Sunday, November 15, 2015

Games of Timing for Security in Dynamic Environments


Last week I had the pleasure of attending the 6th International Conference of Decision and Game Theory for Security based in the (very swanky) Grand Royale Hotel next to Hyde Park. The first of these conferences occured in 2010 and has continued to attract
international scholars and researchers. Decision and Game theory has proved to be a valuable and systematic framework used to deal with the intricacies involved in making rational decisions in security.

Games of timing have been an interest of game theorists for quite some time now. However, according to recent literature, most have been assuming that the cost and effectiveness of actions are time-independent. An example of this is the game of FlipIt[1]
between an attacker and defender playing for possession of a resource.
 A talk at this year's conference given by Ben Johnson removes this assumption by setting up a model that captures the discovery, repair and exploitation of software vulnerabilities. An example given by Ben was Microsoft officially stopping all support of Windows XP in 2014. Security professionals conjectured that attackers would begin ``stockpiling'' vulnerabilities until this time in order to exploit them fully.
The question that Johnson et. al.'s paper [2] and his talk tries to answer is whether the attacker should wait to exploit these vulnerabilities considering there is a risk of the internal security team discovering them. If you require a more in depth description of anything discussed here then please refer to the authors' paper.[2]

The authors consider the life cycle of the software as finite with an end time $t = T$.  The rate of vulnerability discovery is an arbitrary function of time $V(t)$. The defender's security investment $d(t):[0,T] \to \mathbb{R}_{\geq 0}$ is a function of time representing the level of her security investment. The timing of the attacker's exploitation of vulnerabilities is $a(t):[0,T] \to \mathbb{R}_{\geq 0}$. This is the amount of time the attacker will wait to exploit the vunerability when discovered at time $t$.

The repair process for vulnerabilities is determined by an exponential decay function $e^{-\lambda \tau}$. This represents the probability that a vulnerability remains exploitable at a time $\tau$ after discovery. The authors consider their model in both continuous and discrete time. For the purpose of this blog post, i'll just discuss continuous-time.

The defender's total cost over the time interval $[0,T]$ is simply $\int^T_{t=0} d(t) dt$. The expected loss to the defender if the attacker waits for time $a(t)$ to exploit a vulnerability found at time $t$ is $e^{-\lambda a(t)} \frac{R}{d(t + a(t))}$ where $R$ is a scaling factor between security costs and losses. These can be combined to find the defenders total payoff
$$
 U_d = - \int^T_{t=0} \bigg{(}d(t) + V(t)  e^{-\lambda a(t)} \frac{R}{d(t + a(t))}\bigg{)}dt
$$
Likewise, the attacker's payoff is given by
$$
 U_a =  \int^T_{t=0} \bigg{(} V(t)  e^{-\lambda a(t)} \frac{R}{d(t + a(t))}\bigg{)}dt
$$
The authors then turn to looking for conditions for useful Nash Equilibria[3] to exist.
Interestingly, it can be shown that the attacker will never wait if the vulnerability function satisfies
$$
\frac{V(t+a)}{V(t)} \geq e^{-\lambda a(t)}
$$
for every $t \in [0,T]$ and $a(t) \in [0, T-t]$. This begins to provide guidelines for finding vulnerability repair rates in order to stop the attacker from stockpiling vulnerabilities. Further work suggested by Ben is to study the Stackelberg equilibria[4] of the game.


References
[1] -  https://eprint.iacr.org/2012/103.pdf
[2] -  http://aronlaszka.com/papers/johnson2015games.pdf
[3] - https://en.wikipedia.org/wiki/Nash_equilibrium
[4] - https://en.wikipedia.org/wiki/Stackelberg_competition