## Friday, April 15, 2016

### Study group: Protecting Obfuscation against Algebraic Attacks

The study group this week was on the paper 'Protecting Obfuscation against Algebraic Attacks' by Boaz Barak, Sanjam Garg, Yael Tauman Kalai, Omer Paneth and Amit Sahai.

In this blog post we will consider a slimmed-down version of the accomplishments of the paper for the sake of brevity. Moreover, while the idea presented is complicated, it is not particularly complex, and in such circumstances it is often better to favour concision over precision.

Introduction
Obfuscation is the idea of mixing up a program so that it becomes unintelligible but functionality is preserved. Essentially, a person's access to a function is replaced with access to an oracle which computes that function.

Common examples of its use are:
• Crippleware: software can be obfuscated so that it is difficult to reverse engineer and so that premium parts of software can be packaged with non-premium parts by 'crippling' them with obfuscation, so that a product key can later be used to unlock the premium content.
• Public key encryption from private key encryption: Given a private key encryption scheme, one can obfuscate an encryption circuit with the secret key embedded and release this publicly. This (roughly) gives a public key scheme.

What this paper achieves
In this paper, the authors build on the core of an older and well-known breakthrough obfuscation construction of 2013 by Sanjam Garg, Craig Gentry, Shai Halevi, Mariana Raykova, Amit Sahai and Brent Waters. In doing so, they present a new obfuscator; while the starting point of the new construction is exactly the same as the old, they depart from it when providing countermeasures against the (speculated) attacks against the scheme.

This construction is in the multi-linear model, which roughly translates to assuming the existence of Graded Encoding Schemes (GESs). (Note that in the literature, the terms 'Multi-linear Maps' and 'Graded Encoding Schemes' are used interchangeably, though there are important differences. In this discussion, we are considering Graded Encoding Schemes.)

GESs provide a way of encoding (NB that it is not encrypting) elements of some plaintext space so that they have an associated level, which is any subset of $\{1,...,k\}$ for some integer $k$. Intuitively, they offer a good way of preventing an adversary from doing 'too much' with the data he is given (in particular, 'too many' multiplications). Importantly, they allow multi-linear operations, and therefore, in particular, matrix multiplication. The properties they have which are pertinent for this post are the following:
• Encodings may be added only if they are encoded at the same level.
• Encodings may be multiplied only if the levels at which the elements are encoded are disjoint. When a multiplication is performed, the resulting element is encoded at the union of these disjoint levels.
• One can check if a top level encoding (i.e. an element encoded at level $\{1,...,k\}$) is an encoding of zero.

Since their inception, GESs have been rife with problems: no current construction is secure for all desired applications. Fortunately, for their use in obfuscation, all known attacks on GESs (at least for the CLT15 GES at the time of writing) require more information than an obfuscation of a function provides. With a relatively modest overhead (compared with the notorious time and space requirements for GESs), a recent paper propounded the use of random matrices to achieve a similar result to the paper we are discussing without the use of a GES.

Starting point
We start with a circuit which we want to obfuscate; i.e., a Boolean function $C:\{0,1\}^n\to\{0,1\}$. From this, we create a matrix branching program: this is a sequence of pairs of matrices where each pair has an associated input bit. (We will not cover here how exactly one does this, but a good starting point for more information would be the paper from 1986 by David Barrington.) We denote this sequence of pairs by $(B_{i,0},B_{i,1})_{\textsf{inp}(i)}$ where $\textsf{inp}(i)$ is a function describing which bit of the input $x$ the $i$th point in the branching program inspects. A single input bit can be inspected multiple times throughout the program. To evaluate the program on an input $x$, for each point in the program $i$, we check the value $b$ of the $\textsf{inp}(i)$th input bit of $x$ and select the matrix $B_{i,b}$. We then multiply all of these matrices in order and look at the result, $\prod_{i=1}^{l} B_{i,x_{\textsf{inp}(i)}}$ where $l$ is the length of the branching program and $x_{\textsf{inp}(i)}$ is the $\textsf{inp}(i)$th bit of input $x$. If the product is the identity matrix, we interpret this as $C(x)=1$, and otherwise $C(x)=0$.

The end goal is to 'mix up' matrices so that, if we give them away, they don't reveal information about the underlying circuit but they still allow evaluations of the original function.

We start by encoding (using the GES) the $i$th pair of matrices under the singleton level $\{i\}$. This is useful because, as previously noted, we can still perform multi-linear operations like matrix multiplication, but not do 'too many' operations with them (for example, we cannot compute powers of matrices). The product received from an evaluation of the branching program will be a top-level encoding, to which we can then apply our zero-testing procedure (which, recall, requires that an element be top-level in order to check whether or not it encodes zero).

Attacks and Preventative Measures
Attacks on an obfuscation scheme are when an adversary seeks to glean some auxiliary information about the structure of the branching program by seeing what happens when computations are done erroneously. Attacks fall into four categories, and we will see how this paper deals with each of them. The first two are covered in the same way as the original construction on which this paper builds.
1. Non-algebraic attacks: performing non-multi-linear operations on the matrix elements, or failing to respect the algebraic structure of the matrices as matrices over the plaintext ring. These attacks are largely dealt with by encoding with the GES.
2. Out of order attacks: taking one matrix from each pair from the branching program and finding the product without respecting their branching program order. This is actually very simple to deal with, and involves a trick by Joe Kilian which he described in his paper of 1988. We choose a set of random matrices $\{R_0,\ldots,R_{l-1}\}$ (where $l$ is the length of the branching program), set $R_l=R_0$, and then define: \begin{eqnarray} \widetilde{B_{i,0}}&=R_{i-1}B_{i,0}R_{i}^{-1}\\ \widetilde{B_{i,1}}&=R_{i-1}B_{i,1}R_{i}^{-1} \end{eqnarray} for $i=1,\ldots,l$. This randomised version of the branching program computes the same output as the original, but now products that don't respect the branching program order will be encodings of random elements.
3. Mixed Input attacks: it was stated that more than one point in the branching program may inspect the same input bit (i.e. $\textsf{inp}$ is not necessarily injective). This attack involves performing an unfaithful evaluation by switching from reading, say, the $j$th bit of the input as 0 at one point in the program and then as 1 at some later point. This attack is dealt with using the innovative idea of straddling set systems. Encoding matrices of a branching program at levels given by sets in these systems, instead of at singleton sets, effectively locks together matrices at different points in the branching program. This is exactly what we need in order to prevent this kind of attack (and also the next).

We provide an elucidating example rather than the technical definition (details can be found in the paper, linked above): Let $S=C\cup D$ where $C=\{\{1\},\{2,3\},\{4,5\}\}$ and $D=\{\{1,2\},\{3,4\},\{5\}\}$. We call $S$ a straddling set system for the universe $U=\{1,2,3,4,5\}$. Note that $C$ and $D$ are the only exact covers of $U$ in $S$.

Now, suppose that we had a branching program which considered input bit 1 at positions 1 and 3 in the branching program, and input bit 2 at positions 2 and 4. I.e., we have
 $1$ $2$ $3$ $4$ $\textsf{inp}(1)=1$ $\textsf{inp}(2)=2$ $\textsf{inp}(3)=1$ $\textsf{inp}(4)=2$ $B_{1,0}$ $B_{2,0}$ $B_{3,0}$ $B_{4,0}$ $B_{1,1}$ $B_{2,1}$ $B_{3,1}$ $B_{4,1}$
We define a straddling set system for each input bit: $S_1$ is
$C_1=\{\{1\},\{2,3\}\}$ and $D_1=\{\{1,2\},\{3\}\}$
and $S_2$ is
$C_2=\{\{4\},\{5,6\}\}$ and $D_2=\{\{4,5\},\{6\}\}$

Then denoting by $(M,A)$ an encoding of the matrix $M$ at level $A\subset \{1,\ldots,k\}$, we encode as follows:
 $1$ $2$ $3$ $4$ $\textsf{inp}(1)=1$ $\textsf{inp}(2)=2$ $\textsf{inp}(3)=1$ $\textsf{inp}(4)=2$ $(B_{1,0},\{1\})$ $(B_{2,0},\{4\})$ $(B_{3,0},\{2,3\})$ $(B_{4,0},\{5,6\})$ $(B_{1,1},\{1,2\})$ $(B_{2,1},\{4,5\})$ $(B_{3,1},\{3\})$ $(B_{4,1},\{6\})$
Since $C_1$ and $D_1$ are the only exact covers of $U_1$ and $C_2$ and $D_2$ are the only exact covers of $U_2$, if we want to obtain a top-level product at the end, choosing, for example, $B_{1,0}$ forces us also to choose $B_{3,0}$ (and we cannot choose $B_{3,1}$). This exactly prevents mixed-input attacks.
4. Partial evaluation attacks: computing subproducts of matrices and comparing them. These attacks are dealt with using more complicated straddling set systems, but the idea of 'locking' matrices together is essentially the same. The branching program is extended so that each point is dependent on two input bits, and several redundant matrices are inserted into the program so that for every pair of input bits, there is a point in the program which inspects both of these bits. The straddling sets can then be constructed so that each input bit affects the matrix the evaluator must choose at every point in the program. Thus it greatly restricts the operations that can be performed, and stops these partial evaluation attacks.
The paper concludes with a proof that this obfuscator achieves what is known as virtual black-box obfuscation in the ideal graded encoding model. The really interesting thing in this author's opinion is the aforementioned translation of the straddling set technique to the obfuscator (for which, admittedly, a security proof has yet to materialise) which doesn't require a GES.