Tuesday, May 5, 2015

Eurocrypt 2015: Threshold Implementations

On Thursday morning, Vincent Rijman gave an invited talk on Threshold Implementations designed to be more secure against side-channel attacks. He started off discussing Shannon's Product Cipher[1](like a modern SP network). From a cryptanalysis point of view, this can be very difficult to break due to high diffusion. What this means is that if one character of plaintext is changed then several characters of ciphertext will be changed. Similarly, changing a character of the ciphertext will change several characters of plaintext. However, if we introduce side-channel attacks[2], attacks based on gaining information from the physical implementation of a cryptosystem such as timing information, power consumption, or the example given by Vincent, electromagnetic radiation. 

Changing the value of a logic cell from `0' to `1' consumes energy. The amount of energy the hardware consumes can depend on many things, such as transister design, process variations, length of connection line, crosstalk between lines and many more. In order to counter these attacks a few methods have been devised; 
  • The first is to try to balance the power consumption equally, this means it limits the amount of information gained from the power consumption. 
  • The next is Leakage-Resilient Cryptography such as Ephemeral Keys[3]. 
  • Finally, the method of masking. This splits the sensitive information into $d+1$ shares where $d$ is called the masking order. 
This talk concentrated on the latter technique. In the 2003 paper by Ishai et. al [4] they use masking in private circuits. In 2003 Trichina [5] wrote a paper on a masked multplier. But on its own this is not enough, since there are transient effects in hardware. What this means is that changing values takes time. We call this a transition period. These delays depend on the circuit lay-out. Transient effects account for almost all the power consumption of CMOS[6] circuits. The propogation of these transient effects depends on the input of the combinational circuit, and these dependencies are non-linear. Thus, we have a security breakdown.

Vincent then introduced a threshold implementation that doesn't rely on the behaviour of hardware implementations of combinational logic. The setup assumes that combinational logic leaks information on all its inputs. They do this by the method of secret sharing. An example of this is if we have a function $f$ and an input $x$ giving the output $y$. I.e. $y = f(x)$. We can split the function and hence the input into 3 seperate parts $f_1, f_2, f_3$ and $x_1, x_2,x_3$, respectively. We then create the setup such that the inputs of each split function has only 2 of the split inputs. So that we have
f_1(x_2, x_3) = y_1, \\
f_2(x_1, x_3) = y_2, \\
f_3(x_1, x_2) = y_3.

y_1+y_2+y_3 = y, \\
x_1+x_2+x_3 = x.
This looks extremely similar to multi-party computation. The main theorem that follows from this is: $\textit{The average power consumption of the circuit is independent of x.}$ An extremely informal proof of this is as follows. For each $i$, $f_i$ is independent of the input $x_i$, the power consumption of $f_i$ is independent of $x_i$. If $(x_{i-1}, x_{i+1})$ is independent of $x$, then the power consumption of $f_i$ is independent of $x$. The average power consumption of the circuit is equal to the sum of the average power consumptions of $f_i$. Hence, the power consumption of the circuit is independent of $x$. A few assumptions were necessary for this. The $x_i$'s need to be uniformally random. The implementation of the $f_i$'s depend only on $(x_1, \ldots, x_{i+1}, x_{i-1}, \ldots, x_n)$ and finally we obviously need suitable $f_i$ to exist! For linear $f$ it's easy to have the property
f_1(x_2, x_3, \ldots, x_n) +  \ldots + f_n(x_1, x_2, \ldots, x_{n-1}) = f(x_1, x_2, \ldots, x_n)
however for most interesting $f$ this is a research problem. A simple example of this is the multiplier. We have
z = f(x,y) = x \cdot y \\
z_1 = f_1(x_2,x_3,y_2,y_3) = x_2 y_2 + x_2 y_3 + x_3 y_2, \\
z_2 = f_1(x_1,x_3,y_1,y_3) = x_3 y_3 + x_1 y_3 + x_3 y_1, \\
z_3 = f_1(x_1,x_2,y_1,y_2) = x_1 y_1+ x_1 y_2 + x_1 y_3.
This is secure, even with transient effects. If we consider some facts we can gain about arbitrary functions. The size of the hardware increases with the number of shares. Functions with algebraic degree $n$ requires $n+1$ shares and strong ciphers have high algebraic degree. What this means is that the stronger the cipher, the higher the algebraic degree and hence the amount of shares required increases; meaning the larger the hardware required. 

A potential solution to this is using registers. The combinational logic between registers has lower algebraic degree and registers limit the propagation of transient effects. For this to work we need a few more assumptions. The inputs of each stage need to be uniformly distributed. The input of the second step must equal the output of the first step and the outputs of the first step need to be uniformly distributed. This can be done by remasking, or an extra property for $f_i$ is needed. This property is as follows; every $y_i$-tuple for the same $y$ must get an equal amount of "hits". The original multiplier does not fit this extra assumption, some binary representations of $y$ are actually "hit" more often. Changing the definition of the multiplier to what follows solves this.

z_1 = (x_3 + x_4)(y_2 + y_3) + x_2 + x_4, \\
z_2 = (x_1+ x_3)(y_1 + y_4) + y_1 + x_4, \\
z_3 = (x_2 + x_4)(y_1 + y_4) + x_2, \\
z_4 = (x_1 + x_2)(y_2 + y_3)  + y_1.
Further work suggested by Vincent is security against fault attacks, which induce hardware failure while measuring signals. Other suggested future work is security against attacks using non-linear combinations of signals measured at different times.

[1] - http://en.wikipedia.org/wiki/Product_cipher
[2] - http://en.wikipedia.org/wiki/Side-channel_attack
[3] - http://en.wikipedia.org/wiki/Ephemeral_key
[4] - http://www.cs.berkeley.edu/~daw/papers/privcirc-crypto03.pdf
[5] - https://eprint.iacr.org/2003/236.pdf
[6] - http://en.wikipedia.org/wiki/CMOS

Saturday, May 2, 2015

52 Things: Number 30: Roughly outline the BR security definition for key agreement

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know To Do Cryptography': a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. In this week we look at a security definition for authenticated key exchange.

Establishing a shared key between two parties is one of the oldest problems in cryptography, and turns out to be much harder than standard encryption, even when just considering definitions. Although the classic Diffie-Hellman protocol from 1976 seems to solve the problem, it provides no authenticity guarantee - i.e. that a key has been agreed with the right person - since a man-in-the-middle attack can easily be performed.

To model this kind of attack, and others, we need a security definition. There are two main approaches when defining the security of a key exchange protocol, namely those based on a symbolic model and those using a computational model. In the symbolic model, which become popular in the '90s after the classic paper on BAN logic, techniques from formal methods are used to model and analyse a protocol. The symbolic model is good for identifying attacks, but it is difficult for the underlying logic to capture all classes of attacks, so analysis in this model does not provide great security guarantees, but can be semi-automated using theorem provers.

In their seminal 1993 paper, Bellare and Rogaway instead created a game-based security definition for authenticated key exchange in a computational model, similar to the IND-CPA and IND-CCA definitions for encryption. In this model, cryptographic primitives are not assumed to be unbreakable, but instead we attempt to quantify the success probability of an adversary by computing their 'advantage' in a security game. The main feature of an adversary that we wish to encompass is that all communication is under the adversary's control: they can read, modify, delay and replay messages. They can also run any number of instances of the protocol simultaneously with other parties. The intuition behind the AKA security game is that the only way an adversary can get a party to accept an agreed key is by forwarding honest messages from a genuine protocol run, in which case they cannot possibly learn anything new.

The security game consists of a number of different oracles that an adversary can query. The three main oracles are the corruption oracle, which allows the adversary to take control of a chosen party, the key registration oracle, which registers a public key for any chosen user, and the message oracle, which is the main oracle used for passing messages. Note that messages are not sent directly between the participants, instead the adversary does this using the message oracle.

The message oracle is the main oracle allowing the adversary to create protocol sessions with parties (where they aim to establish a short-term, or ephemeral, shared key) and send messages. When querying the oracle, they can take one of the following actions:
  • Start a new session between two users
  • Learn the secret key of any terminated session
  • Send a message in an existing session and receive the response
The security game follows the real-or-random paradigm, similarly to standard definitions of encryption, by choosing a secret bit $b$; if $b=0$ then the adversary is given a random key for its challenge, otherwise it gets the real key. After interacting with the oracles, the adversary chooses a single session that has terminated, in which both parties are not corrupted and there is no 'matching' conversation where the key has been revealed (to prevent a trivial break), and receives a challenge key for this session. They win the game if they correctly guess $b$.

A protocol is said to be a secure authenticated key exchange protocol if it is correct, and any adversary's strategy is the above game is no better than random guessing. The above outline is only a rough sketch, of course, and there are many further details in the paper.

Thursday, April 30, 2015

Eurocrypt 2015: When life gives you Fully Homomorphic Encryption...

On Wednesday morning, Kristin Lauter gave an invited talk about the Practical Applications of Homomorphic Encryption. She started by talking about a recent competition on Secure Genome Analysis sponsored by the (American) National Institutes of Health at the iDASH Privacy & Security Workshop, which received a substantial amount of media coverage.

Kristin reasoned that this excitement was caused in part by the fact that with the rise of relatively inexpensive genome sequencing, privacy is becoming a fundamental problem. A person's genome can give information on genetic diseases that the person has, as well as partial information on the genetic make-up of close family members. Leaking this information can lead to genetic discrimination, for example from health insurance providers. Still, this kind of data can be put to good use as well, such as personalized medicine, biomedical research and ancestry or paternity tests, so we would like to give researchers access to this data without breaking the privacy of the data donors.

Therefore, the aim of the competition was to devise ways of performing analysis on genomic data in a secure manner, which was implemented with two different tracks. The first track considered the scenario where one party owns all the data, like a hospital that wants to protect the privacy of its patients, which is an ideal scenario for Homomorphic Encryption. The second track considered the scenario where there are multiple data owners that mutually distrust one another, which is an ideal scenario for Multi-Party Computation. There were teams from both industry and academia. The used data was part real data from donors and part simulated data.

Kristin described a few different tasks: counting the frequency of minor alleles, and computing the Hamming and Edit distances for the alignment of genomes. The frequency count can be performed by purely additive homomorphic solutions, but schemes based on RLWE allowing for more homomorphism were still competitive. On the other end of the spectrum was the Edit distance, which consists of a relatively deep circuit. However, in practice it is often good enough to have an approximation, as it allows for a shallower circuit such that even existing schemes can provide a practical solution. Each task had a different winner, indicating that there is currently no best solution for all applications.

This competition spurred fundamental progress in the field of secure genome analysis through the interaction between computer scientists, mathematicians, bio-informaticians and policy-makers. Kristin therefore also highlighted the importance of communicating the concept and possibilities of Homomorphic Encryption to the general public, in order to find more good applications. One major point of the invited talk was that there are applications out there right now that can be implemented using relatively shallow circuits, such as the aforementioned secure genome analysis, but also the evaluation of private models in the cloud and machine learning. These applications can be solved by the schemes we have today, even with their inefficiency drawbacks.

That is not to say there are no more challenges to be solved. Storage costs are a problem as Fully Homomorphic Encryption schemes suffer from substantial ciphertext expansion. The security is not always as well understood as we would like. The efficiency does not scale very well to deep circuits and large amounts of data. These are but some of the challenges that Kristin outlined at the end of her talk. So let us work together and make some lemonade!

Co-chairing Eurocrypt two years running, reflections of a program chair (Part 1)

How do you get selected as program co-chair in the first place? Well having been asked for ‘two' major IACR events (CHES 2008, and then for the pair Eurocrypt 2014 and 2015) it seems to me that the process is that somebody drops your name in a board or steering committee meeting … you better chose your ‘friends’ carefully! Once asked there is no way out as such opportunities do not come often and so you better go with the flow! 

I was asked often over the last couple of years how the ‘co-chairing’ was for me.  Now, because CHES embraced the tradition of having two chairs from the beginning (one selected from academia and one from industry) I have only ever have been a co-chair and I know nothing else. Hence, my standard answer was ‘fine’ (remember also my prior remark about friends) ...

Silly jokes aside: I cannot imagine doing such a role alone. Being part of a small team suits me. I was also blessed by working with three exceptional colleagues: Pankaj Rohatgi during CHES 2008, Phong Nguyen during Eurocrypt 2014 and Marc Fischlin during Eurocrypt 2015. I had a great rapport with all three, which contributed to being able to enjoy the madness! Over the three times that I co-chaired, I experienced different ways to distribute the workload: by dividing up the set of papers into two subsets, by dividing up into topics, and alternatively by simply time-slicing: one takes the lead on the various tasks when the other one can’t. All these strategies worked. 

Another question I was often asked  was how much time it takes. Well ‘a lot’ is my answer. I am not a fan of exact time keeping (no matter how many work sheets my employer asks me to fill out ... I have a ‘template’ anyway), because it costs me time that I don’t have! But if you want to get a feel for the input required then think about looking after about 200 ‘researcher entities’. They send you a paper and then they want feedback, which you get based on other people’s opinion. This should not be a time consuming task once all PC members are appointed, as the number of interactions is limited to a few which require manual involvement (thanks to Shai’s effort with our Web Review System). If only that was the case. My Inbox lists just under 1000 emails for Eurocrypt. Removing automated notifications for submissions and results and final versions, this leaves around 400 emails. Because I can’t afford a personal assistant, I have to read/think/act/respond/ myself, which in some cases takes a few minutes and in many cases requires many more  minutes during the ‘act’ and ‘respond’ phase. Luckily I have held a personal Fellowship since 2011 and so could spend plenty of time …

Yet another area of interest was that around how we (i.e. Marc and I) managed to convince the PC to go with our suggested tweaks for the review process (we introduced a two stage process and changed the submission requirements around paper length). My answer was ‘with difficulty’. There was at first a number of people who expressed their concerns in a very vocal manner. Which is fair enough. After quite some email exchanges, it became clearer that not everybody who was more ‘quiet’ was against our proposal. After some people offered us their support more openly, we were lucky enough that everybody ‘gave it a go’. 

It’s useful, I find, to experiment, but only if afterwards one also asks for feedback and reflects on it. So after all notifications had been sent out in 2015, and people had some time to ‘cool down’, I sent out a Doodle poll asking people for their views. Here are the results (121 people responded, I list the number of votes each statement received):

I like the fact that there were two rounds: 92 
I hated the fact that there were two rounds: 12
I think it was useful to only elicit rebuttals for the second round papers: 63
I think that everyone should have been allowed to write a rebuttal: 38
I think the increase in page limit was useful: 86
I think we should stick to 12 pages as limit for always and ever: 12
I want to have a page limit of 20 pages LNCS in the future: 38
I want to have a page limit of 25 pages LNCS in the future: 19
I want to have a page limit of 30 pages LNCS in the future: 31

I also received anonymous comments which I will supply to the IACR board in my report. They mainly touch on the perceived lack of impact of the rebuttals that authors wrote on the final decision. I think this is a valid point. As all second round authors were given the chance to write a rebuttal, most authors did take that opportunity. But for many papers reviewers hadn’t directly required clarifications from authors and so these papers probably didn’t benefit much then from that opportunity. Some papers however did get questions very directly to be addressed during the rebuttals and there the rebuttal had a useful role to play (even though it was limited to a single
iteration, i.e. we refrained from allowing lengthy conversations between reviewers and authors, which we inevitably would have had to moderate to preserve anonymity). 

I was to do some chairing again (which I will not any time soon) then I would want to discuss the question of only giving selected papers the opportunity to respond (based on reviewer recommendations). But one thing I took away from the review process is that the two stages, where one is short and the second one focused on only a smaller set of papers makes a lot of sense. 

This is the end of part 1 … 

... come back to the blog in a few days time to find out about what it was like to make decisions about accepting/rejecting papers, our attempts to create parallel sessions, and to read up on how the attendees though about the experiment! 

Eurocrypt 2015 Rump Session

It was on a sunny evening that the rump session commenced. The call for presentations was one of the most convoluted one ever conceived in the history of IACR. The inevitable effect was a fair amount of confusion during the submission process, leading to a brilliant program.

Elisabeth Oswald (with Marc Fischlin loitering in the background) gave an insightful view in early perceptions of the new submission format and the use of parallel sessions. They had conducted a questionnaire among Eurocrypt authors and program committee members. One of the questions related to the submission page limit, should it be 20, 25, or 30? The responses were split between short and long, with very little support for the middlle. (One day later, during the membership meeting Cachin mentioned the Board had settled on 30 pages LNCS style.) Yvo Desmedt followed up with a plea to publish corrections to proofs etc. more prominently.

Christina Bruszka (with Marc Fischlin lurking in the background) presented the best future conference award to Jens Groth, who will be chairing the 15th IMA Conference on Cryptography and Coding. For most other events announcement slides were made available during the break. Of note were TCC 2016, which will take twice next year! To unclutter the cryptacademic calendar, the TCC community has decided to migrate their conference from February/March to November. The migration will be in steps, so next year TCC 2016-A will be in January in Tel Aviv (submission deadline will be 13th July with notification on October 2nd, roughly a week before Eurocrypt'16 submission). TCC 2016-B will take place later in the year (November or December). Another event worth noticing taking place next week already (at the TU Eindhoven, Netherlands) is Security in Times of Surveillance.

The final talk of the rump session was by Ivan Damgård (with Marc Fischlin substituted by little fish swimming about). He had conceived a new cryptosystem secure in the presence of powerful adversaries by cleverly inverting the paradigm used by quantum cryptography. There if you try to eavesdrop the message, you will destroy the message. The dual approach advocated by Ivan is that if you try to eavesdrop the message, the message will destroy you! I am always amazed how long it sometimes takes to find ideas that are so obvious in hindsight. This is definitely one of them. Ivan had found a way to implement his revolutionary idea by means of a recently discovered fish. There were still some minor niggles that would prevent imminent deployment (for instance, tampering with the ciphertext could lead to destruction of the legitimate receiver), but a very promising research direction nonetheless. Ivan won the Best Rump Award for his presentation, fully deserved in my opinion!

Wednesday, April 29, 2015

Eurocrypt 2015: Bootstrapping for HElib

Fully homomorphic encryption allows evaluation of arbitrary functions on encrypted data. In an FHE scheme, the ciphertexts are "noisy". This noise is necessary for security reasons, but can interfere with decryption if it grows too large, and the noise increases with each homomorphic operation. This means that there is a limited number of homomorphic operations which can be performed, and we then have what is called a Somewhat Homomorphic Encryption scheme (SHE). In this context, bootstrapping is used as a noise management technique to turn an SHE scheme into an FHE one. This is done by homomorphically decrypting the ciphertext before the noise grows too large, in order to reduce the said noise so that more homomorphic operations can be performed. It is implemented by augmenting the public key with an encryption of the secret key (under the public key) $Enc_{pk}(sk)$, then by a homomorphic decryption which results in a "fresh" ciphertext. Note that for this to work, circular security is assumed.

There are three generations of homomorphic schemes, the first being the scheme proposed by Gentry. Then, there are the packing scheme (which is faster asymptotically) and the small noise one. So far as we know,  slow noise growth is incompatible with the ciphertext packing technique. This paper implements the ciphertext packing scheme, described by [BGV12].

Quick overview of the BGV scheme

The parameters of the scheme are integers $m$, $p$ and $q$. The plaintext space is integers in the group $R=\mathbb{Z}[x]/ (\Phi_{m}, p)$, where $\Phi_{m}$ is the $m^{th}$ cyclotomic polynomial. Ciphertexts and secret keys are vectors in $R$, and we require $q$ to be much larger than $p$. A ciphertext $c$ encrypts a plaintext $m$ with respect to a secret key $sk$ if it holds that $m=[[ \langle c, sk \rangle ]_{q}]_{p}$, where $[.]_{p}$ denotes modular reduction in an interval centered around zero (rather than in ${0, 1, ..., p-1}$) and $\langle ., . \rangle$ stands for inner product. The homomorphic operations are done over $R$. The scheme is also parameterised by a sequence of gradually decreasing moduli $q_{i}$, where we perform a modulus switch in order to reduce noise levels (bootstrapping is then performed upon reaching the last modulus).

Work presented

Bootstrapping for HElib which was presented today by Shai Halevi follows the method of [GHS12], and generalises it.  We take $q$ such that $q=2^{e}+1$ for some $e$ not too small. Recryption is done by storing $Enc_{pk}(sk)$ with respect to a larger plaintext space where $p'=2^{e+1}$. We then compute $u=[ \langle sk, c \rangle]_{2^{e}+1}$, and extract the plaintext as the xor of the least and most significant bits of $u$. $u$ is very easily and quickly computed, and it is the extracting of the most and least significant bits that is the most expensive part of the recryption procedure. It is done in three steps, the first of which is a linear transformation which outputs ciphertexts that have the coefficients of $u$ in the plaintext slots. Then a bit extraction operation is performed and an inverse linear transformation, which recovers a ciphertext encrypting the initial plaintext.

The linear transformations are achieved through imposing a hypercube structure to the group $R_{p}=\mathbb{Z}[x]/(\Phi_{m}, p)$ with one dimension per factor of $m$, where we recall that $p$ can be any prime power. The inverse linear transformation is implemented as a sequence of multi-point polynomial evaluations operations, one per each dimension of the hypercube. It can happen that some of these linear transformations are linear over the base field, but not over the extension. To avoid that, various constraints are applied to the parameters.


This paper generalises [GHS12] and [AP13] by extending their methods from the prime $p=2$ to any prime power $p^{r}$. The implementation presented supports recryption of packed ciphertexts, i.e. recryption of ciphertexts encrypting vectors of plaintexts. The procedure also has low enough depth so as to allow for a significant processing in between recryptions while remaining efficient. The ePrint version can be found here.

[GHS12]: http://eprint.iacr.org/2011/566.pdf
[AP13]: http://www.cc.gatech.edu/~cpeikert/pubs/simpleboot.pdf

Tuesday, April 28, 2015

Eurocrypt 2015: Nobody Should Know What You Bought Last Summer

It has emerged that Bitcoin does not really live up to the promise of anonymous transactions because the transactions can be linked publicly. While there are mixers that offer to break this link (but sometimes simply steal your money), it would be desirable to have unlinkability built in to a crypto currency. The challenge here is the following: If the transaction graph is public as it is in Bitcoin, it is easy to find the predecessor of a new transaction to check its validity. For real anonymity, one has to be able to prove that one knows the secret key of one of the existing transaction without revealing which one.

Jens Groth presented such a scheme at his talk. More technically, the scheme allows to prove that one out of a list of Pedersen commitments hides zero. Recall that Pedersen commitments are homomorphic with respect to a group operation both for the secret and the randomness. This makes them particularly suited for the so-called Sigma protocols, which consist of three rounds and can be made non-interactive in the random oracle model by hashing the first message to get the challenge. Furthermore, they offer special soundness, that is, one can extract the secret from a number of correct answers to different challenges, and special honest-verifier zero-knowledge, which means that the protocol can be simulated without knowing the secret if the challenge is chosen independently of the first message.

In summary, they authors propose a Sigma protocol to prove that one out of $N$ Pedersen commitments hides a zero. It provides computational $(\log N + 1)$-soundness and logarithmic communication. The core idea is to understand the $N$ commitments as the leaves of a binary tree and then let the prover commit to the bits of the path to the commitment he knows. This allows to obliviously compute a commitment of the secret known to the prover from the list of commitments. The communication is logarithmic in the size of the list because it is already known to the verifier. On the downside, the verification complexity still is linear in the list size, which seems to be inherent to the problem.

The paper can be found on ePrint.