Bristol Cryptography Blog: 2011

Friday, December 16, 2011

IMA Cryptography & Coding - Day 2

The invited talk on Day 2 of IMACC was given by Ivan Damgård on secure multiparty computation (MPC). The goal of MPC is to jointly compute some function on a number of players' secret inputs. Ivan began with a nice explanation of how to achieve this with information theoretic security. He described how MACs can be used to verify the computations of each player, thus achieving security against active adversaries.

The talk then went on to describe recent results concerning the case of a dishonest majority of adversaries. Clearly this scenario is very important to focus on if MPC schemes are to be deployed in the real world. Early efforts to obtain this level of security were very inefficient, but some recent innovations have proposed using a homomorphic encryption scheme to improve this. This was first suggested in Eurocrypt 2011 by Bendlin, Damgård, Orlandi and Zakarias (pdf). By dividing the computation into two phases ('offline' and 'online'), they manage to perform a secure multiplication in around 8ms. These results are improved upon in a recent paper by Damgård, Pastro, Smart and Zakarias (pdf), which reduces the complexity and storage requirements of the scheme even further. Multiparty computation is an active area of research in Bristol, and it was nice to see it represented in Oxford with Ivan's excellent talk.

One session that interested me in the afternoon was on cryptanalysis and security analysis. This began with Martin Albrecht presenting joint work with Kenny Paterson on breaking a recent IBE scheme of Chen et al. (pdf). The basic attack involved exploiting the relationship between the master secret key and the private keys for each identity that are generated from the master key. Specifically, each private key is given by a quadratic function in the master key components. So given enough of these keys, an attacker has a system of multivariate quadratic equations in the master key that can be solved using a Gröbner basis computation.
Whilst this attack was considered in the original paper, the authors severely underestimated the efficiency with which Gröbner bases can be computed. Apparently this problem is surprisingly common in the literature, and stems from using analysis of the XL algorithm to obtain security estimates, whereas in practice the F4 and F5 algorithms are far more efficient. Although in the original paper security of the scheme was proven to be based on the DHIES scheme, this attack does not allow you to break DHIES. The key flaw is that their proof assumes that a specific function can be modelled by a random oracle, when in fact the output of this function is dependent on the secret key. So the use of a random oracle here is completely inappropriate, and invalidates the security proof.

The third talk in the session, given by Julia Borghoff, followed in a similar vein, attacking the lightweight stream cipher A2U2. The attack also involved creating a system of multivariate quadratic equations and using Gröbner basis computations to solve this. Firstly an efficient chosen plaintext attack was presented, followed by a more complex 'guess and determine' attack requiring only a known plaintext. However, some doubts were raised afterwards as to how feasible this attack would be in practice due to the high number of Gröbner basis applications required.

In between these talks on cryptanalysis, our recent PhD graduate Steve Williams presented his work on the security analysis of the SSH key exchange protocol. Although the underlying application layer of SSH has been thoroughly analysed in the past, no-one had yet performed an analysis of the vital key exchange protocol used to initiate the shared secret. Since the initial messages transmitted in the protocol contain a signature, it is not possible to obtain indistinguishability of the key generated in this stage. So the security of this can only be shown to be one-way, which Steve successfully proves. A transformation is then applied to this shared key, and Steve went on to show that this transformation in fact means that the final keys generated in SSH are indistinguishable.

Thursday, December 15, 2011

IMA Coding and Crypto, Day 1 Morning

This year is the 25th anniversary of the conference colloquially known as Cirencester. It is traditionally held every two years at the agricultural college in Cirencester. It's a rather remote location and it has the effect that in the evening everybody cozies up in the college bar with big wooden fire. A few years back I remember I kept having to climb a big mould of whatever just in order to get telephone reception.

Anyway, this year's Cirencester is not actually in Cirencester, instead it is in Oxford, at the Lady Margaret Hall. This college used to be women only, but these days it is mixed (just like Royal Holloway, University of London). It is located a bit north of the center, about 10-15 mins walk and the buildings are all rather new (certainly for Oxford college standards). So it's not quite as remote as the real Cirencester, but it is a decent emulation (including some of the agricultural whiffs that occassionally manage to make it into the lecture theatre).

The first talk of the conference was by David Nacchache who gave an invited talk consisting of three parts. The first part dealt with code obfuscation. The goal of code obfuscation is simple: you take a piece of code and modify it in such a way that it behaves exactly the same as the original piece, yet all other information about the original code is hidden. For instance, a programmer might want to hide the control flow of a problem, or a cryptographer might want to hide a key that is a fixed input to the program. Obfuscation is a rather hard problem cryptographically, rife with negative results. The main negative result is due to Barak et al., who showed that there exist programs that cannot be obfuscated at all: any obfuscation will invariably leak a predicate about the original. In his talk, Naccache explained programs of a much higher degree of unobfuscatability: any (functionally equivalent) obfuscation will necessarily leak the original code. The main tool of this remarkable result is the concept of Quines, that is programs that print themselves. While this may sound rather negative, Naccache also pointed out a positive, namely their transformation takes any program and turns it into a fully unobfuscatable one, which creates a potential mechanism for detecting forgeries of some original piece of code.

The second part was dedicated to identifying finger prints. While there are many algorithms for this already, most boil down to comparing a fingerprint found on a crime scene against a database, checking for a match one by one. These matches are quite costly and Naccache showed how one can use relatively lightweight statistical methods to precompute an ordering on the database that increases the probability of finding a hit sooner rather than later (thus reducing overall search time).

Finally, Naccache discussed the problem of figuring out the computation that was performed based on inputs and outputs only. In the past work had been done related to the partial extraction of an exponent based on the projective representation of a scalar multiple, here the focus was on arithmetic circuits over a finite field of small depth. As it was still work in progress, the details are left for later.

After a very short break, the first contributed talk was delivered by our very own Peter Scholl. I think it was his first conference presentation and he did a sterling job! The topic under investigation was improved key generation for Gentry's fully homomorphic encryption scheme. When Gentry proposed his breakthrough scheme, the efficiency left much to be desired and in fact, it was completely impractical to deploy the scheme for a realistic security setting. Much progress has been made in the mean time, but this progress does not always blend well: Gentry and Halevi presented a vastly improved key generation scheme and Smart and Vercauteren presented several significant improvements to other parts of the scheme, yet the Gentry-Halevi key generation sadly does not apply to the parameters Smart and Vercauteren need for their key ideas. In this work, Peter (in joint work with Nigel) adapted the ideas of the Gentry and Halevi to the Smart-Vercauteren setting. It turned out that by introducing several neat new techniques and optimizations, key generation suddenly becomes feasible. I would especially like to draw attention to one of the conclusions, namely that you really need to implement these more complicated schemes to see where the challenges are and what can be done about the bottlenecks. This point was also brought home in later talks, for instance the excellent invited talk by Ivan Damgaard and the inspiring talk by Mike Scott.

After Peter, Frederik Armknecht (fresh from presenting at Asiacrypt) gave a talk on constructing homomorphic encryption schemes from coding theory. The advantages of using coding theory are obvious: the computations involved are almost guaranteed to be simple, plus linear codes naturally allow one homomorphism naturally. The hope was that this simplicity could be leveraged to create an efficient scheme supporting both additions and multiplications. As it turns out, this is not as easy as one might hope for. The schemes presented by Armknecht still had some limitations that would reduce their overall useability. Chief amongst these were fairly strict restrictions on the number of ciphertexts allowed (which also implied a secret-key only setting) and a limitation on the number of multiplications that could be handled. Nonetheless, one can imagine that in some scenarios these scheme deliver exactly what is needed in a relative cheap and painful way.

Tuesday, December 13, 2011

HB+ and its variants

Today's study group, led by Luke and Gareth, was on the topic of HB+ and variants thereof, with particular focus on the following three papers:

Efficient Authentication from Hard Learning Problems (Kiltz et al, Eurocrypt 2011) [pdf]

Parallel and Concurrent Security of the HB and HB+ Protocols (Katz et al, Eurocrypt 2006) [pdf]

HB#: Increasing the Security and Efficiency of HB+ (Gilbert et al, Eurocrypt 2008) [pdf]

The motivating problem for HB-related schemes is secure authentication for low-cost devices such as RFID tags, whose power and storage constraints necessitate lightweight protocols. The motivation for the original HB protocol (Hopper and Blum, Asiacrypt 2001) [pdf] was actually secure human-executable identification, performed using only such undemanding computations as dot products and XORs -- so, although their proposal does not fulfill all the desired security requirements, it is nevertheless an excellent starting point for developing (very) low complexity schemes.

HB authentication derives its security from the NP-hard 'Learning Parity with Noise' (LPN) problem, which corresponds to the task of decoding random linear codes:

Let A be a (q × k) binary matrix; x a random k-bit vector; η ∈ (0,½) a (fixed) noise parameter; ν a random q-bit vector such that HW(ν) ≤ ηq. Given A, η, z = (Ax)⊕ν the LPN problem is to find a k-bit vector x's.t. HW((Ax')⊕z) ≤ ηq.

The basic HB protocol is as follows:

Tag and reader share a key x ∈ {0,1}ⁿ.
For rounds j = 1,...r:

Reader draws a^j ∈_R {0,1}ⁿ and sends to tag
Tag computes u^j = a^j · x and draws ε^j ∈_η {0,1}
Tag sends z^j = u^j ⊕ ε^j to reader
If z^j = a^j · x for a clear majority of j ∈ {1,...,r} then the reader accepts the tag as authentic.

This is clearly attractive as the only computations performed on the tag are bit-wise computable dot-products and single bit XORs. However, it turns out to be vulnerable to an active attack in which the adversary retrieves x bit-by-bit by sending challenges of the form a^j_k = 1, a^j_i = 0, i ≠ k, that is: (0,0,...,1,...0,0). Then z^j = x_k ⊕ ε^j so that the majority over r rounds will reveal x_k.

To resist the above attack Juels and Weis (Crypto 2005) [pdf] proposed HB+:

Tag and reader share two keys x, y ∈ {0,1}ⁿ.
For rounds j = 1,...r:

Tag draws b^j ∈_R {0,1}ⁿ and sends to reader
Reader draws a^j ∈_R {0,1}ⁿ and sends to tag
Tag computes u^j = (a^j · x) ⊕ (b^j · y) and draws ε^j ∈_η {0,1}
Tag sends z^j = u^j ⊕ ε^j to reader
If z^j = (a^j · x) ⊕ (b^j · y) for a clear majority of j ∈ {1,...,r} then the reader accepts the tag as authentic.

This is provably secure in the 'detection-based' adversarial model, and therefore no longer vulnerable to the active attack described above. However, several problems remain:

The revised protocol introduces the problem of generating random vectors on the tag.
The proof supposes that the rounds are performed sequentially and does not extend to a parallelised implementation (desirable as a strategy to reduce communication complexity).
The false rejection rates (under the proposed acceptance thresholds) are very high -- as much as 44% for 80 rounds with a noise level of ¼. (See table 1 of (Gilbert et al, Eurocrypt 2008)).
The adequacy of the adversarial model (which supposes the adversary can interact only with the tag before impersonating it to the reader) has been questioned.

In fact, HB+ is vulnerable to a man-in-the-middle attack in which the adversary is able to interact with the reader as well as the tag before attempting to impersonate (Gilbert et al, 2005) [pdf]. We discussed the fact that such an attack poses a material threat: whilst it would be difficult to carry out an opportunistic MIM on a nearby tag carried by a passer-by, an attacker in possession of a tag with incentive to clone it could very well use a reader of his own to do so, with comparatively unconstrained time and resources.

The same authors solve this problem and also answer to the parallelisation challenge with RANDOM-HB#, in which the secrets become (n_X × m) and (n_Y × m) binary vectors X and Y rather than n_X- and n_Y-bit vectors x and y, and the protocol operates (in one go) in matrix form rather than round-by-round, so that the final verification consists of the comparison of two m-bit vectors a · X ⊕ b · Y and z. This adaptation is secure against the above MIM (known as GRS-MIM after the authors), as well as having a much smaller false rejection rate. However, when the adversary is afforded more powers (i.e. he is allowed to modify more of the elements of the protocol) another MIM attack has been found -- though with increased complexity.

The last paper we looked at was concerned with constructing MACs based on the hardness of LPN (Kiltz et al, Eurocrypt 2011). The protocol can be informally described as follows:

Tag and reader share a key x ∈ {0,1}ⁿ.
Reader selects a random subset of bits of x and sends to the tag.
Tag computes a noisy inner product z of the selected bits and sends to the reader.
Reader verifies that z matches the inner product (without noise) 'most of the time'.

This acheives 2-round authentication with active security. The proof derives from the hardness of the subspace LPN problem (Pietrzak 2010 [pdf]) and because it doesn't use rewinding technqiues it remains valid against quantum adversaries.

Single-sample DPA (aka horizontal DPA)

This week's study group covered 'old' papers by Colin Walter (big-mac attack) to current work of Mike where focus is to apply DPA techniques to a single trace. More specifically, we discussed the following papers:

Colin D. Walter “Sliding Windows Succumbs to Big Mac Attack” [pdf]
Colin D. Walter “Longer Keys May Facilitate Side Channel Attacks” [pdf]
Jean-Christophe Courrège et al. “Simple Power Analysis on Exponentiation Revisited” [pdf]
Werner Schindler and Kouichi Itoh “Exponent Blinding Does Not Always Lift (Partial) Spa Resistance to Higher-Level Security” [pdf]

Stefan kicked off the study group by explaining us why longer keys may facilitate Side Channels Attacks (SCAs). Common sense would suggest that longer keys should be used to provide more security. This is true for the mathematical strength of a cipher, however, Colin Walter shows in his work that for embedded RSA crypto-systems the increase in leaked data through longer keys in fact lowers security. We discussed two types of attacks, namely a timing and a power analysis attack, that may be possible from increasing the key length. The implementation under attack uses a variant of the binary "square-and-multiply" algorithm to compute the RSA exponentiation. As so often when trying to attack such implementations, the goal is to distinguish between squares and multiplications to extract secret key material.

One way to distinguish these two core operations is to look at timing variations that occur for computing the modular products. Typically, a form of Montgomery's modular multiplication (MMM) is used to compute these products. From a side-channel perspective, the interesting aspect of MMM is the fact that it includes a final conditional subtraction which reduces the output to less than the modulus. This final subtraction occurs with higher probability for squares than for multiplications and this forms a timing side-channel that can be exploited by an attacker that observes a number of exponentiations. The data leaked through this side-channel is proportional to the square of the key length and thus longer keys are being less safe. The simplest countermeasure to thwart this kind of attack is to use an unconditional subtraction and select the appropriate result afterwards.

The second attack that Stefan described exploits the power leakage of the internal hardware multiplier. For MMM a number of single precision multiplications need to be computed. The idea is here to fix one of the multiplier inputs while the second input is more or less random and then average the power traces over a number of multiplications. The power trace that we end up with is then dependent on the Hamming weight of the fixed operand. If we apply this technique to all chunks of a long integer multiplication we can build templates during pre-computation which we can then use during the actual exponentiation to guess which operation has been computed. The nice property of this attack is that we only need to collect a single power trace of an exponentiation to characterise the different operations plus mount the actual attack.

In the second part, Mike was talking a bit about the stuff he is doing at the moment. For example, on an ARM chip it is apparently easy to identify the occurrence of the different single precision multiplication operations in the power trace while executing an exponentiation. Using a horizontal DPA, the idea is to recover in each iteration the i-th operation (whether a square or multiplication has been executed) under the assumption that we have successfully recovered the ﬁrst i − 1 operations. Next, an attacker makes a prediction (e.g., using the Hamming weight) of how the power consumption should look like for the two core operations. Depending on which of these two power proﬁles correlates best with the actual power trace reveals another operation of the exponentiation. Collecting one power trace of the device under attack is enough to mount such attack.

Mike also described another attack on the exponentiation that exploits some bias in the Hamming weight of the output of single precision multiplication and squaring operations. Basically, what is exploited in this attack is the fact that the second most least significant bit of a single precision squaring operations will always be zero. With this information we can build templates by characterising these squaring and multiplication operations to try to make a distinction between the operations. Mike reports that he had a 95% success rate in choosing the right operation using this technique. The advantages of this attack is that the bias is caused by the operation itself and it will remain there irrespective of the message input.

Finally, Mike briefly discussed some techniques to attack implementations that use exponent blinding as countermeasure.

Thursday, December 8, 2011

Dagstuhl: Cloud Computing

This week I have been at Schloss Dagstuhl as a co-organizer of a workshop on security for cloud computing. Dagstuhl is a computer science research centre in Germany, and every week they invite a bunch of computer scientists to the old Schloss to discuss a particular research topic.

We discussed various topics. From securing data on the cloud, to computing on this data, to verifying the computation and more interestingly we discussed the economics of whether it actually made any sense. In this last topic there were two interesting talks by Ari Juels and Radu Sion. Both essentially contended that economically it may make no sense to perform computation on encrypted data in the cloud. Indeed Radu went even further and suggested that the so-called economic benefits of general cloud computing may be illusory, especially if the amount of data transfer to and from the cloud outweighs the benefit of the outsourced computation. This was done by measuring the cost of various operations in terms of pico-cents.

A related question, debately quite hotly, was whether any of the ANY cryptographic method for computing on encrypted data could have widespread deployment/usage. As usual the "Danish Sugar Beet" auction was mentioned as the application for Multi-Party Computation technology; but this is a niche application and none of us could come up with a commercially compelling large scale application of such techniques. In all cases there seemed to be impediments to deployment, often of a non-technical nature; such as in whose interest would it be to commercially make such systems available?

Dagstuhl is a great place to visit. If you ever get an invite you need to go. Organization is a bit chaotic; but that is the charm. We made the programme up as we went along, and so could respond to the interesting topics that were arising. However, it is probably best to go not in the bleak mid-winter.

Asiacrypt 2011 Day 4

One talk I was interested in today was the talk on the paper "Short signatures from weaker assumptions". In the paper, the authors first propose several new constructions of (m,1)-programmable hash functions (PHFs) for any m≧1. They then show how to use the new (m,1)-PHFs for generic construction of short yet efficient hash-and-sign signatures based on weaker hardness assumptions: the q-DH and RSA assumptions (Note before this work , there are some existing PHFs short signatures based on stronger assumptions: strong q-DH and strong RSA assumption). The resulting signature schemes from weak assumptions are secure in the standard model.

The concrete q-DH signature schemes are the first hash-and-sign schemes from the q-DH assumption and the proposed RSA signature schemes have considerable efficiency improvement compared to the previous standard model RSA-based signatures. Interestingly, the resulting signature schemes offer different tradeoffs between signature size, efficiency and public-key size. The bigger the parameter m in the (m,1)-PHF, the larger the public-key size, the shorter the signature size. Therefore, to obtain extremely short signatures, the size of public-key can get quite large.

Wednesday, December 7, 2011

Asiacrypt 2011 Day 3

There were two talks discussing secret sharing at Asiacrypt day 3: one talk on the paper "Computational verfiable secret sharing revisited" and the other talk on the paper "Natural generalization of threshold secret sharing".

In the former paper, the authors demonstrate that homomorphism of comments is not a necessary for computational verifiable secret sharing (computational VSS) in the synchronous or in the asynchronous communication model. The author consider computational VSS as a standalone primitive and propose new VSS schemes based on the definitional properties of commitments that are almost as good as the existing VSS schemes based on homomorphic commitments.

In the later paper, the authors study the research for new families of ideal access structures that are among the most generalization of threshold secret sharing. The authors also study the efficiency analysis of the methods to construct ideal secret sharing schemes. By using integer polymatroids, the authors propose a number of new families of ideal multipartite access structures.

Tuesday, December 6, 2011

ASIACRYPT

The first two days of Asiacrypt 2011 were really great. Met lot of people and discussed various topics, other than my own research area. Yesterday, there were two talks that I liked very much. One of them was on Deniable encryption, about which Ming has already blogged. The other talk was in the same session and on Lossy encryption and selective opening attack (SOA). The motivation for SOA comes from computationally secure multiparty computation (MPC) tolerating adaptive adversary. In such protocols, the adversary can see all the messages in the protocol (the messages exchanged between the honest parties are secure as they are encrypted) and over the course of time, can decide which parties to corrupt (this strategy is called adaptive corruption). Ideally, if the messages of the individual parties are completely independent and the underlying encryption scheme is IND-CPA secure then the security of the MPC protocol is preserved. However, in a typical MPC protocol, the messages exchanged between the parties are not completely independent. In that case, the big question is the following:

suppose the adversary "selectively" get the message and the corresponding randomness used to encrypt the messages of some of the parties in the protocol. Then does that preserve the messages and randomness of the remaining honest parties (assuming that the messages and the randomness of the honest parties need not be independent)?

If an encryption scheme satisfies this property then it is said to be secure against "selective opening attack" (SOA) and using such encryption schemes, we can design computationally secure MPC protocols against adaptive adversary.

The speaker (Benoit Libert) showed how to design lossy encryption scheme secure against SOA. Now let me briefly discuss what is lossy encryption scheme. It can be viewed as a generalization of "dual mode encryption scheme", where there are two modes of encryption: a typical injective or normal mode, where there is a one-to-one correspondence (injective mapping) between plain text and cipher text, while the second mode is a "lossy" mode, where there is no such association. This mean that the cipher text will be independent of the message.

The speaker briefly outlined in his talk, how to design lossy encryption scheme, secure against SOA, from general assumptions, such as re-randomizable encryption, homomorphic encryption and oblivious transfer (OT). I really enjoyed the talk and I really found the concept of lossy encryption and SOA very interesting. Actually, just few days back, I was asking Nigel whether it is possible for anyone to efficiently compute
a message m' and corresponding randomness r', which results in a given cipher text C, which actually is also an encryption of some different message m, under a different randomness r. That is, given C = Enc(m, r), m and r, can one efficiently find another m' and r', such that C = Enc(m', r') = Enc(m, r)? And he said that in general its not possible to do so efficiently, unless the encryption scheme is non-committing. I was not aware of this notion of non-committing encryption and after hearing it, I started to read about it. And then when I listened to yesterday's talk about lossy encryption, then it further widened my interest in this topic. I think it will be a good topic to be discussed in the study group.

Asiacrypt 2011 Day 2

The first talk at Asiacrypt day 2 is the talk on "Bridging brocast encryption and group key agreement". The work bridges two well-studied primitives, broadcast encryption and group key agreement, with a new hybrid primitive referred to as contributory brocadast encryption (CBE). In CBE, a group of members contribute to the public group encryption key. A sender can securely broadcast to any subset of the group members which is chosen in an ad hoc way. CBE incorporates the ideas of broadcast encryption and group key agreement. In the set-up phase, a group of members interact via open networks to negotiate a common encryption key while each member keeps a different secret decryption key. Anyone can necrypt message to any subset of the group members by using the common encryption key and only the intended receiver can decrypt the ciphertext. CBE allows the sender to exclude some group members from reading the ciphertext (compared to group key agreement) and does not need a fully trusted third party to set up the system (compared to broadcast encryption). Finally, a CBE construction proven to be semi-adaptively secure under decision BDHE assumption in the standard model is also proposed.

Monday, December 5, 2011

Asiacrypt 2011 Day 1

One of the interesting talks today is a talk given by Peter Sebastian Nordholt on "Lower and upper bounds for deniable public-key encryption". A deniable cryptosystem is introduced to allow sender or receiver to deny a message exchange and combat coercion. For example, if the adverary can threaten the communicating particies into revealing the internal states or parameters after the execution of the communication, the cryptosystem is still secure under this kind of coercion. According which parties can be coerced by the adversary, we can distinguish between three kinds of deniability: sender deniability, receiver deniability and bi-deniability. A deniability public-key encryption is a public-key encryption which is deniable.The main contribution of this paper is to derive upper and lower bounds on how secure a deniability public-key encryption scheme can be as a function of the key size.

For the lower bounds, the author have the following results:

Receiver deniable: a public encryption with l-bit keys can be at most (1/2)* ((l+1)^(-1)) -receiver deniable
Sender deniable: the author do't know a non-trivel lower bound
Bi-deniable: at most (1/2)*((l+1)^(-1)) -bi-deniable.

For the upper bounds, the author have the following results:

Receiver deniable: let k be the length of the secret key, there exist a (1/n)-sender-deniable puvlic-key encryption scheme with key length l=O((n^2)* k).
Sender deniable: let k be the length of the secret key, there exist a (1/n)-sender-deniable puvlic-key encryption scheme with key length l=O(n* k).
Bi-deniable: let k be the length of the secret key, there exist a (1/n)-sender-deniable puvlic-key encryption scheme with key length l=O((n^4 )* k).

Thursday, December 1, 2011

ICISC 2011: Day 1

Yesterday was Day 1 of ICISC 2011. This year the conference is making its 14th appearance and this time the conference received the maximum number of submission in its history (total 128 submissions) and the acceptance ratio is 25.2%. The emphasis on Day 1 was mostly on practical crypto: one session on side channel, one on network security and one invited talk on light weight crypto. In addition to this, there was one session on hash functions and one on public key cryptography. Apparently I was requested to chair the session on hash functions. This was the first time when I chaired any session and interestingly, I was given 100$ for the same:)-

I could not make much out of yesterday's talks, as most of them were related to the topics that are not my cup of tea. However, I did like the invited talk on light weight cryptography by Thomas Peyrin. Actually this is the first time I listened to any talk on this topic. The speaker very nicely introduced what light weight cryptography means and what are the major goals and challenges. What I could understand from his talk is that the major goal is to design light weight crypto primitives, specifically light weight block ciphers and light weight hash functions, which provide us with appropriate level of security and speed and at the same time not utilizing much resources (in this context, it is the space) so as to be used in applications like RFID. Various guidelines for the design of these primitives, meeting the above requirements were also discussed. The speaker also talked about one of his latest light weight hash function and one light weight block cipher, which were published in CHES 2011 and CRYPTO 2011. However, I could not make much out of them as it became too technical for me. Moreover, I was having severe jet lag (Seoul is around 9hrs ahead of Bristol time) and even now I am finding it difficult to adjust to the local time.

Overall, the first day experience at ICISC 2011 was OK. But I am looking forward specifically for the last day talks, as there are several talks related to protocols and theoretical cryptography and I am hopeful that they will be interesting.

Wednesday, November 30, 2011

Study group: more leakage resilient stuff

Leakage, Leakage, Leakage ...

Once more we looked into recent papers attempting to deal with leakage in cryptographic schemes, this time we focused on public key cryptography: signature and encryption schemes in the bounded leakage and continuous leakage model were discussed/examined in our last study group.

After last week's session on definitions and models, and 'fundamentals' such as defining LR-OWFs via SPRs (handy to know as signature schemes can be derived from OWFs) we started off with a paper Boyle et al. (EC' 2011) that gives ''Fully leakage resilient signatures" in the bounded leakage model and then goes on to show how to adapt it to the continuous leakage model. Let's first look more closely at what the title implies/promises.
Fully LR in this context alludes to the idea that an adversary can query a leakage oracle not only on the secret key, but on the entire secret state including any randomness used during signature creation. Further it can tolerate 'a lot of leakage'. This is in contrast to some previous work (referenced in the paper) which only allowed 'not too much' leakage about the secret key. Clearly this is then a relevant improvement as it is well known that for a wide range of practical signature schemes leakage about the ephemaral keys is what brings those systems down (relevant but not in the paper mentioned articles such as Nguyen and Shparlinski, "The Insecurity of the Elliptic Curve Digital Signature Algorithm with Partially Known Nonces", DESI 30(2), 2003) show that knowing a handful (quite literally!) of bits of a couple of ephemeral keys allows to easily recover the secret signing key ...

Well what else does this new construction give us: no random oracles used in the proof, no OCLI, and the authors argue that there are even some instantiations which one could argue are theoretically efficient. A side remark mid-way through the note reveals though some previously not mentioned restrictions on the leakage: when it comes down to the actual proof there is no leakage assumed for the randomness used in key generation or refreshing (in the continuous leakage case). Especially the latter assumption is rather undesireable for practice ...

Conclusion: a construction that does all what we want in a model which is almost good enough for practice. (Quite a praise as I look at it from the lense of a practitioner).

The second paper we looked at was by Kiltz and Pietrzak (AC' 2010) entitled 'Leakage Resilient El Gamal Encryption'. The authors of this paper took a refreshing approach: rather than cooking up a scheme from 'scratch' they had a (brief?) look at what practitioners already do to secure El Gamal type schemes in practice and attempted to prove some properties of the well known defenses used. So what do practitioners do in practice? Assume a pair of (secret key, public key)=(x, X=g^x). Then encryption proceeds something like that: the message (represented as group element) is blinded with an ephemeral key y which is transmitted via DH, hence the ciphertext is a pair (c1, c2)=(g^y, mX^y). Decryption recovers m by computing c2/c1^x. In practice an attacker will attempt to reveal x most likely by targetting the exponentiation c1^x as c1 can be observed/chosen and the exponentiation routine will most likely use x in very small chunks. To prevent this practitioners use a technique called blinding (typically applied to both exponent and message) to prevent such attacks.

The reason why I point this fact out explicitely is that in this article the authors explain that 'Even tough side-channel leakage may contain lots of dat, only a small fraction can actually be exploited in each measurement.' Wrong. I'll try to clarify what actually happens: each point (=sample data point, a collection of data points gives a 'trace' which is the actual observation/measurement) contains information about the internal state of the device at this particular point in time. If only a small part of the secret has 'contributed' to the leakage at this point, then only this small part of the secret leaks at this point (in time). As one trace consists of many points (remember we observe 'lots of data'), there is 'lots of leakage' contained within one trace. Especially implementations of exponentiations will produce distinct 'patterns' (i.e. have different leakage characteristics corresponding to 0 and 1 bits in the secret) that will almost certainly allow to extract the entire secret given one single observation. This is a key point, which is often overlooked by the theory community (by assuming that not the entire secret leaks given one query to the leakage oracle). It implies that without countermeasures (including blinding) along the lines developed mainly within the CHES community you don't even need to start considering LR schemes developed by theorists: the basic assumption that not the entire secret leaks in one observation actually requires to do something, it is not 'naturally' guaranteed.

Back to the paper and their statements about the security of blinding schemes. Many schemes are known by practitioners, and on a high level they work like this: rather than working with your secret exponent x directly, share it (depending on your underlying algebraic structure) either multiplicatively or additively (whether to use additive or multiplicative sharing is typically influenced by performance considerations). Additive sharing in the case of El Gamal encryption is not a good idea: if rather than using x directly we were to share it and use x'= x+r_i and x''=x-r_i it would be easy to calculate the the last 'few' bits of x by knowing the last t bits of x' and x'' as those bits would be given by x' mod 2^t + x'' mod 2^t (minus a fudge factor if there was an overflow mod p). In practice people use what was suggested already by Kocher in "Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems": one chosses random r_i, computes its inverse invr_i and shares x such that x=x'*x'' with x'=x*r_i and x''=rinv_i. This can be done reasonably efficiently, and one can even update r_i and rinv_i easily (by e.g. just squaring them) such that they are not re-used in subsequent encryption rounds (Would the proof in the paper by Kiltz and Pietrzak extend to this way of updating/refreshing shares?).

The cool thing about the paper is that they are able to prove (in case of bilinear groups, using the generic group model) that the above outlined strategy (minus the efficient update of shares mentioned at the end) is actually CCA1.

Friday, November 25, 2011

ECRYPT2 Vampire2 Retreat

Leakage, Leakage, Leakage!

We were hosting again a Vampire research retreat here in Bristol. Roughly 25 people interested in the physical security of cryptography came together to discuss and share ideas about practically relevant attacks and countermeasures. We had a really fantastic crowd coming from all over Europe (France, Luxembourg, Belgium, Germany, UK and Austria), with a wide range of interests. We ended up discussing possible new avenues to attacks recent schemes (masking, re-keying), tried to make progress w.r.t difficult practical challenges such as working with very noisy data, as well as work on fundamental questions related to target function properties (aka cipher properties) and DPA resistance.

A big thanks here from us organisers to all participants' motivation, interest, and contributions. Another equally big thanks goes the the EC for their commitment to ECRYPT which enables us to come together in this way!

Hope to see many of this retreat's participants also on our next one in March in Leuven!

Wednesday, November 23, 2011

Study group: Definitions of leakage-resilience in theory and practice

Leakage, leakage, leakage.
Consider a leaky device that's supposed to handle a secret key for encryption, what can we say about its security?
If the device leaks the whole secret key, obviously there's no security in the encryption, so we somehow need to restrict what leakage an adversary can observe if we want to give any positive guarantees.

At this point the views of theoreticians and practitioners start to diverge. The gist of today's talk was that theoreticians work in models based on the state of the art at the time the model is created but take some time to catch up when practicioners find new results.

What leakage?

There are several models floating around in the cryptographic theory.
The main distinction is between bounding the type of leakage an adversary can observe (i.e. what functions of the secret key) and the amount of leakage.
If we look at bounding the amount of leakage, there's three main models.
First, the "relative leakage" model which bounds the amount of leakage (in bits) compared to the size of the secret. Secondly, the "bounded retrieval" model in which the total amount of leakage the adversary can observe is bounded.
Finally, the "continuous leakage" model in which the factor of time comes into play: the device in question runs in a sequence of time periods and the leakage that the adversary can observe in each period is bounded, but there's no overall bound. This means the device must refresh its keys as it transitions from one period to the next.

Criticism of the Models

All these models have one practical drawback: they don't really reflect what's going on in practice. On the one hand they're too strong, the adversary's observations being usually limited by a constraint such as "polynomial size leakage". Practitioners can only dream of observing that much leakage and if they could, none of today's devices could stand up to such a powerful adversary.

On the other hand, the models miss out the main challenges in today's practice, designing more efficient ways to observe and evaluate leakage. In practice, the success of an attack will depend on the quality of the statistical model used to analyse the leakage, the "noise" in the leakage, the number and quality of observations one can make and so on; current research involves ways to improve these factors or find trade-offs between them.
Theoretical leakage models on the other hand are all-or-nothing, you get the leakage in one go or you don't.

OCLI - really?

Many years ago, when power analysis first appeared on scene, it was noticed that CMOS logic required very little power to "hold" a state but what caused power spikes was actually the changing of states, whether from 0 to 1 or vice versa.
And so it was proposed to base theoretical models on the assumption that one could only observe leakage from data that is being computed on, as opposed to just sitting around in memory - Only Computation Leaks Information (OCLI).
Unfortunately today's practice does not support this assumption.
Shamir's cold-boot attack (pour liquid gas into laptop to freeze RAM, tear it out, copy contents before it thaws) is an attack on data just sitting in memory.
There are many other types of side channel beside power consumption - a computer virus could be considered a form of leakage to the adversary.
Even defining what is "data being computed upon" isn't that easy: at the hardware level with all the buffers, caches, pipelines and other optimisations in today's processors even a chunk of x86 assembly code won't give you a simple answer to when which gate is switched.
At the software level, things get even worse. The abstraction offered by an operating system may shift memory pages around in a manner mostly oblivious to applications running on it, not to speak of several processes running in parallel.
Finally with today's nanoscale logic, even the principle that power consumption is primarily related to changes in state does not necessarily hold anymore.

Some theoretical examples

The theory of leakage resilience is not all useless, though. We looked at two examples.
If f is a one-way function, what happens if a few bits of an input are leaked?
It's fairly easy to see that if there's a preimage-extractor given a few (at most logarithmically many) bits of input then one can just guess or brute-force these bits to construct a preimage-extractor that doesn't require any leakage.
More interestingly, if f is second-preimage resistant then f will be leakage resilient one-way if the leakage is small in the compression ratio.
Suppose you're given a preimage and want to find a second one that maps to the same image as the first, and you have a preimage-extractor that needs some leakage to work.
You can feed it the image of your challenge and answer its leakage requirements with the preimage that you already know.
If you do all this formally and get the math right the result is that you have a decent probability of the extractor giving you a different preimage to the one you know.

A real example

Finally we'll look at an example of a deployed protocol in set-top boxes that was designed to be leakage resilient.
There are many "common sense" components to reduce the exposure to known attacks: a root key is used only to derive message-keys using a hash function tree construction.
This prevents an attacker from getting too many traces on the root key itself.
A public, fresh random message id is chosen for each mesage to encrypt. This prevents an attacker from repeatedly encrypting with the same parameters.
Different constants are attached to each hash input in the construction to prevent correlations between different hash calls.
The message is encrypted in blocks, or rather chunks of blocks that just about fit in a smartcard's memory and the key is refreshed after each chunk.

There's no proof of this construction using today's models and it seems unlikely that there will be one any time soon but we're not aware of any published break of this particular protocol.

Conclusion

There's a big gap between theory and practice in side-channel attacks and praticioners (like today's speaker) sometimes wonder if the theory bears much relation to their work at all.
We still don't have a satisfactory theoretical model that adequately reflects what goes on in practical attacks.

Tuesday, November 15, 2011

Study Group: Polly Cracker, Revisited

Today along came Polly and it was cracking! In other words, our study group for this week was dedicated to Polly Cracker, Revisited by Albrecht, Farshim, Faugère, and Perret. The paper will be presented at Asiacrypt 2011 (LNCS 7073) later this year.

Polly Cracker schemes use ideals of multivariate polynomial rings and they are based on the presumed hardness of computing good (Gröbner) bases for these ideals. The first schemes were introduced in the early nineties (just when I went to University) and there has been a steady amount of work on them, though they never really gained the widespread acceptance that for instance factoring, discrete log, or recently lattice based cryptosystem acquired.

We started our study group with a review of the relevant mathematics. David gave us an excellent turbo tutorial on rings, ideals, monomial orderings and finally Gröbner bases. It was a very useful reminder of the basics and inevitably I had to think back of times long gone by, when, as part of a coursework, I had to prove the Pythagorean theorem using Gröbner bases (definitely one of the geometrically least insightful ways of going about proving this theorem).

After this refresher, Jake took over to discuss the Polly Cracker, Revisited paper. The first part of the paper (and our discussion) deals with noiseless version of Polly Cracker. This can be considered the more classical problem. Three different hardness assumptions can be considered, all based on an adversary with access to an oracle returning randomly sampled elements in the ideal: in the first version (GB), the adversary has to reconstruct a Gröbner basis; in the second version (IR) the adversary needs to compute the remainder (modulo the ideal) of a random challenge (in the ring); and in the third version (IM) the adversary just needs to decide whether some challenge element is in the ideal or not. Of course, in the paper all the relevant probability distributions are worked out properly. While it is quite obvious that the ability to compute Gröbner basis suffices to solve the other two problems, the other direction is less straightforward and in fact, Albrecht et al. only provide a conditional implication. They also show how to create a symmetric encryption scheme that satisfies a bounded version of IND-CPA security (here the adversary has only a limited number of calls to its left-or-right and encryption oracles).

The interesting stuff appears afterwards, when the authors introduce errors or perturbations in analogy of the LWE and approximate GCD problems. Indeed, they even show that their approach is a proper generalization of these two settings. Unsurprisingly, this allows the creation of IND-CPA secure scheme. More exciting is the fact that it also shows that Regev's scheme based on LWE allows for a fixed number of multiplications.

We did not discuss the paper in great detail, but Jake did mention one interesting avenue for continued research. Given that this new approach allows one to cast both LWE and approximate GCD in the same framework, can one also capture ring-LWE. If so, this might enable a better comparison of the various fully homomorphic encryption (FHE) schemes out there. The hope expressed by Jake was that this might allow a reduction to standard LWE (for the current batch of ring-LWE schemes), which would boost our confidence in those schemes.

Wednesday, November 9, 2011

Study Group: Physical security and FPGAs - Part II

In this week's study group we resumed the topic of FPGA Security and discussed issues covered by 4 papers:

Amir Moradi and Alessandro Barenghi and Timo Kasper and Christof Paar
“On the Vulnerability of FPGA Bitstream Encryption against Power Analysis Attacks – Extracting Keys from Xilinx Virtex-II FPGAs” [pdf]

Amir Moradi and Markus Kasper and Christof Paar “On the Portability of Side-Channel Attacks – An Analysis of the Xilinx Virtex 4, Virtex 5, and Spartan 6 Bitstream Encryption Mechanism” [pdf]

Tim Güneysu and Amir Moradi “Generic Side-Channel Countermeasures for Reconfigurable Devices” [pdf]

G. Canivet, P. Maistri, R. Leveugle, J. Clédière, F. Valette and M. Renaudin
“Glitch and Laser Fault Attacks onto a Secure AES Implementation on a SRAM-Based FPGA” [pdf]

The study was moderated by Dan and Simon. Dan started his talk with a short introduction to FPGAs and described the main building blocks in SRAM-based FPGAs such as LUTs, embedded memory blocks, dedicated blocks for multiplication or additions. He also briefly described the potential weaknesses in terms of security.

In short, FPGA could be used to implement third party design and those designs sometimes have to be protected against counterfeiting. One mechanism that tries to solve this problem is bitstream encryption, where the vendor tool encrypts a bitstream using a symmetric key K_BIT: once loaded to the FPGA another internal mechanism decrypts it using the (previously loaded) same symmetric key K_BIT. The bitstream itself can consist of a variety of designs but one of them might be, e.g., Advanced Encryption Standard (AES) used as a cryptographic primitive. To perform an operation on AES this cryptographic primitive should be supplied by a symmetric key K_AES.

There were three main questions set at the beginning of the Study Group that both Dan and Simon tried to answer based on the above-mentioned papers.

Can we recover the symmetric key used in the cryptographic primitive K_AES?
Can we recover the symmetric K_BIT used in bitstream protection via Differential Power Analysis?
What kind of countermeasures can we use to prevent DPA attacks that reveal K_AES?

The first presented paper is addressed at question 1. Authors describe a fault injector platform for SRAM-based FPGA devices. Faults can be induced in the experiment by voltage glitches or laser attacks. Apart from practical issues, another interesting feature of the paper is the fault model for SRAM-based FPGA devices. In the presented model, authors described the following types of errors: no effect: the internal state of the circuit was not modified at all, silent errors: the state of the circuit is modified, but the result is correct and no error is detected, false positives: there actually is an alteration, and an alarm is raised, but the result is nonetheless correct, detected errors: the fault led to an unexpected result and it was successfully detected, undetected errors: no error is detected, but the cipher is not the expected value.

There are few techniques available in the literature to implement error detection. One of them is based on a temporal redundancy, where computation is performed twice (or more) and then the results are compared. Authors implemented AES as a target algorithm and used Dual Data Rate (DDR) as a fault detection scheme. The DDR implementation uses rising and falling edge to update a different set of registers and thus allows data to be parsed twice for each clock cycle. Then “saved” time can be used to compute results again and check them against faults. This architecture has been attacked by laser-based fault injections during runtime. Results showed that although DDR AES implementation is very resistant to transient faults, FPGA could be vulnerable to faults that modify configuration bitstream.

Another interesting paper presented (by Simon) during the Study Group dealt with countermeasures against K_AES leakage from an FPGA device; i.e., what can we do with off-the-shelf SRAM-based FPGA fabric to prevent DPA? Authors focus in this case on a Viretx-5 device from Xilinx and combine a few techniques to make a DPA (especially first-order) more difficult, e.g. generating Gaussian Noise (using Shift Register LUTs, BRAM Write Collisions, Short Circuits in Switch Boxes), clock randomization (using Digital Clock Managers – DCMs), preventing clock frequency manipulation and block memory content scrabbling. As a result, authors present an FPGA armed with countermeasures ready to be used by non-side-channel-aware engineers. During a case study evaluation, authors showed once again that, in general, increasing a noise level is not a good idea to make your design secure against power attacks. More useful was an idea to use a Digital Clock Manager (DCM) available on FPGA chip and modified their output to achieve a pseudorandom clock behavior. In other words, the rising edge of a clock used in a design is “floating” in some range, which makes the power consumption more unpredictable. The strength of this approach was increased by preventing clock frequency manipulation method to mitigate known workarounds for clock frequency randomization techniques. The combination of all applied countermeasures increased the total number of traces measurements required to at least 100 million and even that did not guarantee the success of first-order DPA.

The last two papers presented by Simon describe practical attacks on the bitstream configuration mechanism. These are the first attacks described in the open literature that aim to reveal the bitstream protection key using side-channel methods, and according to authors it was successful for Virtex II, Virex-4 and Virex-5 devices. Because of the black-box approach (which means that the attacker has very limited knowledge about an internal design) authors had to answer a few additional questions to become familiar with the overall architecture, e.g., what type algorithm has been used, what is the mode of operation, is it a pipeline, loop or software implementation. A number of additional steps (for example, improving the S/N (signal to noise ratio) via filters and alignment techniques) had to be performed as well. Interestingly, in the case of Virtex-4 and Virex-5, GPUs have been used to speed up computation; with increasing complexity of DPA attacks this approach might be very useful for future studies.

Friday, November 4, 2011

Study group: Models of Malware

The study group topic for discussion this week was on formal methods of modeling malware and viruses. In the late 80's Fred Cohen, at the time a student of Leonard Adleman, described an abstract definition of viruses and used it to prove that there is no algorithm that can perfectly detect all possible viruses (text). Adleman himself published a paper a year later, using an abstract definition of a computer virus to answer questions on the detection of, and protection against viruses (text). Since the late 80's the capability of malware designers has increased and the complexity of malware itself has evolved, resulting in a renewed interest in abstract definitions and modeling of malware and viruses. In the study group Theo introduced some more recent research that follows on from Cohen and Adleman's earlier work; "Toward an Abstract Computer Virology" by G. Bonfante, M. Kaczmarek, and J.-Y. Marion ((pdf) and "Math on Malware" by Henk-Jan van der Molen (pdf).

Bonfante et al. suggest a newer definition for viruses that encapsulates Adleman's previous definitions, amongst others, and use the new definition to make an interesting observation on virus detection and to propose a countermeasure against a particular type of virus propagation. A virus is a program with a propagation mechanism that can be described in terms of some computable function B, that selects from a list of available programs and then replicates the virus inside the target. In other words, B is the delivery mechanism; some functional property or flaw in the programming language used by the virus to infect the system. A common approach from previous attempts at formalising viruses has been to define a virus as a self-replicating device; a virus has the capability to act on a description of itself. This leads to a definition that uses Kleene's recursion theorem to construct the viral mechanism. The self-reproduction is provided by a duplication function that copies a virus into a program in different ways.

There are three particular types of duplication function to mimic three types of virus behaviour. A crushing is simply a virus that has some behaviour involving copying itself, e.g. to a directory. A cloning is a virus that copies itself into some parts of a program environment, without increasing the overall program size (typically by overwriting honest code). Finally an ecto-symbiotic virus essentially attaches itself to a host program, without overwriting. Further definitions that cover polymorphic viruses (viruses that in some sense mutate when they reproduce) are discussed in the paper. The authors also show how these definitions can encapsulate Adleman's original definitions by translating it into the new formalism. An interesting result is that it can be proven, under the definitions that are approximately described above, that there are at least some propagation mechanisms B for which it is decidable whether a particular program is a virus.

A slightly different question is to ask how malware or viruses propagate throughout a network once they are introduced into the environment. Henk-Han van der Molen focuses on using ideas from epidemiology and network theory to study how infection rates change over time once a network is compromised. In particular, a highly useful metric would be to quantify how impactful particular countermeasures and security methods are at reducing infection rates (and increasing clean-up rates). For instance, potential tools for alleviation such as deployment of anti-virus software, periodic re-imaging, increased user education, and incident management/disaster recovery procedures have varying economic costs, so knowing their real effectiveness as countermeasures can increase efficiency.

The particular network model chosen for analysis is described as the "Susceptible-Infected-Susceptible" (SIS) model. From a biological standpoint, it captures a scenario whereby after contracting a disease, and recovery, humans are still susceptible to re-acquiring the condition later on. This is perhaps the best-fit for modeling a computer network, as it may be that machines that are cleaned/re-imaged/patched still have potential vulnerabilities. Without going into the mathematical description too heavily, the SIS model starts by assuming that in the early phase, infection rates grow slowly because there are few infected computers to spread the malware. The number of infected computers that are returned to the susceptible state is proportional to the number of infected computers. Two other additional factors are the probability of resuscitation, which models the effectiveness of the countermeasures in cleaning an infected node back to its susceptible state, and the contamination factor, which models the probability an infected node can infect a susceptible neighbour.

The SIS model is analysed in three particular scenarios. The first is the 'corporate' scenario, with statistics from malware infections in organisations used to estimate the resuscitation and contamination factors. In the 'practice' scenario, the goal of the malware is assumed to be to infect as many computers as quickly as possible. In this instance, the parameter estimation is difficult because there are few sources of reliable statistics available. In the final 'cyberwarfare' scenario, the malware needs to remain undetected for as long as possible to keep target nodes in the infected state. One interesting result drawn from the model is that with a periodic reset of software, in the corporate scenario the steady state of malware infections drops almost to zero. With the increasing proliferation of virtualisation techniques in business, executing this strategy frequently may become more and more viable.

Another mitigating factor in the spread of malware is the diversity of the software installed on a network. A fully Windows-based network could potentially be entirely compromised by a single Windows-based virus, but a network comprised of a mix of Mac OS, Linux and Windows could not be infected to the same extent by just one such virus. An analysis of the SIS model seems to support this hypothesis, with increasing diversity resulting in slower infection rates. However the practical and economic implications of this strategy are much harder to capture: the support costs for supporting diverse software are higher, typically vendors try to encourage lock-in, a reliance on software supporting shared data standards is introduced, and so on.

The research poses a challenging yet important problem. Through modeling it will be possible to improve our understanding of virus and malware propagation and the effectiveness of tools and procedures as countermeasures, but a further understanding of the implications and costs of the mitigating factors is also required to have a complete solution.

Monday, October 31, 2011

InfoSecHiComNet 2011

Last week I was attending a conference we helped organise in India entitled InfoSecHiComNet 2011. The conference received 122 submission, of which 14 went in to the conference program. These were complemented by invited talks by Jorn-Marc Schmidt, Ingrid Verbauwhede, Benedikt Gierlich, Saibal Pal, Palash Sarkar, and Sanjay Burman.

We expect that this conference will be repeated under the less general theme of "Cryptography and Cryptographic Engineering", which we hope will help encourage cryptography research in India. This conference is expected to complement CHES, since the number of submissions generally received by CHES demonstrates the popularity of research into topics related to cryptographic engineering. The intention is also to hold conferences that will not be in direct conflict with Indocrypt.

Tuesday, October 25, 2011

Study Group: Physical security and FPGAs - Part I

In this study group Marcin and Philip started the discussion on Physical security of FPGAs which will be followed by a second part in two weeks. They based the study group mostly on two papers by Saar Drimer, "Volatile FPGA design security - a survey" [pdf], "Security for volatile FPGAs (Technical Report)" [pdf] and on the paper "Protecting multiple cores in a single FPGA design" [pdf] by Drimer, Güneysu, Kuhn and Paar.

As we have both theoretic and practical cryptographers in our group, Marcin and Phil started by describing the basic technology of a Field-Programmable Gate Array: Unlike ASICs (Application-Specific Integrated Circuit), the function of a FPGA is not determined when it is manufactured. Instead, FPGAs have to be configured; based on their configuration mechanism, they are either

volatile, i.e. the configuration can be changed, or
non-volatile where the configuration is "burnt" into the device, e.g. using anti-fuses.

The key manufacturers of FPGAs are Xilinx, Altera and Actel which was bought by Microsemi.

The study group focused on volatile FPGAs, which essentially consist of an array of SRAM lookup tables (LUTs) which map a fixed number of input bits (e.g. 4) to a smaller number of ouput bits (e.g. 1) and configurable interconnect. The LUTs basically emulate the truth tables of the logic gates used in ASICs while the configurable interconnect assures that basically all LUTs can be "wired together" as required. Being SRAM, the LUTs have to be reconfigured at boot time which provides the volatility of the FPGA. A comparison of FPGAs to CPUs and ASICs results (roughly) in the following table:

	CPU	FPGA	ASIC
Speed	low	medium	high
Power consumption	high	medium	low
Implementation cost	low	low	high (tapeout, masks)
Cost per unit	medium	high	low
Updateability	good	doable	none
Implementation	C, Java,...	HDL	HDL

Of course FPGAs come with a couple of security related issues including such as

leakage (side channel attacks)
semi-invasive attacks (fault injection)
invasive attacks
trustworthy (remote) configurations (hw trojans, reliability)
protection of Intellectual Property in configuration bitstreams (IP cores).

A FPGA is configured using a "bitstream" which consists of the data to be written to the LUTs and of the configuration bits for the interconnect. Since updateability is one of the key features of a FPGA, one would of course want to do it remotely and allow only trustworthy, i.e. authenticated bitstreams to be loaded into a FPGA. At the same time, the bitstreams contain valuable IP and whoever can read the bitstream can either clone the device (by loading the bitstream into another FPGA) or reverse engineer the IP from the bitstream. Furthermore, it is common that a System Designer doesn't implement all the IP in a bitstream himself - instead he licenses IP cores from several IP Core vendors. (Commonly, an embedded system on a FPGA consists of a microprocessor core, some interface cores such as UART, USB or Ethernet, some accelerator cores e.g. for graphics, crypto, signal processing and a bus core to connect all of the other cores. These may all be supplied by different IP core vendors.) So the IP core vendors have to ensure that their cores are only used in devices for which someone bought a license from them.

The first protocol presented by Marcin and Phil is a simple remote update protocol for which the FPGA needs to have

a fixed update logic UL besides the configurable logic which can compute a keyed MAC,
a unique and public fingerprint F,
a version number V of its current bitstream (to guarantee freshness)
and of course a key which will be used as input for the keyed MAC.

Then, the FPGA has to identify itself and its current bitstream version to the update server so that both sides know F and V. Additionally, both sides need to generate nonces in order to avoid replay attacks; for the FPGA it is sufficient to use a counter while the update server should create a truly random nonce. The bitstream is then sent to the FPGA together with its MAC value which serves both as authentication and error protection mechanism. It is important for the remote update protocol to provide a backup solution for the case that something goes wrong during the backup - otherwise the FPGA would be unusable. One option is to have a default bitstream stored in some non-volatile ROM without using too much (expensive) memory. Note also that the remote update protocol does not need to decrypt the bitstream - commonly the encrypted bitstream is stored in the non-volatile memory as is and gets only decrypted when it is loaded into the configurable logic at boot time using a key and the decryption routine embedded into the boot logic.

The second protocol presented by Marcin and Phil addresses the bit stream decryption and IP core protection problem. Basically, if the bitstream is completely owned by the System Designer (SD), the SD could just encrypt it using a symmetric key he has loaded into the FPGA. But IP core vendors want to protect their IP even though it will always be used by SDs as part of a larger system. So a more complex system is needed and Diffie-Hellman style key-exchange can provide it as the presented 4 stage protocol with three parties (FV - FPGA Vendor, CV - Core Vendor, SD) shows. The needed prerequisits of the FPGA are

a unique key K_FPGA,
unique identifier FID (both stored in non-volatile memory inside the FPGA),
some hardwired additional registers to store a bit stream encryption key for each CV and
a hardwired decryption module which has access to all of the above keys.

Obviously, none of the keys should be accessible to the outside world. The 4 stages are:

Setup: FV generates a secret x and computes a public g^x and embeds the secret into the personalisation bitstream PB and encrypts PB using K_FPGA and sends [FID, g^x, ENC_{K_FPGA}(PB)] to SD.
Licensing: For each CV and each FPGA device, SD sends (FID, g^x) to the CV who choses a secret y, computes a public g^y and uses a key derivation function KDF (specified by the FV) to compute a key K_CV=KDF(y,g^x,FID) which he uses to encrypt his IP core and sends [ENC_{K_CV}(core), g^y] to the SD.
Personalisation: Having assembled his own bitstream containing all the encrypted cores of the CVs, the SD loads first the encrypted PB onto the FPGA which the FPGA is able to decrypt using K_FPGA. The PB contains an implementation of KDF and of the vendor secret x. Furthermore, the SD provides all the g^y to the FPGA which in turn uses the KDF to compute all K_CV=KDF(x,g^y,FID) which it stores in the registers of the decryption logic. (The computation works thanks to Diffie-Hellman.) Now the Personalisation logic (within the configurable part of the FPGA) isn't needed anymore as the (static) decryption logic is fully prepared to decrypt the bitstreams.
Configuration: The SD sends all encrypted bitstreams to the FPGA which decrypts them using the respective K_CV's which were stored during personalisation in the decryption registers.

Note that the SD learns neither the secret x nor any of the secret y's and therefore can not compute any K_CV to decrypt any of the IP cores he licensed. Additionally, the K_CV are tied to the individual FPGA chip thanks to FID so cloning of a device is not possible.

Tuesday, October 18, 2011

Study group: RSA-based standard model signatures

Nigel and Ming-Feng presented the results of 4 papers on provably secure digital signatures in the standard model. Ming-Feng started with Cramer and Shoup's seminal paper from 2000 [ps], and then discussed reductions of the signature length by Fischlin (PKC 2003) [pdf]. Nigel then mentioned the first (stateful) scheme based on the RSA assumption by Hohenberger and Waters [pdf], and dedicated a more thorough presentation to their follow-up work at CRYPTO 2009 [pdf], which is a stateless scheme.

Cramer-Shoup signature scheme was the first to have a standard-model proof of security against adaptive chosen-message attack. It is stateless (which is now the standard for digital signatures, meaning the signer need not keep any state information) and based on the strong RSA assumption. The RSA assumption is, given an RSA modulus n (which is the product of two large primes) and r, z ∈ Z_n^*, it is hard to find y ∈ Z_n^* s.t. y^r≡z (mod n). For the strong RSA assumption, the adversary has more flexibility: given n and z, she has to find both r>1 and y ∈ Z_n^* s.t. y^r≡z (mod n).

Note that while the latter assumption is stronger than the former, the best known attack against both of them is to factor the modulus n.

A signature scheme SIG=(KG,Sign,Ver) consists of algorithms for key generation, signing and verification, which can be sketched as follows (pk is the public key and sk is the signing key):

1^k → KG → (sk,pk)

(sk,m) → Sign → σ

(pk,m,σ) → Ver → 1/0

A scheme is secure against chosen-message attacks if an adversary getting the public key and a signing oracle that returns signatures on messages of its choice cannot output a message it has never queried and a valid signature on it.

Note that the "text-book" RSA signature (where a signature is simply m^d mod n) scheme is not secure under chosen-message attack, since signing is homomorphic. "Hashed RSA" is secure in the random-oracle model.

Cramer and Shoup proposed the first scheme that is provably secure against chosen-message attacks in the standard model. The scheme has 2 parameters l and l' with l+1<l'. Typical assignments are l=160, l'=512.

Key generation

generate an RSA modulus n=pq where p=2p'+1 and q=2q'+1 for p' and q' large primes (such p and q are called strong primes)
generate quadratic residues h,x ∈ QR_n
generate an (l+1)-bit prime e' and
define pk=(n,h,x,e') and sk=(p,q)

Signing

for a hash function H with l-bit outputs set x=H(m)
choose (l+1)-bit number e
choose y ∈ QR_n
choose x' where (y')^e = x' h^H(m)
compute y where y^e=xh^H(x')
define σ = (e,y,y')

Verification

check whether e is a (l+1)-bit integer and e≠ e'
compute x'=(y')^e' h^-H(m)
check x= y^e h^-H(x')

The proof distinguishes different kinds of forgers, depending on whether their forgery reuses the e' of a signature it obtained from a signing query or it reuses the x' or both. While forgeries that reuse parts can be reduced to the standard RSA assumptions, for forgeries with a new e we need to resort to the strong RSA assumption.

In his PKC '03 paper, Fischlin gives a variant of Cramer-Shoup signatures (also provably secure under the strong RSA assumption) which has shorter signatures: instead of |σ| = 2|n|+l+1, we now have σ=|n|+2l+1, but the public keys are longer.

While the CS signature scheme was state of the art, it requires the strong RSA assumption. The reason being that every signature requires its own value e, therefore in the security reduction, in order to simulate the signing oracle, we allow the simulator (who does not know the factorisation of n) to somehow "cheat" by creating new e's.

It thus seems the that we cannot get around this stronger assumption for RSA-based signatures. However, at EUROCRYPT 2009, Hohenberger and Waters presented a scheme that was proven secure under the RSA assumption. The caveat being that it was a stateful scheme, meaning that the signer had to store a counter, which it had to increase every time it issues a signature; else all bets are off. Their signatures were short and looked like standard hash&sign signature à la H(M)^d mod n.

At CRYPTO the same year, they then extended their ideas to define a stateless signature scheme, based on RSA. They start with defining a scheme that only satisfies security against weak chosen-message attacks, where the adversary has to give the challenger all messages it would like to have signed before it gets to see the public key.

Then they transform their scheme into a fully secure scheme by using a known trick: Chameleon hash functions. These are functions that have input a message and some randomness: H(m,r) → y. Now, given a trapdoor τ one can for any y and m efficiently compute r s.t. H(m,r)=y.

The good news is that there exists a chameleon hash function based on the RSA assumption. Given a weekly (not daily) secure scheme and a Chameleon hash function, one can generically construct a fully secure scheme as follows.

Add to the public and private key the description of the hash function. To sign pick a random r and sign x=H(m,r): σ'=S(sk,x). The signature is then defined as σ=(σ',r). To verify it, just recompute the hash and verify σ' on it.

Now given this transformation, all that needs to be done is construct a weakly secure scheme, which is done as follows (remember, a simulator's goal is given N,e and y to compute y^1/y mod n):

Key generation

pick an RSA modulus N, pick h ← Z_n^*
pick a pseudo-random function F_k:{0,1}^* → {0,1}^l
pick a random c ← {0,1}^l
define H_k,c(z)=F_k(i||z) ⊕ c, where i is smallest number s.t. H_k,c(z) is an odd prime
output pk=(N,h,c,k), sk=(c,k,p,q)

(Whereas in the EUROCRYPT paper, the state i is responsible for choosing the primes, now the picked prime depends on the input, not on the index of the signature.)

Signing

let m⁽ⁱ⁾ be defined as the first i bits of m
define e_i=H_k,c(m⁽ⁱ⁾) (thus, given a message we generate n prime numbers)
define σ = h^{∏ e_i^-1}

Verification

compute e_i=H_k,c(m⁽ⁱ⁾)
check σ^{∏ e_i} = h

The proof idea is the following. First, after the adversary outputs m₁, ..., m_n, we fix pk. Since the output m* is a forgery, it must be different from all m_i, so there has to be a j s.t. (m^*)^(j) is different from all m_i^(j).

Suppose we guess this position, then we know what the (m*)^(j) is, and can then fix c s.t. H on (m^*)^(j) returns e, the target from the RSA challenge. To simulate signatures on the queried messages m_i, we raise y to the power of all primes e_i,j = H_k,c(m_i^(j)) and set it as h. Now we can take e_i,j-th roots of h. (This is similar to how Boneh and Boyen transform a challenge for the strong Diffie-Hellman assumption into signatures in their scheme.)