Bristol Cryptography Blog: January 2013

Tuesday, January 29, 2013

Study Group: Functional Encryption

Last week's study group was on Functional Encryption and was given by Georg and Joop.

In traditional Public-Key Encryption (PKE), a user Alice encrypts a message x using the public key of the recipient Bob. Bob then uses his corresponding secret key to decrypt the ciphertext y to recover the message x. In some scenarios, the recipient of the ciphertext is not yet known at the time of the encryption or there are more than one recipient who should be able to decrypt the ciphertext, e.g. according to some policy. Functional Encryption (FE), formalized recently by Boneh, Sahai and Waters [PDF], is a generalization of many recent encryption primitives, e.g., Attribute-Based Encryption (ABE) [PDF] and Predicate Encryption (PE) [PDF] which extend traditional PKE to provide fine-grained access to encrypted data.
A FE scheme for a family of circuits C assigns a secret key sk_C for every circuit C∈C.
The holder of a secret key sk_C is able to decrypt a ciphertext y to obtain C(x) and learns nothing more about x.
An FE scheme is formally defined by four algorithms (Setup, KeyGen, Enc, Dec) which are defined as follows:

Setup(1^λ) this PPT algorithm takes a security parameter λ as input and outputs a pair of master public\secret keys (mpk,msk).
KeyGen(msk,C) this PPT algorithm takes as input the master secret key msk and a circuit C∈C and outputs a secret key sk_C.
Enc(mpk,x) this PPT algorithm takes as input the master public key mpk and a message x from the message space X and outputs a ciphertext y.
Dec(sk_C,y) this deterministic algorithm takes as input the secret key sk_C and the ciphertext y and outputs C(x).

Gorbunov, Vaikuntanathan and Wee [PDF] provide a construction for the class of all polynomial-size circuits secure under a simulation-based bounded collusion definition in which the adversary can make bounded number of KeyGen queries and gets a single challenge ciphertext. The definition ensures that any information that the adversary learns from the keys and ciphertext can be obtained by a simulator from only the keys and the output of the circuit.

In the real world the challenge ciphertext is produced by encrypting the challenge message x^* using mpk, whereas in the ideal world, the simulator is equipped with the length of the the challenge message as well as {C_i(x^*), sk_{C_i}, C_i} resulting from the q KeyGen queries the adversary is allowed to make. In the adaptive variant, the adversary is allowed to make further KeyGen queries after it has received the challenge ciphertext.
Gorbunov, Vaikuntanathan and Wee [PDF] show that their simulation-based definition for a single message implies the indistinguishability definition in both the adaptive and non-adaptive cases. The single-message indistinguishability definition also implies the many-messages indistinguishability definition in both the adaptive and non-adaptive settings. However, while the non-adaptive single-message simulation definition implies the many-messages non-adaptive simulation definition, the same does not hold for the adaptive case.

The idea of their construction for the NC1 class of circuits and a message space X =Fⁿis based on a scheme that is secure under a single key query and is summarized as follows:
The system parameters are t, v, N and S.
A new family G is defined from C as G_C,Δ(x,Z₁,…, Z_S) = C(x) + Σ_{i∈ Δ}, where Z_i∈ F and Δ ⊆[S].
The setup algorithm independently runs N times the single key query Setup algorithm for the same circuit family. The master public key is then set to MPK={mpk₁,…, mpk_N}, whereas the master secret key is MSK={msk₁,…,msk_N}.
The KeyGen algorithm then on input (C,MSK) randomly chooses a set Γ ⊆ [N] of size tD+1, where D is the degree of the circuits. It also chooses a set Δ⊆ [S] of size v. It then runs the KeyGen of the single key query scheme with input (msk_i,G_C,Δ ) for all i∈ Γ and returns sk_C=( Γ, Δ, {sk_C,Δ,i}_i∈Γ)_.

To encrypt a message (x₁,… , x_n), choose n random degree t polynomials μ_i(.) such that x_i is the constant term in polynomial μ_i(.). Also, S random degree tD polynomials ζ(.) whose constant term is 0 are chosen. To encrypt the message encrypt (μ₁(i),…μ_n(i),ζ₁(i),…ζ_S(i)) to obtain ciphertexts y_i for i=1,…,N.
To decrypt, compute a degree tD polynomial χ(.) such that χ(i)=SKQDec(sk_C,Δ,i,, y_i) for all i ∈ Γ, where SKQDec is the decryption algorithm of the single key query scheme. The decryption algorithm then outputs χ(0).

Saturday, January 26, 2013

Crypto for 2020 - Day 2

Kenny Patterson talked about theory and practice of key reuse. Key reuse offers some interesting advantages such as reducing storage requirements for certificates and keys, reducing the costs of certification as well as reducing the cryptographic code footprint. A classical scenario where to use this approach would be in an encryption and signature setting. Given all the advantages that key reuse provides in common practice, what can we say about security? At Asiacrypt 2011, an efficient construction based on pairings has been presented for joint security of encryption and signature.

However, in practice, simpler constructions are employed. For instance, the EMV standard uses RSA keys to compute signatures for card authentication and transaction authorization, and to encrypt the PIN between the terminal and the card. No specific analysis has been conducted of whether this approach is detrimental to security. If you are a follower of this blog then you will know that there exists an attack (at least in theory), Nigel described it in some more detail in one of his earlier posts.

Furthermore, Stefan gave a brief overview of the most important developments in the area of physical security in the last 15 years. State-of-the-art attacks can process up to 1 billion measurements of side-channel information (i.e., run time, power consumption or electromagnetic radiation) that typically leaks during the execution of security-critical code. This gives an attacker a lot of information to extract secret key material. That is one of the reasons why there is currently no countermeasure that guarantees absolute side-channel resistance. However, one can make life considerably more difficult for an attacker by employing defence mechanisms that combine higher-order masking with some shuffling of the security-critical code sections or alternatively using a key refreshing solution.

Thursday, January 24, 2013

Crypto for 2020 - security and privacy

The third and final panel discussion here in Tenerife at Crypto for 2020 was entitled 'Crypto for security and privacy', covering the current and expected future challenges to be faced by cryptographers working on protocols and methods for secure, private and/or authenticated communication on the internet. We heard from Nick Mathewson from the Tor project, Zooko Wilcox-O'Hearn, Dan Bernstein and Matthew Green. This is a pretty diverse and multi-faceted topic, so it was interesting to hear some wide-ranging and varied arguments.

The primary theme of the discussion (also quite pervasive throughout the workshop in general) was the divergence between the conceptualization of the problems as understood by theorists designing or analysing protocols and the reality faced by practitioners looking to implement and deploy schemes in the real world.

In terms of specifics, and somewhat related to Kenny Paterson's talk yesterday, the topic of backwards compatibility, poor communication between organisations, standards bodies or academics and influences from commercial or economic pressures resulting in broken or outdated primitives remaining in standards was touched upon. As always in such talks, references to good old TLS were common, but outside of that particular points made by the panel were that we as cryptographers need to consider framing the language we use to describe solutions in a context more useful for implementors, and that it may not be sufficient any more to keep patching up these protocols as they age and become outdated. One particularly weighty point made was that there has been very little academic involvement in the standardisation of many of the major widely deployed cryptographic protocols, and there was a general consensus that engagement between both sides of the equation needs to be both more frequent and more productive.

Matthew Green framed some of this as the problem of having support users who do not use any cryptography with those who do, within the same protocol. The canonical example is that the mechanism for the basic starting point of a user establishing a https connection after hitting enter in the url bar is non-trivial - we've seen some progress in this area with HSTS and certificate pinning, but there is certainly more work to be done.

We also heard about how we should be considering that in many cases on the Internet cryptography is not used by the majority of the userbase. Dan Bernstein discussed how there is potential for building new software to fill these gaps, citing DNSCrypt as an example. One suggestion was how the development of software for secure, authenticated email may be some low-hanging fruit; the barrier for entry is low, and currently doing this is a non trivial task. Tied into the above was a discussion on the usability and accessibility of security within the internet ecosystem - the complexity of configuring TLS within Apache was referenced as an example as how the barrier for entry for a non-expert user to use cryptography correctly is a hindrance for the uptake of crypto protocols.

The workshop is entitled 'Crypto for 2020', so we heard thoughts from each of the panel members on the state of play and guesses on challenges to be faced in the Internet sphere in 7 years time, and how we as cryptographers should be dealing with these. Zooko Wilcox-O'Hearn claimed that the rate of change is accelerating as more and more major players are re-evaluating how much they value security. We're seeing possibly nation-states involved in large scale cyber security operations with Flame, Stuxnet and the recent Red October, and further how these operations may have been running for years without the wider community noticing. An open question that follows is how or whether this 'arms race' or increased focus on security will change how we will have to use and analyse crypto. Nick Mathewson pointed out how the design of crypto protocols in practice lags behind academia by a good number of years, and as such the protocols and schemes we are working on now may be the ones in use in 2020.

Crypto 2020, Day 1

This week the final ECRYPT II event is taking place on Tenerife. With it comes an end to nine successful years of European integration of cryptologic research by the Networks of Excellence ECRYPT I and ECRYPT II. ECRYPT has been very active and it has justly regained a great reputation as organizer of summer schools, workshops, and research retreats. The opportunities that were provided will be sorely missed, especially by PhD students and junior researchers, for whom ECRYPT was a way to meet and build a personal network of excellence. The sad news is that there is no ECRYPT III on the horizon yet, at least not as an EU funded Network of Excellence. Luckily, the experience of ECRYPT shows that schools and workshops can be organized with a relatively limited budget. Given the great value for money, there is a real willingness among those behind ECRYPT to continue offering the experience from which so many past and current PhD students benefitted.

One talk of the morning session I particularly enjoyed was by Joan Daemen. He is part of the Keccak team that won the SHA-3 competition. Joan made a very reasoned call to consider permutations as the basis of symmetric cryptography (whereas traditionally block ciphers or hash functions seem to take this place). The sponge design, as used by SHA-3, is a good example of permutation-based cryptography and Joan explained how other symmetric goals (such as authenticated encryption) can also be achieved by sponge-like constructions. An interesting observation is the distinction between keyed and unkeyed primitives. For the former, an attacker has less information on the state used by a sponge, which arguably means that the sponge can absorb more input per iteration and needs fewer rounds per iteration. It seems that there are still many questions to be answered, e.g. when using permutation-based cryptography what are the possible security-efficiency tradeoffs possible and what can be said about the number of rounds, or is the only option reliance on the opinion of expert designers based on preliminary cryptanalysis?

The afternoon finished with a panel discussion on provable security. The panel consisted of François-Xavier Standaert (FX), Dan Bernstein (DJB), and Giuseppe Persiano (Pino), plus Michel Abdalla as moderator. Given the diverse background of the panel, a lively debate was expected and the personal positioning statements (what they found good and useful about provable security, and what less so) clearly drew the lines.

FX considers constructive proofs useful as they can guide practice, and tight bounds can help in this respect as well. Negative results on the other hand are less useful. He also thinks that theory for the sake of theory is fine, as long as there is no false advertising about practical implications that are in reality not there.

DJB focused on what provable security is trying to do, without wishing to concentrate on proof errors, looseness, limited models etc., as he considers them temporary problems that can largely be resolved. Nonetheless, his view of provable security is rather dim. As an example, he referred to the hash function 4^x 9^y mod p by Chaum et al. This construction is provably secure in the sense that collisions lead to a discrete logarithm. However, finding discrete logarithms in the multiplicative group of integers modulo a prime has become significantly easier since the hash function was first proposed (and if p is a 256 bit prime, finding collisions in the hash function will be easy despite the proof of security).

DJB's interpretation is that a proof of security (or reduction really, but let's not dwell on terminology here) requires structure of a scheme and it is this structure that will often also aid an attacker in breaking the scheme. Rather than looking at reductions, DJB suggest that users should select cryptographic systems based on cryptanalysis, as it is only sustained resistance against cryptanalysis that builds trust in a cryptosystem. DJB's goal is security, not having a proof, and he believes the two are negatively correlated due to the structure required for a proof.

Pino strongly disagreed with DJB, using science as a lens to argue a point. People try to explain a natural phenomenon by building a model for the phenomenon and prove theorems in the model, under the belief that these theorems will have an impact on reality. In cryptology, this knowledge is subsequently used to create real things with certain desired (security) properties. Good science requires that the model describes reality. This means that if a proven scheme is broken, the model must be wrong and needs to be adapted and similarly, if the model claims something to be impossible, it should not be possible in real life. Pino posited that a model should be falsifiable and should not be bizarre, leading to results that are bizar even in retrospect.

The comparison with science, and physics, in particular, led to some heated discussion. One important property of a physics model is its predictive nature: a good model allows prediction of the outcome of new experiments, lending physics an air of verifiability (notwithstanding the possibility of later experiments to falsify current theory). It is not clear how well cryptologic models satisfy this predictive property. Moreover, DJB remarks that in physics there is a common goal of understanding physics, whereas in cryptology choices have to be made (e.g. which candidate will become SHA-3) and the question is what role do security proofs play in this choices, if any.

Michel steers the discussion towards models taking into account side channels and leakage. FX gives a brief presentation on his view, with an emphasis on empirical verifiability and practical relevance. He notices that there is considerable tension between leakage-resilience model on the one hand and side-channel reality on the other. For instance, secure refreshing of the state for every iteration is unlikely to be either practically possible or needed (refreshing at initialization is another matter).

His main point is that the models typically used in leakage-resilient papers seldom capture reality and in some cases make assumptions that are demonstrably false (for instance, a common assumption is that only computation leaks, but at the gate level this is not what is observed in practice). Following Pino's earlier exposition, this prompts the question whether a flaw in the cryptographic model should lead to abolishing the model.

Pino agrees that there are models he is not comfortable with. In his opinion this includes the current leakage models, but also universal composability (UC) which is reknown for its many impossibility results (of things that seem possible in practice) and the random oracle model (where the impossibility results exploit artefacts of the model that). DJB asserts that he random oracle and generic group model are useful in the sense that a reduction using this model excludes certain generic attacks. Pino concludes that he does not understand why people keep using problematic models and offers the insight ``Bad people do bad things.''

All in all a wonderful and lively panel discussion, with great audience participation.

Friday, January 11, 2013

Real World Crypto Workshop - Day 3

After the widespread critique of AES-GCM mode yesterday, the first talk by Shay Gueron today redressed the balance. A number of very impressive performance improvements over the usual AES-CBC+HMAC-SHA1 suite for TLS were given. An interesting mathematical aside was the interpretation of the multiplication step used in GCM mode as a Montgomery multiplication with the field defining polynomial reversed Despite all the optimizations presented for AES-GCM the hash still takes 68% of the computation still.

We then progressed to the "Royal Holloway Hour". Kenny discussed the EMV protocol, and presented details on our Bleichenbacher signing attack on the EMV protocol from last years CT-RSA. An interesting thing was noticed that the attack would be more efficient if one extracted the PIN rather than just obtain the signature on a message. Kenny then went on to discuss the issue of XML and JSON security, where the new standards still use very bad legacy algorithms (PKCS#1v1.5 for public key encryption and AES-CBC mode with no integrity protection for symmetric key encryption). He said how to use a PKCS#1v1.5 decryption oracle to decrypt RSA-OAEP ciphertexts and to breaking AES-CBC allows one to break AES-GCM mode. Thus the presence of the weak algorithms in the standard allows one to break the stronger algorithms. These attacks have been applied to XML Encryption and XML signatures, and awell as JSON encryption and signature algorithms.

The second talk in the Holloway Hour was given by Bertram Poettering. Bertram's aim was to replace certificates (which bind identities to public keys) in such as way as to still maintain a non-interactive verification, but for which a misbehaving CA is penalised by having a fatal failure imposed on it. The proposed technique uses a tagged one-time signature scheme (or TOSS). This is a scheme to sign tag/message pairs. Since the signature is one-time, if a double signature is produced then security is lost.(say by outputing the key, or allowing one to forge other messages). So if the CA issues two certificates on a single ID with different public keys, then the CAs secret key can be extracted. Thus a cheating CA when it cheats has a fatal security breach. So clearly a CA which uses such a scheme is essentially "Giving a TOSS" every time it issues a certificate. It was however unclear how practical this solution would be, and whether the additional cost is worth the limited extra benefit.

The talk by Markus Jakobsson of PayPal discussed various user experiments in the area of security. I will discuss two of the issues presented:

User communication is a big issue. As a bad example of mis-communication, in an experiment making a lock icon bigger made users trust the site more; so simply displaying something can fool a user into thinking they are secure. What we really want, is a user interacting with two IDENTICAL looking web sites should be OK for the legitimate web site and produce an abort for the bad website. A proposed solution (for a phone/tablet etc) is that the user before entering a password (say) into a site has to press an interupt button (say the power button on an iPad); the OS then does a check to say whether the web page is on a white list; if not on a white-list the OS will kill the app. The OS also stops people interacting with the web site without pressing the interupt button. The problem is that the system does not work for "good" web sites which are not on the whitelist (i.e. part of the system).
The second example was about confronting users who are doing something wrong and then lying about it. The experiments were all related to buying items; but they immediately made me feel about how lecturers confront students re plagarism. The basic recommendations from the talk, applied to plagarism, were;

First confront the person with evidence. Do not allow them to commit to a lie first. e.g. Say "This work is very similar to some other work we have found. Have a look".
Make sure the innocent person is given some way of not feeling being accused. e.g. Say "How do you think this has happened? Did you give your work to someone else to help them?"
Give the guilty person a clean simple way out. e.g. Say "We know people make mistakes under pressure; as its the first time this has happened if you admit it then we wont take it further. But make sure it does not happen again".

Dan Boneh talked about the breakthrough result from late last year; namely k-linear maps. Using these maps he showed that one can construct broadcast encryption schemes (such as the ones used in satellite broadcast systems) which have constant size ciphertexts and public key sizes which are logarithmic in the number of recipients in the system. This has been a long standing open problem. The solution was basically to take the old BGW system based on bilinear maps and replace the q-elements in the public key by (log q) elements, and then construct the q-elements back using the binary representations of the associated indexes.

Overall the workshop has been really interesting, with great interaction between academia and industry. The venue of being in Silicon Valley helped ensure that there was a large number of industrial people who were working at the cutting edge.

In the next few days the slides will be made online on the conference website

https://crypto.stanford.edu/RealWorldCrypto/

In addition people might be interested in following the conference feed from Twitter

https://twitter.com/search/realtime?q=%23realworldcrypto&src=hash

Real-World Crypto Workshop: Day 2

Today we saw two big topics being addressed. The first topic was a collection of issues related to electronic payment systems (talks by Kim Wagner and Gaven Watson) and - more general - Authenticated Encryption (talk by Phil Rogaway) and the second topic was about coping with things going wrong (talks by Marc Stevens, Adam Langley and Eric Rescorla). In the following I'm going to focus on the issue of coping with things going wrong.

Both Adam Langley and Eric Rescorla were talking about known security issues within TLS/SSL protocols where fixes are available but with a mixed-to-poor deployment history of those fixes. Eric Rescorla, based on his experience with TLS/SSL pointed out that fixes have a far better chance to achieve high deployment rates fast if they are either

anticipated, provide strong incentives to those who need to deploy them and enjoy strong support from major governments as it was the case with the switch from DES to AES
or if they need to be supported on one side (of a browser - server communication) only, i.e. if they can be applied unilaterally without "breaking the internet".

If however, they need to be supported on both sides and the side that has to pay for it does not reap the profit, as is the case with hash function upgrades where both sides need to support the new hash function and the server has to pay for new certificates while only the clients gain security from being secure against collision attacks, then the deployment of the fix will take a lot more time. Another example of a difficult deployment is the case of AES-GCM which requires adjustments to the API and can not be deployed as a simple drop-in replacement.

Adam Langley took a slightly different tack by pointing out the differences in quality of existing code for cryptographic primitives and urged every cryptographer who presents a new primitive to present it together with a carefully coded (i.e. secure) implementation as this will raise awareness of implementation issues (esp. constant time behaviour) for the designers themselves while significantly reducing the probability that less knowledgeable (with respect to security) implementers create a swamp of vulnerable implementations. Also he presented his "Crypto room 101" of primitives that he didn't want to see in future protocols due to their vulnerability prone implementation: CBC mode, primitives that suffer a "sudden death" in case of entropy failures, DSA, ~~pseudo mersenne primes~~ generalized mersenne primes and AES-GCM.

Adam also stressed the importance of work-arounds¹ for security issues; one fairly novel and impressive workaround was presented by Marc Stevens: Counter-cryptanalysis of the collision attack used by the Flame malware that was discovered in 2012. Using the example of an MD5 based certificate where we know that MD5 is vulnerable the idea of counter-cryptanalysis is to check whether this certificate could have been created by crypt-analytic methods instead of the legitimate signing process.

This is exactly what was done for the Flame malware; Flame did not contain any significant new part except that it is the first known malware to rely on a cryptanalytic attack. A collision attack on MD5 - which is still present as legacy hash function within the PKI infrastructure used by Microsoft (and almost everyone else) - was used to forge a certificate that enabled Flame to insert itself into the Windows security update mechanism. While early examination of Flame already suggested that the certificate might be a forgery, this was only proven using counter-cryptanalysis. In addition, it was possible to show that the creators of flame must have used an unpublished method to construct the MD5 collision, suggesting significant cryptographic skills being involved in the creation of Flame. While this forensic use of counter-cryptanalysis happens offline, Marc also sketched ongoing work to use counter-cryptanalysis to detect attacks online, i.e. while they are happening, so that they can be blocked before systems get compromised.

¹) Eric's stance on work-arounds was more ambiguous as he noted that work-arounds reduce the pressure to fix security issues properly without achieving the same confidence as a proper fix would.

Thursday, January 10, 2013

Real-World Crypto Workshop: Day 1

Today was the first day of the Real-World Crypto Workshop in Stanford, California. This workshop follows on from a similar event held in Cambridge early last year (previously featured in this blog). Like last year the program is filled with lots of interesting speakers so it looks set to be another three days filled with great talks. The workshop aims to provide a meeting ground for both academia and industry. In this blog I will concentrate on two talks from today, one from an academic and the other speaker from industry.

Mihir Bellare's talk "Instantiating Random Oracles", described work aimed to address some of the criticisms of the use of random oracles in security proofs. Random oracles are widely used in proofs of security but there has been much debate in how useful these proofs actually are. Several papers have show separations where schemes can be proved secure when hash functions are modeled as a random oracle but which when implemented in practice with a real hash function would be completely insecure. Such examples however, are not that natural and so the debate has remained on the usefulness of random oracle based proofs. The work in the talk introduces a new security definition for hash functions which would capture the security properties required from a hash function used in practice, when the scheme using such a hash function has a proof which is given in the random oracle model.

The second talk I will describe here was given by Terence Spies from Voltage Security on "Data Privacy Models in the Payments Industry". Focus here was on how credit and debit card transactions are kept secure. Specifically looking at PANs (or Primary Account Numbers), this is the 16-19 digit number shown on the front of each card. In my cases this number is kept in plaintext by the merchant and so is targeted by malicious parties how can then use these details to make fraudulent transactions. The solution to then secure such numbers is to use Format-Preserving Encryption. Such a scheme maintains the format of the message after encryption, i.e. although encrypted it continues to be a 16-19 digit number.