CHES 2017 was held September 25th - 28th in Taipei, Taiwan. This being my first trip to CHES, I was glad to see a mix of academics and people in industry whom had ties with cryptographic hardware on embedded systems.
Although I have a limited frame of reference, I feel the standard of the conference was quite high - the presenters all knew what they were talking about in great detail, and were able to accurately describe the contribution they had made to their respective fields.
My favourite talks were in the 'Side-Channel Analysis' and the 'Emerging Attacks' sessions, as the talks in these two sessions in particular were engaging and close to the work I have been doing during my PhD.
However, my obligatory post-conference blog post will be on 'Sliding right into disaster: Left-to-right sliding windows leak', a joint work by Daniel J. Bernstein, Joachim Breitner, Daniel Genkin, Leon Groot Bruinderink, Nadia Heninger, Tanja Lange, Christine van Vredendaal, and Yuval Yarom (I wasn't aware so many people could work on one paper at the same time!).
The contribution of the paper was showing that although the Right-to-Left sliding window didn't provide leak a great deal of information, the Left-to-Right sliding window provided just enough to recover the full key (in some cases).
For a brief recap, RSA uses modular exponentiation, and in many implementations the 'sliding window' method is used for efficiency. This can be done either Left-to-Right or Right-to-Left, and although they are very similar, they have very slight differences: the Right-to-Left method tends to be easier to program, uses the same number of multiplications as Left-to-Right, but requires a bit more storage. Both are used in practice: the paper shows that the Libgcrypt crypto library uses the Left-to-Right method (and hence they provide an attack against this implementation).
One way to think about it is that if you want to compute x^25, you would convert the exponent 25 into binary, manipulate this bitstring in some way (depending on whether you are going Left-to-Right or Right-to-Left, and also on the size of your window w), and then parse the bitstring: for every non-zero bit, perform a multiply; for every zero bit, perform a square (or something to that effect)
In this manipulated bitstring in the Right-to-Left method, due to the way the bitstring is created, we are guaranteed to have w - 1 zero bits after a non-zero bit. From a leakage point of view, this doesn't provide much information.
However, in the Left-to-Right method, two non-zero bits can be as close as adjacent. This allows us to infer certain details about the bitstring by applying certain rules to what we know (the leakage), and in some cases, working out the value of the key.
If we are able to recover >50% of the key bits this way, we can implement an efficient Heninger-Shacham attack to recover the remaining bits.
The paper was presented by Leon Groot Bruinderink, and he explained it in such a way that I found it clear to understand how the attack works, and how one would prevent against this kind of attack (not using Left-to-Right would be a start). They also contacted Libgcrypt with details of the attack, and it has been fixed in version 1.7.8.
Aside from the papers, CHES has been pretty amazing: the venue was a 5 star hotel in the centre of Taipei, the food was absolutely incredible (even the banquet, which introduced me to the wonders of sea cucumber), and the excursion to the Taipei Palace Museum was exceptionally educational (which as we all know is the best kind of fun).
I would definitely recommend CHES to anyone interested in the more practical side of cryptography, although if it ever takes place in Taiwan again, I strongly suggest you Youtube how to use chopsticks. Unfortunately I never learnt, and after a humiliating trip to the ShiLin Night Market, am now featured on several locals' phones in a video named 'The Tourist who couldn't eat Beef Noodle Soup'.
A blog for the cryptography group of the University of Bristol. To enable discussion on cryptography and other matters related to our research.
Thursday, September 28, 2017
Tuesday, September 5, 2017
Crypto 2017 - How Microsoft Wants to Fix the Internet (Spoiler: Without Blockchains)
In the second invited talk at Crypto, Cédric Fournet from Microsoft Research presented the recent efforts of Project Everest (Everest VERified
End-to-end Secure Transport), which seems an attempt to fix implementing TLS once and for all. Appropriately for such a gigantic task, more than a dozen researchers on three continents (and the UK) work on making it verifiable and efficient at the same time.
As with every self-respecting talk in the area, it started with the disaster porn that is the history of TLS (our lifes depend on it, but it's complicated, there have been 20 years of failures, insert your favourite attack here). However, the Crypto audience hardly needs preaching to (just a reminder that the work isn't done with a proof sketch; performance and error handling also matters), so the talk swiftly moved on to the proposed solutions.
The story starts in 2013 with miTLS, which was the first verified standard-compliant implementation. However, it still involved hand-written proofs and was more of an experimental platform. Enter Everest: They want to tick all the boxes by providing verified drop-in replacements for the HTTPS ecosystem. It covers the whole range from security definitions to code with good performance and side-channel protection.
As an example, Cédric presented Poly1305, a MAC that uses arithmetic modulo $2^{130}-5$ and forms part of the upcoming TLS 1.3 specification. Unsurprisingly, there have already been found bugs in OpenSSL's implementation. Project Everest have implemented Poly1305 with ~3,000 lines of code in Low*, a subset of F* (a functional language) that allows both C-style programming (with pointer arithmetic) as well as verification. Compiling this code with KreMLin (another output of Project Everest) results in machine code that is as fast as hand-written C implementations. The same holds for ChaCha2 and Curve25519.
However, hand-written assembly is still faster. The project aims to catch up on this with Vale, which was published at Usenix this year. Vale promises extensible, automated assembly verification and side-channel protection.
So what is the way forward? TLS 1.3 is on the horizon, bringing various improvements at the cost of a considerable re-design. This requires new implementations, and the project hopes to be first to market with an efficient and verifiable one.
For the rest of talk, Cédric gave more details on how F* naturally lends itself to security games, and how they proved concrete security bounds for the TLS 1.2 and 1.3 record ciphersuites.
All in all, I think it was a great example for an invited talk at Crypto, putting cryptography in a bigger context and bringing work that isn't necessarily on a cryptographer's radar to our attention.
As with every self-respecting talk in the area, it started with the disaster porn that is the history of TLS (our lifes depend on it, but it's complicated, there have been 20 years of failures, insert your favourite attack here). However, the Crypto audience hardly needs preaching to (just a reminder that the work isn't done with a proof sketch; performance and error handling also matters), so the talk swiftly moved on to the proposed solutions.
The story starts in 2013 with miTLS, which was the first verified standard-compliant implementation. However, it still involved hand-written proofs and was more of an experimental platform. Enter Everest: They want to tick all the boxes by providing verified drop-in replacements for the HTTPS ecosystem. It covers the whole range from security definitions to code with good performance and side-channel protection.
As an example, Cédric presented Poly1305, a MAC that uses arithmetic modulo $2^{130}-5$ and forms part of the upcoming TLS 1.3 specification. Unsurprisingly, there have already been found bugs in OpenSSL's implementation. Project Everest have implemented Poly1305 with ~3,000 lines of code in Low*, a subset of F* (a functional language) that allows both C-style programming (with pointer arithmetic) as well as verification. Compiling this code with KreMLin (another output of Project Everest) results in machine code that is as fast as hand-written C implementations. The same holds for ChaCha2 and Curve25519.
However, hand-written assembly is still faster. The project aims to catch up on this with Vale, which was published at Usenix this year. Vale promises extensible, automated assembly verification and side-channel protection.
So what is the way forward? TLS 1.3 is on the horizon, bringing various improvements at the cost of a considerable re-design. This requires new implementations, and the project hopes to be first to market with an efficient and verifiable one.
For the rest of talk, Cédric gave more details on how F* naturally lends itself to security games, and how they proved concrete security bounds for the TLS 1.2 and 1.3 record ciphersuites.
All in all, I think it was a great example for an invited talk at Crypto, putting cryptography in a bigger context and bringing work that isn't necessarily on a cryptographer's radar to our attention.
Tuesday, August 22, 2017
Crypto 2017 - LPN Decoded
Crypto 2017 kicked off this morning in Santa Barbara. After a quick eclipse-watching break, the lattice track ended with the presentation of LPN Decoded by Andre Esser, Robert Kübler, and Alexander May.
Learning Parity with Noise (LPN) is used as the underlying hardness problem in many cryptographic protocols. It is equivalent to decoding random linear codes, and thus offers strong security guarantees. The authors of this paper propose a memory-efficient algorithm to the LPN problem. They also propose the first quantum algorithm for LPN.
Let's first recall the definition of the LPN problem. Let a secret $s \in \mathbb{F}_2^k$ and samples $(\mathbb{a}_i, b_i)$, where $b_i$ equals $\langle \mathbb{a}_i,s \rangle + e_i$ for some $e \in \{0,1\}$ with $Pr(e_i=1)= \tau < 1/2$. From the samples $(\mathbb{a}_i,b_i)$, recover $s$. We see that the two LPN parameters are $k$ and $\tau$. Notice that this can be seen as a sub-case of the Ring Learning with Errors problems; in fact, LWR originated as an extension of LPN.
Learning Parity with Noise (LPN) is used as the underlying hardness problem in many cryptographic protocols. It is equivalent to decoding random linear codes, and thus offers strong security guarantees. The authors of this paper propose a memory-efficient algorithm to the LPN problem. They also propose the first quantum algorithm for LPN.
Let's first recall the definition of the LPN problem. Let a secret $s \in \mathbb{F}_2^k$ and samples $(\mathbb{a}_i, b_i)$, where $b_i$ equals $\langle \mathbb{a}_i,s \rangle + e_i$ for some $e \in \{0,1\}$ with $Pr(e_i=1)= \tau < 1/2$. From the samples $(\mathbb{a}_i,b_i)$, recover $s$. We see that the two LPN parameters are $k$ and $\tau$. Notice that this can be seen as a sub-case of the Ring Learning with Errors problems; in fact, LWR originated as an extension of LPN.
If $\tau$ is zero, we can draw $k$ independent samples and solve for $s$ by Gaussian elimination. This can also be extended to an algorithm for $\tau < 1/2$, by computing a guess $s'$ and testing whether $s'=s$. This works well for small $\tau$, for example $\tau = 1/\sqrt{k}$, used in some public key encryption schemes. Call this approach Gauss.
For larger constant $\tau$, the best current algorithm is BKW. However, although BKW has the best running time, it cannot be implemented for even medium size LPN parameters because of its memory consumption. Further, BKW has a bad running time dependency on $\tau$. Both algorithms also require many LPN oracle calls.
For larger constant $\tau$, the best current algorithm is BKW. However, although BKW has the best running time, it cannot be implemented for even medium size LPN parameters because of its memory consumption. Further, BKW has a bad running time dependency on $\tau$. Both algorithms also require many LPN oracle calls.
The authors take these two as a starting point. They describe a Pooled Gauss algorithm, which only requires a polynomial number of samples. From those, they look for error-free samples, similar to Gauss. The resulting algorithm has the same space and running time complexity, but requires significantly less oracle calls. It has the additional advantage of giving rise to a quantum version, where Grover search can be applied to save a square root faction in the running time.
They then describe a second hybrid algorithm, where a dimension reduction step is added. Thus Well-Pooled Gauss proceeds in two steps. First, reduce the dimension $k$ to $k'$ (for example, using methods such as BKW). Then, decode the smaller instance via Gaussian elimination. The latter step is improved upon by using the MMT algorithm.
For full results, see the original paper. Their conclusion is that Decoding remains the best strategy for small $\tau$. It also has quantum optimisations and is memory-efficient. The Hybrid approach has in fact no advantage over this for small values of $\tau$. For larger values however, they manage to solve for the first time an LPN instance of what they call medium parameters - $k = 243$, $\tau = 1/8$ - in 15 days.
For full results, see the original paper. Their conclusion is that Decoding remains the best strategy for small $\tau$. It also has quantum optimisations and is memory-efficient. The Hybrid approach has in fact no advantage over this for small values of $\tau$. For larger values however, they manage to solve for the first time an LPN instance of what they call medium parameters - $k = 243$, $\tau = 1/8$ - in 15 days.
Tuesday, May 2, 2017
Eurocrypt 2017 - Parallel Implementations of Masking Schemes and the Bounded Moment Leakage Model
Side-channel analysis made its way into Eurocrypt this year thanks to two talks, the first of which given by François-Xavier Standaert on a new model to prove security of implementations in. When talking about provable security in the context of side-channel analysis, there is one prominent model that comes to mind: the d-probing model, where the adversary is allowed to probe d internal variables (somewhat related to d wires inside an actual implementation) and learn them. Another very famous model, introduced ten years later, is the noisy leakage model in which the adversary is allowed probes on all intermediate variables (or wires) but the learnt values are affected by errors due to noise. To complete the picture, it was proved that security in the probing model implies security in the noisy leakage one.
The work of Barthe, Dupressoir, Faust, Grégoire, Standaert and Strub is motivated precisely by the analysis of these two models in relation to how they specifically deal with parallel implementation of cryptosystems. On one hand, the probing model admits very simple and elegant description and proofs' techniques but it is inherently oriented towards serial implementations; on the other hand, the noisy leakage model naturally includes parallel implementations in its scope but, admitting the existence of noise in leakage functions, it lacks simplicity. The latter is particularly important when circuits are analysed with automated tools and formal methods, because these can rarely deal with errors.
The contribution of the paper can then be summarised in the definition of a new model trying to acquire the pros of both previous models: the Bounded Moment leakage Model (BMM). The authors show how it relates to the probing model and give constructions being secure in their model. In particular, they prove that BMM is strictly weaker than the probing model in that security in the latter implies security in the former but they give a counterexample that the opposite does not hold. The informal definition of the model given during the talk is the following:
An implementation is secure at order o in the BMM if all mixed statistical moments of order up to o of its leakage vectors are independent of any sensitive variable manipulated.
A parallel multiplication algorithm and a parallel refreshing algorithm are the examples brought to show practical cases where the reduction between models stated before holds, the statement of which is the following:
A parallel implementation is secure at order o in the BMM if its serialisation is secure at order o in the probing model.The falsity of the converse is shown in a slightly different setting, namely the one of continuous leakage: the adversary does not just learn values carried by some wires by probing them, but such an operation can be repeated as many times as desired and the probes can be moved adaptively. Clearly this is a much stronger adversary in that accumulation of knowledge over multiple probing sessions is possible, which is used as a counterexample to show that security in the continuous BMM does not imply security in the continuous probing model. The refreshing scheme mentioned above can easily be broken in the latter after a number of iterations linear in the number of shares, but not in the former as adapting the position of the probes does not help: an adversary in the BMM can already ask for leakage on a bounded function of all the shares.
Eurocrypt 2017: On dual lattice attacks against small-secret LWE and parameter choices in HElib and SEAL
This morning, Martin gave a great talk on lattice attacks and parameter choices for Learning With Errors (LWE) with small and sparse secret. The work presents new attacks on LWE instances, yielding revised security estimates. This leads to a revised exponent of the dual lattice attack by a factor of 2L/(2L+1), for log q = Θ(L*log n). The paper exploits the fact that most lattice-based FHE schemes use short and sparse secret. We will write q to denote the LWE modulus throughout.
Let's first have a look at the set-up. Remember LWE consists of distinguishing between pairs (A, As+e) and (A,b). In the first instance, A is selected uniformly at random and b is selected from a special (usually Gaussian) distribution. In the second one, both A and b are uniformly random. Selecting s, as this work shows, is perhaps trickier than previously thought. Theory says that, in order to preserve security, selecting a short and sparse secret s means the dimension must be increased to n*log_2(q). Practice says just ignore that and pick a small secret anyway. More formally, HElib typically picks a secret s such that exactly h=64 entries are in {-1,1} and all the rest are 0. SEAL picks uniformly random secrets in {-1,0,1}.
We also recall that the dual lattice attack consists of finding a short vector w such that Aw = 0, then checking if
The improvements presented in this paper rely on three main observations. Firsly, a revised dual lattice attack is presented. This step is done by adapting BKW-style algorithms in order to increase efficiency and can be done in general, i.e. does not depend on either shortness or sparseness of the secret. It is achieved by applying BKZ to the target basis, then re-randomising the result and applying BKZ again, with different block size.
The second optimisation exploits the fact that we have small secrets. We observe that we can relax the condition on w somewhat. Indeed, if s is short, then finding w such that Aw is short instead of 0 is good enough. Therefore, we look for vectors (v,w) in the lattice
Finally, the sparsity of the small secret is exploited. This essentially relies on the following observation: when s is very sparse, most of the columns of A become irrelevant, so we can just ignore them.
The final algorithm SILKE is the combination of the three above steps. The steps are the following.
As usual, Martin wins Best Slides Award for including kittens.
Let's first have a look at the set-up. Remember LWE consists of distinguishing between pairs (A, As+e) and (A,b). In the first instance, A is selected uniformly at random and b is selected from a special (usually Gaussian) distribution. In the second one, both A and b are uniformly random. Selecting s, as this work shows, is perhaps trickier than previously thought. Theory says that, in order to preserve security, selecting a short and sparse secret s means the dimension must be increased to n*log_2(q). Practice says just ignore that and pick a small secret anyway. More formally, HElib typically picks a secret s such that exactly h=64 entries are in {-1,1} and all the rest are 0. SEAL picks uniformly random secrets in {-1,0,1}.
We also recall that the dual lattice attack consists of finding a short vector w such that Aw = 0, then checking if
<Aw, (As+e)w> = <w,e>
is short. If we are in the presence of an LWE sample, e is short, so the inner product is short. Short*short = short, as any good cryptographer can tell you.The improvements presented in this paper rely on three main observations. Firsly, a revised dual lattice attack is presented. This step is done by adapting BKW-style algorithms in order to increase efficiency and can be done in general, i.e. does not depend on either shortness or sparseness of the secret. It is achieved by applying BKZ to the target basis, then re-randomising the result and applying BKZ again, with different block size.
The second optimisation exploits the fact that we have small secrets. We observe that we can relax the condition on w somewhat. Indeed, if s is short, then finding w such that Aw is short instead of 0 is good enough. Therefore, we look for vectors (v,w) in the lattice
L = {(y,x): yA = x (mod q)}.
||<w,s>|| ≈ ||<v,e>||.
Now in small secret LWE instances, ||s||<||e|| and so we may allow ||v||>||w|| such that
Finally, the sparsity of the small secret is exploited. This essentially relies on the following observation: when s is very sparse, most of the columns of A become irrelevant, so we can just ignore them.
The final algorithm SILKE is the combination of the three above steps. The steps are the following.
- Perform BKZ twice with different block sizes to produce many short vectors
- Scale the normal form of the dual lattice
- If sparse, ignore the presumed zero columns, correct for mistakes by checking shifted distribution
As usual, Martin wins Best Slides Award for including kittens.
Wednesday, April 12, 2017
Is Your Banking App Secure?
Last week I was in Malta for Financial Cryptography and Data Security 2017 to present my recent work on securing the PKCS#11 cryptographic API.
One talk that stood out for me was by researchers from the University of Birmingham, who looked for vulnerabilities in the mobile apps provided by major UK banks.
Sadly, they found major weaknesses in apps from 5 of the 15 banks they investigated.
Several apps use certificate pinning, where the app hard-codes a certificate from a trusted CA and only accepts public keys that are signed by the pinned certificate.
This is good practice, as an attacker can add their own certificate to the phone's trust store, but it won't be accepted by the app.
However, two Android apps (for Natwest and Co-op) accepted any public key signed by the pinned certificate, without checking the domain name!
So the attack works as follows:
Curiously, the authors note: "Co-op [...] hired two penetration testing companies to test their apps, both of which had missed this vulnerability". It seems odd that such an obvious mistake wasn't picked up in testing.
The group also found that several banks - Santander, First Trust and Allied Irish - served adverts to their app users over unencrypted HTTP, meaning an attacker could spoof these ads and mount a phishing scam, perhaps by displaying a fake 'security warning' and directing users to re-enter their account details on a malicious page. It was pointed out in the talk that we're much more likely to 'feel safe' within an app (and hence trust all the content we see) than, say, visiting a webpage using a laptop, so this kind of in-app phishing scam could be very effective.
There are even more exploits described in the paper.
It was refreshing to hear that the vulnerable banks responded well to the disclosures made by the Birmingham group and patched their apps as a result. But I'm a little baffled that these basic errors were ever made in such security critical applications.
One talk that stood out for me was by researchers from the University of Birmingham, who looked for vulnerabilities in the mobile apps provided by major UK banks.
Sadly, they found major weaknesses in apps from 5 of the 15 banks they investigated.
Several apps use certificate pinning, where the app hard-codes a certificate from a trusted CA and only accepts public keys that are signed by the pinned certificate.
This is good practice, as an attacker can add their own certificate to the phone's trust store, but it won't be accepted by the app.
However, two Android apps (for Natwest and Co-op) accepted any public key signed by the pinned certificate, without checking the domain name!
So the attack works as follows:
- Purchase a certificate for a domain you own from the trusted CA
- The app will accept your public key with this certificate
- Man-in-the-middle all the encrypted traffic between the user and their bank.
Curiously, the authors note: "Co-op [...] hired two penetration testing companies to test their apps, both of which had missed this vulnerability". It seems odd that such an obvious mistake wasn't picked up in testing.
The group also found that several banks - Santander, First Trust and Allied Irish - served adverts to their app users over unencrypted HTTP, meaning an attacker could spoof these ads and mount a phishing scam, perhaps by displaying a fake 'security warning' and directing users to re-enter their account details on a malicious page. It was pointed out in the talk that we're much more likely to 'feel safe' within an app (and hence trust all the content we see) than, say, visiting a webpage using a laptop, so this kind of in-app phishing scam could be very effective.
There are even more exploits described in the paper.
It was refreshing to hear that the vulnerable banks responded well to the disclosures made by the Birmingham group and patched their apps as a result. But I'm a little baffled that these basic errors were ever made in such security critical applications.
Wednesday, March 29, 2017
PKC 2017: Kenny Paterson accepting bets on breaking TLS 1.3
The member of the TLS 1.3 working group is willing to bet for a beer that the 0-RTT handshake of TLS 1.3 will get broken in the first two years.
In his invited talk, Kenny managed to fill a whole hour on the history of SSL/TLS without even mentioning symmetric cryptography beyond keywords, thus staying within the topic of the conference. Despite all versions of SSL being broken to at least some degree, the standardised TLS became the most import security protocol on the Internet.
The core part of TLS is the handshake protocol, which establishes the choice of ciphers and the session key. Kenny highlighted the high complexity stemming from the many choices (e.g., using a dedicated key exchange protocol or not) and the possible interaction with other protocols in TLS. Together with further weaknesses of the specification, this created the space for the many attacks we have seen. On the upside, these attacks express an increased attention by academics, which comes together with an increased attention by the society as whole. Both have laid the ground for improvements in both the deployment and future versions of TLS. For example, the support of forward secrecy has increased from 12 percent to 86 according to SSL pulse.
Turning to concrete attacks, most important in the area of PKC is the Bleichenbacher attack published already at Crypto 1998 (a human born then would a considered a full adult at the conference venue now). Essentially, it exploits that RSA with the padding used in TLS is not CCA-secure, and it recovers the session key after roughly $2^{20}$ interactions with a server. Nevertheless, the TLS 1.0 specification published shortly after Bleichenbacher's publication incorporates the problematic padding (recommending mitigation measures), and later versions retain it for compatibility. The DROWN shows the danger of this by exploiting the fact that many servers still offer SSLv2 (about 8% of Alexa top 200k) and that it is common to use the same key for several protocol versions. An attacker can recover the session key of a TLS session by replaying a part of it in an SSLv2 session that uses the same key.
On a more positive note, Kenny presented the upcoming TLS 1.3, which is under development since 2014. It addresses a lot of the weaknesses of previous versions by involving academics from an early stage and doing away with a lot of the complexity (including reducing options and removing ciphers). It furthermore aims to decrease latency of the handshake by allowing the parties to send encrypted data as early as possible, reducing the round trip time to one in many cases. The goal of low latency has also led to the inclusion of QUIC, which provides zero round trip time, that is, the client can send data already in the first message when resuming a session. However, QUIC is not fully forward-secure and therefore confined to a separate API. Nevertheless, Kenny predicts that the sole availability will be too tempting for developers, hence the bet offered.
Concluding, he sees three major shifts in TLS this far: from RSA to elliptic-curve Diffie-Hellman, to Curve25519, and away from SHA-1 in certificates. A fourth shift might happen with the introduction of post-quantum algorithms such as Google's deployment of New Hope. Less optimistically, he expects that implementation vulnerabilities will continue to come up.
Update: An earlier version of this post mentioned the non-existing Curve255199 instead of Curve25519, and it attributed New Hope to Google.
In his invited talk, Kenny managed to fill a whole hour on the history of SSL/TLS without even mentioning symmetric cryptography beyond keywords, thus staying within the topic of the conference. Despite all versions of SSL being broken to at least some degree, the standardised TLS became the most import security protocol on the Internet.
The core part of TLS is the handshake protocol, which establishes the choice of ciphers and the session key. Kenny highlighted the high complexity stemming from the many choices (e.g., using a dedicated key exchange protocol or not) and the possible interaction with other protocols in TLS. Together with further weaknesses of the specification, this created the space for the many attacks we have seen. On the upside, these attacks express an increased attention by academics, which comes together with an increased attention by the society as whole. Both have laid the ground for improvements in both the deployment and future versions of TLS. For example, the support of forward secrecy has increased from 12 percent to 86 according to SSL pulse.
Turning to concrete attacks, most important in the area of PKC is the Bleichenbacher attack published already at Crypto 1998 (a human born then would a considered a full adult at the conference venue now). Essentially, it exploits that RSA with the padding used in TLS is not CCA-secure, and it recovers the session key after roughly $2^{20}$ interactions with a server. Nevertheless, the TLS 1.0 specification published shortly after Bleichenbacher's publication incorporates the problematic padding (recommending mitigation measures), and later versions retain it for compatibility. The DROWN shows the danger of this by exploiting the fact that many servers still offer SSLv2 (about 8% of Alexa top 200k) and that it is common to use the same key for several protocol versions. An attacker can recover the session key of a TLS session by replaying a part of it in an SSLv2 session that uses the same key.
On a more positive note, Kenny presented the upcoming TLS 1.3, which is under development since 2014. It addresses a lot of the weaknesses of previous versions by involving academics from an early stage and doing away with a lot of the complexity (including reducing options and removing ciphers). It furthermore aims to decrease latency of the handshake by allowing the parties to send encrypted data as early as possible, reducing the round trip time to one in many cases. The goal of low latency has also led to the inclusion of QUIC, which provides zero round trip time, that is, the client can send data already in the first message when resuming a session. However, QUIC is not fully forward-secure and therefore confined to a separate API. Nevertheless, Kenny predicts that the sole availability will be too tempting for developers, hence the bet offered.
Concluding, he sees three major shifts in TLS this far: from RSA to elliptic-curve Diffie-Hellman, to Curve25519, and away from SHA-1 in certificates. A fourth shift might happen with the introduction of post-quantum algorithms such as Google's deployment of New Hope. Less optimistically, he expects that implementation vulnerabilities will continue to come up.
Update: An earlier version of this post mentioned the non-existing Curve255199 instead of Curve25519, and it attributed New Hope to Google.
Monday, March 27, 2017
Tools for proofs
Security proof for even simple cryptographic systems are dangerous and ugly beasts. Luckily, they are only rarely seen: they are usually safely kept in the confines of ``future full-versions'' of papers, or only appear in cartoon-ish form, generically labelled as ... ``proof sketch".
The following two quotes frame the problem in less metaphorical terms.
``In our opinion, many proofs in cryptography have become essentially unverifiable. Our field may be approaching a crisis of rigor".
Bellare and Rogaway (cca 2004)
``Do we have a problem with cryptographic proofs? Yes, we
do [...] We generate more proofs than we carefully verify
(and as a consequence some of our published proofs are
incorrect)".
Halevi (cca 2005)
Solutions developed by cryptographers e.g. compositional reasoning and the game-hopping technique, help to structure proofs and reduce their complexity and therefore alleviate to some extent the pain of having to develop rigorous proofs. Yet, more often than not proofs are still sketchy and shady.
There is help that comes from the programming languages community which has a long experience with developing tools for proving that programs work correctly and...cryptographic systems are just programs. Recent progress, e.g. automated verification of parts of TLS, fully verified security proofs of implementation masking schemes to defeat leakage, is impressive and exciting. More work is under way.
If you want to learn more about how can you get someone else to do the proofs for you or, more realistically, learn about what existent tools can currently do, what they could do in the future, and discuss what is needed and which way to go, then you should attend the
Workshop on Models and Tools for Security Analysis and Proofs
-- Paris, 29th of April; co-located with EuroSnP and Eurocrypt --
-- Paris, 29th of April; co-located with EuroSnP and Eurocrypt --
which the Bristol Crypto group helps organize. The workshop features as speakers some of the most prominent researchers that are contributing to this direction. You can register for the workshop HERE. Early registration ends March 31st!
But wait...there is more. If you want to explore this area beyond what a one-day workshop allows, then you should consider attending the
-- Nancy, France, July 10th - 13th --
See you all in Paris and/or Nancy!
Tuesday, February 21, 2017
Homomorphic Encryption API Software Library
The Homomorphic Encryption Application Programming Interface (HE-API) software library is an open source software library being developed as part of the Homomorphic Encryption Applications and Technology (HEAT) project, and is available here. The main purpose of this software library is to provide a common easy-to-use interface for various existing Somewhat Homomorphic Encryption (SHE) libraries. Limited support for fixed-point arithmetic is also provided by this library. Note that the HE-API library is still a work in progress.
Fully Homomorphic Encryption (FHE) is a cryptographic primitive that allows meaningful manipulation of ciphertexts. In spite of several recent advances, FHE remains out of practical reach. Hence a reasonable restriction to make is to limit the set of evaluated circuits to a specified subclass, usually determined by the multiplicative depth of the circuit. Such encryption schemes are called as SHE schemes. Various libraries such as HElib, SEAL, FV-NFLlib, HElib-MP, etc., are already available that implement these SHE schemes.
The purpose of this HE-API software library is to provide a common, generic, easy-to-use interface for various existing libraries that implement SHE schemes. The SHE libraries that are currently integrated in the HE-API library are HElib and FV-NFLlib. It may be noted that the FV-NFLlib library is itself an outcome of the HEAT project. At a high-level, the HE-API software library abstracts out the technicalities present in the underlying SHE libraries. For instance, the HElib library implements the BGV SHE scheme, while the FV-NFLlib implements the FV SHE scheme. Needless to say, the syntax for various classes and routines in the individual libraries will be different, though the underlying semantics are very similar. The HE-API library integrates the underlying SHE libraries under a single interface, thereby shielding the user from syntactic differences. Another feature of the HE-API library is that it contains minimal, yet complete, set of routines to perform homomorphic computations. The design of this library is motivated by the ease of use for non-experts.
Supported Data Types
The following application data types are supported by the HE-API software library.
- Boolean
- Unsigned long integers
- GMP's arbitrary precision integers class: mpz_class
- Polynomials with coefficients of type: unsigned long integers or mpz_class
- Vectors of : unsigned long integers or mpz_class
- Fixed-point numbers
Friday, January 13, 2017
RWC 2017 - Secure MPC at Google
This talk was given by Ben Kreuter and its focus was on the apparent disparity between what we research in academia versus what is required in the real world, specifically in the field of multi-party computation (MPC). MPC is the idea of allowing multiple parties to compute some function on their combined input without any party revealing anything about their input to the other parties (other than what can be learnt from the output alone).
While significant work has been done on making MPC efficient in practice (for example, the work of Yehuda Lindell et al. on high-throughput MPC which was presented by Lindell in the preceding talk), the focus tends to be on generic protocols (e.g. general logic circuits) with strong security guarantees (e.g. malicious security), which invariably leads to large computational overhead. In practice, we usually require only specific protocols, which can therefore be optimised, and comparatively weak security guarantees.
In the real world, network cost is the salient factor, rather than the speed of the protocol, since the parties who are involved in a computation often have to use networks (such as the Internet) which are being used by many other people at the same time and cannot make the best use of the network's full capabilities. The MPC at Google is about computation amongst, for example, mobile phones, laptops and servers; this introduces issues like battery constraints and the possibility of the computation not completing; these considerations, firmly grounded in the real world, are important when developing MPC techniques in research.
While significant work has been done on making MPC efficient in practice (for example, the work of Yehuda Lindell et al. on high-throughput MPC which was presented by Lindell in the preceding talk), the focus tends to be on generic protocols (e.g. general logic circuits) with strong security guarantees (e.g. malicious security), which invariably leads to large computational overhead. In practice, we usually require only specific protocols, which can therefore be optimised, and comparatively weak security guarantees.
In the real world, network cost is the salient factor, rather than the speed of the protocol, since the parties who are involved in a computation often have to use networks (such as the Internet) which are being used by many other people at the same time and cannot make the best use of the network's full capabilities. The MPC at Google is about computation amongst, for example, mobile phones, laptops and servers; this introduces issues like battery constraints and the possibility of the computation not completing; these considerations, firmly grounded in the real world, are important when developing MPC techniques in research.
Business applications
A large portion of Google's revenue is generated by advertising: the tech giant, well-known for its aptitude for accurately determining users' desired search results even when queries are expressed ineloquently, specialises in creating personalised adverts to its wide spectrum of users. The efficacy of an advert is generally measured by the proportion of viewers of it who later become customers. Clearly this can be done by businesses comparing their database of customers' transactions with Google's databases of who has been shown which adverts. This, however, would be an invasion of privacy: instead, Google and the business can do MPC: more specifically, a private set intersection protocol.
In a private set intersection protocol, the parties involved compute how large the intersection is amongst the sets input by each party, or even some function on those elements in the intersection. So if the business and Google compute a private set intersection protocol on their data, they can determine how well the advertising went.
Roughly speaking, the MPC Google does in the real world is as follows: Google has a set $\{g_1,g_2,...,g_n\}$ of field elements which encodes a set of people who have been shown an advert for a certain product, and a business has a set $\{b_1,b_2,...,b_m\}$ of field elements which encodes a set of people who have been sold the product in question; Google raises each of its elements to a power $G$ and sends the set $\{g_1^G,g_2^G,...,g_n^G\}$ to the business. The business does the same with its elements for some exponent $B$ to get $\{b_1^B,b_2^B,...,b_m^B\}$, encrypts a set of binary vectors under Paillier encryption (which is additively homomorphic), one corresponding to each element in its set, encoding some other property of the sales (like the amount paid), and also computes the set $\{g_1^{GB},g_2^{GB},...,g_n^{GB}\}$. The business sends Google the set of pairs $\{(b_1^B,P(v_1)),(b_2^B,P(v_2)),...,(b_m^B,P(v_m))\}$ along with $\{g_1^{GB},g_2^{GB},...,g_n^{GB}\}$, and Google computes $\{b_1^{GB},b_2^{GB},...,b_m^{GB}\}$ and adds together all encrypted vectors $P(v_i)$ for which there exists some $j$ such that $g_i^{GB}=b_j^{GB}$. It sends this ciphertext back to the business, which decrypts and interprets the result.
This protocol is very simple, and it is only passively secure (in which players are assumed to execute the protocol faithfully but will possibly try to learn things by inspecting their communication transcripts). An interesting, perhaps somewhat orthogonal concern, to how we approach research from an academic point of view is that it is important that we can convey the security and efficiency of our protocols to lawyers, managers and software engineers who will eventually be sanctioning, authorising or implementing the protocols. "The lawyers are interesting because you can show them a proof, and two plus two equals four is a negotiable statement here... managers usually trust your expertise...and software engineers are the worst because they already assume [the protocol] is impossible."
An alternative solution using garbled circuits was explored in the recent past, but it turned out that their use required some subtle assumptions regarding the computation and communication which would have made the protocol impractical.
Future work would involve getting a (not too much more expensive) maliciously secure protocol and developing the use of the homomorphic encryption to allow different functions to be computed on the data in the intersection.
This is clearly a good situation in which to use MPC. Each party masks their data using a basic additive secret-sharing scheme: if each party has a vector to input, for every coordinate, every pair of parties agrees on some random field element, one subtracts and one adds this to that coordinate of their vector. When the parties send this to Google, the masks will therefore cancel when added together.
In practice,they use a PRG and perform a key exchange (in which one key is given to each pair of parties, for every possible pair) at the beginning to achieve the same effect but with much smaller communication overhead. They also have a trick for dealing with device failures (which is important given the application).
This talk provided helpful and relevant insight into the the importance of matching what we research with what we require in the real world, which is, after all, one of the main reasons for having conferences such as Real World Crypto. Many of the talks are available to watch online here, and I would highly recommend doing so if interested.
In a private set intersection protocol, the parties involved compute how large the intersection is amongst the sets input by each party, or even some function on those elements in the intersection. So if the business and Google compute a private set intersection protocol on their data, they can determine how well the advertising went.
Roughly speaking, the MPC Google does in the real world is as follows: Google has a set $\{g_1,g_2,...,g_n\}$ of field elements which encodes a set of people who have been shown an advert for a certain product, and a business has a set $\{b_1,b_2,...,b_m\}$ of field elements which encodes a set of people who have been sold the product in question; Google raises each of its elements to a power $G$ and sends the set $\{g_1^G,g_2^G,...,g_n^G\}$ to the business. The business does the same with its elements for some exponent $B$ to get $\{b_1^B,b_2^B,...,b_m^B\}$, encrypts a set of binary vectors under Paillier encryption (which is additively homomorphic), one corresponding to each element in its set, encoding some other property of the sales (like the amount paid), and also computes the set $\{g_1^{GB},g_2^{GB},...,g_n^{GB}\}$. The business sends Google the set of pairs $\{(b_1^B,P(v_1)),(b_2^B,P(v_2)),...,(b_m^B,P(v_m))\}$ along with $\{g_1^{GB},g_2^{GB},...,g_n^{GB}\}$, and Google computes $\{b_1^{GB},b_2^{GB},...,b_m^{GB}\}$ and adds together all encrypted vectors $P(v_i)$ for which there exists some $j$ such that $g_i^{GB}=b_j^{GB}$. It sends this ciphertext back to the business, which decrypts and interprets the result.
This protocol is very simple, and it is only passively secure (in which players are assumed to execute the protocol faithfully but will possibly try to learn things by inspecting their communication transcripts). An interesting, perhaps somewhat orthogonal concern, to how we approach research from an academic point of view is that it is important that we can convey the security and efficiency of our protocols to lawyers, managers and software engineers who will eventually be sanctioning, authorising or implementing the protocols. "The lawyers are interesting because you can show them a proof, and two plus two equals four is a negotiable statement here... managers usually trust your expertise...and software engineers are the worst because they already assume [the protocol] is impossible."
An alternative solution using garbled circuits was explored in the recent past, but it turned out that their use required some subtle assumptions regarding the computation and communication which would have made the protocol impractical.
Future work would involve getting a (not too much more expensive) maliciously secure protocol and developing the use of the homomorphic encryption to allow different functions to be computed on the data in the intersection.
Consumer applications
The Android keyboard app by Google, Gboard, logs what a user types so that it can guess words for auto-completing in the future. This data could be used for training machine learning models, and merging results from many local models would enable the formation of guessing algorithms that work well for everyone. However, to do this, the server would need to receive a set large dataset of words typed by a user from each phone so that this processing could be done. Clearly there is an issue of privacy here; moreover, there is also potentially a differential privacy issue.This is clearly a good situation in which to use MPC. Each party masks their data using a basic additive secret-sharing scheme: if each party has a vector to input, for every coordinate, every pair of parties agrees on some random field element, one subtracts and one adds this to that coordinate of their vector. When the parties send this to Google, the masks will therefore cancel when added together.
In practice,they use a PRG and perform a key exchange (in which one key is given to each pair of parties, for every possible pair) at the beginning to achieve the same effect but with much smaller communication overhead. They also have a trick for dealing with device failures (which is important given the application).
This talk provided helpful and relevant insight into the the importance of matching what we research with what we require in the real world, which is, after all, one of the main reasons for having conferences such as Real World Crypto. Many of the talks are available to watch online here, and I would highly recommend doing so if interested.
Thursday, January 12, 2017
RWC 2017 - Is Password Insecurity Inevitable?
Fresh back from an enlightening trip across the pond, I wanted to write about one of my favourite talks, all about password (in)security, from this year's Real World Cryptography conference.
As we know:
The basic idea - this isn't new - is to have the user memorise a single master password that they use to access a password store. Then the password store derives unique pseudorandom passwords for each service the user wants to access (Google, Facebook, etc.) The problem with this solution is that the password store becomes a single point of failure: if it is compromised, then an offline dictionary attack to find the master password will compromise all of the user's accounts at once.
Krawczyk et al. suggest an improvement: SPHINX, which amusingly stands for "a password Store that Perfectly Hides from Itself (No eXaggeration)". The first idea is for the password store to not keep hold of (even a hash of) the master password - instead it has an independent secret key $k$, and any time the user wants to log in to a service $S$, they send the master password $pwd$ to the store, the store computes a PRF $PRF(k, pwd | S)$ and this will be sent to $S$ as the user's password for $S$. This means that if the store is compromised, the master password and the account passwords can't be learned unless the user communicates with the store. So this works well if the store is in local, offline hardware, where the user is unlikely to use the store after it is compromised by an attacker.
However, the authors go further and replace the PRF with an oblivious PRF. This means the store computes an "encrypted" version of $PRF(k, pwd | S)$ from an "encrypted" $pwd|S$, so doesn't learn the plaintext values of the master password or the service password. In practice this can be achieved by the user (i.e. the user's machine) hashing the string $pwd | S$ into an element $g$ of a Diffie-Hellman group, then computing $h = g^r$, where $r$ is a fresh, random exponent, and sending $h$ to the password store. The store's secret key is an exponent $a$, so it computes $h^a$ and sends this back to the user. The user removes the blinding exponent $r$ (i.e. computes $(h^a)^{r^{-1}} = g^a$) and the result is the unique password for $S$. Now even when the password store is compromised and even if the user communicates with the store, the master password and the account passwords can't be learned.
In principle an attacker could recover all the account passwords by compromising both the password store and a service $S$, learning the secret key $a$ and the service password $g^a$, computing $g = H(pwd|S)$ and perfoming an offline dictionary attack to find $pwd|S$. Then for any other service $S'$, the password can be computed via $H(pwd|S')^a$. But as long as $S$ follows good practice and only stores a hash $H'(g^a)$ of the service password, this attack fails: an offline dictionary attack to recover $g^a$ is unfeasible as it's essentially a random group element.
There are no particularly expensive computations involved in using SPHINX, the communication between the user and SPHINX does not need to be secure (so it could be online somewhere) and the store will work regardless of what password protocol is used by the service, so it's extremely flexible. SPHINX therefore strikes me as both useful and practical, which is surely the definition of Real World Cryptography.
As we know:
- Passwords protect everything.
- Passwords are terrible.
The basic idea - this isn't new - is to have the user memorise a single master password that they use to access a password store. Then the password store derives unique pseudorandom passwords for each service the user wants to access (Google, Facebook, etc.) The problem with this solution is that the password store becomes a single point of failure: if it is compromised, then an offline dictionary attack to find the master password will compromise all of the user's accounts at once.
Krawczyk et al. suggest an improvement: SPHINX, which amusingly stands for "a password Store that Perfectly Hides from Itself (No eXaggeration)". The first idea is for the password store to not keep hold of (even a hash of) the master password - instead it has an independent secret key $k$, and any time the user wants to log in to a service $S$, they send the master password $pwd$ to the store, the store computes a PRF $PRF(k, pwd | S)$ and this will be sent to $S$ as the user's password for $S$. This means that if the store is compromised, the master password and the account passwords can't be learned unless the user communicates with the store. So this works well if the store is in local, offline hardware, where the user is unlikely to use the store after it is compromised by an attacker.
However, the authors go further and replace the PRF with an oblivious PRF. This means the store computes an "encrypted" version of $PRF(k, pwd | S)$ from an "encrypted" $pwd|S$, so doesn't learn the plaintext values of the master password or the service password. In practice this can be achieved by the user (i.e. the user's machine) hashing the string $pwd | S$ into an element $g$ of a Diffie-Hellman group, then computing $h = g^r$, where $r$ is a fresh, random exponent, and sending $h$ to the password store. The store's secret key is an exponent $a$, so it computes $h^a$ and sends this back to the user. The user removes the blinding exponent $r$ (i.e. computes $(h^a)^{r^{-1}} = g^a$) and the result is the unique password for $S$. Now even when the password store is compromised and even if the user communicates with the store, the master password and the account passwords can't be learned.
In principle an attacker could recover all the account passwords by compromising both the password store and a service $S$, learning the secret key $a$ and the service password $g^a$, computing $g = H(pwd|S)$ and perfoming an offline dictionary attack to find $pwd|S$. Then for any other service $S'$, the password can be computed via $H(pwd|S')^a$. But as long as $S$ follows good practice and only stores a hash $H'(g^a)$ of the service password, this attack fails: an offline dictionary attack to recover $g^a$ is unfeasible as it's essentially a random group element.
There are no particularly expensive computations involved in using SPHINX, the communication between the user and SPHINX does not need to be secure (so it could be online somewhere) and the store will work regardless of what password protocol is used by the service, so it's extremely flexible. SPHINX therefore strikes me as both useful and practical, which is surely the definition of Real World Cryptography.
Monday, January 9, 2017
RWC 2017 - Post-quantum cryptography in the real-world
A new year takes off and, along
with it, thousands of resolutions are formulated. Although I am not the right
person to talk about them (my diet will begin next Monday), I wish to discuss a
resolution that the cryptographic community as a whole has set for itself in this 2017. Because that's what people do at Real World Crypto (RWC): they talk
about new threads, topics could be worth exploring during the new year,
directions for their researches and interests. This year, for the first time in RWC, post-quantum cryptography (PQC) was given an entire
session, clear sign that time is changing and the moment has come to bring the
discussion to the real world. The message is clear: even if quantum computers
are not popping up in tomorrow's newspapers, we can't postpone any longer.
A very simple reason for this was
given by Rene Peralta, of the NIST PQC team, during the overture of the
session: standardisation takes time, up to seven years if we start right now,
and full transition takes even longer. I found Rene's presentation to be neat
and direct: our public-key cryptography fails against quantum computers and our
symmetric one needs some (non-drastic) modifications. The resolution is to "start thinking about it this year, possibly by November 30th, 2017".
However, a question arises quite naturally: are we ready?
The other three talks of the
session tried to answer in the affirmative. Among the several PQC proposals
that are around in theoretical papers, two made their ways into RWC: the well-stablished lattice-based cryptography and the new-born isogeny-based
cryptography, which nevertheless carries the pride and sympathy of ECC.
Lattices and funny names: NewHope
and Frodo and Crystals
Lattice-based cryptography has
three representatives in the run for PQC schemes. Valeria Nikolaenko showed
two: the first one is called NewHope and is a key agreement protocol based
on the hardness of Ring-LWE. The latter is a problem very favourable to applications
because it combines sound theoretical security (worst-case to average-case
reduction) to fast implementations thanks to specific choices of parameters
which allow for speed-ups in the computations: NewHope turns out to be even
faster than ECC and RSA, but at the price of a larger communication.
However, there are some concerns on the security of LWE when the ring structured
is added. Thus, Frodo ("take off the ring") is designed to achieve the same
goal using only standard LWE. The drawback is a degradation in performance,
since the tricks hinted above cannot be used anymore and keys are generally
bigger.
The third lattice-based scheme
was presented by Tancrede Lepoint and is a suite called Crystals. This is based on yet another kind of lattices: module lattices, for which it is also known a worst-case to average-case reduction. These are less structured lattices (hence possibly calming down the detractors of ring structure) in which similar implementation speed-ups are possible: the timing is indeed comparable to NewHope's, while the communication is improved.
"Make elliptic curves great again"
Michael Naehrig presented a new proposal for PQC: do you remember curves with plenty of small subgroups where to easily solve the discrete logarithm problem? Now they come in handy again: all the subgroups (of order 2 and 3) are considered to be nodes of a graph, whose edges are the isogenies (a.k.a. bijetive homorphisms between curves). In this new context, given two curves in the graph, it is difficult to come up with the isogeny linking the two. However, such a new approach doesn't really stand against other solutions: keys are small but performance is not a pro (so to speak).
RWC 2017 - Erasing Secrets from RAM
One of my favourite talks from the Real World Crypto 2017 conference was given by Laurent Simon, on Erasing Secrets from RAM.
In short, it was found that in practice, many non-malicious programs handling keys and other sensitive data do not erase the RAM correctly. This would allow an attacker (that has access to all of a system's volatile memory and CPU state) access to any unerased sensitive data.
It was thought that compiler optimisation played a part in the lack of erasion. Take the code below:
void sensitive_function(...) {
u8 sensitive_buffer[KEY_MAX] = "\0";
...
zeromem(sensitive_buffer, KEY_MAX);
}
The compiler may choose to remove the zeromem line, as the sensitive_buffer is going out of scope anyway. This would leave sensitive data on the RAM, unbeknownst to the programmer.
So, the paper presents a tool that allows developers to mark sensitive variables in their code, and then see (post-compilation) any potential leakage of sensitive data.
They call this tool Secretgrind, based off the popular Valgrind.
Anyway, as it turns out, the compiler optimisation problem mentioned above wasn't actually a problem in practice - they didn't once encounter this problem in all their testing. Instead, the majority of sensitive leaks were down to developers' mistakes; they had forgotten to erase sensitive variables on both the stack and the heap.
There were a few issues with IO API's caching optimisations, though - such as when you read a line from a PEM file using mmap, it often loads the whole file into memory to save you the time. However, this is not immediately obvious, and when you go to delete the line from RAM, the rest of the file is still in memory!
Laurent concluded the talks saying Secretgrind was still in development, and although referring to it as a 'hack' (due to it's fragility), wishes for it to be used to "help you guys check your code".
In short, it was found that in practice, many non-malicious programs handling keys and other sensitive data do not erase the RAM correctly. This would allow an attacker (that has access to all of a system's volatile memory and CPU state) access to any unerased sensitive data.
It was thought that compiler optimisation played a part in the lack of erasion. Take the code below:
void sensitive_function(...) {
u8 sensitive_buffer[KEY_MAX] = "\0";
...
zeromem(sensitive_buffer, KEY_MAX);
}
The compiler may choose to remove the zeromem line, as the sensitive_buffer is going out of scope anyway. This would leave sensitive data on the RAM, unbeknownst to the programmer.
So, the paper presents a tool that allows developers to mark sensitive variables in their code, and then see (post-compilation) any potential leakage of sensitive data.
They call this tool Secretgrind, based off the popular Valgrind.
Anyway, as it turns out, the compiler optimisation problem mentioned above wasn't actually a problem in practice - they didn't once encounter this problem in all their testing. Instead, the majority of sensitive leaks were down to developers' mistakes; they had forgotten to erase sensitive variables on both the stack and the heap.
There were a few issues with IO API's caching optimisations, though - such as when you read a line from a PEM file using mmap, it often loads the whole file into memory to save you the time. However, this is not immediately obvious, and when you go to delete the line from RAM, the rest of the file is still in memory!
Laurent concluded the talks saying Secretgrind was still in development, and although referring to it as a 'hack' (due to it's fragility), wishes for it to be used to "help you guys check your code".
Thursday, January 5, 2017
RWC 2017 - A Formal Security Analysis of the Signal Messaging Protocol
Real World Crypto 2017 kicked off yesterday in New York City. This afternoon, Luke Garratt presented his and his colleagues' work, A Formal Security Analysis of the Signal Messaging Protocol. The signal protocol is used by the Facebook messenger app, WhatsApp and the Signal app (to name but a few). It is therefore surprising that there had been no formal security analysis, and their work addresses this issue. The paper is motivated by the questions
and
What should Signal achieve?
Does it?
Or, put in more modern language (spoiler alert),
Let's first look at what kind of security we can hope for. We speak of Forward Secrecy, which ensures that past messages are not compromised even if the communication is at some point in the future. This is to prevent an adversary storing all the conversations and waiting until an attack is successful to then recover all the communication.
Post-compromise security pushes this even further. If we have post-compromise security, not only past conversations are not compromised, but also future ones. The Signal protocol achieves this using a technique called ratcheting, which involves session keys being updated with each message sent. Why is this useful? Well, it makes the adversary's life much harder. In order to attack a system with post-compromise security, an adversary must obtain long-term keys and immediately attack and continue attacking if they want to compromise future sessions. As opposed to forward security, where an adversary would obtain a long-term key and wait for an interesting target to launch a man-in-the-middle attack (e.g. TLS-DHE) or to no forward security, where they would just store all ciphertext traffic until they obtain a long-term key and decrypt everything (e.g. TLS-RSA). Times are hard.
Their security model captures:
- Full network control by the adversary;
- Perfect forward secrecy;
- Post-compromise security;
- Key compromise impersonation attacks;
- Some (but not all) random numbers compromise.
Their proof is too long to be featured in this blog post, but Luke promises it is tedious rather than complex. Their conclusion? So far, so good.
In terms of limitations, they note that they have not analysed any implementation of the protocol, so this is a theoretical analysis only. They also assume an honest key distribution and have not (yet) considered multiple devices.