## Tuesday, June 7, 2016

### From IND-CPA to Robust AEAD: Security Notions for Symmetric Encryption

Edit 09/06/16: When I originally wrote this post, I was under the impression that the Robust AE notion had been widely adopted in the symmetric community. Since this is not yet the case, I have edited the post to emphasise that this is a new notion of Rogaway et al. and not a standard one.

This week I'm at the Summer School on Real-World Crypto and Privacy in a beautiful beach resort near Šibenik, Croatia. Obviously spending time by the hotel pool / Mediterranean sea is terribly dull and exhausting, so I was delighted that today I could instead attend a two-hour lecture by Phil Rogaway on security notions for symmetric encryption.

In all seriousness, Phil gave an excellent tour of the (recent) history of symmetric encryption research and I'll try to sketch some of the key points here.

OK, let's first recall the classical notions for symmetric encryption. We assume that, for any key $k$, encryption is a probabilistic function $E_k$ from the message space $\mathcal{M}$ to the ciphertext space $\mathcal{C}$ and that decryption is a deterministic function $D_k: \mathcal{C} \to \mathcal{M}$. Then the advantage of an adversary is the adversary's ability to distinguish between the oracles $E_k(\cdot)$ and $E_k(\$(\cdot))$, where$\$(\cdot)$ returns a random string of the same length as the input. This is called IND-CPA security.

The above notion captures privacy, but not authenticity. But in practice, people want to use a single encryption scheme to achieve both of these goals. So the security definition evolved to probabilistic Authenticated Encryption (pAE): first we allow the decryption function to return $\bot$, meaning the ciphertext is inauthentic. Then we define the privacy advantage just as in the IND-CPA game, but additionally consider the authenticity advantage, which is the probability that an adversary with access to $E_k(\cdot)$ outputs a forgery: a ciphertext that did not come from the encryption oracle and that is decrypted by $D_k$ to something other than $\bot$.

This definition still has practical problems. Firstly, it still requires probabilistic encryption and - as we all know - truly random coins are hard to come by. So we want to swap true randomness with nonces: numbers that should only be used once. Then encryption can be implemented using a deterministic function that takes a nonce as an extra input. Secondly, practitioners often need to transmit some header data along with the ciphertext, and this associated data needs to be authenticated but not encrypted. So here's yet another definition(!): a nonce-based AEAD scheme is a function that takes a key, a nonce, some associated data and a message and returns a ciphertext. The authenticity notion is defined essentially as in pAE, but the privacy notion is strengthened: the adversary is asked to distinguish between $E_k(\cdot,\cdot,\cdot)$ and $\$(E_k(\cdot,\cdot,\cdot)),\$ i.e. the ciphertexts need to look like random strings, not just encryptions of random strings. These definitions assume that the adversary is nonce-respecting: the adversary never repeats the nonce in their encryption queries.

Of course, nonces are not always used just once, as we would like. To move even closer to the messy world of practical security, we introduce Misuse-Resistant Authenticated Encryption (MRAE) where the adversary can repeat nonces, as long as they don't repeat the entire nonce-data-message triple (otherwise they could trivially distinguish the corresponding identical ciphertexts from independent random strings).

"Surely that's the last one!", I hear you cry. Not quite. It's easy to show that MRAE requires significant ciphertext expansion: ciphertexts must be significantly longer than the plaintexts. This is a problem in certain lightweight applications like in the Internet of Things. So, accepting that there's a tradeoff between the amount of ciphertext expansion and the level of security, Phil and others have recently introduced robust AE (RAE), where the encryption function now has an additional argument specifying the amount of ciphertext expansion. One then tries to obtain the best possible security with that amount of expansion. I'll omit the details in the interest of space.

What was most interesting about Phil's talk was to learn how the evolution of the theoretical notions of security was driven by practical considerations. On the other hand, since the security goalposts seem to be moving all the time, perhaps practitioners will just stop trying to reach them.