Tuesday, November 5, 2013

Honey I Blew Up The Password File

Ari Juels today at CCS gave a presentation of his Honeywords idea (joint work with Ron Rivest) which aims to proect against server breaches of password files.

Assume an adversary determines a password file, which we can assume (for most users) allows the adversary to obtain the clear text password (since most users choose easily guessed passwords). This is a real attack which has been mounted against various companies in the past couple of years, often resulting in high profile media coverage which the companies would have rather avoided.

Ari suggested adding to the password file a set of ``fake'' passwords, called Honeywords. Suppose there are 19 such Honeywords and one real one. The adversary on performing a server breach obtains twenty possible passwords, and so if he submits one of these at random he will be detected with probability 19/20, i.e. 95 percent. Such a detection also reveals to the server that they have been compromised.

This simple idea raises two key issues. Firstly how to detect the correct password out of a set of passwords, and how to choose the Honeywords. For the first problem the authors suggest a simple form of distributed cryptography. The first server takes the input password and checks it against the set of twenty words. The index of the input password within this set is then passed to a second server, which simply returns whether the index is the index of a Honeyword or the correct password.  As described the system is clearly secure against passive corruption of one of the two servers.

As for the second problem of how to choose Honeywords, the authors present a number of techniques. The main issue being that simply choosing random words as the Honeyword for some users will easily lead to the detection of the correct password.

Combined with standard hashing techniques for password storage the proposed Honeyword system seems to work well. However, an issue not discussed by the talk is what happens if an adversary actively compromises one of the two servers.


  1. How different is this from k-anonymity ?

  2. I think it is a completely different situation. The idea is to have server one determine whether the input password is one of the k-possible passwords, and then server two to determine whether the resulting index corresponds to a Honeyword or not.

  3. The basic idea of obfuscating real data with fake or bogus, so that attacker cannot distinguish which is real or fake is the crux of k-anonymity, l-diversity related privacy concepts in data mining world.