Thursday, January 8, 2015

So I am going to start in the middle of the talk by describing the part that most people will find most interesting before looping back around to discuss the presentation in order. Given below and received with a flurry of excitement during the presentation itself (lots of camera phones appeared for this slide) is the way Facebook hash their passwords:

1)$\$$cur = 'plaintext' 2)\$$cur = md5($\$$cur) 3)\$$salt = randbytes(20)
4)$\$$cur = hmac_sha1(\$$cur,$\$$salt) 5)\$$cur  = cryptoservice::hmac($\$$cur) 6) [= hmac_sha256(\$$cur,$\$$secret)] 7)\$$cur  = scrypt($\$$cur, \$$salt) 8)$\$$cur = hmac_sha256(\$$cur, $\$\$salt)

Ok, so why do it like this? Well while Facebook have the usual security considerations that we all have, they also have one that probably only they can claim - having to efficiently deal with over a billion users! I will now try and explain why each of the lines are in source.

1) This is just taking in the plaintext (the password) and is clearly required

2) md5 hash - this is a pretty standard thing to do, or at least was about 10 years ago. So why is it still here? The standard way to change this would be to keep two tables side by side one with the md5 hashes for the user and one for whatever the new solution is and then when the user logs in for the first time since the change you check the md5 hash and then store the new one for future uses. When all users have done this you can delete the md5 hashes and you are done. With a small number of users this seems feasible but with a billion users that is a lot of data to store and could take a (extremely) long time to get to the point everyone has transfered to the new system. Hence this is why this line is still here and then the remaining lines make the system more secure. This solution makes more sense at this scale because the whole table can be updated without having to have the user log in first and it can be done within the single table.

3-4) This is the standard step of salt and hash. The interesting point here is that 160 bits of salt are used, which seems like a lot. However it is explained that for all the Facebook users, from the beginning of time (or Feb 2004 to be precise) to now, to have a unique salt the salt would need to be about 32 bits long. However since salts are assigned randomly (as they should be) you need to consider the birthday bound on the probability of collisions, so you need 64 bits of salt. The other 100 bits (while seems a bit on the large side) allows for future proofing for things like new users and multiple password changes (people tend to forget their passwords...)

5-8) As you all probably know; hash (by design) is fast, so the goal here is to slow down the brute force time of a user's password. the interesting part is on lines 5-6 which calls this cryptoservice. What this is doing is sending it over to Facebook who hash in a secret, this has two advantages; firstly it means that passwords can not be brute forced in offline attacks and secondly it allows Facebook to monitor password hashing attempts and to block any suspicious looking activity. The scrypt on line 7 is used to slow down the local computation while the hmac_sha256 on line 8 is used to shrink the size of the output, so that the password database is manageable (after all even if each entry in the table is tiny, with a billion users it will still be a very large table. For example if each entry has to increase by a single bit the whole table will increase by a Gb in size!).

Various points from the rest of the talk:
Authentication for standard websites tends to be "something you know" (your password), while if you are security concious you can turn on two factor authentication to add "something you own" (tends to be your phone) but Facebook have started including other factors as well when you log in. One thing they now consider is where you are; if I always log on from Bristol but five minutes later I log on from Hawaii then there is probably something wrong and further authentication checks should be made. Of course now that Tor is becoming more widespread this could just be Tor doing its thing and I imagine a conversation between Tor and Facebook will be on the cards. The other check they are doing (which again can be seen as a something you own) is a "have they logged on from this browser before?" if they have the it is (more) likely to be the person who logged in last time but if it is a new device then further authentication should take place since it is less likely to be the intended user.