How To Protect Your Users From Password Theft

Monster.com recently disclosed yet another major breach that compromised the personal data of over 1.3 million users. This is not unlike the previous breach in August 2007, though the attack vector was likely different. From a notice on their website (emphasis mine):

We recently learned our database was illegally accessed and certain contact and account data were taken, including Monster user IDs and passwords, email addresses, names, phone numbers, and some basic demographic data. The information accessed does not include resumes.

Considering the well-known tendency to use the same password on multiple websites, compounded with the fact that Monster pledged a comprehensive security review after the first breach, it’s just embarrassing that they are still storing passwords in the clear.

So let’s talk about how to properly store passwords for a web application.

Use a one-way cryptographic hash

Don’t store your passwords in the clear! If you do, an attacker just needs to find one SQL Injection vulnerability and he’s got the password for every one of your users. The idea behind using a one-way algorithm is that the hash value can’t be reversed to “decrypt” the password. So how does authentication work? When a user attempts to login, you apply the same one-way algorithm to convert the user-provided password into the hash value, and then compare the two hashes. If they match, then the user-provided password was correct. At no time is the password ever stored in the clear.

Often, developers will hear the advice “use a hash” and interpret that as “run the plaintext password through MD5 or SHA-1 and store the result.” But that only solves part of the problem — the part about using an irreversible algorithm. It doesn’t protect against pre-computation. Let’s say you’ve used SHA-1 to hash your passwords, and your USERS table looks like this in the database:

USER          PASSWORD_HASH
admin         5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8
bob           fbb73ec5afd91d5b503ca11756e33d21a9045d9d
jim           7c6a61c68ef8b9b6b061b28c348bc1ed7921cb53

So if you wanted to obtain the original passwords you’d have to run a dictionary or brute force attack, hashing all possible password options with SHA-1 and comparing the output to the stored hashes. This would take a long time but eventually you’d figure some of them out. But what if you already had a list of all 8-character permutations and their corresponding SHA-1 hashes? Now all you have to do is look up the hashes, rather than computing them on-the-fly. This is the idea behind rainbow tables.

An attacker with a SHA-1 rainbow table covering 8-character alphanumeric combinations would quickly look up those three hashes and obtain the original passwords of “password”, “p4ssword”, and “passw0rd” respectively.

Use a salt

The best defense against pre-computation of raw hashes is salting. To salt a password, you append or prepend a random string of bits to the plaintext password and hash the result. You then store the salt value alongside the hash so that it can be used by the authentication routine. Look in the /etc/shadow file of any modern Unix system and you’ll see something like this:

user1:$1$lKorlp4C$RD5TSM6PaZ6oaWRVUuXT40:13740:0:99999:7:::
user2:$1$qOmA0CUm$I6IdbZDTDl6B6m7s77VPe1:13650:0:99999:7:::
user3:$1$nIEInNo5$PSxcLtvGIJArL8r2AQl74.:13749:0:99999:7:::

Let’s look at the “user1″ entry in the example above, paying attention to the second field which contains a bunch of alphanumeric characters separated by dollar signs. The first token, 1, is a version number, The second token, lKorlp4C, is the salt. The third token, RD5TSM6PaZ6oaWRVUuXT40, is the one-way hash that was calculated using lKorlp4C as the salt.

When the user attempts to login, the system passes the user-provided password along with the stored salt into the hash routine (in this case, md5crypt), and compares the result to the stored hash.

Each bit of salt used doubles the amount of storage and computation required for a pre-computed table. For instance, if we used one bit of salt — either 0 or 1 — the rainbow table would have to account for two variations of every password. Eight bits of salt require 2^8, or 256 variations of every password. Use a sufficiently large salt and pre-computation becomes infeasible. For example, the md5crypt utility uses 48 bits of salt (and for an extra layer of protection, it runs 1000 iterations of MD5 to slow down dictionary attacks).

There are a couple of common mistakes that people make with regard to salting. First, don’t use the same salt every time. If you do, you’re not really increasing the search space because the attacker only has to account for a single salt value. Second, don’t worry about protecting the salt values, they’re not secrets. The added security is derived not from the secrecy of the salt but rather by the amount it increases the resources required for pre-computation.

If you have OpenSSL installed you can play around with various salt mechanisms and see what the output looks like:

$ openssl passwd -h
Usage: passwd [options] [passwords]
where options are
-crypt             standard Unix password algorithm (default)
-1                 MD5-based password algorithm
-apr1              MD5-based password algorithm, Apache variant
-salt string       use provided salt
-in file           read passwords from file
-stdin             read passwords from stdin
-noverify          never verify when reading password from terminal
-quiet             no warnings
-table             format output as table
-reverse           switch table columns

$ openssl passwd -1 password
$1$LH1SwzJI$0ho4XuPVfGlbWIcNuGIap/
$ openssl passwd -1 password
$1$eAUtQOBh$GlvJwVsyb8In5KKkvnR0E0
$ openssl passwd -1 password
$1$PgaSiWTy$ElLh6uy83Y6T4Y70AGmV20

A quick Google search shows that there is a lot of confusion about salting.

But wait, now my password recovery feature won’t work

What’s that? You say your application has one of those “Forgot My Password” features where a user can type in their username and their current password will be sent to the e-mail address on file? Clearly, that requirement depends on passwords being stored either in the clear or using a reversible mechanism such as symmetric encryption.

The answer here is to redesign your password recovery feature. Don’t let an unnecessary requirement force you into poor security practices. If you must e-mail a password, generate a temporary password that’s only valid for a short time period, and require the user to login immediately and select a new password. This obviates the need to retrieve the original, forgotten password.

Why not just use symmetric encryption?

Instead of storing passwords in the clear, you could encrypt them using a symmetric algorithm such as AES and have the application encrypt/decrypt as needed. While this solves the plaintext storage problem, it creates a new problem: key management. Where do you store the key? How often does it change? How many people have access to it? What do you do if/when the key is compromised? And so on. The tradeoff really isn’t worth it for something that’s more elegantly solved with salted hashes.

Layered defenses

While you’re rethinking password storage, it might be a good time to consider other common flubs such as password complexity and brute-force protections.

In conclusion

  • Storing passwords in the clear puts your users at unnecessary risk if (when) your application database is compromised
  • Use salted hashes instead of storing passwords in a recoverable format
  • Password recovery mechanisms can be implemented without needing to obtain the original password
  • As with any aspect of security architecture, use layered defenses

Have fun refactoring!

Veracode Security Solutions
Veracode Security Threat Guides

Peter | January 26, 2009 4:28 pm

One problem with storing hashed passwords is that there still are a number of authentication protocols in use (such as EAP-PEAP/MSCHAPv2) that require plaintext passwords. Any server wanting to provide backwards compatibility with clients using these algorithms has to at least provide an option for symmetric encrypted passwords :(

Chris Eng | January 27, 2009 11:54 am

@Peter:

Not sure I follow. Do you mean if my app has to authenticate the user against another backend system that uses PEAP? In that case, I still don’t have to store the user’s password in the clear. I just act as a passthrough for whatever the user sends me. Could you elaborate further?

Nate | January 27, 2009 12:33 pm

Good article, but it misses one key point: use an existing implementation of all these functions, don’t roll your own. The OpenBSD crypt(3) supports all these features and more (including variable-length hashing to slow an attacker even more). The chance of someone making a mistake when reimplementing this is high, especially someone not familiar with all the security evolution the original went through (8 char limit anyone?)

There are even implementations for python and Java:
http://www.mindrot.org/projects/py-bcrypt/
http://www.mindrot.org/projects/jBCrypt/

Chris Eng | January 28, 2009 1:46 am

@Nate:

Thanks. Yes, that point is definitely worth calling out — I should have emphasized that. I could write an entire post on “don’t roll your own” (and you could probably write dozens). ;)

@Patrick:

I tried to allude to that with the example of md5crypt using 1000 iterations. But I wasn’t aware that the practice was called “key stretching.”

Klau | January 28, 2009 6:08 am

I am a little bit slower, but I still didn’t understood a few things from your post:
suppose the time to break the MD5/SHA-1 hash is x

1. a) using a salt, can it be cracked using the rainbow tables assuming that you don’t know the salt? how much is x now?
b) same question, but this time supposing that you have no idea what’s the value of the salt

Peter | January 28, 2009 12:28 pm

@Chris:

The situation I was referring to is if you are writing an authentication backend supporting multiple clients. The clients under your control should absolutely support protocols allowing hashed passwords, as you suggest. If you are required to support a client that only supports CHAP or EAP-MD5, for instance, your backend still has to have access to the plaintext password.

I’m not trying to criticize your article – it’s very good. I recently was unfortunate enough to have to ‘break’ a system implemented as you suggest so that it can support legacy protocols, and thought others might appreciate the heads up. In my case, I force the administrator to explicitly enable reversibly encrypted passwords and accept the lowered security (similar to the approach taken by MS IAS.)

Chris Eng | January 28, 2009 4:08 pm

@Peter:

Ah, I get it now. You’re the one actually issuing the CHAP challenge, so you need the plaintext password in order to calculate and verify the hash (or whatever sort of calculation CHAP uses for the verification step, I can’t remember). Thanks for the clarification; it’s always interesting to hear stories from the field.

Chris Eng | January 28, 2009 4:20 pm

@Klau:

Let’s assume that the time to recover a password stored as an unsalted SHA-1 is zero (it’s not, because the lookup takes a finite amount of time). But it’s pretty fast. The reason it’s fast is because you’ve pre-calculated the hashes for all possible inputs.

Now let’s assume that you’re trying to recover a password stored as a salted SHA-1 hash, where the salt is 64 bits. Unless you had some serious computational resources and storage available to you, you probably don’t have 2^64 pre-computed rainbow tables (corresponding to each of the 2^64 possible salt values). Since you don’t have a rainbow table, you can’t simply look up the hash, so your only option is to use brute force. Now you have to calculate the hash for billions of possible passwords until you eventually find the one you’re looking for. If we brute force all possible 8-character alphanumeric passwords, you’ll have to do that calculation (62^8)/2, or 109 trillion times, on average, before you recover the password. That’s what we affectionately refer to as “computationally infeasible.”

If you throw away the salt, then you can’t recover the password. But neither can the authentication routine, which is why you have to store it alongside the password.

Tejeddine Mouelhi | February 2, 2009 5:11 am

You wrote ‘an attacker just needs to find one SQL Injection vulnerability and he’s got the password for every one of your users.’

I am a little bit confused. I thought that ‘one SQL injection vulnerability’ can update the database to store a one way hashed password. In that case, the solution you propose does not work (except for the one improved with salt).
And beyond that I will even say that SQLIA is worse than password theft, you can drop tables/database or even shutdown the server.

Am i missing something here ?

Kinds regards,
P.S. You did not react to my reply on ‘CWE/SANS Top 25 Most Dangerous Programming Errors’ post

Chris Eng | February 2, 2009 11:27 am

@Tejedinne:

One SQL injection vulnerability gives you the password for all the users when said passwords are stored in the clear.

Yes, in many situations, you could also use a SQL injection vulnerability to overwrite a stored password hash. However, this isn’t particularly stealthy and it’ll be pretty obvious to the victim when he tries to login and can’t. I wasn’t suggesting that people use unsalted hashes — that’s why immediately after describing unsalted hashes, I pointed out the rainbow table attack.

As for dropping tables or databases, that causes disruption but it doesn’t help the attacker. They’re usually after the data so it doesn’t do them much good to delete it. Information is king.

The “CWE/SANS Top 25 Most Dangerous Programming Errors” post was Chris Wysopal’s, not mine. I’m sure he’ll respond to you when he gets some time.

Fünf Zeichen sind (manchmal) genug « Erich sieht | February 9, 2009 3:55 pm

[...] (Online-Angriff) oder eben mit dem Hashwert (offline), wenn er ihn kennt. Der Hashwert ist das, was man normalerweise von einem Passwort speichert. Daraus lässt sich das Passwort nicht berechnen, aber man kann umgekehrt den Hashwert eines [...]

Please Post Your Comments & Reviews

Your email address will not be published. Required fields are marked *

RSS feed for comments on this post