The Problem
The recent RockYou.com password problems have spawned plenty of debate online about the best way to store passwords and build a site securely.
Part of being a good, security-conscious web developer is paranoia, and it's apparent that the RockYou.com developers could have used a little more of it. They made two mistakes in their work, not one. Their first, and most obvious one, is that they had a SQL injection hole somewhere. Their second was their assumption that their measures to protect their data were enough to do so.
A healthy dose of paranoia would have led their developers to make the opposite assumption - that whatever they did to protect the data, sooner or later someone would be able to access it.
The result of this second mistake is that, rather than simply announcing a security hole has been found and closed, they have had to deal with the fact that the passwords of more than 32 million people have been exposed, in plain text, to an unknown number of people. As most people use the same password for multiple places, and most will be unaware that this has happened, we can safely assume that the access details of millions of email accounts are in the open and unchanged. That's a bad day in code-land by anyone's standards.
Hashing
The solution to the problem is to first assume that all data will be exposed at some point to an intruder of some sort. Once you assume that, it becomes important to ensure that the damage resulting from that exposure is minimal.
Which brings me on to hashes. Hashes are one-way functions that generate a representation, usually a number, of the data put in to them. They always generate the same hash from the same data, and there is no simple way to reverse the process.
This makes them incredibly useful for password storage. Instead of storing a user's password, you can store the hash of the password. When a user logs in again, instead of checking the password they type in against the one you have stored, you calculate the hash of the password they type in and compare that to the stored hash.
There are lots of different hashing algorythms, the most commonly used being MD5 and SHA1.
Are Hashes Secure?
Unfortunately, ensuring passwords are stored securely isn't as simple as just using storing a simple hash of a password. Two of the strengths of hashes are also their largest potential weakness: they are small to store and quick to generate.
To generate SHA1 and MD5 hashes of every word in English, for example, takes moments. To store that amount of data is also trivial. To generate hashes of all combinations of letters and numbers, plus a few commonly used punctuation marks, up to say 8 characters, is much slower but still doable without any special setup or equipment.
Tables of precalculated hashes of data like this are easily found online or easily generated. If you have a hash of some data (like a password) and you want to see what that data originally was, you can compare the hash to the entries in your precalculated table. If you find a match, you have discovered the data that was originally used to generate the hash - the password you were trying to find out.
So basic password hashing is, essentially, useless for the majority of users. It is a simple process to compare hashes of basic passwords to a table of precalculated hashes and thereby "dehash" passwords en masse.
Some people recommend nesting hashes as a way to make add complexity and therefore more security. Unfortunately, to generate tables of nested hashes is almost as easy as plain hashes by themselves, and no more secure.
Add Salt!
The solution is to hash more than just the user's password, and this process is called "salting". For example, instead of storing a hash of a user's password, you could store the hash of their email address and their password together.
This is effective because tables of hashes of generated data of more than about 10 characters start to become problematic to generate and store. At around that point, tables must be generated based upon dictionaries and known words, rather than on programatically generated lists of all possible passwords in a range.
The average length of "email plus password" is easily in the region of 25 characters. Not only that, but if someone worked out that you were using hashes of "email plus password", they would still need to generate a new table for every password they wanted to dehash.
This level of complexity, added to a reasonably strong password policy, ensures that if (or when) your user data is exposed, the work involved in extracting usable passwords from it is going to stop all but the most determined attackers. Not only that, but even they will find extraction of data in bulk prohibitively difficult.

13 Comments
Nice article. Unfortunately it seems like a practice many developers neglect until disaster strikes.
#1, Chris Wiliams, 16 December 2009. Reply to this.
A very comprehensive explanation, thank you!
#2, aex, 16 December 2009. Reply to this.
I've always used seemingly long salts (around 40 characters or so) that were randomly generated. My favorite place to get such a salt is from wordpress: https://api.wordpress.org/secret-key/1.1/
Is there any benefit for using something bigger than SHA1?
#3, Ryan Rampersad, USA, 16 December 2009. Reply to this.
Well-written, though it's sad it has to be written at all. I mean c'mon, hashing passwords is web security 101.
One thing to add to improve security on hashes is a "pepper" in addition to salt. The salt is the non-static stored item (like the user's email) that's different on each entry. But a determined attacker who knew that was the salt could still crack it.
By using a static salt (pepper if you will) that is stored as its own variable in a script (not in the web-directory) would then require the attacker to not only access the MySQL database but get into the filesystem.
BTW addedbytes... Tab doesn't work for these comment boxes, I think because tabindex on them is set to 9-digit numbers, it must confuse the browser into thinking they're not tabable.
- Adam
#4, Adam Wolf, United States, 16 December 2009. Reply to this.
Thansk for sharing - I found the bit about hashing useful.
Eoin
#5, Eoin Redmond, Ireland, 22 December 2009. Reply to this.
HMAC is also a good approach and component of a solution to this problem.
#6, Ryan T, 23 December 2009. Reply to this.
Yes.. this is a Good procedure to do provide good security..
Have you observed in google's Login page?
Once you enter username and change the focus to password field, some process starts in background and you can see the data transfer progress bar showing some activity..
Is it some processes for security or what i dont know..
#7, Nanjangud, India, 27 December 2009. Reply to this.
Adam Wolf suggests the same solution that I use as standard on my apps:
app-specific salt ( or pepper, as Adam says )
+ random salt generated whenever pass is reset
+ password
...and SHA-512 that all together.
so even if your pass is "bob" it'll be hashing something like "randomsalt-Staticpepper-bob"
PHP's hash function with algo comparisons:
http://us.php.net/manual/en/function.hash.php
Per the comments on that page: "The well known hash functions MD5 and SHA1 should be avoided in new applications. Collission attacks against MD5 are well documented in the cryptographics literature and have already been demonstrated in practice. Therefore, MD5 is no longer secure for certain applications."
#8, James, USA, 27 December 2009. Reply to this.
Good article. Security is always on our minds
#9, avanzaweb, Spain, 31 December 2009. Reply to this.
Thanks for this article, i'm a fairly new developer and did not know what salting was, i was curious after setting salts for my wordpress sites without actually knowing what they did.
Your explanation cleared everything up for me, it seems like a straightforward enough idea.
#10, ralcus, 16 January 2010. Reply to this.
Great post. I new vaguely what salting is, but now I know exactly what it is. this post has cleared up any questions I had.
Thanks.
#11, Shane Heaters, UK, 19 February 2010. Reply to this.
Will, that is vendicated your opinion, dude. =)
#12, term paper services, 26 February 2010. Reply to this.
1. add salt
2. add user-agent and other fingerprinting techniques
3. use ssl if possible
4. make sure your server is secure and properly configured.
5. don't use a write enabled db user if you're just reading from the db.
6. store your backups in a very secure place.
7. and never ever send user details via email.
i'm often surprised that there is such a lack of hacking considering so few people follow the basics.
#13, murray, south africa, 4 March 2010. Reply to this.