Blog
Why You Should Always Salt Your Hashes
The Problem
The recent RockYou.com password problems have spawned plenty of debate online about the best way to store passwords and build a site securely.
Part of being a good, security-conscious web developer is paranoia, and it's apparent that the RockYou.com developers could have used a little more of it. They made two mistakes in their work, not one. Their first, and most obvious one, is that they had a SQL injection hole somewhere. Their second was their assumption that their measures to protect their data were enough to do so.
A healthy dose of paranoia would have led their developers to make the opposite assumption - that whatever they did to protect the data, sooner or later someone would be able to access it.
The result of this second mistake is that, rather than simply announcing a security hole has been found and closed, they have had to deal with the fact that the passwords of more than 32 million people have been exposed, in plain text, to an unknown number of people. As most people use the same password for multiple places, and most will be unaware that this has happened, we can safely assume that the access details of millions of email accounts are in the open and unchanged. That's a bad day in code-land by anyone's standards.
Hashing
The solution to the problem is to first assume that all data will be exposed at some point to an intruder of some sort. Once you assume that, it becomes important to ensure that the damage resulting from that exposure is minimal.
Which brings me on to hashes. Hashes are one-way functions that generate a representation, usually a number, of the data put in to them. They always generate the same hash from the same data, and there is no simple way to reverse the process.
This makes them incredibly useful for password storage. Instead of storing a user's password, you can store the hash of the password. When a user logs in again, instead of checking the password they type in against the one you have stored, you calculate the hash of the password they type in and compare that to the stored hash.
There are lots of different hashing algorythms, the most commonly used being MD5 and SHA1.
Are Hashes Secure?
Unfortunately, ensuring passwords are stored securely isn't as simple as just using storing a simple hash of a password. Two of the strengths of hashes are also their largest potential weakness: they are small to store and quick to generate.
To generate SHA1 and MD5 hashes of every word in English, for example, takes moments. To store that amount of data is also trivial. To generate hashes of all combinations of letters and numbers, plus a few commonly used punctuation marks, up to say 8 characters, is much slower but still doable without any special setup or equipment.
Tables of precalculated hashes of data like this are easily found online or easily generated. If you have a hash of some data (like a password) and you want to see what that data originally was, you can compare the hash to the entries in your precalculated table. If you find a match, you have discovered the data that was originally used to generate the hash - the password you were trying to find out.
So basic password hashing is, essentially, useless for the majority of users. It is a simple process to compare hashes of basic passwords to a table of precalculated hashes and thereby "dehash" passwords en masse.
Some people recommend nesting hashes as a way to make add complexity and therefore more security. Unfortunately, to generate tables of nested hashes is almost as easy as plain hashes by themselves, and no more secure.
Add Salt!
The solution is to hash more than just the user's password, and this process is called "salting". For example, instead of storing a hash of a user's password, you could store the hash of their email address and their password together.
This is effective because tables of hashes of generated data of more than about 10 characters start to become problematic to generate and store. At around that point, tables must be generated based upon dictionaries and known words, rather than on programatically generated lists of all possible passwords in a range.
The average length of "email plus password" is easily in the region of 25 characters. Not only that, but if someone worked out that you were using hashes of "email plus password", they would still need to generate a new table for every password they wanted to dehash.
This level of complexity, added to a reasonably strong password policy, ensures that if (or when) your user data is exposed, the work involved in extracting usable passwords from it is going to stop all but the most determined attackers. Not only that, but even they will find extraction of data in bulk prohibitively difficult.
Some Thoughts on the New Site
It's been a while since I began work on the new version of AddedBytes.com. There was a lot to do. I had a few specific aims for the move:
- New Platform: bBlog, though it had served me well, was no longer being updated. The version I was running was hideously out of date, fairly bloated, and a pain to maintain by the end. It was also causing problems for the VPS it was running on.
- New Host: The site was outgrowing the JaguarPC VPS it was on, and was suffering downtime. I wanted more control over my own hosting and an opportunity to learn more about hosting my own site. So a self-managed VPS was on the list.
- New Design: The old design was thrown together pretty quickly after the incident with the trademark police. I wanted something brighter, cleaner and more spacious.
In no particular order, here are a few useful (hopefully) tricks I learned along the way:
HTML 5
The site uses HTML 5, at least in a basic fashion - <header> tags, that sort of thing. I like a lot of what HTML 5 promises, but so far I'm not sure what the advantages are to you, the reader. Semantic improvement is great, but many of the advantages of HTML 5 seem to be in areas that I just won't be using - audio, video etc.
Hosting
There are a bajillion hosts offering everything under the sun to anyone with a dollar to their name. The small print for each is, it seems, a bit of a minefield. What's the point in offering unlimited bandwidth when CPU usage is capped at such a low level that 6 people visiting on the same day will put you over the limit?
Equally, what's the point of hiring support people without basic technical hosting knowledge? In the course of my regular work, I've encountered support staff at one major international host who told me that it was normal for a site to go offline for three days while it was moved from shared hosting to a VPS. Three days to move an ecommerce site? Normal? Last time I had a site moved by a host it happened in off hours, cost nothing and there was no downtime. The same support staff have told me that it is impossible to host a site with one company and the DNS with another.
I'm in danger of veering off into ranty territory here, so I'll pull myself back from the brink and leave you with what I've taken away from the experience. That being that there is an unholy triangle for web hosting - you want a cheap price, high capacity and great support? Pick two.
Ultimately, I've gone the self-managed route (with Slicehost) for a couple of reasons. First, hosting management is something I dabble with occasionally when a client requires it, but my knowledge is spartan at best and I can increase my own value to an employer if I bring server management skills and experience to the table. Second, I wanted to be forced to fix my own mistakes. When something breaks now, I'm not going to be able to simply file a support ticket and have it fixed. Baptism by fire.
Typography
I'm lucky enough to work with some great designers at GSBA, and have been picking up little bits and pieces along the way. One thing that I have come to realise is that adjusting typography is a neverending mission. One minute you're playing with line heights and word spacing ... the next you're worrying about orphans and adjusting margins to line up letter stems.
You're Awesome
Within a couple of hours of launching the new site I had a collection of emails from people pointing out problems. And these were good bug reports - including a URL, an explanation of what they saw ... sometimes a screenshot. Made replicating and fixing problems a doddle. So thanks!
Character Sets Can Ruin You
The old site's database stored data in ISO-8859-1. The data was actually UTF-8. PHP was running as ISO-8859-1. The site was rendered as UTF-8. I think. I'm still not entirely sure. Moving that data to a system running entirely as UTF-8 was a painful experience. Character sets can be a nightmare, especially when you're setting things up yourself. Pick one and stick with it, for everything. One day, when I can bring myself to (or am forced to) revisit the problems I had with this, I'll write up what I learned and how I got things working.
Listen to Other People
Especially when they tell you that when you're working on your IP tables configuration, it's important not to log out until you've verified everything works. Because if you log out before you check and something's broken, you cannot get back in. Tough lesson, that one. Which brings me to ...
Backup from the Beginning
Seriously, it's never, ever too early. If you start out by grabbing snapshots and organising them effectively, you'll be much more likely to be in a position to revert a bad change. With self-managed hosting in particular it is very easy to rush a change that looks innocuous but ends with it utterly destroying everything you've worked on so far.
Separation is Good
I moved my email to Google Apps a couple of years ago (and it's been great so far). I dread to think how complicated this would have been if I'd kept my email at the same place as the site. Having the email handled elsewhere hugely reduced the work involved in moving things around. Same with the DNS - it's entirely separate from everything else. The move involved just changing A records around - minimal risk and easy to reset.
Launching Feels Good
It feels good, but nerve-wracking, to press the button and make a move. It feels even better when, a few days later, you haven't had to revert the entire process.
New Server, New Platform, New Design
Way back in 2004, when I first launched this site (under a somewhat more unusual name), I opted not to write something from scratch to run it. No sense in reinventing the wheel.
What I picked was bBlog, an open source PHP and MySQL blogging platform. It was Smarty-based, easy to extend, and easy to administer. There was a nice community developing plugins for it and pushing development along.
Unfortunately, bBlog's development slowed ... and then ground to a halt. Eventually, the project was declared dead. And so began my quest to move to a new platform.
After lots of umming and ahhing, and after trying a few alternatives and asking for suggestions on Twitter, I decided to give MODx a try. It looked like a capable system, easy to extend and modify, and reasonably powerful.
I also decided to change my hosting package. JaguarPC had hosted my site without too much trouble for a couple of years, but I wanted to learn my way around server management, so I opted for a Slicehost package.
And because all good things come in threes, and everything else was going to be new, I decided to put together a new design too. The old one was thrown together rather hastily and although it worked, I wanted to make a clean start.
So here we are ... MODx, Slicehost, new design. There are bound to be a few teething problems (please let me know if you have any problems), but this will also allow me to add some of the new things I've been wanting to do for a while. To that end, comment replies have now been added - woohoo!
Known Issues
- Most tools not working.
- "Remember Me" not working on comments.
- Live preview not working on comments.
Life Chart
Inspired by the brilliant graphs over at Information is Beautiful, I spent some time finishing up a little data visualisation I've been meaning to do.

I seem to be on a permanent quest to reduce time wastage, and I was curious where my time was actually being spent. So, I recorded a week of my life in 5 minute intervals, rounded numbers to the nearest hour, grouped similar activities and popped it in a spreadsheet.
It turned out, fortunately, to be a fairly typical week. Work was hectic, and I spent a few extra hours at the office buried in code. I had Open University work to do, though no more so than usual. I didn't do enough cooking or reading, watched too much TV and definitely didn't get enough sleep. Running is conspicuously absent, though I did have a couple of squash matches.
Unfortunately, this didn't really tell me much I didn't already know. There were no giant time-sinks that I was blissfully unaware of. I need to spend a little less time at the office, in the pub or watching TV and a little more time reading, exercising and sleeping.
By the way, I have no way of knowing how much of an impact, if any, the Observer Effect will have had on the data, but this is fairly unscientific anyway, so I'm not going to lose any sleep over it.
Mathematical Anniversaries
My wife, as is traditional among married folk, celebrates the number of times our Pale Blue Dot has whizzed around our Bright Yellow Dot since we got married, back on the 22nd April 2006. These are known as anniversaries (from the latin for "return yearly"), and are typically excuses to celebrate events that occurred in the past on the same day of the year as the original events.
People, it seems, are suckers for an anniversary. We celebrate the anniversaries of our own births. We celebrate the anniversaries of the births of our friends and families. Some of us celebrate the anniversaries of the births or deaths of people we've never met. Some of us even celebrate the anniversary of the passing of an entirely arbitrary point in space.
Say what you like, we're big on tradition on this little rock.
Men tend to suffer under this harsh regime of date management. While both we and the fairer sex are obliged to remember all of these days and events, it seems it is usually men who more easily forget them (though it is all much easier since the invention of synchronised online calendars). Not only do we have to remember them, but we are encouraged to buy cards and gifts for people to help them celebrate. It seems it is never enough just to remember - you must provide evidence of your remembering.
This practice of celebrating the number of orbits a marriage has lasted or a person has lived has always seemed a little arbitrary to me. Why hang your hat on that particular astronomical curiosity? Why put special emphasis on multiples of ten of these particular planetary pirouettes? Why not celebrate other milestones?
I feel it is only fair to warn you that things may get a little geeky from this point onwards.
The most astute of you may have calculated by now that the 22nd April 2006 was exactly 1,235 days ago, making yesterday 1,234 days since my wedding. Well done to you. This is, I hope you will agree, a Major Event. It is 3.38 times rarer than the traditional annual anniversary. Our pi-versary (every 3142 days), due in a few years, is far rarer - a pi-versary happens just once every 8.6 years.
I celebrated this latest occasion by buying my wife a 1,234th anniversary card. Similarly, 235 days ago I bought her a 1,000th anniversary card. She, sadly, forgot both of these landmark anniversaries. This may prove useful, should I falter with a later anniversary myself.
By no means are these mathematical anniveraries limited to those specific numbers, or wedding anniversaries, or even days or years passing. My 12,345th birthday will be on the 29th April 2014. My 50th wedding lunar-versary is going to be the 16th January 2010. The list of opportunities to celebrate (or to be offended at friends and families for missing anniversaries, if you prefer) is nearly endless.
Maths and physics, indeed, are full of interesting numbers, sequences and constants. Pi, e, the square root of 2, the golden ratio, square numbers and higher powers, the Fibonacci sequence, the gravitational constant ... all worthy of celebration.
It would be silly to expect you to calculate these anniversaries manually. Therefore I have created a tool to aid you. Please enter the date you wish to commemorate, and click the button to generate a list - extensive but by no means complete - of significant dates to celebrate. The list will be shown on the next page, and you will be able to download the events to your chosen calendar if you like.
New Opera Logo
I've long been a slave to the underdog browser known as Opera. While its market share may not be the greatest (though it does very well with alternative devices), it's served me well the last few years, staying fast and featured-packed.
The logo sucks though. Really. It's terrible. Firefox has a great logo, designed by Jon Hicks, as does Safari, Konqueror, Chrome and even IE. But Opera's has always been an also-ran logo - too lopsided to hang out in the dock, too simple to invogorate the imagination, too un-browsery to evoke thoughts of zipping down the information superhighway consuming content at breakneck speeds.
People have talked about it before. Some have suggested and designed alternatives. (I use this one in my RocketDock / OSX Dock / AWN Dock.)
However, the people at Opera (now including the aforementioned Jon Hicks, who credited Oleg Melnychuk with the new icon design) have listened and finally released a new logo. It's still not the greatest logo out there, but at least it isn't lopsided in the dock any more.

Hat tip: Joen @ Noscope.
Personal Development: To Do - Revisited
It's been almost a year since I posted an entry titled Personal Development: To Do, in which I talked about being disorganised (because I am) and wanting to get my house in order. To that end, I wrote up a list of things I wanted to get done.
A year on, I wanted to revist this and update it as required. To start with, here is the list from August 2008:
- Keep on blogging!
- Keep on making cheat sheets!
- Move AddedBytes (set up server).
- Thin out project folder and pick 2 to work on until finished.
- Write a web service.
- Write SVN Statistics app in Python (learn Python).
- Rewrite site management VB app in Python (learn Python).
- Learn Objective-C and Cocoa by writing a Useful Small Mac App (decide on what app!).
- Learn a new PHP framework.
- Get involved in an open source project.
- Update and release more code from AddedBytes.com under open source license.
Looking at it, I am glad that I've managed to tick a few items off. Moving Added Bytes is almost done, but for a couple of small issues and some indecision. I've thinned out my projects and am working on one or two things at a time, which is good. I've made a few new cheat sheets - not as many as I would have liked, but a few.
About a year ago, I started work at Active Parity, which gave me opportunities to develop and work on several systems I had wanted to spend time with. I've written a couple of small-traffic web services. I've learned my way around the Zend framework, MODx, Joomla and a variety of other CMSes and systems. I've released more code under open source licenses (not much though, and mostly for MODx).
The rest of the list has not gone quite so well. I've not written a great deal this last year (partly due to not having much in the way of topics - anything I have started writing about, or considered writing about, seems to have been done in exhaustive depth already). My plans to learn Python, Objective-C and Cocoa have made little progress. And although I've been playing with MODx, I've not really got involved with the project itself, or any other open source project.
I'd say I've managed about 50% of the list this year, which I'm ok with. Certainly, my "unfinished project guilt" has been reduced, which was one of the main aims. I'm more motivated that beforeand finding it easier to spend time on specific things.
I also took on a couple of extra items that weren't there. The largest of those, in terms both of time and how valuable I feel it's been, is that I've started working towards a physics degree with the Open University. I've completed two courses so far, with a few weeks left on my third course, which means I'm about a third of the way through according to the OU's points system.
The OU work is a major time commitment, and has that's left less time for me to dedicate to working on other projects. On the other hand, it's been really good for my brain. I can't recommend it highly enough to anyone considering something similar - returning to academic learning after 10 years of dissolving my brain with beer was among the best decisions I made last year.
I'm glad I made the list last year - it helped me to focus my energies better. So I'm going to update it for the next year. This time, I'm going to break it into two parts. One is ongoing things I want to do - things that will be on the list for the forseeable future and that, more importantly, aren't ever really finished. The second is specific things I want to do in the next year. Some items from the original list, though not done, are on the back burner or now fall under the umbrella of a different item.
Ongoing To-Dos
- Keep on blogging!
- Keep on making cheat sheets!
- Keep projects list to two at a time.
- Release more code under open source licenses.
- Improve server administration skills.
- Continue learning with Open University.
Specific To-Dos
- Move AddedBytes.
- Move projects from Google Code over to GitHub.
- Use Drizzle in a project.
- Master database version control.
- Write site management app in Python (oh yeah ... learn Python).
- Get involved in an open source project.
Trust Not Granted: A guide to what you can and can't do with XBAPS
I've done poorly at updating these last few months ... still moving the site to a new server but work is relentless at the moment and leaving me with little enthusiasm to carry on with more of the same when I get home!
Fortunately, I have friends and colleagues who are not so similarly burdened at the moment, and one of these is Allan Wenham, .NET developer extrordinaire. He has put together a guest post - a short guide to non-permitted actions in XBAPs. Over to you, Al:
So you're going to make an XBAP application, lucky you. First thing you should consider is the limitations imposed by the security model and importantly the framework itself.
Rule 0: You must have at least framework 3.0 to run an XBAP, 3.5 SP1 is highly recommended.
Rule 1: You can't run an XBAP on any other browser then IE in with .net framework 3.0. 3.5 SP1 of the .Net framework supports Mozilla Firefox; sadly no other browsers are supported at this time.
Rule 2: There are many things you cannot do with your XBAP in partial trust mode. Here I will provide what I hope is a fairly comprehensive list of non-permitted actions that will give you the dreaded "Trust not Granted" error. Main because as the time of writing I couldn't find such a list!
- Opening up a new browser window
- Directly connecting with a database
- Any File IO
- Talking to a WCF services that is not on the same hosting server
- Talking to a WCF service that has any other binding apart from BasicHTTPBinding
- Most standard dialogs (as well as input box)
- OS driven Drag and Drop
- Bitmap Effects- although these are deprecated in .NET 3.5 SP1
- Shader Effects
Al works for Venture Finance PLC doing .Net Programming. When he's not building websites, he's writing about Turkish Delight (Fry's specifically). You can email him with questions at wenhamton@gmail.com or leave comments below. Thanks, Al!
Introducing TriviBot
Many years ago, I used to be a regular in the #quiz chat room on DALnet. That's on IRC, for you whippersnappers who don't know what DALnet is. If you don't know what IRC is ... then get off our interland, youngster.
The #quiz channel was great fun. A bot, powered by MoxQuizz, would ask a question. First to get the answer right got a point. First to 30 won the game. Hours flew by and were forever lost.
I was looking for a small project to while away a morning and wanted to play with Twitter's API, so decided to write a basic quizbot for Twitter along similar lines to MoxQuizz. After a couple of hours behind the keyboard, I am pleased to be able to introduce you to @TriviBot.
The idea is pretty simple - answer a question first to get a point. First to 10 (at the moment) wins the game. Leaderboards and all-time high score tables will be put up shortly.
In all honesty, I have no idea how well or badly this kind of idea will translate to Twitter. Twitter is, after all, not a chat room network like IRC. However, I've enjoyed writing the thing and testing it has been good fun.
LetMeGoogleThatForYou Bookmarklet
I'm sure someone must have already done this and I'm just incapable of finding it (despite Googling it myself) but I figured that the only thing missing from the brilliant LetMeGoogleThatForYou was a bookmarklet, so I made one: LetMeGoogleThatForYou. Highlight text, click bookmarklet and voila - patronisation on demand.
