Password Selection

James Young · December 4, 2012

The following sums up about what this post will be regarding perfectly.

Courtesy XKCD 936

As it turns out, the IT Industry has spent an awful lot of time trying to convince people use horribly complicated passwords which are terribly difficult for a human to remember.  But they’re really easy for a computer to guess.  This leads to a number of security failures that people do to try and fit into the restrictions of their systems;

  • They use a password which is based on a single dictionary word and then tack on some symbols to fit into the requirements of whatever system they’re using.  Enter stuff like ‘Password2012!’.
  • They use the same base password, and then just increment some number at the end every time it expires.  So they wind up using ‘Password15’, ‘Password16’ and so on.
  • They use a decent password, but use it everywhere.  I’ll talk about password re-use later.
  • They use a decent password, but it’s derived from a formula where the name of the service or similar is bound into the password, making it easy to reverse engineer the password.  For example ‘Myhotmail1!’ or something like that.

Now, mathematically speaking, a long password is VERY MUCH harder to guess than a shorter, but more complex one.  The above cartoon illustrates this.  Selecting words in your native language, if you pick enough of them, is much much more secure than a nearly random bunch of punctuation and capital letters.

Why are random words easier to remember?

Interestingly, such a password is actually easier for a human to remember than a complex symbol-based password.  Why?  Because human memory works in symbols.  Your memory can store in correct order between 6-8 symbols with not too many transposition errors.  Four symbols is trivial to remember.  Your memory is able to store words in your native tongue as individual symbols, so a string of four random words is stored as only four symbols - easy to remember, easy to get in the right order.  However, the complex password above with punctuation symbols will be stored by your memory as several symbols - one will be the dictionary word you’ve based it on, then individual symbols for all the permutations you made to it.  Suddenly you’re trying to store 6 or more symbols in your memory at once.  This is near your upper limit, and transposition errors and other mistakes creep in.

So quotes and stuff are great, right?

Actually, no.  See, it turns out that humans are absolutely terrible at picking random strings of text.  Cracking dictionaries now contain the most common quotes that people tend to use, meaning their effective strength is greatly reduced.

You’re best off selecting a number of RANDOM words.

Ok, so REALLY random words.  What do we do?

Which brings us to the following resources;

The password generator I wrote uses the General Service List, a list of ~2k commonly-used English words.  The list used is very similar to the list that the XKCD Generator uses, with the notable exception that my generator uses a Mersenne Twister random number generator, which produces much better quality random numbers than the random() implementation in base JavaScript.  Both generators use client-side Javascript so they don’t record or otherwise log any of the passwords generated.

The beauty here is that it doesn’t matter if the word list is publically available.  It doesn’t matter that the algorithm used to generate the passwords is published.  The generated passwords are still strong.

Go forth and pick good passwords!

References:

Twitter, Facebook