Analyzing the XKCD Passphrase Comic
I rarely see any discussion of password strength without seeing th XKCD comic below brought up to illustrate that a long pass phrase is better than a shorter random jumble of characters. Since this is something I have been arguing for fifteen years, this is something I do agree with, although adding a little more randomness and complexity is still necessary.
In 2006 I wrote Pafwert, a random but smart password generator, to illustrate this concept. Pass phrases are easier to remember, easier to type (we type in whole words), and are generally much stronger passwords. My philosophy has always been that length is more important than any other factor for password strength.
But not everyone agrees. Most often the argument against the pass phrase technique is that since the password is made up of 4 whole words, basically this isn’t that much different than a 4-character password, you just need to adjust the brute-force tools to work with whole words instead. While this is somewhat true, it doesn’t take much to turn this technique into something extremely effective.
How Strong are Pass Phrases?
To determine password strength, we generally determine how many passwords have similar characteristics. In other words, if finding a password is like finding needle in a haystack, the critical question is how big is that haystack?
To do the math on this, we need to determine how large a set of words the average English-speaking user would likely choose from. Some English language dictionaries include well over 150,000 words but most linguists agree that the average-intelligence English speaker has a vocabulary of somewhere between 7,000 and 15,000 words.
What is misleading about these numbers is that dictionary words are only a small part of our vocabulary. Consider these other non-dictionary words:
- Proper nouns such as McDonalds, Lady Gaga, Instagram, JQuery, and possibly hundreds of thousands of other words that are part of our daily vocabulary.
- Domain names like facebook.com, flickr.com, and thousands of others.
- Popular slang and social jargon (see your average Facebook post).
- Alternate spellings, leetspeek, etc.
- Acronyms such as WWW, CISPA, SSN, WWII, and SMS.
- Words from other languages
- Programming language elements and function names
- And don’t forget written-out numbers, you will never find “1,276,209″ in a dictionary and there are millions of those.
Forget dictionary words, our vocabularies are HUGE.
So how many actual words do we know? It is impossible to say but a very conservative estimate would be a minimum of about 25,000 words. Realistically this number is much higher than this but we will use 25,000 here just for illustration.
Now if we are picking 4 random words from a set of 25,000 words the number of possible combinations is 25,0004 or 390,625,000,000,000,000 (noted as #1 on the table below) which is about the strength of a 9-10 character alphanumeric password (see this chart). But passwords are case-sensitive and we often capitalize one of the words so realistically we are talking about 50,000 words or 50,0004 or 6,250,000,000,000,000,000 possible combinations (noted by #2 on the table below) which is about as strong as a 10-11 character alphanumeric password.
What’s interesting to note is that even a 3-word phrase results in 125,000,000,000,000 possibilities so even that would be roughly equivalent to a 7-8 character alphanumeric password which is the most commonly-seen password.
Making Them Even Stronger
Now most people have already developed techniques to make passwords stronger by adding some numbers or otherwise mutating that word so that it would not appear in a dictionary. That is why we often see passwords like dr@gon or freddy2000. Now these are very weak passwords by themselves but if you use this same technique in a pass phrase you can make them much stronger.
Remember, we are dealing with numbers that grow exponentially so a technique that is mediocre with a short password is incredibly effective with a long password.
Now consider the following pass phrase: Picking at 200 p1ckles
Or this one: I’m alway sthe first
Or this one: How bout the 0xFC?
It’s a simple technique and a minor change but by doing this we have greatly expanded our 50,000 words. Many password cracking tools are very good at generating word permutations and can very quickly create and try hundreds of variants of a single dictionary word. But when you multiply that times 4 words, the numbers grow very fast.
Say, for example that for each of our original 25,000 words there are approximately 100 different mutations. That means we now potentially have a vocabulary of 2,500,000 words. And 2,500,000^4 equals 39,062,500,000,000,000,000,000,000 possible combinations of 4-word phrases (shown as #3 on the table above) which is stronger than a 14-character alphanumeric password.
So yeah, the XKCD recommendation is valid. And all you have to do is add a few simple mutations to make that method incredibly stronger.book, Brute Force, cispa, complexity, crack, cracking, dictionary words, eff, numbers, pass phrases, password, Password cracking, password cracking tools, Password strength, Passwords, random, randomness, techniques, Tools, xkcd