[an error occurred while processing this directive]

classical cryptographycryptanalysis
the basics
introductionthe stepsthe basicsnextprevious

The most important tools in cryptanalysist toolbox are several tendencies of languages that are generally consistent and help make plaintext 'stand out.' This allows the cryptanalyst (ie: you) to compare the properties of ciphertext to the properties of normal plaintext. The more official name of these 'tendencies' are frequency distributions which represent the general distribution of characters in regular writing for a language. Other frequency distrubutions, including digraphs and trigraphs show the most common two and three letter pairings in languages.

In English, the letter 'e' is by far the most used letter, while the word 'the' is the most common. When analyzing messages, it is very useful to know the frequency distributions for many letters and key phrases. E, T, O, A, N, I, R, S, and H are the most common letters (in that order.) If you want to remember them, just try saying etoan-irsh a few times and pretty soon you'll have it. The rest of the letters, digraphs and specific frequencies will be so useful that eventually you'll memorize them subconciously without even trying.

The chart below shows the percent frequencies of all the letters in the alphabet. This chart is based on over 4.5 million letters we counted from five famous novels (well, we had a computer count for us). If you'd like to see how your data matches up to the novels (The Adventures of Tom Sawyer, Great Expectations, Moby , A Tale of Two Cities, and War and Peace), type in your own text into the box and press enter. A resulting graph will let you compare the frequency distribution of your text and a large average of text.

Normal frequency distribution



With the addition of probability and statistics, the cryptanalyst can see minor deviations, non-random and random patterns, and other tell tale signs of possible encryption techniques through the abnormal (or too normal) distributions of characters and combinations of characters. For now, we'll stick to just the standard frequency distribution table, and contact chart.

In addition to the frequency distribution chart above, a contact chart will also be of great value while solving cryptograms. The first step in making a contact chart is to find the frequency distribution for the message. Then, the letters of the frequency distribution should be written down in a single column, with the most abundant letter at the top and the least at the bottom of the column. To the right of each letter in the column, the entire frequency distribution alphabet (in order, higest to lowest) should be written. Next, the cryptanalyst rereads the message and looks for each instance of the most abundant letter. Whenever that letter is found, the cryptanalysist checks to see which letters are behind and in front of the letter. If, for example, he finds a 'h' in front of an 'e', he put a mark above the 'h' in the row of the 'e'. And if he finds a 'k' after an 'e', he'll put a line under the 'k' in the 'e' column. The cryptanalysist repeats this procedure for each letter of the column. In the end, he has produced a chart that a allows him to quickly see which letters are in 'contact' with each other and in which order. All this may sound very confusing, so take a look at the example contact chart created below by the text of this paragraph.

Contact chart



In the contact chart example above, letter 'e' is strikingly obvious by looking at both it's abundance and the way it interacts with other letters. 'e' comes into contact with many consonants easily and the contact chart does a good job of pointing this out (other vowels are more exclusive.) Also, the contact chart allows you to quickly see that 'h' precedes 'e' far more than other letters, while 'th' is another one of the most popular digraphs. While this analysis may seem a bit silly since all of the letters in the chart represent their true identities, a contact chart becomes very useful when letters don't represent themselves. By knowing the general behavior of letters in the contact chart, you'll be able to spot them easily, even when masquerading as something else.

The frequency distribution chart and contact chart are all fundamental concepts that should be understood clearly. Of course the best way to learn a concept is to practice it -- which means that you should try making these charts on your own. Although the process may seem tedious, the results are worth it. However, in spite of the advantages of making the charts yourself, we realize that this is the '90s and everything is instant. So, we made the programs that you used above to help reduce the time. If you want to do it the old fashioned way, we salute you! And if you want to take the easy way out and use the programs, well, it's alright, we understand (we do it sometimes, too). So, whenever we call on your cryptanalysis skills in our site, we'll provide links to these programs to help speed your work.

crypt agent challenge
And so that's the gist of cryptanalysis! Now that you've read through (hopefully) our cryptanalysis lesson and tried our cryptanalysis programs, you should be ready for your first CryptAgent Challenge. You'll recieve 12 AgentPoints for sucessfully completing this challenge (and it's pretty easy, too!). Click here or on the button to try CryptAnalysis Challenge!

If you're not a CryptAgent, you can find out more and register if you like. It's free, and it's fun!

the steps

[an error occurred while processing this directive]