The Secret World of Codes and Code Breaking

Age 7 to 16
Article by NRICH team

Published 2004 Revised 2011

magnifying glass

When you think of spies and secret agents, you might think of lots of things; nifty gadgets, foreign travel, dangerous missiles, fast cars and being shaken but not stirred. You probably wouldn't think of mathematics. But you should.
Cracking codes and unravelling the true meaning of secret messages involves loads of maths, from simple addition and subtraction, to data handling and logical thinking. In fact, some of the most famous code breakers in history have been mathematicians who have been able to use quite simple maths to uncovered plots, identify traitors and influence battles.

The Roman Geezer
Let me give you an example. Nearly 2000 years ago, Julius Caesar was busy taking over the world, invading countries to increase the size of the Roman Empire. He needed a way of communicating his battle plans and tactics to everyone on his side without the enemy finding out. So Caesar would write messages to his generals in code. Instead of writing the letter 'A', he would write the letter that comes three places further on in the alphabet, the letter 'D'. Instead of a 'B', he would write an 'E', instead of a 'C', he would write an 'F' and so on. When he got to the end of the alphabet, however, he would have to go right back to the beginning, so instead of an 'X', he would write an 'A', instead of a 'Y', he'd write a 'B' and instead of 'Z', he'd write a 'C'.

Complete the table to find out how Caesar would encode the following message:

Caesar's message A T T A C K   A T   D A W N
  B U                        
  C V                        
Coded message D                          

When Caesar's generals came to decipher the messages, they knew that all they had to do was go back three places in the alphabet. Have a go at trying to work out these messages which could have been sent by Caesar or his generals:

hqhpb dssurdfklqj
wkluwb ghdg
uhwuhdw wr iruhvw

Easy as 1, 2, 3
This all seems very clever, but so far it's all been letters and no numbers. So where's the maths? The maths comes if you think of the letters as numbers from 0 to 25 with A being 0, B being 1, C being 2 etc. Then encoding, shifting the alphabet forward three places, is the same as adding three to your starting number:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

For example, encoding the letter 'A' is 0+3=3, which is a 'D'.
Coding 'I' is: 8+3=11, which is 'L'.
However, you do have to be careful when you get to the end of the alphabet, because there is no letter number 26, so you have to go back to number 0. In maths we call this 'MOD 26', instead of writing 26, we go back to 0.
Have a go at coding your name by adding 3 to every letter. Then have a go at coding your name by shifting the alphabet forward by more places by adding greater numbers eg adding 5, then adding 10. Then have a go at decoding. If your letters are numbers and encoding is addition, then decoding is subtraction, so if you've coded a message by adding 5, you will have to decode the message by subtracting 5.

If you've got the hang of coding messages by shifting the alphabet forward, then you might have realised that it is actually pretty simple to crack this type of code. It can easily be done just by trial and error. An enemy code breaker would only have to try out 25 different possible shifts before they were able to read your messages, which means that your messages wouldn't be secret for very long.
So, what about coding messages another way? Instead of writing a letter, we could write a symbol, or draw a picture. Instead of an 'A' we could write *, instead of a 'B' write + etc. For a long time, people thought this type of code would be really hard to crack. It would take the enemy far too long to figure out what letter of the alphabet each symbol stood for just by trying all the possible combinations of letters and symbols. There are 400 million billion billion possible combinations!
This type of code was used by Mary Queen of Scots when she was plotting against Elizabeth the First. Mary wanted to kill Elizabeth so that she herself could become Queen of England and was sending coded messages of this sort to her co-conspirator Anthony Babington. Unfortunately for Mary, there is a very simple way of cracking this code that doesn't involve trial and error, but which does involve, surprise, surprise, maths.

letter sent by Mary Queen of Scots
Letter sent by Mary Queen of Scots to her co-conspirator Anthony Babington. Every symbol stands for a letter of the alphabet.

Letters in a language are pretty unusual because some get used more often than other letters. An easy experiment you can do to test this out is to get everyone in your class to raise their hand if they have the letter 'E' in their name. Then get all those with a 'Z' to raise their hand, then a 'Q', then an 'A'. You will probably find that 'E' and 'A' are more common than 'Z' and 'Q'. The graph below shows the average frequency of letters in English. To compile the information, people looked through thousands and thousands of books, magazines and newspapers, and counted the number of times each letter came up.

graph showing frequency of letters in English

In English, E is the most commonly used letter. In any piece of writing, we use E about 13% of the time on average. 'T' is the second most common letter and 'A' is the third most commonly used letter.
And it's this information that can help you to crack codes. All Elizabeth the First's Spy-Master had to do to crack Mary's code, was to look through the coded message and count the number of times each symbol came up. The symbol that came up the most would probably stand for the letter 'E'. Look at our Ancient Runes problem for another code that could be deciphered by counting how often each symbol appears.
When you crack codes like this, by looking for the most common letter, it's called 'frequency analysis', and it was this clever method of cracking codes that resulted in Mary having her head cut off. CHOP!

Test your talents
Cracking these coded messages doesn't just involve looking for the most common symbol, you can also look for symbols that are all out on their own in the message ie one letter words. There are only two one-letter words in English, 'A' and 'I', so a lone symbol would have to stand for an 'A' or 'I'. Another thing you can look out for are common words. The most common three letter words in English are 'the' and 'and', so if you see a group of three symbols that comes up quite a lot, they could stand for 'the' or 'and'.
If you would like to test out these code breaking tips and your new code breaking talents, have a look at Simon Singh's Black Chamber. It has Caesar shift and frequency analysis puzzles for you to break, and other codes that you can try to unravel.
For more information about other secret codes that have been used throughout history, check out Simon Singh's web site. It's packed full of information about all sorts of codes, including the famous story Enigma, the code machine used by the Germans during WWII. The Germans thought their code was invincible, but incredibly, British mathematicians managed to break the code and read all the messages sent by the Germans during the war. Historians think that having this inside information shortened the war by two whole years.

After reading this, you might fancy making up some codes of your own, and writing you own secret messages. BE WARNED. Other people have also read this article and they too will be top mathematical codebreakers. Spies are everywhere, so be careful - who's reading your messages?

Claire Ellis, the author of this article, was director of the Enigma Project, which takes codes and code breaking, and a genuine WW2 Enigma machine, into the classroom. For more information contact the new director, Claire Greer, via the Enigma Schools' Project web site.