Cryptopals – Set 1: Basics
This set covers the basic concepts of cryptography. It teaches you some valuable lessons like the need to always work with raw bytes and that XOR’ing is a quite common within encryption algorithms. As the authors already mentioned, these challenges serve as a qualifier for the upcoming challenges.
Since I’m coding in C#, I’m using the Visual Studio Code IDE. In my project, I’m using a dedicated namespace to store all my challenges. My folder looks like the following.
Cryptopals/
├── bin
├── Challenges
├── Cryptopals.csproj
├── obj
└── Program.cs
My Main()
function in my Program.cs
file simply contains code that generates an object of one of my challenge classes inside the Challenges
namespace and runs the solution by calling the Solve()
function.
using System; using Cryptopals.Challenges; namespace Cryptopals { class Program { static void Main(string[] args) { // To run a specific challenge, change the class to "Challenge_#" where // "#" represents the challenge number. // If necessary, provide parameters to the Solve() method. // e.g. to run challenge 3, use "Challenge_3" as the class Challenge_1 challenge = new Challenge_1(); challenge.Solve(); } } }
We can simply change the challenge number in this code to solve another challenge. In the future, we might add parameter values to the Solve()
function.
Challenge 1: Convert hex to base64
Simple challenge: you have received a hex string and need to convert it to base64. Small notice: do NOT directly encode the hex string to base64, as you will receive a completely different result.
It is important to remember that, when working with crypto, we need to work with raw bytes. So, to solve this challenge, first decode the hex string to a byte array and encode the bytes to base64.
using System; namespace Cryptopals.Challenges { public class Challenge_1 { private void PrintInfo() { string info = @" ---------------------------------------------- | Set 1 - Challenge 1 | | Convert hex to base64 | | https://cryptopals.com/sets/1/challenges/1 | ---------------------------------------------- "; Console.WriteLine(info); } public void Solve() { PrintInfo(); string hex = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d"; Console.WriteLine($"Hex string: {hex}"); // Always work with raw bytes! byte[] byte_arr = Convert.FromHexString(hex); string result = Convert.ToBase64String(byte_arr); Console.WriteLine($"Result: {result}"); } } }
Running the code with dotnet run
will give us the following output.
Challenge 2: Fixed XOR
In this challenge, we have to XOR the hex string of:
1c0111001f010100061a024b53535009181c
With the following hex string:
686974207468652062756c6c277320657965
And produce the following result:
746865206b696420646f6e277420706c6179
It’s crucial that these two strings have an equal length, as this is required for our XOR function that we will implement. This is because we will be XOR’ing each byte of one hex string with a byte of the other string. We simply cannot XOR bytes with each other if the values do not have the same length, since there wouldn’t be anything to XOR with.
We have written the following function to XOR two byte arrays with each other that have an equal-length buffer:
private byte[] XOR(byte[] ba1, byte[] ba2) { // Check if the length of each string value is the same if (ba1.Length == ba2.Length) { // Create new byte array buffer for our XORed result byte[] xor_value = new byte[ba1.Length]; for (int i = 0; i < ba1.Length; i++) { // Perform an exclusive OR for each byte in both byte arrays and cast to byte xor_value[i] = (byte)(ba1[i] ^ ba2[i]); } return xor_value; } else { Console.WriteLine("Length of values are not the same."); return null; } }
The function expects two byte arrays as arguments and will return a byte
value when both parameters have the same length. Otherwise, a null
value will be returned.
In our Solve()
function, we insert both hex strings into a variable, cast them to byte arrays and finally pass them to the XOR()
function.
public void Solve() { PrintInfo(); string val1 = "1c0111001f010100061a024b53535009181c"; string val2 = "686974207468652062756c6c277320657965"; byte[] val1_bytes = Convert.FromHexString(val1); byte[] val2_bytes = Convert.FromHexString(val2); byte[] xor_value = XOR(val1_bytes, val2_bytes); if (xor_value is not null) { string result = Convert.ToHexString(xor_value); Console.WriteLine($"Result: {result}"); } }
If we now reconfigure our Program.cs
file and enter dotnet run
in our terminal, we can see the following output.
Complete solution:
using System; namespace Cryptopals.Challenges { public class Challenge_2 { private void PrintInfo() { string info = @" ---------------------------------------------- | Set 1 - Challenge 2 | | Fixed XOR | | https://cryptopals.com/sets/1/challenges/2 | ---------------------------------------------- "; Console.WriteLine(info); } private byte[] XOR(byte[] ba1, byte[] ba2) { // Check if the length of each string value is the same if (ba1.Length == ba2.Length) { // Create new byte array buffer for our XORed result byte[] xor_value = new byte[ba1.Length]; for (int i = 0; i < ba1.Length; i++) { // Perform an exclusive OR for each byte in both byte arrays and cast to byte xor_value[i] = (byte)(ba1[i] ^ ba2[i]); } return xor_value; } else { Console.WriteLine("Length of values are not the same."); return null; } } public void Solve() { PrintInfo(); string val1 = "1c0111001f010100061a024b53535009181c"; string val2 = "686974207468652062756c6c277320657965"; byte[] val1_bytes = Convert.FromHexString(val1); byte[] val2_bytes = Convert.FromHexString(val2); byte[] xor_value = XOR(val1_bytes, val2_bytes); if (xor_value is not null) { string result = Convert.ToHexString(xor_value); Console.WriteLine($"Result: {result}"); } } } }
Challenge 3: Single-byte XOR cipher
This is the first interesting challenge of the first set. We have a hex string which has been XOR’ed with a single character, and we have to find the key.
1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736
While this already sounds pretty fun to do, this is not the most interesting part of the challenge. We know that the plaintext is a message written in English. The challenge encourages you to find the correct plaintext based on a “scoring” system. We’ll get into that very soon.
Let’s first focus on decrypting the plaintext and discovering the correct key. Since we’re working with one small hex string and a limited range of possible keys (256 possibilities to be precise), we can simply loop through every character, use it as our key and print the plaintext string.
Let’s start very simple. We have the following code in our Solve()
function.
public void Solve() { PrintInfo(); string hexstr = "1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736"; byte[] ciphertext = Convert.FromHexString(hexstr); // loop through each possible char on the ASCII table (256 bytes) for (int ch = 0; ch <= 255; ch++) { // Attempt to decrypt ciphertext with provided key byte[] result_ba = Decrypt(ciphertext, ch); string result = System.Text.Encoding.UTF8.GetString(result_ba, 0, result_ba.Length); Console.WriteLine("Key: {0} ({1})\tPlaintext: {2}", (char)ch, ch, result); } }
Notice how we loop through 256 integers (0-255). These are all the possible characters that are listed on the ASCII-table. We will take each character and pass it into our Decrypt()
function, along with our ciphertext and will return us another byte array. Let’s have a look at our Decrypt()
function.
private byte[] Decrypt(byte[] ct, int key) { byte[] result = new byte[ct.Length]; for (int i = 0; i < ct.Length; i++) { result[i] = (byte)(ct[i] ^ key); } return result; }
This code should look familiar to you, if you have solved the 2nd challenge. The only difference here is that we don’t check for equal buffers, as compared with the XOR function from the previous challenge. We simply loop through each byte in our byte array, XOR it with our provided key and save the result in another byte array which has the same buffer length as our byte array containing our ciphertext.
Running this code with dotnet run
provides us the following output.
[...] Key: P (80) Plaintext: Kggcafo(EK/{(dacm(i(xg}fl(gn(jikgf Key: Q (81) Plaintext: Jffb`gn)DJ.z)e`bl)h)yf|gm)fo)khjfg Key: R (82) Plaintext: Ieeacdm*GI-y*fcao*k*zedn*el*hkied Key: S (83) Plaintext: Hdd`bel+FH,x+gb`n+j+{d~eo+dm+ijhde Key: T (84) Plaintext: Occgebk,AO+,`egi,m,|cybh,cj,nmocb Key: U (85) Plaintext: Nbbfdcj-@N*~-adfh-l-}bxci-bk-olnbc Key: V (86) Plaintext: Maaeg`i.CM)}.bgek.o.~a{`j.ah.loma` Key: W (87) Plaintext: L``dfah/BL(|/cfdj/n/`zak/`i/mnl`a Key: X (88) Plaintext: Cooking MC's like a pound of bacon <--- Got 'em!! Key: Y (89) Plaintext: Bnnjhof!LB&r!mhjd!`!qntoe!ng!c`bno Key: Z (90) Plaintext: Ammikle"OA%q"nkig"c"rmwlf"md"`caml Key: [ (91) Plaintext: @llhjmd#N@$p#ojhf#b#slvmg#le#ab`lm Key: \ (92) Plaintext: Gkkomjc$IG#w$hmoa$e$tkqj`$kb$fegkj Key: ] (93) Plaintext: Fjjnlkb%HF"v%iln`%d%ujpka%jc%gdfjk Key: ^ (94) Plaintext: Eiimoha&KE!u&jomc&g&vishb&i`&dgeih [...]
Good, we found the correct plaintext and the key that was used (letter X
, decimal 88
). However, we had to scroll through some output and read every decrypted string manually to find the correct one. This isn’t really optimal and requires some automation. Enter our scoring method.
The scoring method that we implemented makes use of letter frequency in English texts. Each letter has a percentage assigned, based on occurrence in English messages. We will make use of these values in our code.
In our Scoring()
function, we start off by declaring a new dictionary that will hold our average frequency values per letter, and a score
variable which will be set to 1.
double score = 1; var letter_frequency_list = new Dictionary<char, double>(); letter_frequency_list.Add('a', 0.082); letter_frequency_list.Add('b', 0.015); letter_frequency_list.Add('c', 0.028); letter_frequency_list.Add('d', 0.043); letter_frequency_list.Add('e', 0.13); letter_frequency_list.Add('f', 0.022); letter_frequency_list.Add('g', 0.02); letter_frequency_list.Add('h', 0.061); letter_frequency_list.Add('i', 0.07); letter_frequency_list.Add('j', 0.0015); letter_frequency_list.Add('k', 0.0077); letter_frequency_list.Add('l', 0.04); letter_frequency_list.Add('m', 0.024); letter_frequency_list.Add('n', 0.067); letter_frequency_list.Add('o', 0.075); letter_frequency_list.Add('p', 0.019); letter_frequency_list.Add('q', 0.00095); letter_frequency_list.Add('r', 0.06); letter_frequency_list.Add('s', 0.063); letter_frequency_list.Add('t', 0.091); letter_frequency_list.Add('u', 0.028); letter_frequency_list.Add('v', 0.0098); letter_frequency_list.Add('w', 0.024); letter_frequency_list.Add('x', 0.0015); letter_frequency_list.Add('y', 0.02); letter_frequency_list.Add('z', 0.00074);
Now for the logic itself in our scoring function. We do a few calculations in order to gain a percentage on how certain we are that we have a valid English plaintext. In theory, 100% would mean that we are absolutely sure that we have an English message and 0% (or even lower) would mean that we’re absolutely sure that we’re not working with an English text. But our scoring implementation will never be sure (which is normal).
Let’s have a look at the code.
// Iterate over the frequency list foreach(var item in letter_frequency_list) { char letter = item.Key; double avg_letter_freq = item.Value; // Count occurrences of each letter int letter_count = text.Count(ch => (ch == letter)); // If the letter is present, calculate the frequency if (letter_count > 0) { double letter_freq = letter_count / (double)text.Length; if (letter_freq > avg_letter_freq) { score -= letter_freq - avg_letter_freq; } else { score -= avg_letter_freq - letter_freq; } } // Otherwise, subtract letter avg. from total score else { score -= avg_letter_freq; } } // Return percentage accuracy return score * 100; }
We’ll loop through each if the items in our letter_frequency_list
. In our loop, we count the occurrences of each letter in our plaintext. If at least one letter has been found, then we will divide our letter count with the length of the plaintext. This gives us the actual letter frequency for that plaintext.
Now, to calculate the actual score, we will be calculating the offset from the average frequency of the letter. If the calculated letter frequency is higher than the average frequency, then we will deduct the average letter frequency from our own calculated letter frequency. Otherwise, we’ll deduct our calculated frequency from the average. This produces the offset from the average letter frequency. That offset will be deducted from the total score, which starts at one.
The idea is simple: the lower the offset we have, the better. The end score that is closest to 100% is most likely the correct English plaintext.
If no occurrences were found of that letter, then we simply deduct the average letter frequency of the total score for that particular letter. Finally, we return the percentage to our Solve()
function.
If we now print the score as well for each decrypted plaintext, we can see the following output. (the below scores were from a previous algorithm, but are still relatively accurate).
[...] Key: S (83) Plaintext: Hdd`bel+FH,x+gb`n+j+{d~eo+dm+ijhde Score: 58.40453% Key: T (84) Plaintext: Occgebk,AO+,`egi,m,|cybh,cj,nmocb Score: 61.34571% Key: U (85) Plaintext: Nbbfdcj-@N*~-adfh-l-}bxci-bk-olnbc Score: 64.28688% Key: V (86) Plaintext: Maaeg`i.CM)}.bgek.o.~a{`j.ah.loma` Score: 52.52218% Key: W (87) Plaintext: L``dfah/BL(|/cfdj/n/`zak/`i/mnl`a Score: 49.581% Key: X (88) Plaintext: Cooking MC's like a pound of bacon Score: 70.16924% <-- Key: Y (89) Plaintext: Bnnjhof!LB&r!mhjd!`!qntoe!ng!c`bno Score: 64.28688% Key: Z (90) Plaintext: Ammikle"OA%q"nkig"c"rmwlf"md"`caml Score: 67.22806% Key: [ (91) Plaintext: @llhjmd#N@$p#ojhf#b#slvmg#le#ab`lm Score: 67.22806% Key: \ (92) Plaintext: Gkkomjc$IG#w$hmoa$e$tkqj`$kb$fegkj Score: 67.22806% [...]
In our final code, we stored the best score along with it’s respective plaintext inside two dedicated variables. We checked after each iteration if our score was higher than the last one and if so, we would store the new results in these variables until we have completed the loop.
Final code:
using System; using System.Collections.Generic; using System.Linq; namespace Cryptopals.Challenges { public class Challenge_3 { private void PrintInfo() { string info = @" ---------------------------------------------- | Set 1 - Challenge 3 | | Single-byte XOR cipher | | https://cryptopals.com/sets/1/challenges/3 | ---------------------------------------------- "; Console.WriteLine(info); } private double Scoring(string text) { double score = 1; var letter_frequency_list = new Dictionary<char, double>(); letter_frequency_list.Add('a', 0.082); letter_frequency_list.Add('b', 0.015); letter_frequency_list.Add('c', 0.028); letter_frequency_list.Add('d', 0.043); letter_frequency_list.Add('e', 0.13); letter_frequency_list.Add('f', 0.022); letter_frequency_list.Add('g', 0.02); letter_frequency_list.Add('h', 0.061); letter_frequency_list.Add('i', 0.07); letter_frequency_list.Add('j', 0.0015); letter_frequency_list.Add('k', 0.0077); letter_frequency_list.Add('l', 0.04); letter_frequency_list.Add('m', 0.024); letter_frequency_list.Add('n', 0.067); letter_frequency_list.Add('o', 0.075); letter_frequency_list.Add('p', 0.019); letter_frequency_list.Add('q', 0.00095); letter_frequency_list.Add('r', 0.06); letter_frequency_list.Add('s', 0.063); letter_frequency_list.Add('t', 0.091); letter_frequency_list.Add('u', 0.028); letter_frequency_list.Add('v', 0.0098); letter_frequency_list.Add('w', 0.024); letter_frequency_list.Add('x', 0.0015); letter_frequency_list.Add('y', 0.02); letter_frequency_list.Add('z', 0.00074); // Iterate over the frequency list foreach(var item in letter_frequency_list) { char letter = item.Key; double avg_letter_freq = item.Value; // Count occurrences of each letter int letter_count = text.Count(ch => (ch == letter)); // If the letter is present, calculate the frequency if (letter_count > 0) { double letter_freq = letter_count / (double)text.Length; if (letter_freq > avg_letter_freq) { score -= letter_freq - avg_letter_freq; } else { score -= avg_letter_freq - letter_freq; } } // Otherwise, subtract letter avg. from total score else { score -= avg_letter_freq; } } // Return percentage accuracy return score * 100; } private byte[] Decrypt(byte[] ct, int key) { byte[] result = new byte[ct.Length]; for (int i = 0; i < ct.Length; i++) { result[i] = (byte)(ct[i] ^ key); } return result; } public void Solve() { PrintInfo(); string hexstr = "1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736"; byte[] ciphertext = Convert.FromHexString(hexstr); // Variables to store the best result string plaintext = ""; double best_score = 0.0; int key = 0; // loop through each possible char on the ASCII table (256 bytes) for (int ch = 0; ch <= 255; ch++) { // Attempt to decrypt ciphertext with provided key byte[] result_ba = Decrypt(ciphertext, ch); string result = System.Text.Encoding.UTF8.GetString(result_ba, 0, result_ba.Length); // Feed decrypted string to our scoring function double score = Scoring(result); if (score > best_score) { best_score = score; plaintext = result; key = ch; } } Console.WriteLine("Plaintext:\t{0}\nKey found:\t{1} ({2})\nScore:\t\t{3}%", plaintext, (char)key, key, best_score); } } }
Result:
Challenge 4: Detect single-character XOR
Same exercise as challenge 3, but with more ciphertexts. This time, we need to heavily rely on our code for the scoring system, as this function needs to determine if we’re working with English texts or not. If this doesn’t work as expected, you’re busted.
The file with the hex strings that they provided can be loaded with two options:
- You download the file from within your code and read the lines in memory
- You load the file from your local disk
In my case, I downloaded the file beforehand to disk and saved it in a new directory called Data
. I kept the original name of the file: 4.txt
. My directory structure now looks like this.
Cryptopals/
├── bin
├── Challenges
├── Data <-- NEW
├── Cryptopals.csproj
├── obj
└── Program.cs
This challenge uses 95% of the code of the previous challenge. The only difference now is that we first loop through each line and then loop through each character. The complete solution can be seen below.
using System; using System.Collections.Generic; using System.Linq; namespace Cryptopals.Challenges { public class Challenge_4 { private void PrintInfo() { string info = @" ---------------------------------------------- | Set 1 - Challenge 4 | | Detect single-character XOR | | https://cryptopals.com/sets/1/challenges/4 | ---------------------------------------------- "; Console.WriteLine(info); } private double Scoring(string text) { double score = 1; var letter_frequency_list = new Dictionary<char, double>(); letter_frequency_list.Add('a', 0.082); letter_frequency_list.Add('b', 0.015); letter_frequency_list.Add('c', 0.028); letter_frequency_list.Add('d', 0.043); letter_frequency_list.Add('e', 0.13); letter_frequency_list.Add('f', 0.022); letter_frequency_list.Add('g', 0.02); letter_frequency_list.Add('h', 0.061); letter_frequency_list.Add('i', 0.07); letter_frequency_list.Add('j', 0.0015); letter_frequency_list.Add('k', 0.0077); letter_frequency_list.Add('l', 0.04); letter_frequency_list.Add('m', 0.024); letter_frequency_list.Add('n', 0.067); letter_frequency_list.Add('o', 0.075); letter_frequency_list.Add('p', 0.019); letter_frequency_list.Add('q', 0.00095); letter_frequency_list.Add('r', 0.06); letter_frequency_list.Add('s', 0.063); letter_frequency_list.Add('t', 0.091); letter_frequency_list.Add('u', 0.028); letter_frequency_list.Add('v', 0.0098); letter_frequency_list.Add('w', 0.024); letter_frequency_list.Add('x', 0.0015); letter_frequency_list.Add('y', 0.02); letter_frequency_list.Add('z', 0.00074); // Iterate over the frequency list foreach(var item in letter_frequency_list) { char letter = item.Key; double avg_letter_freq = item.Value; // Count occurrences of each letter int letter_count = text.Count(ch => (ch == letter)); // If the letter is present, calculate the frequency if (letter_count > 0) { double letter_freq = letter_count / (double)text.Length; if (letter_freq > avg_letter_freq) { score -= letter_freq - avg_letter_freq; } else { score -= avg_letter_freq - letter_freq; } } // Otherwise, subtract letter avg. from total score else { score -= avg_letter_freq; } } // Return percentage accuracy return score * 100; } private byte[] Decrypt(byte[] ct, int key) { byte[] result = new byte[ct.Length]; for (int i = 0; i < ct.Length; i++) { result[i] = (byte)(ct[i] ^ key); } return result; } public void Solve() { PrintInfo(); string data_file = "./Data/4.txt"; // Variables to store the best result string plaintext = ""; double best_score = 0.0; int key = 0; foreach (string line in System.IO.File.ReadLines(data_file)) { byte[] ciphertext = Convert.FromHexString(line); // loop through each possible char on the ASCII table (256 bytes) for (int ch = 0; ch <= 255; ch++) { // Attempt to decrypt ciphertext with provided key byte[] result_ba = Decrypt(ciphertext, ch); string result = System.Text.Encoding.UTF8.GetString(result_ba, 0, result_ba.Length); // Feed decrypted string to our scoring function double score = Scoring(result); if (score > best_score) { best_score = score; plaintext = result; key = ch; } } } Console.WriteLine("Plaintext:\t{0}\nKey found:\t{1} ({2})\nScore:\t\t{3}%\n", plaintext, (char)key, key, best_score); } } }
Challenge 5: Implement repeating-key XOR
This one is another short challenge that — in comparison with the last two challenges — doesn’t require a lot of code to solve. The goal here is to encrypt the below message with the key ICE
, using repeating-key XOR.
Burning 'em, if you ain't quick and nimble I go crazy when I hear a cymbal
In our main function, we simply compile the plaintext message and key, and convert them to byte arrays. Then we pass both byte arrays to our XOR()
function and convert the results back to a hex string.
public void Solve() { PrintInfo(); byte[] key_ba = System.Text.Encoding.UTF8.GetBytes("ICE"); string plaintext = ""; plaintext += "Burning 'em, if you ain't quick and nimble\n"; plaintext += "I go crazy when I hear a cymbal"; byte[] pt_ba = System.Text.Encoding.UTF8.GetBytes(plaintext); byte[] result = XOR(pt_ba, key_ba); string hex = BitConverter.ToString(result).Replace("-", ""); Console.WriteLine("Plaintext:\n{0}\n\nEncrypted hex string:\n{1}", plaintext, hex); }
Our XOR()
function is a little bit different than the previous encryption / decryption functions. Now we simply provide the message and key as byte arrays, loop through the key’s bytes and use them one by one to encrypt each byte of the message.
private byte[] XOR(byte[] ct, byte[] key) { byte[] result = new byte[ct.Length]; int key_index = 0; for (int i = 0; i < ct.Length; i++) { // Shorthand if-statement to reset index of key byte array key_index = key_index > key.Length - 1 ? key_index = 0 : key_index; result[i] = (byte)(ct[i] ^ key[key_index]); key_index++; } return result; }
Running dotnet run
provides us with the following output. The hex string that we see on screen is identical to the one from the challenge.
Complete source code:
using System; namespace Cryptopals.Challenges { public class Challenge_5 { private void PrintInfo() { string info = @" ---------------------------------------------- | Set 1 - Challenge 5 | | Implement repeating-key XOR | | https://cryptopals.com/sets/1/challenges/5 | ---------------------------------------------- "; Console.WriteLine(info); } private byte[] XOR(byte[] ct, byte[] key) { byte[] result = new byte[ct.Length]; int key_index = 0; for (int i = 0; i < ct.Length; i++) { // Shorthand if-statement to reset index of key byte array key_index = key_index > key.Length - 1 ? key_index = 0 : key_index; result[i] = (byte)(ct[i] ^ key[key_index]); key_index++; } return result; } public void Solve() { PrintInfo(); byte[] key_ba = System.Text.Encoding.UTF8.GetBytes("ICE"); string plaintext = ""; plaintext += "Burning 'em, if you ain't quick and nimble\n"; plaintext += "I go crazy when I hear a cymbal"; byte[] pt_ba = System.Text.Encoding.UTF8.GetBytes(plaintext); byte[] result = XOR(pt_ba, key_ba); string hex = BitConverter.ToString(result).Replace("-", ""); Console.WriteLine("Plaintext:\n{0}\n\nEncrypted hex string:\n{1}", plaintext, hex); } } }
Challenge 6: Break repeating-key XOR
Now the fun really begins.
Disclaimer: the write-up for this challenge is extremely detailed. It’s quite a long read. You have been warned. If you just want to see the complete code for this challenge and skip all the explanations, scroll down to the bottom.
We start off with a file that contains a base64-encoded ciphertext. We know it has been encrypted with XOR, but we don’t know what the key is and what the size of the key is. The goal is to decrypt it.
The authors provided some instructions on how to tackle this challenge, but it lacked some info in my opinion. It was a lot of trial & error when I was solving this challenge. I will try to provide as much detail as possible on my solution and my thought process.
This challenge can be divided up into three steps:
- Retrieve key size
- Retrieve XOR key
- Decrypt ciphertext
Let’s delve deeper now into each step.
Step 1: Retrieve key size
In our Solve()
function, we start off by storing the key sizes and ciphertext into variables, and save them as byte arrays.
int min_keysize = 2; int max_keysize = 40; // Read and convert data from file string data_file = "./Data/6.txt"; string data = String.Join("", System.IO.File.ReadAllLines(data_file)); byte[] ciphertext = Convert.FromBase64String(data);
Next, we pass our ciphertext, minimum key size and maximum key size to our GetKeySize()
function, which needs us to return the correct key size.
// Retrieve key size int keysize = GetKeySize(ciphertext, min_keysize, max_keysize);
Let’s now talk about our GetKeySize()
function and what it will do, before jumping into the code.
At the start of our Solve()
function, we’ve set a minimum and a maximum key size value. In our case, we’ve set these values to 2 and 40, respectively. We will loop through these values and do calculations on the ciphertext.
Before we do that, we need to be able to calculate the Hamming Distance between two string. The Hamming Distance (or simply edit distance) is the difference in bits between two values. Nothing more. It is also not complex to code. The authors provided an example and said that the Hamming Distance between the strings this is a test
and wokka wokka!!!
is 37
.
Calculating the Hamming Distance is important for detecting the key size, because it measures the minimum amount of substitutions that is required to change one value into another value. The lower this value is for a certain key size, the more likely that that key size is the correct size for the XOR key that was used for the encryption.
Let’s give an example on how the Hamming Distance is calculated. Let’s say we have the following short strings.
"wah"
"run"
To calculate the difference between these strings, we need to break them down into bits.
The bits for wah
are the following:
Characters | “w” | “a” | “h” |
Decimal value | 119 | 97 | 104 |
bits | 0111 0111 | 0110 0001 | 0110 1000 |
The bits for run
are the following:
Characters | “r” | “u” | “n” |
Decimal value | 114 | 117 | 110 |
bits | 0111 0010 | 0111 0111 | 0110 1110 |
If we now XOR the bits of each character with each other and take note of each differing bit, then we simply add up the occurrences of different bits to get the Hamming Distance.
“wah” | 0111 0111 | 0110 0001 | 0110 1000 |
“run” | 0111 0010 | 0111 0111 | 0110 1110 |
XOR | 0000 0101 | 0001 0110 | 0000 0110 |
diifering bits | 2 | 3 | 2 |
If we look at the above table, we can see that there are a total of 7 differing bits. So the Hamming Distance for these two strings is equal to 7. Notice that, when we have XOR’ed both values, we’re actually counting the 1’s. This is because XOR only sets the bit to 1 for each difference.
In our code, we will loop through the minimum and maximum key size values that we’ve set. That means that we’ll start at key size 2 and end at 40. For each looped key size, we will break down the ciphertext into blocks of the looped keysize. Next, we’ll calculate the Hamming Distance for each pair of blocks. So we’ll calculate the H.D. for the first and second block, then for the third and fourth block, and so on. We’ll keep doing this in pairs throughout the whole loop, regardless of the looped key size.
After we’ve looped through all the block for a certain key size, we will add up all the Hamming Distances for all the block pairs and divide them by them amount of loops that we did. This is to get our Average Hamming Distance. Lastly, we need to normalize this average by dividing it with the looped key size. This results in the Normalized Avg. Hamming Distance, which serves as our final result for the looped size key. This result will be stored inside a dictionary, along with the key size that it belongs to.
Ok, we’ve done enough explanations now. Let’s jump into the code for the GetKeySize()
function.
We start off by setting some initial values and by doing a check that the provided max key size does not exceed more than half of the ciphertext. Otherwise, we’ll set the max value to exactly the half of the ciphertext. We do this because we need a minimum of 2 blocks with equal buffer lengths to calculate the Hamming Distance. And no, the division will never provide me with numbers that contains a decimal. In C#, int
/ int
= int
, and not a double
or float
:).
int keysize = 0; double keysize_score = 0.0; var score_list = new Dictionary<int, double>(); // Adjust max keysize if it is larger than half of the ciphertext if (max_ks > (ct.Length / 2)) { max_ks = ct.Length / 2; Console.WriteLine("[!] Max keysize is larger than half of the ciphertext. Keysize adjusted to a max of {0}", max_ks); }
Next, we’re going to loop through our key sizes. We first check if there is a remainder when we divide the ciphertext length with the looped key size. If there is a remainder, we subtract it from the ciphertext length when looping through the ciphertext bytes. We do this because we need equal buffer lengths for our blocks. We cannot calculate the Hamming Distance between say a block of 5 bytes and a block of 3 bytes.
In our loop, we’re going to iterate over the complete ciphertext and increment our index each time with (key size * 2). We multiply by two because we require 2 pair of blocks each iteration. We grab the first and second block on the first iteration, calculate the Hamming Distance for these blocks and add them up to the avg_hd
variable. Finally, we’ll increment our counter
by 1.
After the loop, we calculate our avg_hd
by dividing that value with counter
. Finally, we normalize the avg_hd
value, divide it with our key size (casted to a double
) and save our final result for that key size in norm_avg_hd
. We save this value and it’s key size into the score_list
dictionary.
// Loop through key sizes for (int i = min_ks; i <= max_ks; i++) { int block_remainder = ct.Length % i; double avg_hd = 0.0; int counter = 0; // Loop through ciphertext and create blocks // increment by keysize * amount of blocks used for HD for (int j = 0; j < (ct.Length - block_remainder - i); j += (i * 2)) { // Divide up ciphertext blocks byte[] ct_block1 = ct[j..(j + i)]; byte[] ct_block2 = ct[(j + i)..(j + (i * 2))]; // Calculate Hamming distance double hd = HammingDistance(ct_block1, ct_block2); avg_hd += hd; counter++; } // Get average H.D. and normalize avg_hd = avg_hd / counter; double norm_avg_hd = avg_hd / (double)i; score_list.Add(i, norm_avg_hd); }
Our logic for calculating the Hamming Distance between two byte arrays is as follows:
private int HammingDistance(byte[] ba1, byte[] ba2) { int hamming_distance = 0; // Check if buffer lengths are the same if (ba1.Length == ba2.Length) { // Get bits from both byte arrays BitArray bitArray1 = new BitArray(ba1); BitArray bitArray2 = new BitArray(ba2); // loop through each bit and check if different for (int i = 0; i < bitArray1.Length; i++) { if (bitArray1[i] != bitArray2[i]) { hamming_distance++; } } return hamming_distance; } else { // Return -1 if lengths are not equal Console.WriteLine("Cannot calculate Hamming distance. Invalid buffer lengths."); return -1; } }
After we have completed our outer loop, we sort our score_list
based on the score and store it in a new dictionary called sorted_score_list
. We then grab the key size on top and it’s score. This key size should be the most likely key size.
// Sort list by score var sorted_score_list = score_list.OrderBy(kvp => kvp.Value); keysize = sorted_score_list.ElementAt(0).Key; keysize_score = sorted_score_list.ElementAt(0).Value;
If we debug our program and set a breakpoint right after creating the sorted list, we can then see the following values.
It seems that the key size of 29
is the most probable key size of the XOR key. Spoiler alert: yes, this is the correct key size. But the algorithm is currently not yet fool-proof. It might still generate some false positives in its current state. That’s why we added some additional logic to it.
What I’m about to write is not something that I’ve come up with myself. I took that inspiration from another blog post. It involves calculating the greatest common divisors (GCD) for the top 3 key sizes. You’d calculate the GCD for the 1st and 2nd top key sizes, for the 1st and 3rd top key sizes and for the 2nd and 3rd top key sizes. To get a better understanding of why this is done, I highly recommend you to read this blog in more detail.
We’ll follow the following logic (copy / pasted from the aforementioned blog):
- GCD12 is not one and
- GCD12 is a key size in the top n key sizes and
- GCD12 is equal to GCD13 and GCD23 or
- GCD12 is equal to either the first or second most probable key size
- If the above conditions are not met, the script returns the key size with the smallest NAvgHD as the most probable key size.
I’ve implemented this logic in my code like this.
// Filter out false positives by calculating GCD of top 3 scores int top_keysize_1 = sorted_score_list.ElementAt(0).Key; int top_keysize_2 = sorted_score_list.ElementAt(1).Key; int top_keysize_3 = sorted_score_list.ElementAt(2).Key; int gcd_1_2 = GCD(top_keysize_1, top_keysize_2); int gcd_1_3 = GCD(top_keysize_1, top_keysize_3); int gcd_2_3 = GCD(top_keysize_2, top_keysize_3); // check if GCD_1_2 is not one if (gcd_1_2 != 1) { // check if GCD_1_2 is one of the top key sizes if (gcd_1_2 == top_keysize_1 || gcd_1_2 == top_keysize_2 || gcd_1_2 == top_keysize_3) { if (gcd_1_2 == gcd_1_3 && gcd_1_2 == gcd_2_3) { keysize = gcd_1_2; } else if (gcd_1_2 == top_keysize_1 || gcd_1_2 == top_keysize_2) { keysize = gcd_1_2; } } }
Yes, I’ve set the top key size AGAIN in a new variable while it was already in the keysize
variable. Yes, I know it’s not best practice. Roast me all you want. This is more readable to me :).
The GCD()
function that I’ve used in the above code block looks like this, and is very simply to implement. This GCD function is based on Euclidean’s method.
private int GCD(int a, int b) { return b == 0 ? a : GCD(b, a % b); }
That’s basically it. After that we should have the correct key size for the XOR key, and return the discovered size back to our Solve()
function.
Step 2: Retrieve XOR Key
Now that we have discovered the correct key size, we can start focusing on retrieving the actual key. Conceptually, this isn’t really hard to do. We’re going to do the following:
- Divide the ciphertext into blocks of the discovered key size
- Transpose the blocks: create a block that is the first byte of every block, the second byte of every block, etc.
- Solve each transposed block as if it were single-char XOR
- Give a score to each plaintext for each block (best plaintext = correct char for XOR key)
Say we have the following ciphertext bytes.
And we have a a key size of 5. Then we’re going to divide up the ciphertext in blocks of 5. The last block may contain less bytes than 5 if the ciphertext length is not dividable by 5, which is the case here (we have a remainder of 4).
Next, we’re going to transpose the blocks. We’ll take the first byte of each block and create a separate block out of it.
Then we do the same for the 2nd byte, then the 3rd byte, until we’ve reached the 5th and last byte.
Finally, we’re going to brute force each transposed block as if it were single-character XOR. Let’s have a look at the code now for the GetXORKey()
function.
We start off by setting some initial values that are later on needed in the code. We would like to know the amount of blocks that need to be created and the size of the last block, as it may not be equal to the key size.
var blocks = new List<byte[]>(); int amount_of_blocks = ct.Length / ks; int size_last_block = ct.Length % ks;
Next, we’ll loop through the amount of calculated blocks and add them to our blocks
dictionary. Afterwards, we’ll add the last block to this dictionary as well if the remainder is not equal to zero.
// Loop through amount of calculated blocks for (int i = 0; i < amount_of_blocks; i++) { blocks.Add(ct[(i * ks)..((i * ks) + ks)]); } // Last block can be smaller than keysize if (size_last_block != 0) { blocks.Add(ct[(ct.Length - size_last_block)..ct.Length]); }
We now have divided up our ciphertext into blocks of the length of our key size. Now we are going to transpose the blocks. We will first loop from 0 to key size (29 in this case), create a dedicated list t_block
and then loop through each byte array in blocks
. The index value from our outer loop will be used to get the correct byte from each block and place it in the t_block
list. When we went through all the blocks, we add the t_block
list to the transposed_blocks
list, which contains all the transposed blocks.
The try / catch
is there because the last byte array might not contain an equal buffer length. If we hit an IndexOutOfRangeException
, we know that the last byte array does not contain a byte at the specified index, and we can simply continue with our loop by breaking the current loop.
// Transpose blocks var transposed_blocks = new List<byte[]>(); for (int i = 0; i < ks; i++) { var t_block = new List<byte>(); foreach (byte[] block in blocks) { try { t_block.Add(block[i]); } // Again, last block may be smaller catch (IndexOutOfRangeException) { break; } } transposed_blocks.Add(t_block.ToArray()); }
When the transposed blocks are created, we simply brute-force each block as if it were single-char XOR. We already have the code to do this from previous exercises.
// brute-force each block with single-byte XOR var key = new List<byte>(); foreach (byte[] block in transposed_blocks) { int counter = 0; double best_score = 0.0; int best_key = 0; string best_pt = ""; for (int i = 0; i < 256; i++) { // Add integer as byte to new byte array byte[] xor_key = new byte[] {(byte)i}; byte[] pt_ba = XOR(block, xor_key); string plaintext = System.Text.Encoding.UTF8.GetString(pt_ba, 0, pt_ba.Length); // Get score double score = Scoring(plaintext); if (score > best_score) { best_score = score; best_key = i; best_pt = plaintext; } } key.Add((byte)best_key); counter++; }
Our XOR()
function looks like this:
private byte[] XOR(byte[] ct, byte[] key) { byte[] result = new byte[ct.Length]; int key_index = 0; for (int i = 0; i < ct.Length; i++) { // Shorthand if-statement to reset index of key byte array key_index = key_index > key.Length - 1 ? key_index = 0 : key_index; result[i] = (byte)(ct[i] ^ key[key_index]); key_index++; } return result; }
And, once again, our Scoring()
function (which is exactly the same as the previous exercises).
private double Scoring(string text) { double score = 1; var letter_frequency_list = new Dictionary<char, double>(); letter_frequency_list.Add('a', 0.082); letter_frequency_list.Add('b', 0.015); letter_frequency_list.Add('c', 0.028); letter_frequency_list.Add('d', 0.043); letter_frequency_list.Add('e', 0.13); letter_frequency_list.Add('f', 0.022); letter_frequency_list.Add('g', 0.02); letter_frequency_list.Add('h', 0.061); letter_frequency_list.Add('i', 0.07); letter_frequency_list.Add('j', 0.0015); letter_frequency_list.Add('k', 0.0077); letter_frequency_list.Add('l', 0.04); letter_frequency_list.Add('m', 0.024); letter_frequency_list.Add('n', 0.067); letter_frequency_list.Add('o', 0.075); letter_frequency_list.Add('p', 0.019); letter_frequency_list.Add('q', 0.00095); letter_frequency_list.Add('r', 0.06); letter_frequency_list.Add('s', 0.063); letter_frequency_list.Add('t', 0.091); letter_frequency_list.Add('u', 0.028); letter_frequency_list.Add('v', 0.0098); letter_frequency_list.Add('w', 0.024); letter_frequency_list.Add('x', 0.0015); letter_frequency_list.Add('y', 0.02); letter_frequency_list.Add('z', 0.00074); // Iterate over the frequency list foreach(var item in letter_frequency_list) { char letter = item.Key; double avg_letter_freq = item.Value; // Count occurrences of each letter int letter_count = text.Count(ch => (ch == letter)); // If the letter is present, calculate the frequency if (letter_count > 0) { double letter_freq = letter_count / (double)text.Length; if (letter_freq > avg_letter_freq) { score -= letter_freq - avg_letter_freq; } else { score -= avg_letter_freq - letter_freq; } } // Otherwise, subtract letter avg. from total score else { score -= avg_letter_freq; } } // Return percentage accuracy return score * 100; }
This concludes all the code to retrieve the XOR key. If we now run the code that we currently have, we can see that we retrieve the following XOR key.
Terminator X: Bring the noise
Step 3: Decrypt the ciphertext
Now that we have every piece of the puzzle to solve this challenge, we can finally decrypt the ciphertext. The complete code for this challenge can be seen below.
using System; using System.Collections; using System.Collections.Generic; using System.Linq; namespace Cryptopals.Challenges { public class Challenge_6 { private void PrintInfo() { string info = @" ---------------------------------------------- | Set 1 - Challenge 6 | | Break repeating-key XOR | | https://cryptopals.com/sets/1/challenges/6 | ---------------------------------------------- "; Console.WriteLine(info); } private double Scoring(string text) { double score = 1; var letter_frequency_list = new Dictionary<char, double>(); letter_frequency_list.Add('a', 0.082); letter_frequency_list.Add('b', 0.015); letter_frequency_list.Add('c', 0.028); letter_frequency_list.Add('d', 0.043); letter_frequency_list.Add('e', 0.13); letter_frequency_list.Add('f', 0.022); letter_frequency_list.Add('g', 0.02); letter_frequency_list.Add('h', 0.061); letter_frequency_list.Add('i', 0.07); letter_frequency_list.Add('j', 0.0015); letter_frequency_list.Add('k', 0.0077); letter_frequency_list.Add('l', 0.04); letter_frequency_list.Add('m', 0.024); letter_frequency_list.Add('n', 0.067); letter_frequency_list.Add('o', 0.075); letter_frequency_list.Add('p', 0.019); letter_frequency_list.Add('q', 0.00095); letter_frequency_list.Add('r', 0.06); letter_frequency_list.Add('s', 0.063); letter_frequency_list.Add('t', 0.091); letter_frequency_list.Add('u', 0.028); letter_frequency_list.Add('v', 0.0098); letter_frequency_list.Add('w', 0.024); letter_frequency_list.Add('x', 0.0015); letter_frequency_list.Add('y', 0.02); letter_frequency_list.Add('z', 0.00074); // Iterate over the frequency list foreach(var item in letter_frequency_list) { char letter = item.Key; double avg_letter_freq = item.Value; // Count occurrences of each letter int letter_count = text.Count(ch => (ch == letter)); // If the letter is present, calculate the frequency if (letter_count > 0) { double letter_freq = letter_count / (double)text.Length; if (letter_freq > avg_letter_freq) { score -= letter_freq - avg_letter_freq; } else { score -= avg_letter_freq - letter_freq; } } // Otherwise, subtract letter avg. from total score else { score -= avg_letter_freq; } } // Return percentage accuracy return score * 100; } private byte[] XOR(byte[] ct, byte[] key) { byte[] result = new byte[ct.Length]; int key_index = 0; for (int i = 0; i < ct.Length; i++) { // Shorthand if-statement to reset index of key byte array key_index = key_index > key.Length - 1 ? key_index = 0 : key_index; result[i] = (byte)(ct[i] ^ key[key_index]); key_index++; } return result; } private int HammingDistance(byte[] ba1, byte[] ba2) { int hamming_distance = 0; // Check if buffer lengths are the same if (ba1.Length == ba2.Length) { // Get bits from both byte arrays BitArray bitArray1 = new BitArray(ba1); BitArray bitArray2 = new BitArray(ba2); // loop through each bit and check if different for (int i = 0; i < bitArray1.Length; i++) { if (bitArray1[i] != bitArray2[i]) { hamming_distance++; } } return hamming_distance; } else { // Return -1 if lengths are not equal Console.WriteLine("Cannot calculate Hamming distance. Invalid buffer lengths."); return -1; } } private int GCD(int a, int b) { return b == 0 ? a : GCD(b, a % b); } private int GetKeySize(byte[] ct, int min_ks, int max_ks) { Console.WriteLine("[*] Retrieving key size ..."); int keysize = 0; double keysize_score = 0.0; var score_list = new Dictionary<int, double>(); // Adjust max keysize if it is larger than half of the ciphertext if (max_ks > (ct.Length / 2)) { max_ks = ct.Length / 2; Console.WriteLine("[!] Max keysize is larger than half of the ciphertext. Keysize adjusted to a max of {0}", max_ks); } // Loop through key sizes for (int i = min_ks; i <= max_ks; i++) { int block_remainder = ct.Length % i; double avg_hd = 0.0; int counter = 0; // Loop through ciphertext and create blocks // increment by keysize * amount of blocks used for HD for (int j = 0; j < (ct.Length - block_remainder - i); j += (i * 2)) { // Divide up ciphertext blocks byte[] ct_block1 = ct[j..(j + i)]; byte[] ct_block2 = ct[(j + i)..(j + (i * 2))]; // Calculate Hamming distance double hd = HammingDistance(ct_block1, ct_block2); avg_hd += hd; counter++; } // Get average H.D. and normalize avg_hd = avg_hd / counter; double norm_avg_hd = avg_hd / (double)i; score_list.Add(i, norm_avg_hd); } // Sort list by score var sorted_score_list = score_list.OrderBy(kvp => kvp.Value); keysize = sorted_score_list.ElementAt(0).Key; keysize_score = sorted_score_list.ElementAt(0).Value; // Filter out false positives by calculating GCD of top 3 scores int top_keysize_1 = sorted_score_list.ElementAt(0).Key; int top_keysize_2 = sorted_score_list.ElementAt(1).Key; int top_keysize_3 = sorted_score_list.ElementAt(2).Key; int gcd_1_2 = GCD(top_keysize_1, top_keysize_2); int gcd_1_3 = GCD(top_keysize_1, top_keysize_3); int gcd_2_3 = GCD(top_keysize_2, top_keysize_3); // check if GCD_1_2 is not one if (gcd_1_2 != 1) { // check if GCD_1_2 is one of the top key sizes if (gcd_1_2 == top_keysize_1 || gcd_1_2 == top_keysize_2 || gcd_1_2 == top_keysize_3) { if (gcd_1_2 == gcd_1_3 && gcd_1_2 == gcd_2_3) { keysize = gcd_1_2; } else if (gcd_1_2 == top_keysize_1 || gcd_1_2 == top_keysize_2) { keysize = gcd_1_2; } } } Console.WriteLine("[+] Most likely key size: {0}\n [*] Normalized Avg. Hamming distance: {1}", keysize, keysize_score); return keysize; } private byte[] GetXORKey(byte[] ct, int ks) { Console.WriteLine("[*] Retrieving XOR key ..."); var blocks = new List<byte[]>(); int amount_of_blocks = ct.Length / ks; int size_last_block = ct.Length % ks; // Loop through amount of calculated blocks for (int i = 0; i < amount_of_blocks; i++) { blocks.Add(ct[(i * ks)..((i * ks) + ks)]); } // Last block can be smaller than keysize if (size_last_block != 0) { blocks.Add(ct[(ct.Length - size_last_block)..ct.Length]); } // Transpose blocks var transposed_blocks = new List<byte[]>(); for (int i = 0; i < ks; i++) { var t_block = new List<byte>(); foreach (byte[] block in blocks) { try { t_block.Add(block[i]); } // Again, last block may be smaller catch (IndexOutOfRangeException) { break; } } transposed_blocks.Add(t_block.ToArray()); } // brute-force each block with single-byte XOR var key = new List<byte>(); foreach (byte[] block in transposed_blocks) { int counter = 0; double best_score = 0.0; int best_key = 0; string best_pt = ""; for (int i = 0; i < 256; i++) { // Add integer as byte to new byte array byte[] xor_key = new byte[] {(byte)i}; byte[] pt_ba = XOR(block, xor_key); string plaintext = System.Text.Encoding.UTF8.GetString(pt_ba, 0, pt_ba.Length); // Get score double score = Scoring(plaintext); if (score > best_score) { best_score = score; best_key = i; best_pt = plaintext; } } key.Add((byte)best_key); counter++; } Console.WriteLine("[+] XOR key found!\n [*] \"{0}\"", System.Text.Encoding.UTF8.GetString(key.ToArray())); return key.ToArray(); } public void Solve() { PrintInfo(); int min_keysize = 2; int max_keysize = 40; // Read and convert data from file string data_file = "./Data/6.txt"; string data = String.Join("", System.IO.File.ReadAllLines(data_file)); byte[] ciphertext = Convert.FromBase64String(data); // Retrieve key size int keysize = GetKeySize(ciphertext, min_keysize, max_keysize); // Find XOR key byte[] key = GetXORKey(ciphertext, keysize); // Decrypt ciphertext Console.WriteLine("[*] Decrypting ciphertext ..."); byte[] pt_ba = XOR(ciphertext, key); string plaintext = System.Text.Encoding.UTF8.GetString(pt_ba); Console.WriteLine("[+] Plaintext found:\n----------\n{0}\n----------", plaintext); } } }
If we now execute dotnet run in our terminal, we can see that the ciphertext gets decrypted successfully.
Challenge 7: AES in ECB mode
Luckily, the final two challenges for this set aren’t that extensive as the previous one :).
This one is pretty easy. We have received a base64-encoded ciphertext in this file and need to decrypt it by usng AES-ECB. The 16-byte key that has been used to encrypt the file is "YELLOW SUBMARINE"
.
But first, what is ECB?
The abbreviation for ECB means Electronic Codebook and is a block cipher. Actually, it is a very weak cipher. Let’s have a look at the below diagram, which explains how ECB mode works.
The plaintext gets divided into equal buffer-length blocks (default = blocks of 16 bytes) and each plaintext block is then encrypted with the key, which also has the same amount of bytes as the plaintext block. This is repeated for all the blocks.
Do you see the problem already with this AES mode? We’ll touch on this further in the last challenge.
Anyways, let’s implement ECB now in our code and decrypt the message. For this challenge, we’re going to start to use the System.Security.Cryptography
namespace as well.
In our Solve() function, we convert the key string to bytes and grab the contents of our file and save them as bytes as well. Then we pass both values to our ECB_Decrypt()
function.
public void Solve() { PrintInfo(); byte[] key_ba = System.Text.Encoding.UTF8.GetBytes("YELLOW SUBMARINE"); // Read and convert data from file string data_file = "./Data/7.txt"; string data = String.Join("", System.IO.File.ReadAllLines(data_file)); byte[] ciphertext = Convert.FromBase64String(data); // Decrypt ECB ciphertext string plaintext = ECB_Decrypt(ciphertext, key_ba); Console.WriteLine("Plaintext:\n----------\n{0}\n----------", plaintext); }
Now let’s have a look at our very compact ECB_Decrypt()
function.
private string ECB_Decrypt(byte[] ct, byte[] key) { Aes Ecb = Aes.Create(); Ecb.Mode = System.Security.Cryptography.CipherMode.ECB; Ecb.BlockSize = 128; Ecb.Key = key; return AesDecrypt(ct, Ecb); }
We first create an Aes
object called Ecb
. Then, we set the mode to ECB, set the block size to 128 and provide the key as well. When all of that is set correctly, we pass the ciphertext and the AES object to another function called AesDecrypt()
, which serves as a general purpose function that will be used for other AES methods as well (will come back in set 2).
private string AesDecrypt(byte[] ciphertext, Aes aes) { ICryptoTransform decryptor = aes.CreateDecryptor(); using (MemoryStream ms = new MemoryStream(ciphertext)) { using (CryptoStream cs = new CryptoStream(ms, decryptor, CryptoStreamMode.Read)) { using (StreamReader sr = new StreamReader(cs)) { return sr.ReadToEnd(); } } } }
This function returns us the decrypted plaintext, and we pass it then back on to our Solve(
) function. The complete code for this challenge can be seen below.
using System; using System.IO; using System.Security.Cryptography; namespace Cryptopals.Challenges { public class Challenge_7 { private void PrintInfo() { string info = @" ---------------------------------------------- | Set 1 - Challenge 7 | | AES in ECB mode | | https://cryptopals.com/sets/1/challenges/7 | ---------------------------------------------- "; Console.WriteLine(info); } private string AesDecrypt(byte[] ciphertext, Aes aes) { ICryptoTransform decryptor = aes.CreateDecryptor(); using (MemoryStream ms = new MemoryStream(ciphertext)) { using (CryptoStream cs = new CryptoStream(ms, decryptor, CryptoStreamMode.Read)) { using (StreamReader sr = new StreamReader(cs)) { return sr.ReadToEnd(); } } } } private string ECB_Decrypt(byte[] ct, byte[] key) { Aes Ecb = Aes.Create(); Ecb.Mode = System.Security.Cryptography.CipherMode.ECB; Ecb.BlockSize = 128; Ecb.Key = key; return AesDecrypt(ct, Ecb); } public void Solve() { PrintInfo(); byte[] key_ba = System.Text.Encoding.UTF8.GetBytes("YELLOW SUBMARINE"); // Read and convert data from file string data_file = "./Data/7.txt"; string data = String.Join("", System.IO.File.ReadAllLines(data_file)); byte[] ciphertext = Convert.FromBase64String(data); // Decrypt ECB ciphertext string plaintext = ECB_Decrypt(ciphertext, key_ba); Console.WriteLine("Plaintext:\n----------\n{0}\n----------", plaintext); } } }
Running dotnet run
provides us with the solution.
Challenge 8: Detect AES in ECB mode
The last challenge is to detect when a ciphertext has been encrypted with ECB mode. We have received this file which contains a bunch of long hex strings and we need to find out which line has been encrypted with ECB.
Let’s come back to our question from our previous challenge: what is the problem with ECB if you have a look at the following diagram?
The issue here is that each plaintext block is only encrypted with the provided key. No other values are being used. This means that the same 16-byte plaintext will generate the exact same 16-byte ciphertext. With this knowledge, ECB ciphertext is easily detectable.
Let’s jump into the code and have a look at our Solve()
function. We simply grab the data from the file and loop through each line. In our for loop, we will convert the line to a byte array and pass it on to our DetectECB()
function, which returns true or false. If ECB is detected, we print out the corresponding line.
We could actually solve this without converting the hex lines to a byte array, but I want to stay true to the concept of “always working with bytes”. This is always the more reliable way.
public void Solve() { PrintInfo(); // Read and convert data from file string data_file = "./Data/8.txt"; foreach (string line in System.IO.File.ReadLines(data_file)) { byte[] ct_ba = Convert.FromHexString(line); bool is_ecb = DetectECB(ct_ba); if (is_ecb) { Console.WriteLine("[+] ECB found for the following line:\n{0}", line); } } }
Let’s look at the DetectECB()
function. We only need to pass on the ciphertext as an argument. An optional argument can be provided for the block size, which now defaults to 8. The challenge mentions that the same 16-byte plaintext will always produce the same 16-byte ciphertext, but remember that we are now working with bytes.
What I mean by that is, a hex value exists out of 2 bytes (e.g. \xC0
, \x08
, etc.) because it is represented as a string, but a hex value actually represents 1 byte (e.g. C0 = 192, 08 = 8). Thus, when we convert the hex string to bytes, we need to work with blocks of 8 bytes instead of the mentioned 16 bytes by the challenge.
If we look at the code below, we start off with a small check to see if the ciphertext can be divided into equal-length block sizes. Then, we simply loop through the ciphertext, create block of 8 bytes and add them to the block_list
list.
If the block_list
has been populated with all the blocks, we will loop through that list. In the outer loop, we loop until the total count of blocks in our list and use that value as a first index. We do the same in our inner for loop, and this value is used as well as an index to get the second byte array. With this method, we can iterate over each block and compare it with the other blocks. If both blocks are sequentially the same AND both index values are not equal to each other, then we know that ciphertext has been encrypted with ECB.
private bool DetectECB(byte[] ciphertext, int block_size = 8) { if (ciphertext.Length % block_size != 0) { Console.WriteLine("[-] Ciphertext cannot be divided into equal blocks of {0}. Length of ciphertext: {1}. Exiting.", block_size, ciphertext.Length); Environment.Exit(-1); } // Create blocks of specified block size (default = 8) var block_list = new List<byte[]>(); for (int i = 0; i < ciphertext.Length; i += block_size) { byte[] block = ciphertext[i..(i + block_size)]; block_list.Add(block); } // Loop through each block in block_list for (int index = 0; index < block_list.Count; index++) { // loop through blocks in block for (int i = 0; i < block_list.Count; i++) { byte[] block1 = block_list[index]; byte[] block2 = block_list[i]; // Check if byte arrays are the same if (index != i && block1.SequenceEqual(block2)) { return true; } } } return false; }
The complete code can be seen below.
using System; using System.Linq; using System.Collections.Generic; namespace Cryptopals.Challenges { public class Challenge_8 { private void PrintInfo() { string info = @" ---------------------------------------------- | Set 1 - Challenge 8 | | Detect AES in ECB mode | | https://cryptopals.com/sets/1/challenges/8 | ---------------------------------------------- "; Console.WriteLine(info); } private bool DetectECB(byte[] ciphertext, int block_size = 8) { if (ciphertext.Length % block_size != 0) { Console.WriteLine("[-] Ciphertext cannot be divided into equal blocks of {0}. Length of ciphertext: {1}. Exiting.", block_size, ciphertext.Length); Environment.Exit(-1); } // Create blocks of specified block size (default = 8) var block_list = new List<byte[]>(); for (int i = 0; i < ciphertext.Length; i += block_size) { byte[] block = ciphertext[i..(i + block_size)]; block_list.Add(block); } // Loop through each block in block_list for (int index = 0; index < block_list.Count; index++) { // loop through blocks in block for (int i = 0; i < block_list.Count; i++) { byte[] block1 = block_list[index]; byte[] block2 = block_list[i]; // Check if byte arrays are the same if (index != i && block1.SequenceEqual(block2)) { return true; } } } return false; } public void Solve() { PrintInfo(); // Read and convert data from file string data_file = "./Data/8.txt"; foreach (string line in System.IO.File.ReadLines(data_file)) { byte[] ct_ba = Convert.FromHexString(line); bool is_ecb = DetectECB(ct_ba); if (is_ecb) { Console.WriteLine("[+] ECB found for the following line:\n{0}", line); } } } } }
Running our code with dotnet run
results in printing the line that has been encrypted with ECB.
The line that has been encrypted with ECB is:
d880619740a8a19b7840a8a31c810a3d08649af70dc06f4fd5d2d69c744cd283e2dd052f6b641dbf9d11b0348542bb5708649af70dc06f4fd5d2d69c744cd2839475c9dfdbc1d46597949d9c7e82bf5a08649af70dc06f4fd5d2d69c744cd28397a93eab8d6aecd566489154789a6b0308649af70dc06f4fd5d2d69c744cd283d403180c98c8f6db1f2a3f9c4040deb0ab51b29933f2c123c58386b06fba186a
If we divide this ciphertext into blocks of 16 bytes (because this is a hex string again), we can actually see that multiple lines are the same.
d880619740a8a19b
7840a8a31c810a3d
08649af70dc06f4f
d5d2d69c744cd283
e2dd052f6b641dbf
9d11b0348542bb57
08649af70dc06f4f
d5d2d69c744cd283
9475c9dfdbc1d465
97949d9c7e82bf5a
08649af70dc06f4f
d5d2d69c744cd283
97a93eab8d6aecd5
66489154789a6b03
08649af70dc06f4f
d5d2d69c744cd283
d403180c98c8f6db
1f2a3f9c4040deb0
ab51b29933f2c123
c58386b06fba186a