Symmetric Key Cryptography

Symmetric key cryptography involves scrambling plaintext into ciphertext using a cryptographic algorithm and a key in such a way that the entire plaintext is recoverable via decryption using the same algorithm (in reverse) and the same key. Here are some examples of real world symmetric key algorithms: DES, IDEA, 3DES, AES, Blowfish, Twofish, and Serpent.

A simple historical symmetric key algorithm (the Caesar Cipher) substitutes characters with ones further down the alphabet by a given number of positions, e.g. A -> E, B -> F, C -> G, up to Z -> D. You can do this mechanically with two concentric disks, each of which has an alphabet written around it. If you position the A of the outer disk against the E of the inner disk, then this "cipher wheel" can encrypt characters by mapping from the outer disk to the inner disk (e.g. A -> E), and decrypt characters by mapping them from the inner disk to the outer disk (e.g. E -> A). In this case the letter on the inner disk that you line up with the "A" on the outer disk up (e.g. "E"), is the key.

cipherwheel

Actual historic Caesar Cipher wheel from the U.S. Civil War. 

 

The following two rows of letters are equivalent to a cipher wheel with the key set to "E":

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

E F G H I J K L M N O P Q R S T U V W X Y Z A B C D

Let's say we want to encrypt the message "HELLO". You find each character of "plaintext" (e.g. "H") in the top row, then map it to the character directly below it in the bottom row (e.g. "L"). Do this for each character of plaintext, one by one. The resulting ciphertext is "LIPPS".

To decrypt the ciphertext, you would use the same two rows but find each character of the ciphertext (e.g. "L") in the bottom row, and map it onto the character directly above it (e.g. "H"). The ciphertext "LIPPS" would be mapped into the recovered text "HELLO".

Decryption "reverses" or "undoes" the transformation from plaintext into ciphertext.

What if I used a different key? This week the key is "Q" instead of "E". The two rows would now look like this:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Q R S T U V W X Y Z A B C D E F G H I J K L M N O P

Now "HELLO" would produce the ciphertext "XUBBE". The same encryption algorithm (shift by N), but a different key ("Q"), results in different ciphertext. Using these two new rows, decrypting "XUBBE" again recovers "HELLO". Valid encryption (using key = "Q") and correct key during decryption ("Q") results in correct recovered text.

But what if I didn't know that key had changed to "Q", and was still using "E"? I would use the first two rows to decrypt "XUBBE" and wind up with "BYFFI". Valid encryption (using key = "Q") but wrong key during decryption ("E" instead of "Q") results in gibberish. The algorithm (shift by N) was correct, but the key was wrong (I used E when I should have used Q). Actually the plaintext is still recoverable from the "gibberish". I would need to encrypt "BYFFI" with key E (resulting in "XUBBE") then decrypt the result with key Q (resulting in "HELLO"). Note that this successfully reverses the two transformations.

There are actually 26 possible keys here (A to Z). But what happens when we use the key "A"? The two rows now look like this:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

We encrypt the plaintext "HELLO" into the ciphertext "HELLO". We can still recover the message "HELLO" by decrypting that ciphertext with key = A. But the message is visible in the ciphertext! That is because A (shift by 0) is a "weak key" for this algorithm. Likewise shift by 26, by 52, etc. Any multiple of 26 (including 0) is a weak key. Shift by 27 produces the same result as shift by 1. There are only 25 distinct "useful" keys. Even some real world algorithms (like DES) have a few weak keys that cause the information to be sliced and diced in a zillion ways, but the whole plaintext appears in the ciphertext.

For fun, try downloading this simple Windows Caesar Cipher Machine app. Note - this app adds the "blank" character on both wheels, so you can enter plaintext that includes blanks. Because of this, the above examples may not work right.

Caesar Cipeher Machine for Windows

You can set the "inner wheel" by selecting the letter to put next to "A" on the outer wheel. Anything you type (or paste) into the Plaintext box will be encrypted using the current key when you click "Encrypt". The contents of the "CipherText" box will be decrypted (into the Recovered Text box) using the current key when you click "Decrypt". To crack a ciphertext with an unknown key, paste it into the CipherText box then keep clicking "Next Key" until readable text shows up in the Recovered Text box. Try cracking the following ciphertext:

IXUPSQUHQGPSYEXUGPYHPCDIPKUGNPHIGDCW

 

Key Strength

The strength of a symmetric key is determined by the number of possible keys that must be tried. With this cipher, if I don't know this week's key, it only takes a few minutes to try every possible key until readable plaintext appears. Recovering the plaintext without prior knowledge of the key is the inverse of cryptography and is called cryptanalysis. Cryptanalysing an encrypted message by trying all possible keys is called a brute force attack. There is no defense against a brute force other than by having too many keys to try in a reasonable amount of time.

There are faster ways of decrypting messages (without knowledge of the key) for some algorithms. A strong algorithm has no known ways to attack it faster than brute force. If your key is a string of N bits, then the number of keys is equal to 2 to the Nth. If your key is 8 bits in length, then there are 256 possible keys (2 to the 8th). If your key is 56 bits in length (e.g. with DES), then there about 7.2E16 (7.2 times 10 to the 16th, or 72 quadrillion) possible keys. When DES was introduced in 1975, this sounded like infinity. The EFF created a specialized DES cracker ("Deep Crack") for about US$ 250,000 that could find a DES key (in certain cases) on the average in 4.5 days. As of 2008, a commercially available machine (for much less money) can try DES keys at a speed of about 65 billion per second, which takes 12.8 days to try every possible key, or 6.4 days on average. Because this is possible, DES is no longer considered strong enough. A temporary fix involved using DES three times in a row (a.k.a. 3DES), with three different keys. This made an effective key length of 168 bits (3 x 56), but in reality, there were ways to crack 3DES much faster than a real 168 bit algorithm would take. It was definitely stronger than single DES.

No problem, newer symmetric key algorithms such as AES use 128 or even 256 bit keys. With 128 bits keys, there are 3.4E+38 possible keys. At 65 billion keys per second, that would still take 1.66E+20 years to try all possible keys. And if that isn't strong enough for you, 256 bit keys would take 5.65E+58 years. In comparison the current age of the universe is only about 1.4E+10 years old. Even 128 bit keys are strong enough for now, even taking into account vast improvements in how fast we can try keys. 256 bit keys are essentially uncrackable (by brute force) without fundamental breakthroughs in physics we can't currently imagine.

The first use of computers in cryptography were to cryptanalyse military messages created with manual and mechanical devices, such as Engima. Initially dedicated code cracking machines such as the "Bombe" were used at Bletchley Park in the U.K. to break Engima. Later general purpose computers were used to do this. Computers were so effective at breaking manual and mechanical encryptors, that where possible, people shifted to using computers to also do the encryption, taking the complexity of ciphers (encryption algorithms) to entirely new levels. Today, personal computers (and even smartphones) are more than powerful enough to encrypt information at military grade. This technology is now widely used to protect e-commerce transactions as well as sensitive data and communications.

Symmetric key cryptography uses the same key to encrypt and decrypt. This is intuitive. I use a key to lock my padlock, and use the same key to unlock it. There is symmetry between the encryption and decryption processes. If I encrypt something with a specific key (K1), you need a copy of that same key (K1) to decrypt it. This key must be kept secret from everyone except the communicating parties. But of necessity, both of those parties must have that key. This is the key management problem.

Here is symmetric key cryptography in mathematical notation:

PT = the plaintext (message), CT =  the ciphertext (scrambled message)

AES = symmetric encryption algorithm, AES-1 (AES "inverse") = symmetric decryption algorithm

K1 = the symmetric key

PT' = the recovered plaintext

Encryption:

CT = AES(PT, K1)

Decryption:

PT' = AES-1(CT, K1)

Current cryptographic algorithms mostly process binary (not textual) data. A file with only text (e.g. created with notepad) is a special case of a binary file (one with only 7 bit characters, grouped into lines, each of which is followed by CR,LF). But binary files cover every possible file type, including image (e.g. .jpg), audio (e.g. .mp3), Word document (e.g. .docx), etc. Even if the input happens to be a simple text file, the output of current algorithms is always binary data. If it must go through a text-only channel (e.g. SMTP email), it must be encoded into hexadecimal or base64, which increases its size. If this encoding is not needed, then the output file is usually just a bit larger than the input file. If text encoding is required, the based64 encoded ciphertext size usually is about 1.3 times the size of the binary output file for base64, or 2 times for hexadecimal.

Current cryptographic algorithms are mostly block oriented. This means they process chunks of plaintext at a time - typically 64 to 128 bits (8 to 16 bytes). The scrambling within each block is quite thorough, but in the simplest case (ECB, or Electronic CodeBook), each block is treated as an independent problem. A long string of zeros can result in repeating groups in the output file (this is a big help in cryptanalysis). There are several schemes to prevent this, such as CBC (Cipher Block Chaining) and CFB (Cipher FeedBack), where each block has some of the previous block (or an Initialization Vector for the first block) mixed in. This "mixing" is removed during the decryption process. These schemes eliminate any such repeating groups and significantly increase the difficulty of cracking an encrypted message with very little additional effort. Often, symmetric encryption is identified by the symmetric algorithm name, the key length, and the block processing mode, e.g. AES128-CBC. That designation would mean AES using 128 bit keys, processed with Cipher Block Chaining.

Symmetric key cryptography is fast (in terms of amount of data encrypted per second), can provide high security, and is simple to use in an application. However, systems based on only symmetric key algorithms typically have serious problems (and weaknesses) in their key management. If Alice wants to send Bob an encrypted file, they must somehow securely exchange a shared key. If either party accidentally compromises the key it is compromised for both parties. If Alice uses the same key to send files securely to Charlie, then Bob can also read those files (which may not be desirable). In the general case each communicating pair needs a unique key. The longer Alice uses a given key (or the more data that is encrypted with it), the higher the probability that it will be compromised. Ideally, every message should have a unique key. In practice that is very difficult to accomplish if you only have symmetric key cryptography available.

Because of these problems with key management, asymmetric key cryptography was developed.