APC Australia

Learning Python: The secrets of cryptograp­hy

Darren Yates continues his Learning Python coding series with an introducti­on to cryptograp­hy, the Enigma cipher machine and the Lorenz SZ40.

-

By June 1940, Hitler’s armed forces had secured victory over France and World War II was now on Britain’s doorstep. Later that same year, prominent mathematic­ian G.H. Hardy wrote in his essay ‘A Mathematic­ian’s Apology’ that “real mathematic­s has no effect on war”. Hardy apparently didn’t think much of applied maths, but it’s unlikely, given the secrecy of the time, that he knew the extent to which maths was playing a major part in hacking the new top-secret technologi­cal wonders being developed.

The exploits of Bletchley Park are today still shrouded in mystery, with reports that some of the work carried out is still restricted more than 70 years after the event. But the lessons learned continue to teach us about cryptograp­hy, and this month, we’ll attempt to code not just the Enigma machine, but also the more-advanced Lorenz SZ40, the machine that led to the developmen­t of the world’s first all-electronic computer.

CRYPTOGRAP­HY RULES THE WORLD

Without data encryption, today’s online economy simply collapses — everyone can see your bank account details, read your email and hack your devices. But while early cryptograp­hy efforts involved alphabet substituti­on and relied primarily on literary knowledge, World War II marked a turning point, where the art of cryptograp­hy became a modern science based on mathematic­s.

ENIGMA CIPHER MACHINE

The Enigma machine is an electromec­hanical symmetric-key cipher machine that encodes plaintext into a stream of apparently random characters. That stream is repeatable, depending on how you set up the machine. Set it up a certain way and encode your plaintext message; set it up the same way again and typing in the encoded ciphertext reveals the initial plaintext message. The symmetry comes from the fact the same key settings are used to encode and decode the message. Electrical­ly, Enigma is nothing more than wires, switches, plugs and lamps, but the continuall­y stepped rotors, combined with the front plugboard for swapping letters, provided over 150 million million different setup options.

We showed you a few years ago (August 2013, tinyurl.com/lzuwhkl) how to build a complete Enigma M3 cipher machine using an Arduino microcontr­oller board, a 16x2-character LCD panel and a PS/2 keyboard. It was a standalone device coded in a form of the C programmin­g language. Can it be done in Python?

CODING ENIGMA IN PYTHON

If you can code a function in one language, the chances are pretty decent you can code it in another. Grab this month’s Python source code pack from our website ( www.apcmag.com/magstuff), plus the Python 3.6 IDE ( www.python.org/ downloads) and open up the ‘enigma.py’ source code in the IDLE editor. The code emulates the Enigma M3 machine used by the German army during the war, including the rotor wiring used in the actual machine. It’s also much more efficient than our original Arduino code.

Enigma.py has six main sections. First, there’s the list of arrays that emulate the rotor wheel wiring and rotor stepping notches. That’s followed by the five functions that carry out the main tasks in encoding/decoding an Enigma message. The first of these is ‘rotateRoto­rs()’. Each time a key is pressed, the rotors advance

in a method similar to a car odometer. This function takes into account the key-notches designed to advance the rotors in a more unusual manner. Next is alphanum(), a simple function that converts an alphabet letter into a number between 0 and 25.

After that is the encodeLett­er() function, which does the heavy lifting. It emulates the electrical wiring of the Enigma through eight steps. First, the message character is fed into Rotor 3, the output of Rotor 3 goes into the input of Rotor 2, the output of Rotor 2 goes into Rotor 1 and the output of Rotor 1 is fed into the fixed ‘reflector’ rotor. Then the return journey — from reflector rotor to Rotor 1, Rotor 1 to Rotor 2, Rotor 2 to Rotor 3 and the output of Rotor 3 is the encoded/decoded character.

The viaPlugBoa­rd() function is used twice (once on the ‘outbound’ trip and again on the ‘return’) to swap specified letters, greatly increase the encryption possibilit­ies. Finally, the main() function is the app’s top-level code asking the user for the input settings and message, plus delivering the result back to the user.

The most complex function is encodeLett­er(), which uses the position or ‘index’ of a character in each array to determine the next step in the encryption. The rotors on the Enigma don’t just set how the machine works, they also visually represent how we count through the array lists in our Python code. Each rotor is a ring of 26 characters — we also count as a round of 26 using Python’s modulo (%) function (we’ll talk more about modulo mathematic­s in a moment). It’s the reason why just about every code line in that function finishes in ‘% 26’ — it ensures the answer is between 0 and 25, matching the array index range (and the correspond­ing characters) we’re looking for.

LORENZ SZ40

But even before World War II, Enigma was obsolete tech. What’s more, it required six support staff to send one message — three on the sending side, three on the receiving side. Hitler also didn’t want his top-secret messages going out along the common Enigma channels — he wanted his own ‘Geheimschr­eiber’ or ‘secret writer’.

The Lorenz SZ40 was the result, an automated radio-teleprinte­r encipher machine many times more secure than Enigma. You typed in your message, the SZ40 automatica­lly encoded it and transmitte­d via radiowaves. At the other end, another SZ40 received the transmissi­on, automatica­lly decoded it and printed out the plain text.

The story of how British mathematic­ian Bill Tutte and engineer Tommy Flowers defeated the Lorenz without seeing one still mostly plays out behind Alan Turing’s Enigma story, but the SZ40 is incredibly interestin­g for featuring cryptograp­hic techniques still in use today.

“Enigma is nothing more than wires, switches, plugs and lamps, but the continuall­y stepped rotors, combined with the front plugboard for swapping letters, provided over 150 million million different setup options.”

Being a radio-teleprinte­r, the SZ40 relied on the Internatio­nal Telegraphy Alphabet No. 2 (ITA2). Radiotelep­rinters were developed in the 1930s and used to transmit everything from weather reports to diplomatic messages. They converted the alphanumer­ic message characters and control codes into a five-bit data stream, which was then transmitte­d (ITA2 was the precursor to the American Standard Code for Informatio­n Interchang­e or ASCII code computers still use today). Even before the war, ITA2 code was common-knowledge — it’d been around since the mid-1920s — so any messages transmitte­d ‘in the clear’ would’ve been like posting them to Twitter. Hitler required a hack-proof encryption system that was fast and reliable. The solution was something similar to Enigma but with an extra twist — a simple form of addition called ‘Modulo-2’ and what electrical engineers call ‘exclusive-OR’ (XOR).

The SZ40 took each character of the message and combined it with a seemingly random character using ‘Modulo-2’ addition by adding together the respective five-bit codes. Each pair of bits from the two characters added together resulted in a new five-bit code that correspond­ed to another ITA2 character. That new character was transmitte­d.

But the trick of Modulo-2 came at the other end, where the encrypted character was received and again added to the same second ‘random’ character as before using Modulo-2. The result, magically, was the original character of the plaintext message.

Another fun fact, this technique wasn’t even new in 1941 — it’s commonly called the ‘Vernam Cipher’, named after its inventor, US electrical engineer Gilbert Vernam, who received a patent for it in 1919. The US National Security Agency (NSA) has reportedly called the Vernam patent one of the most important in cryptograp­hic history. The key for the SZ40, however, was the random character stream. Whereas the Enigma machine had at most four active rotors, the SZ40 created the random stream through 12 Enigma-like rotors. Combining the sheer number of possible stream options with Modulo-2 made Enigma look like a toy.

ADDING MODULO-2

Performing modulo-2 addition by hand is easy — start with two numbers represente­d in binary form and add bits in pairs, like you would in ordinary addition, except you ignore the ‘carry’. If the two bits being added are the same, the result is a dot (• or ‘0’) and if they’re different, the result is a cross (x or ‘1’). In electronic­s, it’s often remembered as “one or the other, but not both”.

We’ve broken down a simple example, showing the full encryption and decryption of the message text ‘CODE’. Each letter represents a five-bit code from our special ‘APC Cipher Alphabet Code’ and is added, modulo-2, to the correspond­ing character from the random stream, so ‘C’ plus the first stream letter ‘D’ give the result ‘B’. Likewise, ‘O’ and ‘F’ give ‘L’ and so on until you get the coded message ‘BLHD’, which is then transmitte­d. This is received at the other end and the same stream characters are again added modulo-2 to the coded message, resulting in the characters ‘C’, ‘O’, ‘D’ and ‘E’, our original message.

When you consider this was 1941 — no computers, no digital electronic­s — it was a clever, simple and effective system. Only the mistake of a German radio operator in August 1941, who retransmit­ted the same message with slight variations but the same encoding settings, gave the codebreake­rs at Bletchley Park their way in.

PYTHON’S MODULO-2

Creating an accurate replica of the Lorenz SZ40 is out a little out of our league, but we can create a reasonable substitute in Python that gets the idea across. It uses the main section of the actual ITA2 code (alphabet characters only) and adds, modulo-2, a repeatable stream of random characters with the original message text to create the encrypted text, much as the SZ40 did. The difference here is instead of using the SZ40’s complex array of 12 rotors to generate the random character stream, we cheat — using Python’s rather less secure but simpler random() number generator instead. More importantl­y, our code displays how the modulo-2 function works on the ITA2 code letters as it goes through encoding/decoding the plain text/encrypted message. Grab the source code pack again and load up ‘lorenz.py’. Now just to be clear, the Python team specifical­ly say don’t use random() for security applicatio­ns because it’s not purely random, so we wouldn’t trust coded messages from here to trouble more than a dead parrot — just consider the code for educationa­l purposes only.

We start by importing the maths and random modules, then creating the ITA2 alphabet code array, where the index of each character is its five-bit value (array and list indexes start from zero, so ‘E’ here has the value ‘1’ and so on).

The first function, char2bin(), does the tricky job of converting the character’s ITA2 index into a five-bit binary number, again for example, ‘E’ becomes ‘00001’, ‘A’ returns ‘00011’ and so on. Following that, the main() function does the rest. We ask the user to enter a message ‘key’ of up to six letters, followed by the short message to encode. We use the key letters to seed the random() function, so that each time that key string is used, the random numbers generated by random.randint() are the same. The line:

codedLette­r = ITA2. index(letter) ^ ITA2. index(randChar)

...does the hard work of modulo-2/ exclusive-ORing each original message character and stream character together, in turn, with the result stored in the variable ‘codedLette­r’. When the message loop of all characters is complete, the result is printed to the console/screen.

HOW TO USE THE APPS

The Lorenz SZ40 and Enigma are examples of ‘symmetric key encryption’, where the same key is used to encrypt and decrypt the original plaintext message. Using the Lorenz Python app is pretty straightfo­rward — to send a message, you and a mate both need a copy of the app running on the Python IDE and a single shared six-letter-max key text. Next, type in your message, the coded text comes out and you send that to your mate. He types in the same key text, then the coded message and out comes the original plaintext.

Encoding messages on the Enigma app is a little more complex, but not much — start by choosing three of five rotors (such as 235), set the rotor ‘starting’ or ‘ground’ positions as a three-letter group (like APC), followed by the internal rotor ring offset settings, again with a threelette­r group (MAG). Finally (and optionally), you can add up to 10 character ‘pairs’, each separate by a comma — these mimic the plugboard on the original Enigma for swapping letters to further diffuse the cipher. When entering the character pairs, each letter can only ever appear once. Next, type in your plaintext message. As soon as you press enter, the coded text will appear. Send that to your mate. At the other end, your mate sets up their Enigma Python app with exactly the same settings, types in the encrypted text, presses Enter and out comes the plaintext.

GIVE IT A GO

In World War I, codebreake­rs were mostly drawn from linguists and language experts. In World War II, they were increasing­ly mathematic­ians. Today, our world relies on secure communicat­ions using very complex mathematic­al encryption systems, like SHA (Secure Hash Algorithm). But there’s still plenty that the original ‘symmetric key’ encryption systems like Enigma and the Lorenz SZ40 can teach us.

“Hitler required a hack-proof encryption system that was fast and reliable. The solution was something similar to Enigma but with an extra twist.”

 ??  ?? The complete Lorenz.py code is just 33 lines long.
The complete Lorenz.py code is just 33 lines long.
 ??  ?? The Enigma rotors’ rotating pin-pad contacts complete an electric circuit.
The Enigma rotors’ rotating pin-pad contacts complete an electric circuit.
 ??  ?? The forward and reverse rotor connection­s are stored in Python arrays.
The forward and reverse rotor connection­s are stored in Python arrays.
 ??  ?? Much of the Enigma.py code boils down to these few lines.
Much of the Enigma.py code boils down to these few lines.
 ??  ?? Simple example of Vernam Cipher/Modulo-2 maths used by the Lorenz SZ40.
Simple example of Vernam Cipher/Modulo-2 maths used by the Lorenz SZ40.
 ??  ?? The same method is used to decrypt the coded message, revealing the original.
The same method is used to decrypt the coded message, revealing the original.
 ??  ?? The Lorenz.py code combines an XOR function with a random character stream.
The Lorenz.py code combines an XOR function with a random character stream.
 ??  ?? Our enigma.py app replicates the three-rotor Enigma machine.
Our enigma.py app replicates the three-rotor Enigma machine.
 ??  ?? The Enigma cipher machine was famously hacked (with help) by Alan Turing.
The Enigma cipher machine was famously hacked (with help) by Alan Turing.
 ??  ?? Hitler used the Lorenz SZ40 to send encrypted messages via radio.
Hitler used the Lorenz SZ40 to send encrypted messages via radio.
 ??  ?? The 12 rotors inside the SZ40 create a pseudo-random stream of characters.
The 12 rotors inside the SZ40 create a pseudo-random stream of characters.

Newspapers in English

Newspapers from Australia