CS35 Lab 3: The Enigma Machine

Due 11:59pm Sunday, September 21st, 2015

The goal of this lab is to introduce you to basic concepts in the C++ program language. Concepts you will be familiar with after this lab include:

A skeleton version of the program can be cloned from a project named for you and your partner; it will appear as lab3-<username1>-<username2> on Github.



Encryption and Ciphers

Encryption is the process of encoding messages in such a way that only authorized parties can read the message. The idea of encrypting messages is thousands of years old and has been particularly useful in the military, where it is important to exchange messages (e.g. "Attack at dawn!" vs "Hold your ground") without enemies being able to understand them in the event a message is intercepted. A cipher is a kind of encryption scheme where each letter is replaced by another. Ciphers have been used since at least Roman times. Julius Caesar used an cipher where each letter of the message was replaced with one three positions earlier in the alphabet. So, for instance, 'A' is encrypted as 'X', 'B' becomes 'Y', 'C' is encrypted as 'Z', 'D' is encrypted as 'A', and so on. An encrypted message is called the ciphertext, and if an enemy intercepted a message, it would just look like a jumble of letters. On the other hand, Caesar's subordinates, knowing the encryption scheme, could decrypt the message by employing doing the opposite of the encryption scheme: 'A' is decrypted as 'D', 'B' is decrypted as 'E', etc. This cipher was used so effectively that it became known as the Caesar cipher. In this lab, you'll complete an implementation of the following ciphers:

The Allies made a huge effort to break the Engima code during World War II. This task was led in part by Alan Turing, and it was said that breaking the enigma machine shortened the war in Europe by two years. The effort to break German ciphers also resulted in the first electrical digital computers. You may have seen an interpretation of this if you saw the movie The Imitation Game.
Introduction

In this lab, you will create several classes representing the different ciphers mentioned above. All of these classes will be subclasses of an abstract Cipher class used to represent an interface for ciphers. Your subclasses will implement functions (such as encrypt and decrypt) whose behavior differs with each different cipher.

In addition, you will fill in the skeleton of a main program which allows the user to encrypt and decrypt files using one of a set of cipher objects. Below is an overview of the files required for submitting this week's lab. Those highlighted in blue will require implementation on your part. Those in black are complete and should not be modified except for comments.


Program Requirements

I recommend implementing your lab assignment in 4 phases:

  1. Review and understand the Cipher and Caesar classes.
  2. Complete implementation and tests of the Rotation and Substitution ciphers.
  3. Fill in the parts of main needed to use Rotation and Subsitution ciphers from the main program.
  4. Enigma Cipher - complete the lab by implementing the Engima cipher and its corresponding main components.
  5. Play around with some mystery files and adding the handling for an array of Cipher objects to your main.
Note: The code for the Engima cipher is more complex than the rest of this lab. You should definitely get the rest of the lab working before tackling this final piece. I have instructed the ninjas to not help students with the Enigma cipher unless students have demonstrated working implementations for the rest of the lab.

I. Cipher and Caesar classes

The Cipher class serves as the base class for all of the cipher objects. It's sole job is to describe a set of public methods that define the methods each type of Cipher needs to implement: ---------------------------
Key Concepts: virtual functions, pure virtual functions, abstract classes. The encrypt and decrypt methods are pure virtual methods. There are no implementations for them in the Cipher class. Instead, each subclass must implement the functions themselves. In a class declaration, pure virtual functions are declared as follows: virtual string encrypt(string s) = 0; The virtual keyword means encrypt is a virtual function. When a function is virtual, subclasses can provide their own implementations. By adding the = 0 the method becomes pure virtual. Pure virtual functions must be implemented by subclasses. A class that declares a pure virtual function is called an abstract class. It cannot be instantiated. Instead, subclasses are created and instantated.

Use pointers to create these objects as follows.
Cipher* mycipher = new Caesar();

This creates a pointer to a Cipher object called mycipher, and instantiates it as a Caesar object. The object itself is a Caesar object, but mycipher is a pointer to a Cipher object—its type is Cipher*. This is often just what we'd like in C++. The user of this pointer does not need to know what kind of cipher myCipher is; they just want to encrypt and decrypt.

The Caesar class is our first cipher. Its declaration and implementation are given to you, but you are still responsible for understanding its code. This class contains implements the virtual functions from Cipher. In particular, it implements encrypt and decrypt functions. To encrypt, it iterates over the characters in the input string, and builds up a new string one character at a time by applying the Caesar cipher's rules (note that it explicitly skips characters that aren't in the A-Z range).

Managing the encryption and decryption. So you read in a letter 'K' and are encrypting using the Caesar cipher. How does your lab figure out that 'H' is the encrypted character? More generally, with an arbitrary cipher, how should the data be structured to manage the encryption? There are several different possibilities, but the most straightforward is to use arrays. In the case of the Caesar cipher, it uses a character array:

    char alphabet[] = {'A','B','C','D','E','F','G', 'H','I', 'J',
    'K','L','M','N','O','P','Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
    'Y', 'Z'};
  
When encrypting a character, the Caesar object searches through this array (using a for loop) until finding the index of the character to encrypt. Then it subtracts 3 from the index and outputs that character (careful to “wrap around” if going past Z or A). You will need to do similar (but substantially more complex) calculations for the other ciphers.

II. Implementing the User Interface

For your main program main.cpp, you should create a library of cipher objects (stored as an array of Cipher *), then allow users to encrypt or decrypt using one of the ciphers you've created. You should also give the user a third option: to decrypt a mystery file. This third feature will take a ciphertext from a (presumably) unknown cipher and decrypt it using every possible cipher in your library. (This allows a human user to go through the decrypted files and determine which cipher was used to do the original encrypting)

The overall design of the skeleton is as follows:

Greet user, then repeat the following until the user chooses to exit:

Ask user if they want to exit, encrypt, decrypt, or decrypt a mystery file.

If the user wants to encrypt, (1) ask the user which of the
  available ciphers they want to use to do the encrypting, along with any
  necessary settings files, (2) ask for the name of the plaintext
  file they want to encrypt, and (3) ask for the name of the ciphertext
  file they want to write to. Then do the encrypting.

If the user wants to decrypt, (1) ask the user which of the available ciphers
  they want to use to do the decrypting, (2) ask for the name of the
  ciphertext file they wish to decrypt, and (3) ask for the name of the
  plaintext file they want to write to. Then do the decrypting.

If the user wants to decrypt a mystery file, ask for the name of the
  file they wish to decrypt.  Then, decrypt this file using each
  available cipher.  Choose different names for the decrypted files,
  so a human can read the files after running your program and
  determine which cipher was used to do the encryption.  For example,
  if you choose "hw.enc" for your mystery file, you might choose
  "hw0.dec", "hw1.dec", ..., where "hw<i>.dec" denotes the output of
  decrypting using the ith cipher.

Sample output is included at the end of this writeup. Much of this has been provided for you; your job is to fill in the construction of ciphers, input validation, and the final step with mystery files.

III. Implementing Rotation and Substitution Ciphers

I recommend first implementing the Rotation class.

Like the Caesar cipher, the Rotation specializes the Cipher class. You should define your header and implementation file in rotation.h and rotation.cpp respectively. In addition to what it inherits from Cipher, your Rotation class should comply with the following:

As you implement the Rotation class, add tests for Rotation objects encrypt and decrypt methods (using different values for rotation). Does it encrypt and decrypt correctly?

Once you have implemented and tested your Rotation cipher, you are ready to create the Substitution class in substitution.h and substitution.cpp. This should also specialize the Cipher class. It should comply with the following:

The Substitution cipher is more involved than previous ciphers. As a result, you'll need to perform an extra validation step. You need to validate that substitution contains all 26 capital letters in some order. How do you validate this? Have a for loop which goes through alphabet. For each letter of the alphabet, scan substitution and make sure it is there. If it is not, print an error message and return NULL in main rather than constructing a Substitution cipher.

As you work on the Substitution Cipher, test your implementation by creating and testing Substitution objects in the test file. Then run your main program and try out some encryption and decryption on these objects.


IV. Implementing the Enigma Cipher

Next you'll implement a simplified version of the Enigma Machine cipher.

Warning: this is more involved than previous ciphers, and I strongly recommend you get the rest of your lab working first. I've instructed ninjas not to assist you until you've demonstrated that the rest of your lab is complete. An Enigma machine uses three rotors. Each rotor (call them the inner, middle, and outer rotors) contains some fixed permutation of the letters. To encrypt a character c, perform the following:

  1. Locate c on the inner rotor.
  2. Find the character on the outer rotor aligned with c. Call this character c'.
  3. Locate c' on the middle rotor.
  4. Find the character on the outer rotor aligned with c'. Output this character.
After encrypting a letter but before encrypting the next letter, you must move the rotors in the following way:
  1. Move the inner rotor clockwise one position.
  2. If the inner rotor is now back in the start position, also move the middle rotor.

Initial Enigma Settings. Like the substitution cipher, there are extra settings that are needed by the constructor. Your constructor should take two ints representing the initial offset for the inner and middle rotors, and three strings representing the fixed permutation for each rotor. For the user interface, the main program already handles reading these in from a file; you just need to validate them. The format for this file is as follows. The first line contains integers denoting the inner rotor offset and middle rotor offset respectively. The second, third, and fourth lines contain the information for the inner, middle, and outer rotors respectively. There is an example enigma settings file in settings/enigma1.txt.

Validating Enigma Settings. As with the Substitution cipher, you must validate the settings for your Enigma cipher. Make sure that the initial offset for each rotor is between 0 and 25. Also make sure that the settings for each rotor permute the alphabet (so each letter appears exactly once).

If the enigma files fail to load, or if the settings are invalid, refuse to create a Cipher object by returning NULL.

Hint: to implement the enigma cipher, I recommend using character arrays for each rotor. Use integer variables (say innerOffset and middleOffset) to maintain the offsets. In this way, the state of the inner rotor would be implemented with (i) an array of chars and (ii) an offset. If 'D' is the first element of the inner rotor and the offset=3, then this character would align with the fourth character in the outer array.

Hint 2: decrypting the enigma cipher can seem complicated, but you still want to do essentially the opposite of the encryption. Specifically, perform the following:

Summary of requirements for Enigma. You are not required to implement the Enigma cipher using Hint 1, but assuming you do, the requirements are as follows:

V. Implementing the Mystery Decryption

In order to decrypt a mystery file, you should fill in the do_all_decrypts function with the following:

The rest of main has good examples of using ifstream (input file stream) and ofstream (output file stream) to create and read files with an interface almost exactly like cin and cout, but for reading from files. Feel free to also use the helper fileToString to read an entire file to a string at once. If you open many files (which you likely will), be sure to close them using ofstream.close() when you are done writing to them.

Review of Requirements and Sample Results
Here is a sample run (and a potentially useful test case!)
$ make main
Choose an option: 
  (1) Exit
  (2) Encrypt a file with a specific cipher
  (3) Decrypt a file with a specific cipher
  (4) Decrypt a mystery file with a list of ciphers
2

Choose an encryption scheme: 
  (1) Caesar
  (2) Rotation
  (3) Substitution
  (4) Enigma
4
Enter the name of the settings file (for enigma): settings/enigma1.txt
Enter name of plaintext file.
input/test1.txt
Enter name of file to write ciphertext to.
input/test1.enigma1
Successfully wrote encrypted file to input/test1.enigma1
Choose an option: 
  (1) Exit
  (2) Encrypt a file with a specific cipher
  (3) Decrypt a file with a specific cipher
  (4) Decrypt a mystery file with a list of ciphers
1
$ cat input/test1.txt
HELLO WORLD

$ cat input/test1.enigma1
BDUKJ HNYZL



Other hints

Handling nonalphabetic characters. In this lab you can assume that you only need to encrypt capital letters, and you can ignore all other characters. What happens if the text contains characters which are not capital letters, such as lower case letters, numbers, or punctuation? For this lab, feel free to not encrypt these characters -- if one appears, just leave it as-is in encrypted string and continue on and read in the next character.


Submit

Once you are satisfied with your code, hand it in by pushing to Github. You may run this as many times as you like, and only the most recent submission will be recorded.


Submit

Once you are satisfied with your code, hand it in by pushing to Github. You may run this as many times as you like, and only the most recent submission will be recorded.


Acknowledgments

The Enigma Machine cipher part of the lab is based on David Reed's NIFTY 2009 submission entitled "Enigma Encryption".