CS35 Lab 2: The Enigma Machine

Due 11:59pm Sunday, September 21, 2014

The goal of this lab is to introduce you to basic concepts in the C++ program language. Concepts you will be familiar with after this lab include:

A skeleton version the program will appear in your cs35/labs/02 directory when you run update35. The program handin35 will only submit files in this directory. You may choose one partner for this lab. Both partners should be present and working on the code together. You will both be responsible for understanding all concepts, so dividing and conquering is not an option. The academic integrity policy applies to the entire pair; you cannot work or share code with anyone outside of your partner. Also, be sure to indicate who your partner is by including both names in the header and by also selecting the 'p' option when using handin35. Only one student in the pair should submit the lab.

See this page for tips on sharing files with your lab partner.


Encryption and Ciphers

Encryption is the process of encoding messages in such a way that only authorized parties can read the message. The idea of encrypting messages is thousands of years old and has been particularly useful in the military, where it is important to exchange messages (e.g. "Attack at dawn!" vs "Hold your ground") without enemies being able to understand them in the event a message is intercepted. A cipher is a kind of encryption scheme where each letter is replaced by another. Ciphers have been used since at least Roman times. Julius Caesar used an cipher where each letter of the message was replaced with one three positions earlier in the alphabet. So, for instance, 'A' is encrypted as 'X', 'B' becomes 'Y', 'C' is encrypted as 'Z', 'D' is encrypted as 'A', and so on. An encrypted message is called the ciphertext, and if an enemy intercepted a message, it would just look like a jumble of letters. On the other hand, Caesar's subordinates, knowing the encryption scheme, could decrypt the message by employing doing the opposite of the encryption scheme: 'A' is decrypted as 'D', 'B' is decrypted as 'E', etc. This cipher was used so effectively that it became known as the Caesar cipher. In this lab, you will implement the following ciphers:

The Allies made a huge effort to break the Engima code during World War II. This task was led in part by Alan Turing, and it was said that breaking the enigma machine shortened the war in Europe by two years. The effort to break German ciphers also resulted in the first electrical digital computers.
Introduction

In this lab, you will create several classes representing the different ciphers (Caesar, rotation, substitution, engima) mentioned above. All of these classes will be subclasses of an abstract Cipher class. This Cipher class will encapsulate all information and functionality that is common to all cipher classes. Your subclasses will implement functions (such as encrypt and decrypt) whose behavior differs with each different cipher.

In addition, you will create a main program which allow the user to encrypt and decrypt files using one of a set of cipher objects. Below is an overview of the files required for submitting this week's lab. Those highlighted in blue will require implementation on your part. Those in black are complete and should not be modified except for comments.


Program Requirements

I recommend implementing your lab assignment in 4 phases:

  1. Review and understand the Cipher and Caesar classes.
  2. Complete implementation of a menu-based interface.
  3. Complete implementation of the Rotation and Substitution ciphers.
  4. Enigma Cipher - complete the lab by implementing the Engima cipher.
Note: The code for the Engima cipher is more complex than the rest of this lab. You should definitely get the rest of the lab working before tackling this final piece. I have instructed the ninjas to not help students with the Enigma cipher unless students have demonstrated working implementations for the rest of the lab.

Your program should use proper design. That is, your program will be tested on your use of modular functions and understanding of how to call, define, and declare functions in C++.


I. Cipher and Caesar classes

The Cipher class serves as the base class for all of the cipher objects. It contains three data members: Additionally, it contains several protected methods, including: Aside from getName, these methods are all accessor methods for inFile and outFile. They have been implemented for you. These methods are protected to allow each subclass access to them without giving an outside class user access. You should use them to handle your file I/O.

The most important methods of the Cipher class are:

---------------------------
Key Concepts: virtual functions, pure virtual functions, abstract classes. The encrypt and decrypt functions are pure virtual functions. There are no implementations for these functions in the Cipher class. Instead, each subclass must implement the functions themselves. In a class declaration, pure virtual functions are declared as follows: virtual bool encrypt(string plainTextFile, string cipherTextFile) = 0; The virtual keyword means encrypt is a virtual function. When a function is virtual, subclasses can provide their own implementations. By adding the = 0 the method becomes pure virtual. Pure virtual functions must be implemented by subclasses. A class that declares a pure virtual function is called an abstract class. It cannot be instantiated. Instead, subclasses are created and instantated.

Use pointers to create these objects as follows.
Cipher *mycipher = new Caesar();

This creates a pointer to a Cipher object called mycipher, and instantiates it as a Caesar object. The object itself is a Caesar object, but mycipher is a pointer to a Cipher object. This is often just what we'd like in C++. The user of this pointer does not need to know what kind of cipher myCipher is; they just want to encrypt and decrypt.

The Caesar class is our first cipher. Its declaration and implemenation are given to you, but you are still responsible for understanding its code. This class contains implements the virtual functions from Cipher. In particular, it implements encrypt and decrypt functions. To encrypt, it reads in one character at a time from the plaintext file, applies the Caesar cipher to each, and writes it to the output file.

Managing the encryption and decryption. So you read in a letter 'K' and are encrypting using the Caesar cipher. How does your lab figure out that 'H' is the encrypted character? More generally, with an arbitrary cipher, how should the data be structured to manage the encryption? There are several different possibilities, but the most straightforward is to use arrays. In the case of the Caesar cipher, it uses a character array:

    char alphabet[] = {'A','B','C','D','E','F','G', 'H','I', 'J',
    'K','L','M','N','O','P','Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
    'Y', 'Z'};
  
When encrypting a character, the Caesar object searches through this array (using a for loop) until finding the index of the character to encrypt. Then it subtracts 3 from the index and outputs that character. You will need to do similar calculations for the other ciphers.

II. Implementing the User Interface

For your main program cipher_engine.cpp, you should create a library of cipher objects (stored as an array of Cipher *), then allow users to encrypt or decrypt using one of the ciphers you've created. You should also give the user a third option: to decrypt a mystery file. This third feature will take a ciphertext from a (presumably) unknown cipher and decrypt it using every possible cipher in your library. (This allows a human user to go through the decrypted files and determine which cipher was used to do the original encrypting)

The overall design of your program should be as follows:

Greet user, then repeat the following until the user chooses to exit:

Ask user if they want to exit, encrypt, decrypt, or decrypt a mystery file.

If the user wants to encrypt, (1) ask for the name of the plaintext
  file they want to encrypt, (2) ask for the name of the ciphertext
  file they want to write to, and (3) ask the user which of the
  available ciphers they want to use to do the encrypting. Give the
  user information about each cipher using the printInfo
  method.  (Then do the encrypting).

If the user wants to decrypt, (1) ask for the name of the ciphertext
  file they wish to decrypt, (2) ask for hte name of the plaintext
  file they want to write to, and (3) ask the user which of the
  available ciphers they want to use to do the decrypting.  Again,
  give the user a list of options along with information about each
  available cipher object.  (Then do the decrypting)

If the user wants to decrypt the mystery file, ask for the name of the
  file they wish to decrypt.  Then, decrypt this file using each
  available cipher.  Choose different names for the decrypted files,
  so a human can read the files after running your program and
  determine which cipher was used to do the encryption.  For example,
  if you choose "hw.enc" for your mystery file, you might choose
  "hw0.dec", "hw1.dec", ..., where "hw<i>.dec" denotes the output of
  decrypting using the ith cipher.

Sample output is included at the end of this writeup.

At this point, you should have a full user interface, although your library of cipher objects might just consist of a single Caesar cipher. As you implement the remaining ciphers, incrementally populate your library with more cipher objects.


III. Implementing Rotation and Substitution Ciphers

Once you have created the user interface, you should go on to implementing the new cipher subclasses. First, implement the Rotation class.

Like the Caesar cipher, the Rotation specializes the Cipher class. You should define your header and implementation file in rotation.h and rotation.cpp respectively. In addition to what it inherits from Cipher, your Rotation class should comply with the following:

Once you have finished implementing the Rotation cipher, test it by adding a couple of rotation cipher objects (each with a different rotation value) to your cipher library. Does each object encrypt and decrypt correctly?

Note: make sure to add rotation.o and rotation.h to the appropriate places in the Makefile.

Once you have implemented and tested your Rotation cipher, you are ready to create the Substitution class in substitution.h and substitution.cpp. This should also specialize the Cipher class. It should comply with the following:

The Substitution cipher is more involved than previous ciphers. As a result, you'll need to perform two extra validation steps: Once you finish implementing the Substitution Cipher, test your implementation by adding a few Substitution objects to your cipher library. Then run your main program and try out some encryption and decryption on these objects.

Note: you'll need to change lines 4 and 6 in the makefile to compile this class.


IV. Implementing the Enigma Cipher

For the final part of your lab, you'll implement a simplified version of the Enigma Machine cipher.

Warning: this is more involved than previous ciphers, and I strongly recommend you get the rest of your lab working first. I've instructed ninjas not to assist you until you've demonstrated that the rest of your lab is complete. An Enigma machine uses three rotors. Each rotor (call them the inner, middle, and outer rotors) contains some fixed permutation of the letters. To encrypt a character c, perform the following:

  1. Locate c on the inner rotor.
  2. Find the character on the outer rotor aligned with c. Call this character c'.
  3. Locate c' on the middle rotor.
  4. Find the character on the outer rotor aligned with c'. Output this character.
After encrypting a letter but before encrypting the next letter, you must move the rotors in the following way:
  1. Move the inner rotor clockwise one position.
  2. If the inner rotor is now back in the start position, also move the middle rotor.

Initial Enigma Settings. Like the substitution cipher, you should read in from a file both (i) the initial offset for the inner and middle rotors and (ii) the fixed permutation for each rotor. The format for this file should be as follows. The first line contains integers denoting the inner rotor offset and middle rotor offset respectively. The second, third, and fourth lines contain the information for the inner, middle, and outer rotors respectively. There is an example enigma settings file in settings/enigma1.txt.

Validating Enigma Settings. As with the Substitution cipher, you must validate the settings for your Enigma cipher. Make sure that the initial offset for each rotor is between 0 and 25. Also make sure that the settings for each rotor permute the alphabet (so each letter appears exactly once).

If the enigma files fail to load, or if the settings are invalid, load some default settings: each initial rotor position should be set to 0, and each rotor should map each letter to itself ('A' to 'A', 'B' to 'B', ...)

Hint: to implement the enigma cipher, I recommend using character arrays for each rotor. Use integer variables (say inner_offset and middle_offset) to maintain the offsets. In this way, the state of the inner rotor would be implemented with (i) an array of chars and (ii) an offset. If 'D' is the first element of the inner rotor and the offset=3, then this character would align with the fourth character in the outer array.

Hint 2: decrypting the enigma cipher can seem complicated, but you still want to do essentially the opposite of the encryption. Specifically, perform the following:

Summary of requirements for Enigma. You are not required to implement the Enigma cipher using Hint 1, but assuming you do, the requirements are as follows:

Review of Requirements and Sample Results
Instead of printing out sample results here, we're publishing a sample executable. To see a sample solution to this program, run
  $ ~brody/public/cs35/enigma/cipher_engine

Other hints

Handling nonalphabetic characters. In this lab you can assume that the text consists of all capital letters, along with possibly some whitespace, which you can ignore. What happens if the text contains characters which are not capital letters, such as lower case letters, numbers, or punctuation? For this lab, feel free to ignore these characters -- if one appears, just leave it out of the encryption and read in the next character. In the real world, these characters might be (i) ignored, (ii) written to the output file without encrypting them, or (iii) encrypting them using some other means. Again, for this lab, you should ignore these characters.

Converting ints to strings. It might be helpful to convert integer to string values here. In the version of C++ we're using, this isn't as straightforward as you'd like to think. You need to include a string stream class. Here is some code for a simple helper function that does the conversion:

#include <sstream>

string intToString(int z) {
  ostringstream convert;
  convert << z;
  return convert.str();
}

Submit

Once you are satisfied with your code, hand it in by typing handin35. This will copy the code from your cs35/labs/02 to my grading directory. You may run handin35 as many times as you like, and only the most recent submission will be recorded.


Acknowledgments

The Enigma Machine cipher part of the lab is based on David Reed's NIFTY 2009 submission entitled "Enigma Encryption".