CS 31 Lab 8: Shell Program

Part 1, Command Line Parser: due 11:59 PM, Tuesday, November 20

Part 2, Full Shell: due 11:59 PM, Tuesday, November 27


Lab 8 Goals:

Lab Description

You will implement a shell, which is the program you interact with on the command line of a terminal window. A shell operates by doing the following things:

  1. Print a prompt and wait for the user to type in a command.
  2. Read in the command string entered by the user.
  3. Parse the command line string into an argv list.
  4. If the command (first item in the parsed argv list) is a built-in shell command, the shell will handle it on its own (without forking a child process).
  5. Otherwise, if it's not a built-in command, fork a child process to execute the command and wait for it to finish (unless the command is to be run in the background, then the shell doesn't wait for the child to exit).
  6. Repeat until the user enters the built-in command exit to exit the shell program.

Part 1: Command line parsing library.

For the first stage of your shell, you'll implement a parsecmd library that contains functions to parse a command line string into its individual command line arguments, and construct an argv list of strings from the command line args. Your library functions can then be used by other programs by #including your parsecmd.h file and linking in your parsecmd.o binary on the gcc command line:

$ gcc -g -o tester tester.c parsemd.o

Your library will export one function, parse_cmd_dynamic, though you're encouraged to create helper functions to aid your implementation of it. Helper functions should not be exported via the parsecmd.h header file.

The parse_cmd_dynamic function will take in a command line string and turn it into an argv list (an array of strings, one per command line argument) that's suitable for passing to a newly-starting program. The function will also test for the presence of an ampersand (&), which indicates that the command should be run in the background. For example, if the user enters the command line string:

$ cat foo.tex

The string "cat foo.tex" will be passed to parse_cmd_dynamic, which will then produce an argv array that looks like:

argv [0] ---->"cat"
argv [1] ---->"foo.tex"
argv [2] ----|  (NULL)

Note that there will be a newline character after the foo.tex and before the final null-terminator character as a result of the user hitting the [enter] key.

This operation is known as tokenizing the string into individual tokens, each of which is separated by white space. Given a command line string as input, your implementation of the parse_cmd_dynamic function should dynamically allocate and return the argv array of strings, one for each command line token. It takes the following parameters:

/*
 * parse_cmd_dynamic - Parse the passed command line into an argv array.
 *
 *    cmdline: The command line string entered at the shell prompt
 *             (const means that this function cannot modify cmdline).
 *             Your code should NOT attempt to modify these characters!
 *
 *         bg: A pointer to an integer, whose value your code should set.
 *             The value pointed to by bg should be 1 if the command is to be
 *             run in the background, otherwise set it to 0.
 *
 *    returns: A dynamically allocated array of strings, each element
 *             stores a string corresponding to a command line argument.
 *             (Note: the caller is responsible for freeing the returned
 *             argv list, not your parse_cmd_dynamic function).
 */
char **parse_cmd_dynamic(const char *cmdline, int *bg);

To produce a dynamic argv array, your implementation must first determine how many tokens are in the command line string. After that, it should malloc EXACTLY the right number of (char *) argv buckets, one for each of the tokens plus one extra bucket at the end for NULL, which indicates the end of the tokens. Then, for each token, it should malloc exactly enough space to store the string corresponding to the that token (don't forget one extra byte for the null-terminator '\0' character).

For example, if the command line string is:

"   ls   -l   -a   "

The function should allocate space for an array of four character pointers (char *'s). The first should have three bytes (characters) allocated to it for storing 'l', 's', and '\0'. The second should allocate three bytes to store "-l\0", and the third should hold "-a\0". The final pointer should be NULL.

   // Declare a local var to store dynamically allocated args array of strings.
   char **args;

   // After allocating memory and tokenizing, it should look like:

      args --------->[0]-----> "ls"
                     [1]-----> "-l"
                     [2]-----> "-a"
                     [3]-----|  (NULL)

To help test parse_cmd_dynamic's behavior and convince yourself of its correctness, you'll write a few test cases in the provided tester.c file.

Requirements

Tips

Part 2: Building a shell.

Now that we can tokenize command line strings, let's put together the rest of the pieces for executing user commands. Your shell should support the following features:

  1. Running commands in the foreground.

    When a command is run in the foreground, for example:

    cs31shell> ./sleeper 2  
    

    Your shell program should fork() a child process to execute sleeper and then wait until the child process exits before proceeding. You can accomplish this by calling waitpid in the parent (your shell) by passing in the pid of the child process (the return value of fork()).

  2. Running commands in the background.

    When a command is run in the background, for example:

    cs31shell> ./sleeper 3 & 
    

    Your shell program should fork() a child process to execute sleeper, but it should NOT wait for the child to exit. Instead, after forking the child process, it should immediately return to step 1 (print out the prompt and read in the next command line). The child process will execute the command concurrently while the parent shell handles other command(s).

    Your shell must still reap background processes after they exit, so you can't just forget about them! When a child that was run in the background exits, your shell program will receive a SIGCHLD signal. You should install a SIGCHLD handler that will call waitpid() to reap the exited child process(es).

    Your shell should be able to run any number of processes in the background, so if you type in quick succession:

    cs31shell> ./sleeper & 
    cs31shell> ./sleeper & 
    cs31shell> ./sleeper & 
    cs31shell> ps 
    

    The ps program output should list all three sleeper child processes.

  3. Built-in commands.

    Your shell should respond to the following three built-in commands on its own. It should not fork off a child to handle these!

    1. exit: Terminate the shell program. You can print out a goodbye message, if you'd like.
    2. history: Print a list of the user's 10 more recently entered command lines. (Note: blank lines should not be added to the history.)
    3. !num (where num is an actual number, e.g., !5): Re-execute a previous command from the history. The previous command could be a run-in-the-foreground, run-in-the-background, or a built-in command that your shell should execute appropriately. The command line should be added into the history list (i.e. executing !5 should not put !5 in the history list, instead a copy of the command line associated with command ID 5 from the history list should be added to the history list). See the sample output below for some examples of history and !num commands.

  4. History.

    Your shell program should keep a list of the 10 most recently entered command lines by the user (use a #define constant named MAXHIST whose value is 10). The built-in history command should print out the history list in order from first (oldest) to last (most recently entered) command. For each element in the list, it should print the command ID and its corresponding command line string. The command ID is an ever-increasing number, and each command should increment the command ID by one. I suggest using an unsigned int to store this value (don't worry, I'm not going to try executing 4 billion commands in an attempt to overflow this value).

    Your history list should be implemented as circular queue of history structs: a circular array of MAXHIST buckets where new commands are added to one end and old ones removed from the other end.

    Users can request the execution of a previous command, as stored in the history list, by executing the built-in !num, where "num" is a number corresponding to a history command ID. Upon receiving such a command:

    1. Search your history list for a command with a matching command ID. Remember that the command ID is not the position in the history list, it is the unique number of the command in your shell's execution history (i.e. !5 is the 5th command run by your shell, !34 is the 34th command run by your shell).
    2. If a matching command ID is not found, print out an error message.
    3. Otherwise, use the command line from the matching history command ID, and re-execute it (this command now also becomes the most recent command to add to your history list).
    4. If the command from the history list is not a built-in command, then your shell should run it just like it does any foreground or background program (parse its command line into argv list, fork-execvp and waitpid or not).

      If the command from the history list is a built-in command (which could only be the history command), then it should execute it directly just like any built-in command.

Requirements

Example Output

It may be helpful for you to take a look at Tia's sample output, particularly to see how the built-in history features work.

Tips

Submitting

Please remove any debugging output prior to submitting.

To submit your code, simply commit your changes locally using git add and git commit. Then run git push while in your lab directory. Only one partner needs to run the final push, but make sure both partners have pulled and merged each others changes. See the section on Using a shared repo on the git help page.