CS31 Lab 8: Shell Program - Part 1: Command Line Parser

1. Due Date

Due 11:59 PM, Tuesday, April 14th

Your partner for this lab is: {labnumber} Partners

CS31 Partner Etiquette and Expeectations.

2. Goals

Practice using the C String Library
Parsing string command line arguments into an argv list in dynamic memory.
Practice writing your own library function for a command line parser.
Gain more expertise with gdb and valgrind for debugging C programs.

3. Lab Description

This is the first stage of your shell lab, where you will implement a command line parser library named parsecmd. This library is the first step to building your own Shell. A shell is a program you interact with on the command line of a terminal window.

Your command line library should have functions that provide the following functionality:

Read in the command string entered by the user.
Parse a command line string into its individual command line arguments, and construct an argv list of strings from the command line args
Dynamically allocate and free the argv array.

4. Handy References

Lab Video Links:

5. `parsecmd` library

A library is a collection of functions that can be used by other programs. Once you write your library, your library functions can be used by programs by including the line #include "parsecmd.h".

For the compiler to know about the existence of your library, we need to link your parsecmd.o binary on the gcc command line:
```
$ gcc -g -o tester tester.c parsecmd.o
```
Your library will export one function, parse_cmd_dynamic. You are however encouraged to create helper functions to aid your implementation. Helper functions should not be exported via the parsecmd.h header file.

5.1. `parse_cmd_dynamic` function

The parse_cmd_dynamic function will take:
- input: a command line string
- output: an argv list (an array of strings, one per command line argument)
The argv list should be suitable for passing to a newly-starting program.
The parse_cmd_dynamic function should also test for the presence of an ampersand (&), which indicates that the command should be run in the background.

/*
 * parse_cmd_dynamic - Parse the passed command line into an argv array.
 *
 *    cmdline: The command line string entered at the shell prompt
 *             (const means that this function cannot modify cmdline).
 *             Your code should NOT attempt to modify these characters!
 *
 *         bg: A pointer to an integer, whose value your code should set.
 *             The value pointed to by bg should be 1 if the command is to be
 *             run in the background, otherwise set it to 0.
 *
 *    returns: A dynamically allocated array of strings, each element
 *             stores a string corresponding to a command line argument.
 *             (Note: the caller is responsible for freeing the returned
 *             argv list, not your parse_cmd_dynamic function).
 */
char **parse_cmd_dynamic(const char *cmdline, int *bg);

5.2. Example output of `parse_cmd_dynamic` function

For example, if the user enters the command line string:

$ cat foo.tex

Note that there will be a newline character after the foo.tex and before the final null-terminator character as a result of the user hitting the [enter] key. The string cat foo.tex will be passed to parse_cmd_dynamic, which will then produce an argv array that looks like:

argv [0] ---->"cat"
argv [1] ---->"foo.tex"
argv [2] ----|  (NULL)

This operation is known as tokenizing the string into individual tokens, each of which is separated by white space. Given a command line string as input, your implementation of the parse_cmd_dynamic function should dynamically allocate and return the argv array of strings, one for each command line token.

5.2.1. Implementation Logic

To produce a dynamic argv array, your implementation must first determine how many tokens are in the command line string.
After that, it should malloc EXACTLY the right number of (char *) argv buckets, one for each of the tokens plus one extra bucket at the end for NULL, which indicates the end of the tokens.
Then, for each token, it should malloc exactly enough space to store the string corresponding to the that token (don’t forget one extra byte for the null-terminator \0 character).

For example, if the command line string is:

" ls -l -a "

The function should allocate space for an array of four character pointers (char * 's). The first should have three bytes (characters) allocated to it for storing l, s, and \0. The second should allocate three bytes to store -l\0, and the third should hold -a\0. The final pointer should be NULL.

// Declare a local var to store dynamically allocated args array of strings.
char **args;

// After allocating memory and tokenizing, it should look like:
args --------->[0]-----> "ls"
               [1]-----> "-l"
               [2]-----> "-a"
               [3]-----|  (NULL)

To help test parse_cmd_dynamic 's behavior and convince yourself of its correctness, you’ll write a few test cases as input to your executable ./tester file.

6. Requirements

Your implementation of parse_cmd_dynamic should behave according to the specifications described above.

You may not change the prototype (argument types or return type) of parse_cmd_dynamic, and your solution should NOT use strtok.

You may assume that if an ampersand (&) appears in the command line, indicating that the command is to be run in the background, that it will appear at the end of the line.
For this lab, an ampersand is never considered to be a token or part of a token; it’s just a way for the user to express that the program should be run in the background.
- Ampersands should NOT appear in the argv array that parse_cmd_dynamic produces.
Your implementation should make no attempt to modify the input command line argument string. It should be treated as a constant. You ARE allowed to make a copy of the string (e.g. with strdup) and modify that however you’d like.
Your function should work for command line strings entered with one whitespace character between command line tokens. For example, these inputs should yield identical argv lists returned by your function:

cat [space] foo.txt [space] blah.txt [space] &
cat [space] foo.txt [space] blah.txt&
[space] cat [space] foo.txt [space] blah.txt [space] & [space]

should all yield:

args --------->[0]-----> "cat"
               [1]-----> "foo"
               [2]-----> "blah.txt"
               [3]-----|  (NULL)

You need only TEST that your code works for command lines with one whitespace character between arguments.

You should fill in the missing TODO’s in tester.c such that, if we run your ./tester, it works correctly and there are no valgrind issues (like memory leaks, since there are no calls to free in the starter code).
For full credit, your solution should be well-commented, it should not use global variables, it should demonstrate good design, and it should be free of valgrind errors.

7. Tips

Test your code in small increments. It’s much easier to localize a bug when you’ve only changed a few lines.
In implementing parse_cmd_dynamic, you will likely want to make more than one pass through the characters in the input string.
When working with pointers, and in particular double pointers, it’s often very useful to draw some pictures about what’s pointing to what, noting the types of variables. Don’t shy away from using the whiteboard or scratch paper!
Remember that if you dynamically allocate space for a string (using malloc) you need to allocate a space at the end for the terminating null character (\0)
Since your parsing library is allocating memory for the argv list, it’s up to your ./tester to free that memory when it’s done with it.

8. Submitting

Please remove any debugging output prior to submitting.

To submit your code, simply commit your changes locally using git add and git commit. Then run git push while in your lab directory. Only one partner needs to run the final push, but make sure both partners have pulled and merged each others changes. See the section on Using a shared repo on the git help page.

CS31 Lab 8: Shell Program - Part 1: Command Line Parser

1. Due Date

2. Goals

3. Lab Description

4. Handy References

5. parsecmd library

5.1. parse_cmd_dynamic function

5.2. Example output of parse_cmd_dynamic function

5.2.1. Implementation Logic

6. Requirements

7. Tips

8. Submitting

5. `parsecmd` library

5.1. `parse_cmd_dynamic` function

5.2. Example output of `parse_cmd_dynamic` function