1. Due Date
Due by 11:59 pm, Tuesday, Nov 25, 2025
This lab should be done with your Lab 9 partner, listed here: Lab 9 partners
Please review our guidelines for working with partners, etiquette and expectations.
2. Lab Goals
-
Learn how to implement a C library (.h and .c files), and then use your library in a program.
-
Learn more about C strings and char types, and use and C ctype and string library functions in programs.
-
Gain more expertise with dynamic memory allocation and pointers in C.
-
Further your mastery of using gdb and valgrind for debugging C programs.
3. Lab Overview
In this assignment you and your partner will implement a C library and complete
test code to test the functions in your library. Specifically, you will
implement two functions from the parsecmd library. Both take a command
line string that they parse into an argv array that can be used to
pass to an exec function. Both also take a pass-by-pointer parameter (bg)
that they set to indicate if the command is run in the background or not.
-
The first version of the function is the one you used in your shell program to construct the
argvarray from a passed command line string:int parse_cmd(const char *cmdline, char *argv[], int *bg);The
parse_cmdfunction takes the command line string and anargvarray ofchar *, and initializes theargvbucket values to the value of individual tokens from thecmdlinestring. The caller passes anargvarray ofchar *(of max sizeMAXARGS). This version doesn’t do dynamic memory allocation, so it does not require freeing by the caller after use. -
The second version is a function with similar functionality, but it dynamically allocates the
argvarray (and dynamically allocates each C string pointed to by each argv[i] element) that it returns to the caller.char **parse_cmd_dynamic(const char *cmdline, int *bg);The
parse_cmd_dynamicfunction takes the command line string and returns a dynamically allocatedargvarray of strings (an array whose buckets storechar *values, so its return type is achar **). The function initializes the returned array’s bucket values to individual tokens from thecmdlinestring. Each token string is a dynamically alloc’ed. *The caller is responsible for freeing all the memory space of the returned array.*
You will use a test program that includes and links in your implementation
of the parsecmd library to test its functionality and correctness.
In addition, you could link your parsecmd library into your shell program
(by modifying the LIBDIR variable in its Makefile) and your shell
should work identically to when it linked in my implementation of the
parscmd library! You are not required to do this step, but you
could try it out.
4. Starting Point Code
4.1. Getting Your Lab Repo
Both you and your partner should clone your Lab 9 repo into
your cs31/Labs subdirectory:
-
get your Lab 9 ssh-URL from the CS31 git org. The repository to clone is named Lab9-userID1-userID2 where the two user names match that of you and your Lab 9 lab partner.
-
cd into your
cs31/Labssubdirectory:$ cd ~/cs31/Labs $ pwd -
clone your repo
git clone [the ssh url to your your repo] cd Lab9-userID1-userID2
There are more detailed instructions about getting your lab repo from the "Getting Lab Starting Point Code" section of the Using Git for CS31 Labs page.
As your start each lab assignment, it is good to first test that you and your partner have both successfully cloned your shared repo, and that you can share code by pushing a small change and by pulling a small change made by your partner. Follow the directions in the "Sharing Code with your Lab Partner" section of the Using Git for CS31 Labs page.
4.2. Starting Point files
$ ls
design_worksheet Makefile parsecmd.c parsecmd.h README.adoc tester.c
-
Makefile: builds the library .o file and a tester program executable file -
README.md: some notes to you -
parsecmd.h: the header file for the parsecmd library. It contains the interface to the library (the function prototypes and any other definitions the library exports, like #define constants). You do not need to modify this file. -
parsecmd.c: the implementation part of the parsecmd library. Your implementations of the two library functions,parse_cmdandparse_cmd_dynamic, go in here. -
tester.c: a test program or your parsecmd library. Add code here to test the functionality of your library. Make sure to test both library functions. Again, add helper functions to make this code manageable. There is not a lot of code that you need to add to this file. But, note the TODO comments in this file for some places where you will need to add code to clean up dynamically allocated memory. -
design_worksheet: start your lab assignment by filling in this worksheet with the design of eachparse_cmdlibrary function before you implement it. You should bring this filled out worksheet with you to the ninja session, and to office hours and other times you are seeking help on your lab assignment. To print a copy of this worksheet to a CS lab printer, use thelprcommand:lpr design_worksheet
See the "Getting Started" tips in Section 9 for information about how and when to use this worksheet.
5. Compiling and Running
You will implement the functions in parsecmd.c and then use the tester
program (source in tester.c to test your implementation.
The tester program, in a loop:
-
reads in a command line string (or quit to exit)
-
calls one or both of your parsecmd functions to parse the command line (see the code in
tester.cfor where it makes calls to the parsecmd library functions) -
prints out the resulting
argvstring values, each string between#chars so you can see if you have any whitespace characters you didn’t remove -
prints out if the command line has a
&at the end, signifying to run in the background.
To run:
$ ./tester
6. Sample Output
Here is output from a run of my program: Sample Output
7. Lab Details
You will implement the parsecmd library that contains two functions
(parse_cmd and parse_cmd_dynamic) that
parse a command line string into its individual command line arguments,
and that constructs an argv array of strings from the command line args.
The functions from the parsecmd library can be used by other programs that include its header file:
#include "parsecmd.h"
and that link a binary version of the library’s implementation
(parsecmd.o) into their executable. For example:
gcc -g -o tester tester.c parsecmd.o
The Makefile included with the lab assignment builds the parsecmd.o
binary from parsecmd.c and links it into the tester program.
You should use the tester program to test the correctness of your
library functions. You will need to add more testing code to tester.c
to fully test your library functions.
7.1. The interface to the parsecmd library (parsecmd.h)
The parsecmd.h file contains the interface to the library you will
implement.
You do not need to modify this file but you should use some of the constants it defines in your library implementation code.
Also, read the comments in parsecmd.h to ensure that you
implement the two library functions correctly (e.g., your
functions do what the interface says that they do).
7.2. Implementing the parsecmd library (parsecmd.c)
You will implement two functions that parse a command line string into an argv array (an array of strings, one per command line argument). They also both test for an ampersand in the command line, which, when present, indicates a background command. For example, if the user enters the follow command line string:
cat foo.tex
These functions will be passed the string:
"cat foo.tex\n"
And will parse the command line string into the argv array:
argv [0] ---->"cat"
argv [1] ---->"foo.tex"
argv [2] ----| (NULL)
The main difference between the two functions is that the first
uses a single statically declared char array, while the second
dynamically allocates space for both the argv array and
the strings (arrays of chars) that each of the argv buckets
points to.
7.2.1. The parse_cmd function
This is the first of the two functions you should implement and test.
/*
* parse_cmd - Parse the command line and build the argv array.
*
* cmdline: the command line string entered at the shell prompt
* argv: an array of size MAXARGS of char *
* parse_cmd will initialize its contents from the passed
* cmdline string.
* bg: a pointer to an int that will be set to 1 if a
* '&' is in the command line string, indicating that
* the command is to be run in the backgroud, and to 0 otherwise.
*
* returns: -2 if cmdline is NULL
* -1 if the command line is empty
* 0 to indicate success
*
* The caller should pass in a variable declared as:
* char *argv[MAXARGS];
* int bg;
* For example, a call might look like:
* (ex) ret = parse_cmd(commandLine, argv, &bg);
*
* argv will be filled with one string per command line
* argument. The first argv[i] value following the last
* command line string will be NULL. For example, for
* the command line string "ls -l":
* argv[0] will be "ls"
* argv[1] will be "-l"
* argv[2] will be NULL
* For an empty command line, argv[0] will be set to NULL
*/
int parse_cmd(const char *cmdline, char *argv[], int *bg);
This function sets the buckets of the argv array parameter to point to
substrings of the cmdline string that it stores in a char array global
variable.
This global char array variable is already declared for you in parsecmd.c:
static char cmdline_copy[MAXLINE];
The parse_cmd function will:
-
make a copy of the cmdline string in its cmdline_copy array
-
process its copy of the string to find tokens, modifying cmdline_copy to create substrings for each token. A token is a sequence of non-whitespace chars, each separated by at least one whitespace character (or by
&). Tokens should not include&, which has special meaning in command lines. -
assign each
argv[i]bucket to point to its corresponding substring token in cmdline_copy. Remember that aNULLvalue in anargv[i]bucket is used to signify the end of the list ofargvstrings.
For example, if the command line entered by a user is the following (note
the user entered a few extra spaces in this command line, and
that $ is the shell prompt):
$ ls -1 -a &
Then its command line string line is the following (note the spaces in the
string and that the end of line character \n is included in the command
line string):
" ls -l -a &\n"
And a copy of it in cmdline_copy looks like:
cmdline_copy 0 | ' ' |
1 | ' ' |
2 | 'l' |
3 | 's' |
4 | ' ' |
5 | ' ' |
6 | '-' |
7 | 'l' |
8 | ' ' |
9 | ' ' |
10 | '-' |
11 | 'a' |
12 | ' ' |
13 | '&' |
14 | '\n'|
15 | '\0'|
Once the parse_cmd function has created a copy of the command line,
it then tokenizes the cmdline_copy string and sets each argv array
bucket to point to the start of its associated token string in
cmdline_copy. In memory this would look something like:
0 1 2 3
------------------------
argv | * | * | * | * |
---|-----|-----|-----|--
cmdline_copy 0 | ' ' | | | | |
1 | ' ' | | | | |
2 | 'l' |<---------- | | ----
3 | 's' | | | (NULL)
4 | '\0'| | |
5 | ' ' | | |
6 | '-' |<---------------- |
7 | 'l' | |
8 | '\0'| |
9 | ' ' | |
10 | '-' |<-----------------------
11 | 'a' |
12 | '\0'|
13 | '&' |
14 | '\n'|
15 | '\0'|
Note the changes to the cmdline_copy string contents and the assignment of
argv bucket values into different starting points in the char array.
In parsing the cmdline_copy string, the parse_cmd function should also
identify that & appears in the command line, and it should set what bg
points to to 1 to indicate this, otherwise it should set it to 0.
Note that & is not one of the substrings pointed to an element in the
argv array; & it is not a command line argument, but specifies to run
the command in the background.
It is fine to stop parsing arguments in the command line string as soon
as an & is found. For example, if the user enters that string:
ls -l -a & hello there
A call to parse_cmd would create an argv string with three command
line arguments ("ls", "-l", and "-a"), set what bg points to
to `, and stop parsing the string (not include "hello" and "there" in
the argv array).
7.2.2. The parse_cmd_dynamic function
There are two main problems with the previous function:
-
The user is limited to command line strings that are at most MAXLINE characters long and have at most MAXARGS arguments.
-
The function uses a single global character array. This means that the caller has to use the
argvreturn strings before another call to theparse_cmdfunction is made (since it will overwritecmdline_copywith the new command line string that it tokenizes). For use by the shell program, this version is okay (think about why this is the case), but it limits the "general purpose-ness" of this function.
The parse_cmd_dynamic function solves these two problems by dynamically
allocating and returning an argv array of strings, one string for each
command line argument.
/*
* parse_cmd_dynamic - parse the passed command line into an argv array
*
* cmdline: the command line string entered at the shell prompt
* bg: will set value pointed to 1 if command line is run in
* background, 0 otherwise (a pass-by-reference parameter)
*
* returns: a dynamically allocated array of strings, exactly one
* bucket value per argument in the passed command line
* the last bucket value is set to NULL to signify the
* end of the list of argument values.
* or NULL on an error
*
* The caller is responsible for freeing the returned argv array.
*/
char **parse_cmd_dynamic(const char *cmdline, int *bg);
This function finds tokens in the command line string much like the
previous version does. However, it must first determine how many tokens are in
the cmdline string and malloc EXACTLY the right number of argv buckets for
the particular cmdline string (and remember it needs an extra bucket at the
end for NULL).
For each token, it will malloc exactly enough space for a char array to
store the token as a string (remember an extra bucket for the terminating
'\0' character).
For example, if the cmdline string is:
" ls -l -a \n"
This function will malloc an argv array of 4 char * values,
and then malloc three arrays of char values, one for each
command line string, the base address of each stored in
a bucket of the argv array. For the example command above,
the returned array would look something like this:
// local var to store dynamically allocated args array of strings
char **argv;
argv --------->[0]-----> "ls"
[1]-----> "-l"
[2]-----> "-a"
[3]-----| (NULL)
The function CANNOT modify the cmdline string that is passed in to it.
However, it may malloc space for a temporary local copy of the cmdline
string and tokenize this copy. If you do this, it may allow you to reuse
some of the code you wrote for the parse_cmd function in this function.
Also, if you use this approach, be sure that your function frees this
copy before it returns; the returned args list should not point into this
copy like with parse_cmd, instead each command line argument should be
malloced separately as a distinct string of exactly the correct size.
parse_cmd_dynamic is more complicated to implement than parse_cmd and
will likely require more than a single pass through the chars of the
command line string. Start with the design of it using the design_worksheet
(see Section 4).
8. Lab Requirements
-
Your two functions should meet the specifications described above.
-
You may only use the single global variable that is already defined for you in
parsecmd.c. All other variables should be local, and values should be passed to functions. -
You may not change any of the function prototypes in the parsecmd library (and don’t change the
.hfile). Your library code must work with our test code that makes calls to these functions as they are defined above. -
You should use good modular code. The two library functions should not be static, but you can add helper functions that are private to the
.cfile, and thus should be declared static (they are only in scope inside theparsecmd.cfile…they are private functions inparsecmd.c). -
All system and function calls that return values should have code that detects and handles error return values.
-
Your functions should work for command lines entered with any amount of whitespace between command line options (but there should be at least one whitespace char between each). For example, all these should yield identical
argvlists:cat foo.txt blah.txt & cat foo.txt blah.txt& cat foo.txt blah.txt &TEST that your code works for command lines with any amount of whitespace between command line arguments
-
Your code should be well commented. See my C style guide for examples of what this means.
-
Your code should be free of valgrind errors. You will need to add code to tester.c to free the space allocated and returned by the dynamic version of the function. Any other space you malloc internally in your library functions (that it does not explicitly return to the caller) should be freed before the function returns. Still Reachable is okay.
-
You may not use the string library
strtokorstrtok_rfunctions in your solution.
9. Tips and Hints
-
A suggested order in which to implement this lab:
-
Start by reviewing Section 2.9.6 of the textbook (see link in
Handy Resources) to understand what the.hand.cparts of the library are and how a user of your library uses its.obinary file and.hfile. -
Next, print out the
design_worksheetfrom your repo and follow the directions for outlining the design of your solution to theparse_cmdfunction first. See Section 4 for more details and directions. -
After filling out the worksheet with your design for
parse_cmd, implement and test it. Run in valgrind to find and fix any memory errors. -
Next, fill out the the
design_worksheetto design theparse_cmd_dynamicfunction. -
Then implement and test it. Run in valgrind.
-
-
Implement and test incrementally! Break up the functionality of a function into parts that you implement and test incrementally. Use valgrind as you go to catch memory access errors as you make them.
-
Review strings, char, and pointers in C. Chapter 2 of the textbook contains sections on this material. And see other references listed in Section 11.
-
Remember if you dynamically allocate space for a string (using malloc), you need to allocate a space at the end for the terminating null character ('\0'), and that you need to explicitly free the space when you are done using it (call free).
-
Use string library and ctype functions. (For more info see the textbook, my string and char documentation off my help pages, and the man pages of individual functions.) Some that may be useful include:
strlen, strcpy, strchr, strstr, isspaceHere is an example of using strstr and modifying a string to create a substring:
int i; char *ptr, *str; str = malloc(sizeof(char)*64); if(!str) { exit(1); } ptr = strcpy(str, "hello there, how are you?"); if(!ptr) { exit(1); } ptr = strstr(str, "how"); if(ptr) { printf("%s\n", ptr); // prints: how are you? ptr[3] = '\0'; printf("%s\n", ptr); // prints: how } else { printf("no how in str\n"); }strstrmay or may not be useful in this assignment, but you will need to create token strings in a way that has some similarities to this example. -
Remember, you cannot directly compare two strings using
==or other similar operators. Instead, you need to usestrcmp. -
You can directly compare the value of two char values using
==and other relational operators. For example:char *str; ... if(str[i] == 'a') { ... } -
Command lines with ampersands in the middle can be handled like bash handles them (bash ignores everything after the &):
"hello there & how are you?"gets parsed into an argv list that looks like this:
argv[0]---->"hello" argv[1]---->"there" argv[2]----| (NULL) -
Use gdb and valgrind as you incrementally implement and test.
-
Writing string processing code can often times be tricky. Use the debugger to help you see what your code is doing. It may be helpful to step through each C statement using
nextIf you do this and want to see the results of instructions on program variables you can use thedisplaycommand to get gdb to automatically print out values every time it gains control. Here is an example of displaying two variables (ptr and i):(gdb) display ptr (gdb) display i -
Think very carefully about type. Draw pictures to help you figure out what values you need to access and what their types are.
10. Submitting your Lab
Please remove any debugging output prior to submitting.
To submit your code, commit your changes locally using git add and
git commit. Then run git push while in your lab directory.
Only one partner needs to run the final git push, but make sure both
partners have pulled and merged each others changes.
Also, it is good practice to run make clean before doing a git add and
commit: you do not want to add to the repo any files that are built by gcc
(e.g. executable files). Included in your lab git repo is a .gitignore file
telling git to ignore these files, so you likely won’t add these types of files
by accident. However, if you have other gcc generated binaries in your repo,
please be careful about this.
Here are the commands to submit your solution files (from
one of you or your partner’s ~/cs31/Labs/Lab9-userID1-userID2 subdirectory):
$ make clean
$ git add *.c
$ git commit -m "correct and well commented Lab9 solution"
$ git push
Verify that the results appear (e.g., by viewing the the repository on CS31-f25). You will receive deductions for submitting code that does not run or repos with merge conflicts. Also note that the time stamp of your final submission is used to verify you submitted by the due date, or by the number of late days that you used on this lab, so please do not update your repo after you submit your final version for grading.
If you have difficulty pushing your changes, see the "Troubleshooting" section and "can’t push" sections at the end of the Using Git for CS31 Labs page. And for more information and help with using git, see the git help page.
Lab 9 Survey
After submitting your complete Lab 9 solution, you should submit the required Lab 9 Questionnaire (each lab partner must do this). Note this link is typically live a day or two before the due date through several days after the due date. Please submit within one week of the lab due date.
11. Handy References
C resources specific to this assignment
-
Dive into Systems Chapter 2.9.6 Writing C libraries
-
Dive into Systems Chapter 2.5 2D arrays Method 2: The Programmer Friendly Way
-
Dive into Systems Chapter 2.6 Strings and the string library
-
Dive into Systems Chapter 2.9.2 Command line arguments
-
Week 11 Weekly Lab Examples writing a C library, C strings, and an example program that uses the readline libary (more here about: the readline libarary)
-
Week 8 Weekly Lab Examples: argv and command line args
General C Programming Resources
-
Week 5 with gdb and valgrind examples
-
make and Makefiles from Appendix 2 of the textbook.
General Lab Resources
-
Class EdSTEM page for questions and answers about lab assignment
General Unix Resources
-
Appendix 2: Using Unix, useful unix commands, editors, history, process control, make, man, etc.