This, and all, programming assignments in this class should be done with a partner. See the "Safe File Sharing" link under the Collaboration Tools section of my help pages for information about how you and your partner can safely share code.
You must implement your shell in C (no C++). See C language links for information about C programming style, debugging tools, etc. You should use valgrind to help you remove all memory access errors from your program; I will check that your code is free of memory access errors and memory leaks when I grade it.
You and your partner should start by sketching out the design of your solution (use top-down design and think good modular design). Implement and test your code incrementally, and test individual functions in isolation as much as possible. For example, start with exec'ing a simple command given an absolute path name to the command. Once this works, move on to adding the next piece of functionality, test and debug it, then move on to adding the next piece, and so on. Use assert statements to test pre and post conditions of functions, and use gdb and valgrind to help you find and fix bugs.
In addition to being correct and robust to bad user input, your shell code should use good modular design, be efficient, and be well commented.
myshell> ls -la # long listing of curr directory -rw------- 1 newhall users 628 Aug 14 11:25 Makefile -rw------- 1 newhall users 34 Aug 14 11:21 foo.txt -rw------- 1 newhall users 16499 Aug 14 11:26 main.c myshell> cat foo 1> foo.out # cat foo's contents to file foo.out myshell> pwd # print current working directory /home/newhall/public myshell> cd # cd to HOME directory myshell> pwd /home/newhall myshell> firefox & # run firefox in the background myshell>Here cd is a built in command, and ls, pwd and cat are commands executed by a child of the shell process.
In general a shell command is in the form:
commandname arg1 arg2 arg3 ...To execute this command, the shell first checks if it is one of its built-ins, and if so invokes a functions to execute it. If it is not a built-in, the shell parses the command line into the command_argv_list, creates a child process to execute the command, and waits for the child to complete the execution of the command.
Creating a new child process is done using the fork system call, and waiting for the child process to exit is done using the wait system call. fork creates a new process that shares its parent's address space (both the child and parent process continue at the instruction immediately following the call to fork. In the child process, fork returns 0, in the parent process, fork returns the pid of the child. The child process will call execv to execute the command. For example:
int child_pid = fork(); if(child_pid == -1) { // fork failed...handle this error } else if(child_pid == 0) { // child process will execute code here to exec the command ... execv(full_path_name, command_argv_list); } else { // parent process will execute this code ... }The parent can call wait or waitpid to block until a child exits:
// block until one of my child processes exits (any one): pid = wait(&status); // block until child process exits // OR to wait for a specific child process to exit: pid = waitpid(childpid, &status, 0);The execv system call overlays the calling process's image with a new process image and begins execution at the starting point in the new process image (e.g. the main function in a C program). As a result, exec does not return unless there is an error (do you understand why this is the case?).
See the "File I/O" and "strings" parts of my C help pages for some basic information about C strings and input functions. In addition, a couple functions that may be useful are readline and strtok. See their man pages for more information about how to use these (be careful about who is responsible for allocating and freeing memory used by these routines). If you use readline, you need to link with the readline library:
gcc -g -o myshell myshell.c -lreadline
% cat foo.txtthe shell program needs to locate the cat executable file in the user's path.
Use the getenv system call to get the value of the PATH environment variable. PATH is an ordered list of paths in which you should search for the command. It is in the form:
first_path:second_path:third_path: ...For example, if the user's path is:
/usr/swat/bin:/usr/local/bin:/usr/bin:/usr/sbinthe shell should first look for the cat command file in /usr/swat/bin/cat. If it is not found there, then it should try /usr/local/bin/cat next, and so on.
To see your path: echo $PATH. To list the value of all your environment variables: env.
Note: There are versions of exec that do the shell path search for you. I do not want you to use these. Instead, your shell code should get the PATH environment variable and search for the command, and then construct the full path name of the command that it will pass to execv.
cd
,
and exit
.
For more information on shell built-in functions look at the man page for
builtins. Shell built-in functions are not executed
by forking and exec'ing an executable. Instead, the shell
process executes them itself.
To implement the cd command, your shell should get the value of its current working directory (cwd) by calling getcwd() on start-up. When the user enters the cd command, you must change the current working directory by calling chdir(). Subsequent calls to pwd or ls should reflect the change in the cwd as a result of executing cd.
myshell> foo 1> foo.out # re-direct foo's stdout to file foo.out myshell> foo 2> foo.err # re-direct foo's stderr to file foo.err myshell> foo 1> foo.out2 2> foo.out2 # re-direct foo's stdout & stderr to foo.out2 myshell> foo < foo.in # re-direct foo's stdin from file foo.out myshell> foo < foo.in 1> foo.out # re-direct foo's stdin and stdoutI/O re-direction using '>' or '>&' need not be supported. For example, the following command can be an error in your shell even though it is a valid Unix command:
myshell> foo < foo.in >& foo.out2
Each process that is created (forked), gets a copy of its parent's file descriptor table. Every process has three default file identifiers in its file descriptor table, stdin (file descriptor 0), stout (file descriptor 1), and stderr (file descriptor 2). The default values for each are the keyboard for stdin, and the terminal display for stdout and stderr. A shell re-directs its child process's I/O by manipulating the child's file identifiers (think carefully about at which point in the fork-exec process this needs to be done). You will need to use the open, close and dup system calls to redirect I/O. For example, to re-direct a process's stdout to a file named foo.out, I'd do the following:
fid = open("foo.out", O_WRONLY | O_CREATE, 0666); // open output file close(1); // close stdout dup(fid); // dupicate file descriptor fid, the duplicate // will go in the 1st free slot in the process's // file descriptor table (i.e. slot 1, the one // just closed, which is the file descriptor // to stdout) close(fid); // we don't need fid file descriptor for this fileNow when the process writes to stdout (file descriptor 1), the output will go to the file foo.out instead of to the terminal.
When your shell program executes a command with a single pipe like the following:
myshell> cat foo.txt | grep -i blahcat's output will be pipe'd to grep's input. The shell process will fork two process (one that will exec cat the other that will exec grep), and will wait for both to finish before printing the next shell prompt. Use pipe and I/O redirection to set up communication between the two child processes.
The second process knows when to exit when it reads EOF on its input; any process blocked on a read will unblock when the file is closed. However, if multiple processes have the same file open, only the last close to the file will unblock processes blocked reading the file; only the last close really closes the file. Any time a process exits, all its open files are closed.
When you write programs that create pipes (or open files) and that fork
processes, you need to be very careful about how many processes have the
files open, so that EOF conditions can be detected by processes.
Here are some examples of commands that your shell should handle.
I will likely test your shell programs using a more complete test suite,
so I recommend testing your shell with more than these commands.
Here are some additional features to try adding if you have time
(some are much more difficult than others):
Useful Unix System Calls
path = getenv("PATH");
cwd = getenv("PWD");
int cpid = fork();
if (cpid == 0) { // the child process
} else { // the parent process
pid = waitpid(cpid, &status, 0);
}
execv( full_path_name, command_argv_list);
access(full_path_name_of_file, X_OK | F_OK);
// to re-direct stdout to file foo
int fid = open("foo", O_WRONLY|O_CREAT, 0666);
close(1);
dup(fid);
close(fid);
int pipe_id[2];
pipe(pipe_id);
read(pipe[0], in_buf, len);
write(pipe[1], out_buf, len);
Test Commands
In testing your shell, if you are ever unsure about the output of a
command line, try running the same command line in bash or tcsh and
see what it does.
myshell>
myshell> ls
myshell> ls -al
myshell> cat foo.txt
myshell> cd /usr/bin
myshell> ls
myshell> cd ../
myshell> pwd
myshell> cd
myshell> find . -name foo.txt
myshell> wc foo.txt
myshell> wc blah.txt
myshell> /usr/bin/ps
myshell> /usr/bin/../bin/ps
myshell> firefox
myshell> exit
myshell> cat foo.txt | more
myshell> cat foo.txt | grep blah
myshell> cat foo.txt blah.txt 1> out.txt 2> out.txt
myshell> wc out.txt
myshell> cat < foo.txt 1> out2.txt
myshell> diff out.txt out2.txt
myshell> ls -la yeeha.txt 2> errorout.txt
myshell> exit
## test some error conditions
## your shell should gracefully handle
## errors by printing a useful error message and not crash or exit (it
## should just restart its main loop: print shell prompt, ...)
myshell> |
myshell> ./hello # assuming there is no hello executable
myshell> hello
myshell> cat foo1> out
myshell> 1> < 2>
Extra Credit
Try these mostly for fun and for a few extra credit points.
However, do not try these until your basic shell program is complete, correct,
robust, and bug free; an incomplete program with extra credit features
will be worth much less than a complete program with no extra credit features.
myshell< history # list the n most previous commands (10 in this example)
4 14:56 ls
5 14:56 cd texts/
6 14:57 ls
7 14:57 ls
8 14:57 cat hamlet.txt
9 14:57 cat hamlet.txt | grep off
10 14:57 pwd
11 14:57 whoami
12 14:57 ls
13 14:57 history
myshell< !8 # will execute command 8 from the history
Implement history for a reasonable sized, but smallish, number of
previous commands (50 would be good). And note that the command number
is always increasing. Don't use the readline functionality to do this.
Instead, implement a datastructure for storing a command history, and
use it when implementing the built-in history command and !num syntax
to execute previous commands.
gcc -g -o myshell myshell.c -lncurses -lreadline
I have no idea how difficult it may be to add this feature.
myshell< cat foo | grep blah | grep grrr | grep yee_ha
What to Hand in
Submit a single tar file with the following contents using
cs45handin
(see
Unix Tools for more information on script, dos2unix, make, and tar):
script
(script takes an optional filename arg. Without it, script's output will
go to a file named typescript)
> script Script started, file is typescript % ./myshell myshell> ls foo.txt myshell myshell> exit good bye % exit exit Script done, file is typescriptThen clean-up the typescript file by running
dos2unix
> dos2unix typescript # you may need to run dos2unix more than # one time to remove all control chars > mv typescript outputfileFinally, edit the output file by inserting comments around the specific parts that you tested that are easy for me to find and that explain what you are testing. The idea is for you to ensure that if you shell correctly implements some feature, that I can test it for that feature. By showing me an example of how you tested it for a feature and making sure that I can easily find your test it will make it more likely that I am able to verify that a feature works in your shell program. For example, you could put special characters before your comments so that they are easy for me to find (like using '#' in the following example):
# # Here I am showing how my shell handles the built-in commands # myshell> cd ... # # Here I am showing how my shell handles commands with pipes # myshell> cat foo.txt | grep blah ...