Lab 2: Implementing System Calls
Due: Wed, Oct. 8 before 1am (very late Tuesday night)

Problem Introduction
Implementation Details
What to Hand in
Preparing Your Demo

QEMU

This is the first of several projects that involve modifying the Linux kernel. You will use QEMU running on one of the CS Lab machines for your Linux kernel development. First come see me or one of the sysadmins to get the root password for the qemu machine and see your machine assignment below (it is important for getting networking to work correctly that you do not run your qemu machine on the same physical machine as any other group). Then, follow the "QEMU Start-up" instructions from the QEMU GUIDE available here to set up QEMU so that you and your partner can share the same virtual machine.

Machine assignments:

anise      Phyo and Cyrus 
chervil    Colin and Joel 
cream      Doug and Eric 
orange     Racheal and Geoffery

Once you have set up your QEMU environment, try running QEMU and booting and shutdown Linux from inside it. Next, try building the kernel from source and installing and booting it.

You and your partner should run QEMU from the PC that you have been assigned to so that we can distribute the QEMU load across the PCs. You can remotely login to your PC from a Sun Lab machine or from one of the PCs in the robot lab.

Starting Point Code

Copy over lab2 starting point code from my public/cs45 subdirectory:

$ cp /home/newhall/public/cs45/lab2/* .

This contains examples that show you what you need to #include and how to compile user-level test programs that call your new system calls

Introduction

For this project you will add two new system call to the linux kernel and write user-level programs that test your system calls. First, you will implement the getcurrenttime system call as described below. Next, you will implement a system call named procinfo that returns information about a running process given a process identifier argument.

A system call is a function that is exported by the kernel to user-level programs. User-level programs invoke a system call stub which contains a TRAP instruction to trap to the kernel. Along with the trap instruction, the user-level program "passes" the kernel the sytem call number that the kernel uses to look up in its system call table to determine which system call routine to invoke. The details of how the system call number is passed depend on the particular architecture. The kernel then executes the system call on behalf of the user-level process; system calls are the user-level interface to the kernel. Some system calls may be blocking, meaning that the kernel may de-schedule the calling process to wait for some event that is triggered by the system call, and schedule other processes to run in the mean time. When the event occurs, the kernel is interrupted to handle the event. The fast part of handling the event is done immediately by the kernel with interupts disabled, if there is a slow part of handling the interrupt, it is done with interupts enabled, and it may be done later (this is what the "bottom half" of an iterrupt handler does). The kernel completes the system call within the blocked processes context and returns from the system call. An example of a blocking system call is read from a file on disk.

There are three parts to a system call: (1) the system call which does a context switch from user space to kernel space possibly copying system call arguments from user to kernel space; (2) the execution of the system call in kernel space; and (3), the return from system call that may require copying a return value from kernel to user space and does a context switch from kernel back to user space.

Implementation Details

I suggest first implementing Part 1 together and after it is debugged and working, then implement Part 2 together. Also, it is better to have one of the two system calls working then to submit two non-working partial implementations.

Part 1: getcurrenttime system call

You will add your new system call in a separate file in the kernel. The getdcurrenttime system call returns the value of the kernel xtime variable to the user-level program that calls it.
NOTE: as you modify existing kernel modules, first make a copy of of the .c or .h or .S file as orig_filename.c, then modify filename.c. This way if you really break something you can easily go back to the original kernel source.

Steps for implementing a system call:

Modify kernel's system call table so it can call your system call: Define a new system call number and add an entry for your new system call to the kernel's system call table (sys_call_table) located in arch/i386/kernel/syscall_table.S. This file is in i386 assembly code and it looks something like:
```
  
ENTRY(sys_call_table) 
  .long sys_restart_syscall       /* 0 - old "setup()" system call, used for restarting */ 
  .long sys_exit 
  .long sys_fork 
  .long sys_read

  ...  
  .long sys_splice 
  .long sys_sync_file_range 
  .long sys_tee                   /* 315 */
  .long sys_vmsplice 
  .long sys_move_pages
```
To this table you will add .long sys_yoursystemcall entries. For example, if I wanted to add a new system call named blah, I would modify the table in syscall_table.S as follows: ... .long sys_tee /* 315 */ .long sys_vmsplice .long sys_move_pages .long sys_blah /* my new system call is number 318 in the table */ (syscall_table.S is included in entry.S where the size of the system call table is determined). Next, you need to add a definition to /local/you_and_pal/qemu/linux-source-2.6.18/include/asm-i386/unistd.h or this should be the same file: /local/you_and_pal/qemu/linux-source-2.6.18/include/asm/unistd.h) for your new system call. For example, I'd add the following for my new system call "blah", which is system call number 318: #define __NR_blah 318 ... #define NR_syscalls 319 # make sure this accounts for ones you add ...

 Implement your system call: add a new file to the kerenl
	which contains your system call code.  You should add the
	file in /local/you_and_pal/linux-source-2.6.18/kernel.  Your
	system call function must have "asmlinkage" prepended to 
	its header and a "sys_" prefix to its name.
	For example:
/* file: blah.c */
// only include kernel.h in kernel-level code (like this)
#include <linux/kernel.h>
#include <asm/uaccess.h>
#include <linux/time.h>

/* blah system call: returns the current system time through
 *  		     thetime argument
 *  
 *    thetime: address of user-level timeval struct into which
 *		the kernel will copy the value of its xtime variable
 *    flag: if set (non-zero), will print time to stdout
 * 
 *    returns: 0 on success, non-zero on error    
 */	
asmlinkage long sys_blah(int flag, struct timeval *thetime){

	/* to print a debug message to stdout use printk, which
	 * is like printf
	 */
	printk("Inside system call blah\n");

	/* copy arguments from user space to kernel space
	 * to copy arguments (passed by reference) from user 
	 * space to kernel space:
	 * (1) first call access_ok() to check if the space
	 *     pointed to by thetime is valid
	 * (2) then call copy_from_user() to copy to kernel space
	 *    
	 * note: the value of the argument 'flag' is passed on the stack
	 *       so it does not need to be explicitly copied from user 
	 *       space 
	 */
			
	/* if we access any kernel variables that could be
	 * modified by interrupt handlers that interrupt our syscall,
	 * then we better put some synchronization around their access
	 * (for fast accesses use spinlocks).  Reading or Writing 
	 * a 4-byte value is atomic.
	 */

	/* copy pass by reference "return" values to user-space
	 *  (1) first call access_ok() to check if the space
	 *      pointed to by thetime is valid (if we have not already done so)
	 *  (2) then call copy_to_user() to copy the value to user space
 	 */

	/* a successful return from the system call */
	return 0;
}


Modify the Makefile in /local/you_and_pal/linux-source-2.6.18/kernel 
to include your new file, so that your system call function will be built 
and included in your kernel executable file.  To obj-y definition in 
the makefile, add your_new_file_name.o:
obj-y     = sched.o fork.o exec_domain.o panic.o printk.o profile.o \
            exit.o itimer.o time.o softirq.o resource.o \
            sysctl.o capability.o ptrace.o timer.o user.o \
            signal.o sys.o kmod.o workqueue.o pid.o \
            rcupdate.o extable.o params.o posix-timers.o \
            kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o mutex.o \
            hrtimer.o rwsem.o your_new_file_name.o

Then, re-build the linux kernel.

Write a user-level program that calls your system call. You should add a user account and use it to run the user-level programs used to test your system calls. 'addusr username' will create a new user account for a user named username, with a home directory in /home/username. There are two ways to call a system call. One way is to generate a system call stub, and then invoke the stub (the generated stub takes care of setting up the parameters to pass to the system call and then traps to the kernel). The other way is to make a call to the syscall system call, passing your system call's number and its args. Include: #include <sys/time.h> #include <errno.h> #include <linux/unistd.h> To generate stubs for your system call by calling one of these in your test program: _syscalln(return_type, name, type1, arg1, type2, arg2, ...) For example in a user program #include < linux/unistd.h > and execute the following to generate a system call stub routine for a system call that returns an int and takes two parameters (a char * and an int): _syscall2(int, my_new_system_call, char *, str, int, len); After this call, you can then make directly call my_new_system_call in your code. For example: int x = my_new_system_call("blah", 4); TO COMPILE and get your modified header files to be included in your executable (i.e. your unistd.h), use a -I gcc command line option that specifies the path to these include files (-proj2 should be the -whatever flag you added to your kernel version when you built it): gcc -I/usr/src/linux-headers-2.6.18-proj2/include mytestprog.c Use a simple makefile so that you don't have to type this in every time.
Write a test program that calls your new system call two different ways: Generate stub routines for your new system call as shown above and then make a call to your new system call just as you would any other Unix system call. Use your new system call's number and the syscall system call to invoke your system call (see the man page for syscall for more information): int syscall(number, arg, ...);


Specific implementation hints:

 Start by implementing a simple system call that just prints out an
"inside my new system call" message using printk, then 
incrementally add more functionality and test.

 Make sure to check that arguments are okay (passing bad values
to a system call should not crash the kernel).

 If your system call needs a new type definition (e.g. it take an
struct argument that is not already a type known by the kernel), then
you need to add a header file to the kernel that contains the new type's 
definition.  You should add the header file in 
/local/you_and_pal/linux-source-2.6.18/include/linux/.
Header file contents (prototypes and definitions) that
only should be visible inside the kernel (not at user level) should be inside
#ifdef __KERNEL__, and #endif preprocessor directives.
(for a new type passed from a user-level program to a system call, this is 
not the case, but for later assignments you may need this).

Often the best way to answer questions about how to implement something
    in the kernel, is to look at examples of similar kernel code.  For example,
    to learn more about system calls you may want to search the Linux kernel 
		for existing system calls to see how they are implemented (getpid and 
		gettimeofday might be useful starting points).


Part 2: procinfo system call

After you have the getcurrenttime system call implemented and tested, next
implement a system call named procinfo.  This system call
will take a process id argument and a pointer to a proc_info_struct as
an argument.  It will fill in the values of the proc_info_struct argument
for the specified process.  Your system call should return 0 if 
the proc_info_struct was successfully filled in.  Otherwise, it should
return one of the following error values: 


 ESRCH  if a process with the given pid does not exist
 EINVAL if there are errors with the  proc_info_struct or pid arguments 
 EFAULT if there is an error writing to user space
 
(ESRCH, EINVAL, ... , are defined in linux/errno.h).  Take a look at
how other system calls return error values to figure out how your
code should do this.    

current is a kernel global variable that points to the currently running
process's task_struck.  Use this variable to get access the caller's
task_struct.

Start by defining a the proc_info_struct in a new header file that you
create in include/linux/.  The struct should have the 
following fields:
int pid;                 /* pid of process */
int parn_pid;            /* pid of its parent process */
long  user_time;           /* total CPU time in user mode*/
long  sys_time;            /* total CPU time in system mode*/
long  state;               /* its current state */
unsigned long rt_priority; /* its real-time scheduling priority */
int time_slice;            /* its scheduling time slice (amt. left) */
unsigned policy;           /* its scheduling policy */
unsigned long num_cxs;     /* number of context switches it has had  (sum of voluntary and involuntary cxs) */
int uid;                   /* its user id */
int gid;                   /* its group id */
int num_children;          /* the number of child processes it has */
char prog[16];             /* its exec'ed file name (e.g. a.out) */

You can fill in these values by accessing a process' task_struct
that is defined in include/linux/sched.h.  Field values in the
proc_info_struct that correspond to null pointer values in
the process' task_struct should be set to -1.  Not all field
values in your struct match the names of fields in the task_struct, so you
may need to read through some code and/or try some things out before you get
the right values for these fields.  Types are defined in types.h files in
archetecture neutral and archetecture specific subdirectories of include.  
For example:
/local/you_and_pal/linux-source-2.6.18/include/linux/types.h.
Errors are 
defined in /local/you_and_pal/linux-source-2.6.18/linux/asm-generic/errno.h 


A few things to help you determine what to do and if your system call returns
correct information:


	 The macro find_task_by_pid might be useful.
	
 Looking at the list interface defined in include/linux/list.h might be useful.  Also, look at some examples of kernel code that uses these macros
to figure out how to use them.
 The 
Linux Source Code Brower 
is useful for quickly finding things in the source code.  Note, that we
are using version 2.6.18 of the kernel so you should make sure you 
are browsing that version too.
top, ps, and files in /proc may give you some information about running
processes that can help you check to see if you are getting the right
values for some of the fields.  In /proc are subdirectories for each 
running process in the system (the directory name is its pid value), with
files you can cat out to get information about the process.  
Also, you 
may want to add new users to your machine so that you can test processes 
owned by someone other than root.  Run the adduser command to add new users.




Hand In

Submit the following via cs45handin before the due date:


 A README file with the following information:
  
   You and your partner's names
  
 The total number of late days that you have used so far
  
 A list of the linux source files you modified (list the complete path
  name of each file)
  
 A list of any features that you do not have fully implemented
  
 Copies of the kernel source and header files that you modified/added for 
this project (I only want the files you modified or added, do not submit
a tar file containing all the kernel code).

Your test programs that contains multiple calls to your new system call that demonstrate how your system call handles valid and invalid input, and shows calls using syscall and calls made by first calling _syscall2 to generate the system call stub.

In addition, you and your partner should sign up for a 15 minute demo time where you will run your test program for me and show me that it works. A demo sign-up sheet will be outside my office door.

Preparing for a Demo

During your 20 minute demo slot you and your partner will demonstrate that your solution works. It is up to you to determine how to demonstrate this to me. When I meet with you, your kernel should be up and running on QEMU (unless there is something that happens during the boot process that you want to show me), and your demo should be ready to run; if you spend your entire demo slot setting up QEMU, then I can only conclude that your solution does not work.

A demo is something that you and your partner should practice before you give it; you want to make sure that it runs correctly and that it demonstrates that your solution is correct and complete. Make sure that you are demo'ing both how your system call works under normal conditions and how you are handling error conditions. Also, be prepared to answer questions during your demo about your implementation and about your test programs.

Often times demonstrating that your solution works means that you will need a way to run a version of your kernel with debugging output enabled, and you may need to show via unix commands or /proc information that your system calls obtain the correct information or do the right thing. In addition, for most demos, you will want to write one or more interactive demo applications (menu driven program), where you choose from a menu of options for invoking your system call(s), execute a system call, examine system state or kernel outuput to verify that it did the right thing, then choose the next system call to execute, and so on; you want to be prepared to discuss and to demonstrate the effects after any single system call.

Lab 2: Implementing System Calls Due: Wed, Oct. 8 before 1am (very late Tuesday night)