CS 45 — Lab 2: Implementing System Calls

Checkpoint: Friday, February 14 (3-minute demos during lab)

Due: Thursday, February 20 @ 11:59 PM

1. Overview

For this lab, you’ll be adding two new system calls to the Linux kernel. The first will retrieve the current time and return it by copying memory into the calling userspace process. The second will modify the PCB (in Linux, the task_struct) of a chosen process so that it no longer appears in the process list.

1.1. Goals

  • Gain an understanding of the OS side of system calls and practice writing your own system calls.

  • Learn to read the Linux kernel’s source code (a little!) and interact with important data structures (e.g., task_struct).

  • Produce userspace test cases to evaluate your kernel code implementation.

  • Practice building and deploying the Linux kernel.

1.3. Lab Recordings

Week 1

Week 2

2. Requirements

For this assignment, you will add two new system calls to the Linux kernel. Please make sure that you:

  1. Provide all of the files I need to build a kernel with your new system calls. That includes all headers you modified, any C files you modified, and any files you added.

  2. Define the system calls using the expected system call numbers. If you don’t, you’ll break my grading programs. Nobody benefits from an unhappy grader.

2.1. System call: getcurrenttime (define as syscall # 333)

You should implement a getcurrenttime system call that takes two parameters: a flag that specifies whether or not to print when executing and a pointer to a struct timespec. Your implementation should print a simple debugging status message using printk() if the flag evaluates to true. It should then verify that the pointer given in the timespec is valid, and if so, lookup the current time and copy it into the pointer’s destination.

To a userspace application, the call might look like:

/*
 *  pflag: if non-zero print inside the system call (using printk)
 *  tspec: pointer passed to your system call, the current time
 *         will be "returned" through this parameter
 *  returns: 0 on success, -1 on error
 */
long getcurrenttime(int pflag, struct timespec  *tspec)

On success, your system call should return 0. If it fails, it should return an error code that describes what went wrong. You are responsible for detecting and reporting the following conditions:

  • Verify that the struct timespec pointer is writable userspace memory using access_ok. If this fails, you should return the constant -EINVAL (invalid argument). On success, access_ok returns true (non-zero), and on failure, it returns false (zero).

The access_ok function is for verifying that a pointer that claims to be from the user truly belongs to a range of addresses that correspond to a userspace virtual address space. It’s not checking "can I write to this location?", it’s checking "is the user-supplied pointer really pointing to an area of memory the user owns?". The function is a security feature: it prevents a malicious user from supplying a pointer that points into kernel memory space and tricking you into writing into (or reading from) kernel data structures.

If you want to make it fail, you need to provide a pointer whose value couldn’t possibly be in the range of your process’s virtual address space. One easy value to test with is the largest possible pointer you can generate. The x86-64 architecture we’re using currently uses 48-bit virtual addresses, which makes the largest pointer 0xFFFFFFFFFFFF (281474976710655).

  • Copy the time value to the userspace timespec struct using copy_to_user. If this fails (e.g., the pointer is NULL or otherwise not writable memory), you should return the constant -EFAULT (bad address). The copy_to_user function returns the number of bytes that couldn’t be copied, so you should interpret zero as success and non-zero as failure.

Note that in userspace, you will not see the error value directly in the return value of your system call. Instead, you’ll see -1 and the errno variable will be set to the value your system call returned. This behavior will allow you to use perror() to decode the error message.

2.1.1. Getting the time

The Linux kernel has many functions for dealing with time. A good place to start looking at the available functions is in kernel/time/, which has files like timekeeping.c and time.c. The corresponding headers, which you’ll need to include in your system call’s .c file to use the time functionality, are in include/linux. The ktime.h header seems to include most of the other headers that may be of interest.

Your ultimate goal is to get the current time as a struct timespec and then copy the value of that struct into the location provided by the user using copy_to_user. Functions that deal with timespecs should be helpful. Note that you do NOT need to write much code to get the time. If you find yourself writing a lot of code to wrangle time formats, there’s probably a simpler way.

You should avoid using any functions that start with a sys_ prefix. Those functions are themselves system calls, and they’re generally not intended to be called by other system calls.

2.2. System call: stealth (define as syscall # 334)

The ps ax command will print a list of all processes on the system. You can use grep to help narrow the list (e.g., ps ax | grep vim to show only processes with 'vim' in the name). The leftmost column of the output is the process id (PID), and the rightmost column is the process’s name. On Linux, ps reads the information about all processes from a special pseudo-file system in /proc. If you execute ls /proc, you’ll see lots of directories named with a number, each of which corresponds to a PID. The files in those directories contain information about the corresponding processes. Note that these files are not stored permanently on any disk. The info is all stored in the OS’s data structures, and the /proc file system is simply the interface it uses to make that information available to users.

You should implement a stealth system call that will hide (or unhide, if called a second time) a process from being listed in the /proc file system. By hiding the process in /proc, it will no longer show up in `ps’s output list. To a userspace application, the call might look like:

/*
 *  pid: the process id of the process whose stealth status should be toggled
 *  returns: 0 on success, -1 on error
 */
long stealth(pid_t pid)

On success, your system call should return 0. If it fails, it should return an error code that describes what went wrong. You are responsible for detecting and reporting the following conditions:

  • Verify that the provided PID matches a real process by looking for the corresponding process’s task_struct. If this fails, you should return the constant -ESRCH (no such process).

2.2.1. Task structs

To implement your stealth system call, you’ll need to add a flag representing a process’s stealth status to the Linux task_struct, which can be found in include/linux/sched.h. You’ll also need to update the INIT_TASK macro in include/linux/init_task.h to initialize your new flag when a new task_struct gets created. Your stealth system call should toggle this flag’s value.

When accessing and manipulating instances of the task_struct, you’re expected to be holding locks. I would suggest looking around at other places where task_structs are used to see how the locking is used. The Elixir cross reference tool will help you to find examples.

2.2.2. Process hiding

To hide a process, you’ll need to edit the implementation of the /proc file system so that it skips over any processes whose stealth flag is set. The /proc implementation lives in the fs/proc directory, with fs/proc/root.c being the "main" file (this is unlikely to be the file where you make your changes, but it should help you to get started with discovering how the file system works).

Once you find the right place, your changes to the code are likely to be trivial (e.g., "if stealth flag is set, skip over entry"). The goal here is to get you familiar with looking around in the kernel’s code.

2.3. Userspace test applications

Having implemented your system calls, you need to test them! You should write two small userspace test applications, one for each of your two system calls, to test that each works as intended. Your test programs should attempt to test as many cases as possible (e.g., calls that will generate errors in addition to correct runs).

In the kernel, each system call is given a unique integer, and you’ll need to know the integer assigned to your system calls (see "Defining System Calls" below). To invoke your new system call, you can use the syscall() function. The first parameter is the system call number, and then you can pass as many additional arguments as your system call needs. For example, to call getcurrenttime(), you could invoke:

int pflag = 1;
struct timespec ts;

int result = syscall(333, pflag, &ts);  // 333 should be the number you assign to getcurrenttime()

Since the above method of making system calls is ugly, I would suggest defining a simple macro for each call to give it a more reasonable name:

#define getcurrenttime(arg1, arg2) syscall(333, arg1, arg2)

/* Now you can call getcurrenttime() normally: */
int result = getcurrenttime(pflag, &ts);

3. Checkpoint

For the checkpoint, you should be able to demonstrate that:

  • You have added a system call to the Linux kernel (e.g., one that prints with printk).

  • You can invoke your system call using a small userspace test program.

  • You have made non-trivial progress toward implementing one of the two required system calls outlined above. Your job is to convince me that you’ve worked on it. I don’t have any particular milestone that I’m looking for…​

4. Defining System Calls

To add a new system call, you’ll need to define it in a few places:

4.1. Add a new system call number to the kernel.

The Linux kernel associates a unique integer with each system call, so you’ll need to assign a new value for each of your two system calls. For the x86 architecture, the values are stored in a table in the file arch/x86/entry/syscalls/syscall_64.tbl. You should assign to your system calls the numbers outlined in the requirements above. You will only be operating your kernel in 64-bit mode, so the 32-bit/64-bit distinction doesn’t have much impact. You should follow the same naming format for the other columns (e.g., name the system call normally and then again with sys_ prepended to the front of it).

4.2. Add your system call prototype to the syscalls header file.

Next, you’ll need to declare your system call in the file include/linux/syscalls.h. The function names should be prefixed with sys_, and they should return the same type as all the others (asmlinkage long). For pointers coming from userspace in parameters (e.g., the struct timespec in getcurrenttime), you need to add __user to the type declaration. Take a look at the other system calls for an example.

4.3. Add the code for your system call.

Next, you’ll need to implement your system call! You should add a new .c file to the kernel directory (e.g., kernel/stealth.c). In that file, you’ll need to #include kernel headers to get access to helper functions, like printk(). I would suggest always including:

#include <linux/errno.h>        // For error constants.
#include <linux/kernel.h>       // For printk().
#include <linux/syscalls.h>     // For syscall macros.

Each of your system calls may also need other headers that are specific to their purpose (e.g., getcurrenttime will need linux/uaccess.h for the access_ok and copy_to_user functions).

With your headers in place, you can define your system call using the SYSCALL_DEFINEX macro, where X is the number of parameters your system call takes. For example, suppose you were adding a system call named testcall that takes two arguments, an integer and a pointer to a userspace string. Your definition might look something like:

/*** NOTE: This is pseudocode, DO NOT copy this and expect it to do anything
 *** useful as-is. */

SYSCALL_DEFINE2(testcall, int, int_param, char __user *, string_param) {
  /* Body of system call implementation goes here.*/

  /* If an error occurs, return a negative constant that starts with E
   * (e.g., EINVAL or ESRCH). */
  if (error) {
    return -EINVAL;
  }

  /* On success, return zero. */
  return 0;
}

4.4. Building and linking your system call.

Finally, after implementing the system call, you need to make sure it gets compiled and used. Since you added a .c file, you’ll need to tell the kernel’s build system to build and incorporate the corresponding .o file. The easiest way is to add file.o to kernel/Makefile in the obj-y section at the top. For example:

# Makefile for the linux kernel.
#

obj-y     = fork.o exec_domain.o panic.o \
	    cpu.o exit.o softirq.o resource.o \
	    sysctl.o sysctl_binary.o capability.o ptrace.o user.o \
	    signal.o sys.o umh.o workqueue.o pid.o task_work.o \
	    extable.o params.o \
	    kthread.o sys_ni.o nsproxy.o \
	    notifier.o ksysfs.o cred.o reboot.o \
	    async.o range.o smpboot.o ucount.o stealth.o
# Note the 'stealth.o' added to the line above.
# That addition will compile stealth.c and link in the resulting .o.

From here, you should build and install your kernel. If you’re building it for the first time, use the "from scratch" instructions to give yourself an installable .deb package. Otherwise, the build process will likely be much faster with the "incremental" instructions.

5. Tips & FAQ

  • When making changes to existing Linux source files, you may want to make a backup copy of the original file first. That way, if your changes break the build, you can easily revert to the original. I would also strongly suggest that you keep track of which files you’ve added/modified as you add/modify them. Having a map of where you’ve been and what you’ve changed will make it easier for you to navigate the Linux source tree, especially as you’re first learning it. You’re also going to need to submit a list of added/modified files anyway, so you might as well update it as you go.

  • If you need to execute commands listed on a web page inside the VM, you can open a browser in the VM and copy from that. I would suggest writing some small scripts to help with copying kernels from your host machine and installing them.

  • The Linux task_struct has a field named "comm" that contains a string with the process’s program name. It may be helpful to print that value using printk() when you’re debugging your stealth call to make sure you’re finding the correct process.

  • You can see the kernel’s printk() output by executing the dmesg command.

  • If your attempt to test a system call tells you "Function not implemented", it means the OS doesn’t know of any system call that matches the number you provided. Either you haven’t linked your system call’s implementation into the kernel, or you haven’t updated the kernel running on your VM to one that contains the new call.

6. Submitting

Please remove any excessive debugging output prior to submitting.

To submit your code, commit your changes locally using git add and git commit. Then run git push while in your lab directory.

Please ONLY submit any Linux source files that you have modified or added along with a README.md file containing the paths of those files within the source tree. DO NOT submit the entire Linux kernel source tree. Please add any userspace testing code to the "userspace" directory.

For example, suppose you:

  • modify include/linux/sched.h

  • add a new file kernel/newfile.c

  • write a userspace test program named test-feature.c

You should submit test-feature.c in the provided userspace directory. You should submit sched.h and newfile.c, along with a README.md, in the root of the repository. The README.md should contain the path of each file (relative to your kernel’s base directory), e.g.:

sched.h:   include/linux/sched.h
newfile.c: kernel/newfile.c

If you don’t give me all of the files you modified, your submission will not build, and I will not be able to grade it. Please make sure you get all of them. I would strongly suggest recording which files you’ve changed immediately as you modify them.