CS 45 — Lab 3: Building a Synchronization Primitive

Checkpoint: Friday, February 28 (demo during lab).

Due: Thursday, March 26 @ 11:59 PM

1. Overview

For this lab, you’ll be adding a new synchronization construct to the Linux kernel: an event. An event is a primitive that any number of processes can open independently. After opening an event, a process may wait or signal on the event. If a process calls wait, it will become blocked until the event is signaled by some other process. If any process calls signal, all waiting process should be immediately unblocked. Calling signal with zero processes waiting does nothing. When a process is done using an event, it should close it, and if all processes close the event, the event gets destroyed.

To implement the event construct, you’ll add a suite of five system calls to the Linux kernel. Unlike the previous lab, where you added two system calls that had no relationship to one another, you’ll implement all of these calls in just one new file (kernel/evnt.c). The system call interface should look like:

evnt_open(int create, int id): open an event, optionally creating a new one.
evnt_close(int id): close an event, destroying it if no processes have it open any longer.
evnt_wait(int id): block the calling process until the specified event is signaled.
evnt_signal(int id): wake up all process blocked on the specified event, if there are any.
evnt_print(void): print the status of the system’s events using printk for debugging.

Each of these calls is specified in more detail below in the requirements.

When starting this lab (or any future Linux kernel lab), you should begin with a fresh copy of the Linux kernel rather than continuing with the previous lab’s kernel directory. That way, you can be sure that nothing you added in a prior lab interferes with this lab’s implementation. The first time you build your fresh kernel, give it a new, unique name and build and install the .deb files. After that, you can use the faster incremental build method.

1.1. Goals

Design a synchronization primitive that meets an interface specification.
Implement your design in the kernel and export it to userspace via system calls.
Employ semaphores to enforce mutual exclusion in kernel code.
Interact with Linux’s mechanisms for changing a process’s state (blocking and unblocking).
Practice reading and writing Linux kernel code.

1.2. Handy References

1.3. Lab Recordings

Week 1

Week 2

2. Requirements

Aside from the usual mechanics of creating a system call (registering the numbers in the syscall table, declaring the syscall prototypes, and adding an entry in a Makefile), all of the kernel code for your event implementation should go in one file: kernel/evnt.c.

2.1. Tracking Event Usage

To implement your event primitive, you need to maintain some data structures in the kernel to track which processes have an event open and which processes are blocked on an event at any given time. To keep such structures from growing unbounded, the Linux kernel typically uses a fixed-size data structure (e.g., the file descriptor table has a maximum size). In a similar fashion, you should implement an event table data structure as a static global variable of a fixed capacity. That is, define a struct that has all the properties you need to track for one event (one row in the table) and then allocate a global, fixed-size array of those structs to represent your table. To make testing (and grading) easier, set your array to be a #defined constant size of 5.

Your table needs to track which events (numbered with indices 0-4) are currently in use. For those that are, it should track which processes have them open and which processes are waiting on them. For tracking which processes have an event open, you should use Linux’s standard linked list implementation to store a list of the processes' PIDs, and for tracking waiting processes, you should use Linux’s wait queue implementation. Both mechanisms are described in more detail below. You’re welcome to maintain extra state variables in each row or other properties about the table too, if that helps (e.g., the number of event rows currently in use).

Because multiple process might be concurrently attempting to make event-related calls, code that accesses your table or other related state variables must be considered a critical section. For this lab, you should use a semaphore to provide mutual exclusion to your table and other state from race conditions. To be clear, the semaphore should only be used to protect your kernel state structures to ensure that only one process may interact with it at a time. For blocking processes that call evnt_wait, use Linux’s wait queue. The interfaces for both are described below.

When the kernel boots, it needs to initialize all of its data structures, so you’ll need to tell it how to initialize your table and state too. For this task, you should implement an init function and then register it to be invoked on start-up, using the pure_initcall macro:

/* NOTE: this is NOT a system call it is a function called by
         the kernel during its boot sequence to initialize the
         state you need to maintain for your event implementation */
static int  __init evnt_init(void) {
    // code to initialize your event table and any other state you intend to keep
}

/* To register your evnt_init function to be automatically called during the boot sequence: */
pure_initcall(evnt_init);

2.1.1. Dynamic Memory

While the global state table is static and has a constant number of event struct entires (rows), you’ll need to dynamically allocate memory to keep track of processes as they open the event. Over time, the list of processes that have the event open will change, and we don’t know how large it will be in advance.

When a process opens an event, you need to dynamically allocate a struct to store the information about the new process that just opened it (e.g, it’s PID) and then link that newly-allocated struct into the event struct’s linked list that tracks who has the event open.

To request and release dynamic memory in the kernel, you can use kmalloc() and kfree(). They work similarly to the userspace versions that you’re used to, except that kmalloc takes a second flags parameter to specify where the memory should come from. You should pass the constant GFP_KERNEL as the flags argument. Memory leaks in the kernel are very, very bad, so make sure that if you add a call kmalloc you have a corresponding call to kfree!

2.2. System Call Interface and Specification

You’ll define five system calls for your event implementation. Please use the system call numbers and error codes described here so that I can grade more easily.

Most of these system calls share several common error conditions. I would strongly suggest making the checks for common errors (e.g., EINVAL, ENOENT, and EACCES) in helper functions rather than repeating code all over the place.

2.2.1. evnt_open (# 333)

evnt_open(int create, int id): This system call is used by a process to open (and possibly first create) an event.

If the create argument evaluates to true, the id is ignored, and the process is asking to create and open a new event (initialize and use a new row in the state table). When creating a new event, you should always assign it the lowest-numbered ID that is available (e.g., don’t use table row 3 if row 0 is currently available). After the event gets created, the process that created it should be added to the list of those that have it open.

If the create flag is false, the process is asking to open an existing event at index id in the table. A process should only be allowed to open the same event once, but it can open multiple events that have different IDs.

For this and all the other evnt_ functions, you will have a much better time if you directly edit the event table rather than trying to edit a local copy. For example, if you are trying to initialize the event at row i, you should access it’s fields directly like this:

evnt_table[i].opener_count = 0;
// initialize other fields of event_table[i]
...

Based on previous semesters, I know that many of you will be tempted to do something like this (THIS DOES NOT WORK THE WAY YOU WANT):

evnt e;

e.opener_count = 0;
// initialize other fields of e
...

evnt_table[i] = e;  // copy the struct fields

This breaks because any linked lists you initialize will point to e, which lives on the stack. After copying the struct into position i, the linked list pointers still point to the stack, and that stack frame will disappear when the function returns.

By always directly modifying the table directly, you won’t need to worry about errant pointers to the stack!

Return Value & Errors

On success, evnt_open should return the ID of the event that was opened. If you’re asked to create a new event, the return value allows the calling process to learn of the resulting event ID (much like opening a new file descriptor). When opening an existing event successfully, the return value should match the provided id value to confirm to the process that it successfully opened that ID.

evnt_open should return error codes for the following conditions:

-EINVAL: the process asked to open an invalid id number (outside the range [0-4]). (invalid argument)
-ENOENT: the process asked to open an event that has not been created yet (one that was not returned by an evnt_open call with create set). (no such file or directory)
-EEXIST: the process asked to open an event that it already has open. (file exists)
-EMFILE: the process requested the creation of a new event, but the table is already full. (too many open files)
-ENOMEM: the system call was unable to allocate memory (kmalloc returned NULL). You’re unlikely to see this one happen, but you should always check for it. (out of memory)
-EINTR: the system call was interrupted by a signal. For you, this should only be happening when you attempt to wait on a semaphore to protect your event implementation’s functions, but the semaphore wait is interrupted due to your process receiving a termination signal (e.g., you hit CTRL-C while waiting). This situation will manifest as a call to down_killable failing, so you should check its return value! (interrupted system call)

2.2.2. evnt_close (# 334)

evnt_close(int id): This system call is used by a process to close the event identified by id. The calling process must have the event open to close it. If any processes are waiting on the specified event, calls to close that event should fail (to prevent deadlocks).

Calling evnt_close should only close the event for the process that calls it. Any other processes that still have the event open should be unaffected and might continue using it.

Any event that is closed by all the processes that were using it (e.g., nobody has it open anymore) should be destroyed. Afterward, its row in the table should be eligible for re-use by a subsequent evnt_open with create set.

Return Value & Errors

On success, evnt_close should return 0.

evnt_close should return error codes for the following conditions:

-EINVAL: the process asked to close an invalid id number (outside the range [0-4]). (invalid argument)
-ENOENT: the process asked to close an event that has not been created yet (one that was not returned by an evnt_open call with create set). (no such file or directory)
-EACCES: the process asked to close an event that it doesn’t have open. (permission denied)
-EBUSY: the process asked to close an event that currently has other processes waiting on it. (device or resource busy)
-EINTR: the system call was interrupted by a signal. For you, this should only be happening when you attempt to wait on a semaphore to protect your event implementation’s functions, but the semaphore wait is interrupted due to your process receiving a termination signal (e.g., you hit CTRL-C while waiting). This situation will manifest as a call to down_killable failing, so you should check its return value! (interrupted system call)

2.2.3. evnt_wait (# 335)

evnt_wait(int id): This system call is used by a process to block itself on the specified event until that event gets signaled. The calling process must have the event open to wait on it. Your implementation should use the functions for blocking and unblocking processes described below in "relevant Linux kernel interfaces."

It is NOT your responsibility to prevent users from doing stupid things when waiting on events. In other words, if two processes open the same event and then both wait on it, it’s the user’s fault that they are deadlocked (assuming a third process doesn’t come along), not your implementation’s fault.

Return Value & Errors

On success, evnt_wait should return 0.

evnt_wait should return error codes for the following conditions:

-EINVAL: the process asked to wait on an invalid id number (outside the range [0-4]). (invalid argument)
-ENOENT: the process asked to wait on an event that has not been created yet (one that was not returned by an evnt_open call with create set). (no such file or directory)
-EACCES: the process asked to wait on an event that it doesn’t have open. (permission denied)
-EINTR: the system call was interrupted by a signal. For you, this should only be happening when you attempt to wait on a semaphore to protect your event implementation’s functions, but the semaphore wait is interrupted due to your process receiving a termination signal (e.g., you hit CTRL-C while waiting). This situation will manifest as a call to down_killable failing, so you should check its return value! (interrupted system call)

2.2.4. evnt_signal (# 336)

evnt_signal(int id): This system call is used by a process to unblock all processes that are waiting on the specified event. If no processes are waiting, the call should succeed, but the call will have no effect. The calling process must have the event open to signal it.

Return Value & Errors

On success, evnt_signal should return 0.

evnt_signal should return error codes for the following conditions:

-EINVAL: the process asked to signal an invalid id number (outside the range [0-4]). (invalid argument)
-ENOENT: the process asked to signal an event that has not been created yet (one that was not returned by an evnt_open call with create set). (no such file or directory)
-EACCES: the process asked to signal an event that it doesn’t have open. (permission denied)
-EINTR: the system call was interrupted by a signal. For you, this should only be happening when you attempt to wait on a semaphore to protect your event implementation’s functions, but the semaphore wait is interrupted due to your process receiving a termination signal (e.g., you hit CTRL-C while waiting). This situation will manifest as a call to down_killable failing, so you should check its return value! (interrupted system call)

2.2.5. evnt_print (# 337)

evnt_print(void): This system call should trigger the kernel to print debugging information about the state of your events using printk(). For open events, the output should include the PID of every process that has the event open and the PID of every process that is currently waiting on the event.

Return Value & Errors

On success, evnt_print should return 0.

evnt_print should return error codes for the following conditions:

-EINTR: the system call was interrupted by a signal. For you, this should only be happening when you attempt to wait on a semaphore to protect your event implementation’s functions, but the semaphore wait is interrupted due to your process receiving a termination signal (e.g., you hit CTRL-C while waiting). This situation will manifest as a call to down_killable failing, so you should check its return value! (interrupted system call)

2.3. Userspace Test Application

Like the previous lab, you’ll need to write a small application that can invoke your system calls and test their behavior. I would strongly suggest that your test program be interactive (i.e., like your shell, it accepts commands as input and then invokes the corresponding system call). That way, you don’t have to worry about orchestrating multiple processes and their timing when you test. You can simply fire up multiple copies of your interactive tester and purposefully cause any combination of event system calls to be made in the order you want.

2.4. File Submission

You must submit ALL the files that you modified or added to the Linux kernel sources. Please make sure that all necessary files are present and that they compile without errors.

3. Checkpoint

For the checkpoint, you should be able to demonstrate that:

The kernel initializes your event table data structure and state variables at start up.
User processes can create and open events with evnt_open.
Your kernel can print the event table with evnt_print and for each event, list the processes that have the event open. You don’t need to print the list of waiting processes yet.
You’ve written a small userspace application to help you test the other checkpoint requirements.

4. Relevant Linux Kernel Interfaces

This lab will require you to interact with a few of Linux’s mechanisms for storing/retrieving information and interacting with process states.

4.1. Getting the Calling Process

In several cases, you’ll need to get information about the process that is calling one of your system calls. Since that’s a common operation, there’s a helpful function, get_current, that will return a pointer to the task_struct of the current process:

struct task_struct *this_process = get_current();

4.2. Semaphores

For this lab, your event implementation should use exactly one semaphore to protect its event table and associated global state from concurrent access races. The semaphore interface is defined in include/linux/semaphore.h. The main items of interest for this lab are:

To declare a semaphore variable, you should use the DEFINE_SEMAPHORE(name) macro, where name is the name you’d like to give the new semaphore variable. It looks a bit weird to declare a variable this way (nobody says DEFINE_INT(i)), but this allows the kernel developers to change semaphore types without breaking all the code that uses them.
You can initialize the semaphore with a call to sema_init. Since you’ll be using the semaphore to enforce mutual exclusion, you should give it an initial value of 1. Your sema_init call should be happening once in your init function.
To make a process wait on a semaphore (acquire the lock), the Linux kernel has a few functions with slightly different behaviors, but we’ll only be using one in this lab: down_killable. Assuming you initialized your semaphore’s value to 1, this variant will allow only one process to acquire the semaphore. Any process that attempts to call down_killable while the semaphore is already in use will be blocked in such a way that it cannot be interrupted by anything other than a signal that will terminate the waiting process. This variant is what we want, since we don’t typically want someone who is waiting on your event to be interrupted, but we do still want test programs to terminate if you hit CTRL-C.
To signal a semaphore (release the lock), use the up function.

4.3. Linked Lists

Each entry in your event table will need to keep track of a variable number of processes that have it open. Rather than implementing your own linked lists, you should use the Linux kernel’s built-in doubly linked list implementation, which is defined in include/linux/list.h.

The kernel’s linked list interface is a little different from the lists you’re probably used to seeing, but it really is implementing the same doubly linked list structure you’re familiar with, I promise! Instead of defining a list node as a struct with a data field and pointers to the next/prev structs, the expectation in Linux is that you embed a struct list_head inside of the struct that you’d like to store a list of. That list head keeps track of the links to the struct list_head in the other structs in the list.

For your implementation, each row in your event table should contain a struct list_head to track the processes that have it open. You’ll also need to define a struct, containing just a pid_t and a struct list_head to represent a record of one process having an event open. Then, whenever a process opens an event, you should: kmalloc one of those structs, fill in the current process’s PID, and then add the struct to the list. When closing, you should do the opposite: remove from the list and then kfree.

The functions you’ll probably want to use are: list_add and list_del to add and remove items from a list. You can also initialize a list with the INIT_LIST_HEAD macro, which you’ll need when creating a new event. Finally, the list interface has convenience macros for iterating over the contents of a list: list_for_each and list_entry. To use it, you’ll need to give it a pointer to a struct list_head for tracking the iteration position and a pointer to the type you expect to find in the list. Here’s an example (not a suggestion that you should indiscriminately copy/paste):

/* NOTE: This code is just an example sketch of how you might work with lists,
 * for reference.  I'm NOT suggesting you use this as-is in your submission.
 * DO NOT copy this and expect it to do anything useful. */

/* Suppose this is the data type you're keeping a list of. */
struct element {
    int data1;
    int data2;
    struct list_head element_list;  // This tracks the links to other elements.
}

struct list_head *iter;
struct element *e;

list_for_each(iter, ...) {
    e = list_entry(iter, struct element, element_list);
    /* e now refers to one element in the list. */
}

/* The missing ... parameter should be the address of the `struct list_head` that starts off your list.
 * In your case, that will be the `struct list_head` stored in an event table row. */

For more information about linked lists in the Linux kernel, try these resources:

https://medium.com/@414apache/kernel-data-structures-linkedlist-b13e4f8de4bf

https://algorithmsandme.com/linked-list-implementation-in-linux-kernel/

https://kernelnewbies.org/FAQ/LinkedLists

4.4. Wait Queues

Wait queues track a list of task_structs that are blocked on a condition. They’re built from Linux linked lists, but the structure is already developed for you, since making processes block is a common kernel responsibility that appears all over the place. The associated functions are defined in include/linux/wait.h.

To declare the start of a wait queue (e.g., in each row of your table), you can declare a variable of type struct wait_queue_head, which can be later initialized with init_waitqueue_head. With the start of a queue declared and initialized, you can create a node containing a task_struct using the DECLARE_WAITQUEUE macro. You can then add the node to a queue / remove it from a queue with add_wait_queue and remove_wait_queue.

The struct wait_queue_head type has two fields: a lock (ignore the lock) and a struct list_head named head. You can iterate over it like you iterate over your other linked lists (e.g., with list_for_each). The elements in the list are of type struct wait_queue_entry and they have a field named private which stores a pointer to a task_struct. It’s stored as a void *, but you can cast it to be a struct task_struct * and then access the pid:

// 'proc' is a struct wait_queue_entry
struct task_struct *t = (struct task_struct *) proc->private;

4.5. Blocking and Unblocking a Process

To block a process on an event, you need to:

Add the process to the wait queue associated with the event.
Set the process’s state (in its task_struct) to one that indicates blocking using set_current_state. Like the semaphore wait functions, there are a few variants. For the same reason described above, you’ll want to set them to the TASK_KILLABLE state.
Call schedule() to cause a context switch.

Later, when the process is resumed, be sure to remove it from the wait queue!

To unblock a process that is waiting on an event, you can use one of the wake_up family of macros defined in include/linux/wait.h. Given the semantics we’ve defined for events, you probably want wake_up_all.

Make sure that processes release the semaphore before blocking themselves. They probably also need to reacquire the semaphore upon waking.

5. Tips

I strongly suggest that you sketch out a logical design of what you want to have happen first, before doing any coding. In other words, map out what should happen in each system call in English text first and then implement it in code later. That way, you decouple the high-level tasks like reasoning about the evnt_ functions behavior and concurrency from the lower-level task of expressing it in kernel speak.

The set of header files I #included in my implementation of evnt.c is:

#include <linux/errno.h>      // error constants
#include <linux/init.h>       // pure_initcall()
#include <linux/kernel.h>     // printk() and linked list functions
#include <linux/sched.h>      // set_current_state() and schedule()
#include <linux/semaphore.h>  // semaphore functions
#include <linux/slab.h>       // kmalloc() and kfree()
#include <linux/string.h>     // memset()
#include <linux/syscalls.h>   // SYSCALL_DEFINE macros
#include <linux/wait.h>       // wait queue functions

#include <asm/current.h>      // get_current()

If you place a process in a wait queue (e.g., in evnt_wait) and block it, it will eventually wake up later when a process signals it. When it wakes up, it will resume from where it left off in the wait function. You can take advantage of this observation by having processes remove themselves from the wait queue after waking up instead of trying to find and remove all the processes in the queue within evnt_signal.
Make sure you don’t block a waiting process while it’s holding an important resource (e.g., the shared semaphore)!
Rather than typing dmesg repeatedly to see your debugging output from evnt_print, you can "follow" the log output in real time by executing: tail -f /var/log/syslog. Just don’t forget to put newlines (\n) at the end of your printk() calls!
Write helper functions for tasks that show up repeatedly (e.g., checking for common error conditions in the event system calls).

6. Submitting

Please remove any excessive debugging output prior to submitting.

To submit your code, commit your changes locally using git add and git commit. Then run git push while in your lab directory.

Please ONLY submit any Linux source files that you have modified or added along with a README.md file containing the paths of those files within the source tree. DO NOT submit the entire Linux kernel source tree. Please add any userspace testing code to the "userspace" directory.

For example, suppose you:

modify include/linux/sched.h
add a new file kernel/newfile.c
write a userspace test program named test-feature.c

You should submit test-feature.c in the provided userspace directory. You should submit sched.h and newfile.c, along with a README.md, in the root of the repository. The README.md should contain the path of each file (relative to your kernel’s base directory), e.g.:

sched.h:   include/linux/sched.h
newfile.c: kernel/newfile.c

If you don’t give me all of the files you modified, your submission will not build, and I will not be able to grade it. Please make sure you get all of them. I would strongly suggest recording which files you’ve changed immediately as you modify them.