Compilers

Lorikeet

lorikeet
Lorikeet

Due on Tuesday, April 15th at 11:59 PM. This is a compiler lab. If you have a partner, the two of you will complete this lab as a team. If you do not have a partner, you are effectively in a team by yourself.

If you are working with a partner, please be familiar with the Partner Etiquette guidelines. You and your partner share a single repository and, barring unusual circumstances, will receive the same grade. If you experience any difficulties in your partnership, please alert your instructor as soon as possible.

If you are working alone, please don’t hesitate to seek help if you get stuck. Since there are no ninjas for this course and you don’t have a partner, your instructor is the only interactive resource available to you and is happy to help you through any problems you might encounter.

In either case, be familiar with the academic integrity policy! You are permitted to discuss high-level details of your compiler project with other students, but you should never see or share anything that will be turned in for a grade. If you need help, please post on the course forum or, if necessary, contact the instructor directly. Make sure to post privately if you are sharing code. If you have any doubts about what is okay and what is not, it’s much safer to ask than to risk violating the Academic Integrity Policy.

Overview

In this lab, we will extend your compiler to a language called Lorikeet. You may start with your Falcon compiler or any compiler following it. Lorikeet introduces multi-process parallelism, allowing multiple computations to run simultaneously. This is a similar form of parallelism as offered by the Python multiprocessing package (which also uses processes for parallelism) and a different form of parallelism from the C pthreads library (which uses threads for parallelism).

Migrating to Lorikeet

As usual, you’ll run some Git commands to prepare for Lorikeet:

git fetch
git switch lorikeet
git merge ...

Here, you should git merge whichever branch you are starting from. This may be falcon, finch, gull, etc. The only new content in Lorikeet is the addition of four new expression forms to the parser and AST. There are no new resources files or other changes.

After merging, you should be able to make clean, make tests, and run ./tests. You’ll get some warnings about the new AST constructors as we’ve had in previous labs, but all of your previous tests should compile.

The Lorikeet Language

Lorikeet extends our bird languages with a four new expression forms to support parallel execution. We will first present the syntax and then discuss the semantics.

Syntax

The new syntactic forms of Lorikeet are as follows:

<expression> ::=
  ...
  | sleep(<expression>)
  | parallel(<expression>)
  | send(<expression>, <expression>)
  | receive(<expression>)

All other syntax is the same as the base language you chose when you merged. As usual, you can see the abstract syntax extensions for Lorikeet in src/language/asts.ml. There are new constructors corresponding to each of the syntax forms shown above.

Semantics

sleep

The first new form of expression in Lorikeet is sleep(...), which takes as its only argument an integer indicating how many seconds to sleep. If a non-integer is provided to this expression, the Lorikeet program will exit using error code 1 (as in all other cases where an integer is expected but not received). This sleep operation simply pauses execution of the program for that amount of time and always evaluates to zero. We will discuss how to sleep using a system call below.

parallel

The next new form of expression is parallel(...), which accepts as its argument the closure of a function to call in parallel with the current process. That function will run independently of the current process; the value it returns is ignored. For instance, consider the following Lorikeet code:

def f n =
  print(4)
end

let child = parallel(f) in
print(5)

This program runs the f function and the body of the main expression’s let at the same time in different processes. Since neither process is waiting for the other, this program may print 4 first and then 5 or it may print 5 first and then 4. (In practice, it will usually do the second of these things as the process for the main program is already running.)

Launching a parallel subprocess is the side-effect of the parallel built-in, but we haven’t discussed the value to which it evaluates. (The print built-in, for instance, evaluates to the value that it printed.) The parallel built-in evaluates to a new kind of bird value, a channel, that facilitates communication between these processes as we discuss below.

send and receive

While the parallel feature allows us to run multiple computations simultaneously, racing to print our results is unsatisfying: we can’t guarantee an order in which to print values and we can’t coordinate between our different processes. The send and receive built-in operations allow different processes in a Lorikeet program to communicate.

When a child process is executed, it receives a bird channel as its sole argument (e.g. into the parent parameter in the previous example). The parallel expression also evaluates to a channel. These two channel values are connected in that information sent to one of these channels will be received by the other. Channels are bidirectional: any channel can send or receive data.

We send information using send(channel, data), where channel is a bird channel value and data is the data to send. The send built-in evaluates to the data value which was sent (similar to how print evaluates to the value which was printed). On the other side, that information can be received by using receive(channel), which evaluates to the next datum which was sent to the other side of the channel.

For example, consider the following program:

def double n parent =
  send(parent, n*2)
end

let child1 = parallel(double 4) in
let child2 = parallel(double 5) in
let _ = print(receive(child2)) in
let _ = print(receive(child1)) in
print(2)

This program will always print

10
8
2

Let’s break down why.

  1. First: the main expression runs double 4 and double 5 in parallel as child processes. Here, double is a two parameter function, but double 4 is a closure which is waiting for a single argument (a channel). Those two parallel child processes run simultaneously with the main expression.

  2. The child processes each send a value to the parent channels which is twice that of their n parameter values. The send calls do not block: the value is stored by the channel and the double call continues. The two child processes then continue and terminate.

  3. The main expression reads the value that was sent to it by the second child process and prints it. Since the second child process was running double 5, it sent a 10 to its channel, so that’s what is printed.

  4. The main expression then reads and prints the value that was sent to it by the first child process, which is an 8 by the same logic above.

  5. The main process finally prints a 2 and then terminates.

Note that the narrative above may vary slightly: the first child process might terminate after the main process reads the 10. In general, parallel processes may execute their steps in any order except in cases where they are made to wait for each other (such as via send and receive in Lorikeet). But regardless of these small variations, the Lorikeet program above will always print the same results.

Note that the introduction of channels and multiprocess communication introduces a new kind of bug that our bird programs did not have before: it is now possible for a process to wait forever. For instance, consider the following buggy Lorikeet program:

def subproc parent =
  0
end

let child = parallel(subproc) in
receive(child)

The program above will launch a child subprocess and then attempt to receive a value from it, but the child subprocess terminates and never sends anything. As a result, the parent subprocess will wait forever for a value to arrive. It is the Lorikeet programmer’s responsibility to write programs that communicate productively; you are not required to do anything about these cases (although you’ll want to avoid writing them into your unit tests!).

For ease of implementation we will prohibit sending any value through a channel other than integers and booleans. We could imagine sending more complex values (tuples, closures, etc.) through channels, but doing so would require considerably more effort.

Runtime Errors

Lorikeet has added behavior that can lead to new kinds of errors. We’ll extend our runtime errors as follows:

Other errors are handled as in previous labs. For instance, the expression parallel(0) would produce error code 5 because the parallel built-in expects to receive a closure but 0 is not a closure. Child subprocesses, if they execute successfully, should exit with code 0 once they are finished running their code.

The Lorikeet Runtime

In order to support these new constructs, we will need to add channels to the Lorikeet language. We will also need to make use of several system-level features through system calls.

System Calls

As described in lecture, Linux system calls on the x86-64 are made with the syscall instruction. You’ll need to add that instruction to your instruction data type. The syscall instruction takes no arguments directly; instead, it relies upon the values of registers. System calls are similar to normal function calls, but they are hardware supported, run operating system code, and follow a different calling convention. General instructions for using system calls can be found by reading the appropriate manual page (man syscall) but, for the purposes of your lab, we present a targeted (and somewhat simplified) description here.

The syscall instruction allows access to a variety of OS-provided operations: setting timers, accessing files, launching or terminating processes, and so on. When the syscall instruction is invoked, the value stored in the rax register determines which system call will be performed. The mapping from system call ID to its behavior is arbitrary: different operating systems provide different system calls and use entirely different ways of identifying them, so a compiler needs to know the operating system for which it is generating code. Our lab computers are running Linux on the x86-64 architecture, so, for our purposes, the table of system calls appears in the Linux kernel source code at the path arch/x86/entry/syscalls/syscall_64.tbl. This section describes the interfaces of the system calls relevant to the Lorikeet lab.

On x86-64 Linux, the system calls we will be using accept arguments in order in the registers rdi, rsi, and rdx. Return values will appear in the register rax once the system call is complete. The registers rax, rcx, and r11 are caller-saved (volatile); all other registers are callee-saved (non-volatile). Several system calls will require pointers to memory; we discuss strategy for allocating that memory below. We will be using the following system calls:

System Call Summary syscall ID First argument Second argument Third argument Return value
nanosleep Pause program execution 35 A pointer to a time structure 0 (none) (unused)
fork Create a child subprocess 57 (none) (none) (none) Child PID (for parent process) or 0 (for child process)
exit Terminate the process 60 Process exit code (none) (none) (none)
pipe Create a pipe 22 Pointer to result memory (none) (none) 0 on success; 1 on error
read Read from a file descriptor 0 File descriptor Pointer to buffer Number of bytes to read Number of bytes successfully read
write Write to a file descriptor 1 File descriptor Pointer to buffer Number of bytes to write Number of bytes successfully written

We will describe the behavior of these system calls in more detail below.

nanosleep

This system call pauses a process for a period of time. This allows the operating system to schedule other programs to run during that time. After the prescribed time, the operating system will allow the process to run again.

The nanosleep call takes two arguments, both of which are pointers to structures describing an amount of time. For our purposes, we will always set the second argument, a pointer to a structure describing time remaining in case the sleep is interrupted, to 0 (for NULL). We will set the first argument to a pointer to a location where we have reserved 16 bytes of memory. The first 8-byte value is the number of seconds to sleep. The second 8-byte value is the number of additional nanoseconds to sleep; since we won’t be using that level of precision, your Lorikeet programs will set that second 8-byte value to 0.

nanosleep ordinarily returns 0 for success or -1 if it is interrupted or some other error occurs. We will ignore interruptions to sleeping in Lorikeet, so we can ignore the return value of the nanosleep system call.

fork

The fork system call creates a copy of your process. The new process inherits a copy of your process’s memory (both heap and stack), register values, and all other state. Both the original process and the new process continue running after the point at which the fork system call returns.

The fork system call returns (in rax) different values to the parent and child processes. This is generally the only way that a given process will know whether it is the parent or the child. The child process is returned a 0. The parent process is returned a non-zero process identification number (PID) to the child. We won’t use this PID in the Lorikeet runtime, but the fact that it is non-zero indicates that the parent process is running.

fork takes no arguments.

exit

The exit system call terminates your process. The sole argument to this system call is the exit code that the process should produce. The C function exit that our stopWithError function uses is just a thin wrapper around this system call.

pipe

The pipe system call creates a pipe with an OS-backed storage buffer. We can think of the pipe almost like a queue data structure: information sent into the pipe comes back out of the pipe in the same order. We send information into the pipe and take it back out of the pipe using the read and write system calls described below.

The pipe system call expects as its only argument a pointer to eight bytes of memory. The system call will create the pipe and return 0 on success. If anything goes wrong, the system call will return a non-zero value. For simplicity of implementation, we will ignore this case.

On the x86-64, the highest 32 bits of the memory to which the argument points are set to the file descriptor of the write end of the pipe. The lowest 32 bits of this memory is set to the file descriptor of the read end of the pipe.

read

The read system call reads data from a file descriptor. The arguments to this call are the file descriptor, a pointer to the memory where data will be stored, and the number of bytes of memory to read. The return value is the number of bytes which were successfully read. This will normally equal the number of bytes requested, but may be a smaller number when the file descriptor refers to e.g. a file which does not have enough bytes left to satisfy the request. Because we are using this system call to read from in-memory pipes, we will generally assume that each eight byte read we perform is fully satisfied.

The read system call will wait until at least one byte of information is available before proceeding. If no bytes are available to read, the system call will simply wait.

write

The write system call writes data to a file descriptor. The arguments to this call are similar to read: the file descriptor, a pointer to the memory containing the data to write, and the number of bytes of memory to be written. Also similar to read, this system call returns the number of bytes successfully written. While write might return a number less than the requested number of bytes in some cases, it will wait until it can write at least one byte. Since all of our reads and writes will be eight bytes and because the size of the operating system’s pipe buffer is a multiple of eight bytes, we don’t need to worry about partial writes.

System Calls, Memory, and Pointers

Several of the system calls described above require that you provide a pointer to memory containing a buffer, value, or data structure. The nanosleep call, for instance, expects a pointer to two 64-bit machine words. There are a couple choices you have when interacting with a system call that requires memory and you can even mix and match these approaches in your compiler if you prefer.

  1. You could set aside some memory in your data section to use when interacting with system calls. This would be similar to how we set aside memory to store the zero-closures for functions in Falcon or how we set aside a single 64-bit value to store the heap cursor in Eagle. You can use whatever labels you want to refer to this memory as it will only be used by the parallelism features of your language runtime.

  2. You could reserve some stack memory by decreasing rsp and using that newly-reserved space for your data. As long as you move rsp back after the system call is finished, this won’t cause any problems.

Note that, for system calls that manipulate data in memory, there is no way to convince them to interact with registers. You will have to set aside memory somewhere for those system calls to interact with.

Channels

The parallel built-in in Lorikeet produces a new kind of value, the channel, which represents a way of communicating with the created process. The created process also receives a channel as an argument to the function it runs. To represent these channels, we will introduce a new binary representation for them. A channel is stored in the form 0xRRRRRRRWWWWWWW03, where 0xRRRRRRR is the file descriptor for the read side of the channel and 0xWWWWWWW is the file descriptor for the write side of the channel. Note that this assumes that file descriptors never exceed 268435455. This assumption is acceptable on the CS network lab computers because, like most similar systems, they permit at most 1,048,576 simultaneous file descriptors per process and use the lowest available number when allocating a new descriptor.

Note that these channels are primitive bird values similar to pointers, integers, and booleans. Make sure that you have considered the impact that channels may have on your typechecking logic. Channels end in 011, for instance, and so their introduction may have some impact on your dynamic typechecking logic.

In order to create a channel, you will need to invoke the pipe system call twice. One pipe will be used to send messages from the parent process to the child process. Another pipe will be used to send messages from the child process to the parent process. A channel is one process’s view of both of these pipes. For instance, imagine that the first pipe call produces the value 0x0000000700000006 and the second pipe call produces the value 0x0000000900000008. This indicates that the first pipe has 6 as its read descriptor and 7 as its write descriptor, while the second pipe has 8 as its read descriptor and 9 as its write descriptor. We could give the parent process the machine value 0x0000006000000903 as its channel. This would tell the parent to write to descriptor 9 (the second pipe) when sending messages to the child and to read from descriptor 6 (the first pipe) when receiving messages. In that case, we would give the child the machine value 0x0000008000000703 as its channel, so that it will write messages to the first pipe and read messages from the second pipe.

In order for this to work, both processes must be able to see both pipes. This means that you must create the pipes before you invoke the fork system call. You can then create the appropriate channel based upon whether you are running in the parent or child subprocess (as determined by the return value from fork).

Implementing the Lorikeet Compiler

The biggest challenge in implementing the Lorikeet compiler lies in interacting successfully with the underlying operating system. One way to simplify this process is to add features as incrementally as possible. You can begin with sleep(...), as its behavior is relatively innocuous. You can then move on to parallel(...) and establish that you’ve implemented it correctly by using racing programs like those shown above to illustrate that multiple processes are running. Finally, you can implement send(...) and receive(...) and test that processes can communicate with each other. In detail:

  1. Begin with compiling sleep expressions. This should be as simple as setting up the time structure parameter for sleep and then making the appropriate system call. Make sure to test that the sleep behavior actually works the way you would expect. If you want to validate that your program is sleeping, you can use the time shell command as follows:

    $ cat test_code/sleepy.bird       # show program source
    sleep(1)
    $ ./hatch test_code/sleepy.bird   # compile the program
    $ time output/sleepy.run          # run the program, timing its execution
    0
       
    real    0m1.005s
    user    0m0.003s
    sys     0m0.001s
    

    time, similar to valgrind, takes as its arguments the program invocation you actually want to run. It monitors the resource usage of the program and prints information about how long it took to run. The “real” category is how much actual time the program was running. The “user” category is how much time the program spent executing its own instructions, while “sys” is how much time the operating system spent executing system calls for the program. If you have slept successfully, then your program should take at least as long as your sleep took and there should be an appropriate gap between the real and user time of the process.

  2. Next, implement the behavior of parallel but without channels. Everywhere a channel would normally appear, use a stand-in value instead (such as true or 0). This way, you don’t need to worry about the pipe, read, or write syscalls while you are debugging your use of fork. Note that, once you invoke the fork syscall, both the parent and the child will run the same code afterward. One of the first steps you’ll take will be to jump to a different label based upon whether the parent or child is currently running.

    As you implement parallel, implement the appropriate type checks (e.g. the argument should be a closure) and calls. You already have logic for function application. You should not copy this code. Instead, move it to a helper function that generates function application instructions and then call it from both the EAppl and EParallel cases. This will make your work a lot easier in the future if you have to modify the EAppl case again. (If you have implemented Hoopoe, you will want to ensure that the child process’s invocation of its closure does not attempt to perform a tail call.)

    Once you have implemented parallel, you can determine that you are correctly launching child subprocesses by running the following program multiple times and verifying that it occasionally produces different values. This program launches four different children, each of which prints four tuples.

    def printloop n k parent =
      if k = 0 then 0 else
      let junk = print((n,k)) in
      let _ = sleep(1) in
      printloop n (k-1) parent
    end
       
    def runloop i =
      if i = 0 then 0 else
      let junk = parallel(printloop i 4) in
      runloop (i-1)
    end
       
    let junk = runloop 4 in
    sleep(4)
    

    Each tuple contains a unique number identifying its child process (in the first position) and a number indicating its iteration (in the second position). So on some runs you might see

    (4, 4)
    (3, 4)
    (2, 4)
    (1, 4)
    (4, 3)
    (3, 3)
    (2, 3)
    (1, 3)
    (4, 2)
    (3, 2)
    (2, 2)
    (1, 2)
    (4, 1)
    (3, 1)
    (2, 1)
    (1, 1)
    0
    

    but other executions might transpose some of the tuples, such as

    (4, 4)
    (3, 4)
    (2, 4)
    (1, 4)
    (4, 3)
    (3, 3)
    (2, 3)
    (1, 3)
    (3, 2)
    (4, 2)
    (2, 2)
    (1, 2)
    (3, 1)
    (4, 1)
    (1, 1)
    (2, 1)
    0
    

    The variation is due to the different execution orders of the child subprocesses, a property which is amplified by the use of sleep.

  3. Last but not least, implement the send and receive built-ins. This starts with returning to your parallel code, adding fork system calls, and creating the channels that you will use in the parent and child subprocesses. Once you have done so, you can write logic for the send and receive operations (including type checks). Remember to reject sending values other than integers or booleans.

    If you want to make your Lorikeet compiler particularly robust, you can also perform system calls to close the unused file descriptors in each process after you fork. We create the pipes before forking because we need part of each pipe in both the parent and the child to create an appropriate channel. The operating system doesn’t know that we’re using the pipes in this way, though, and records that both processes have access to them. By using the close system call (syscall ID 3 with one argument: the file descriptor to close) on the file descriptors that aren’t in your process’s channel, you signal to the operating system that you won’t be using those descriptors. You can also close the descriptors used by the child process before it terminates. None of this is mandatory, but it means that buggy programs are more like to terminate than to sit around and wait forever.

Testing Lorikeet

To verify that your Lorikeet compiler is working correctly, we need to solve a parallelizable problem and compare the performance of the parallel and serial versions. Because Lorikeet only includes basic arithmetic and doesn’t necessarily include tail call optimization or a garbage collector, we must be careful to avoid using too many resources to solve this problem; otherwise, we may run out of heap or stack memory. Here, we construct a toy example using a series of number sequences we will call the “\(k\)-fauxbinacci” numbers which we will define here.

The Fibonacci sequence is a sequence of integer values defined by the function \(F(n)\). The first two numbers, \(F(0) = 0\) and \(F(1) = 1\), are defined directly. The remaining numbers are defined in terms of the previous two numbers: \(F(n) = F(n-1) + F(n-2)\) for all \(n \geq 2\). The Fibonacci sequence is then 0, 1, 1, 2, 3, 5, 8, and so on.

We define here the “\(k\)-fauxbinacci” numbers. The term “fauxbinacci” is a play on “Fibonacci” (an Italian mathematician’s name) and the French word “faux”, meaning “false” or “fake”. Similar to the Fibonacci numbers, the \(k\)-fauxbinacci numbers are generated by a function \(f_k(n)\). The first three values are fixed: \(f_k(0) = 0\), \(f_k(1) = 1\), and \(f_k(2) = k\). The rest of the \(k\)-fauxbinacci numbers are defined by \(f_k(n) = f_k(n-1) + f_k(n-2)\). This means that the \(1\)-fauxbinacci numbers are exactly the same as the Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, etc. The other fauxbinacci sequences progress similarly, but at different values.

\(k\)-fauxbinacci \(n=0\) \(n=1\) \(n=2\) \(n=3\) \(n=4\) \(n=5\) \(n=6\)
\(1\)-fauxbinacci \(0\) \(1\) \(1\) \(2\) \(3\) \(5\) \(8\)
\(2\)-fauxbinacci \(0\) \(1\) \(2\) \(3\) \(5\) \(8\) \(13\)
\(3\)-fauxbinacci \(0\) \(1\) \(3\) \(4\) \(7\) \(11\) \(18\)
\(4\)-fauxbinacci \(0\) \(1\) \(4\) \(5\) \(9\) \(14\) \(23\)

With this definition in mind, we can test a Lorikeet compiler by using parallel computation to calculate \(f_1(i) + f_2(i) + f_3(i) + f_4(i)\) for some sufficiently large \(i\).

Here is a Falcon program which can calculate the sum of the 32-index fauxbinacci numbers for sequences 1, 2, 3, and 4.

def fauxbinacci_1 n =
  if n = 0 then 0 else
  if n = 1 then 1 else
  if n = 2 then 1 else
  fauxbinacci_1 (n-1) + fauxbinacci_1 (n-2)
end

def fauxbinacci_2 n =
  if n = 0 then 0 else
  if n = 1 then 1 else
  if n = 2 then 2 else
  fauxbinacci_2 (n-1) + fauxbinacci_2 (n-2)
end

def fauxbinacci_3 n =
  if n = 0 then 0 else
  if n = 1 then 1 else
  if n = 2 then 3 else
  fauxbinacci_3 (n-1) + fauxbinacci_3 (n-2)
end

def fauxbinacci_4 n =
  if n = 0 then 0 else
  if n = 1 then 1 else
  if n = 2 then 4 else
  fauxbinacci_4 (n-1) + fauxbinacci_4 (n-2)
end

fauxbinacci_1 32 +
fauxbinacci_2 32 +
fauxbinacci_3 32 +
fauxbinacci_4 32

Note that we copy and past the definition of the fauxbinacci function here so that we are calling functions with only one argument. Multi-argument functions induce the creation of new closures which, in turn, might exhaust our heap memory without some form of garbage collection.

We can write a similar program in Lorikeet, but we can calculate the different fauxbinacci values simultaneously by running those subexpressions in parallel. Here is a Lorikeet program which is capable of doing so:

def fauxbinacci_1 n =
  if n = 0 then 0 else
  if n = 1 then 1 else
  if n = 2 then 1 else
  fauxbinacci_1 (n-1) + fauxbinacci_1 (n-2)
end

def fauxbinacci_2 n =
  if n = 0 then 0 else
  if n = 1 then 1 else
  if n = 2 then 2 else
  fauxbinacci_2 (n-1) + fauxbinacci_2 (n-2)
end

def fauxbinacci_3 n =
  if n = 0 then 0 else
  if n = 1 then 1 else
  if n = 2 then 3 else
  fauxbinacci_3 (n-1) + fauxbinacci_3 (n-2)
end

def fauxbinacci_4 n =
  if n = 0 then 0 else
  if n = 1 then 1 else
  if n = 2 then 4 else
  fauxbinacci_4 (n-1) + fauxbinacci_4 (n-2)
end

def run_task func arg parent =
  let result = func arg in
  send(parent, result)
end

let child_1 = parallel(run_task fauxbinacci_1 32) in
let child_2 = parallel(run_task fauxbinacci_2 32) in
let child_3 = parallel(run_task fauxbinacci_3 32) in
let child_4 = parallel(run_task fauxbinacci_4 32) in
receive(child_1) +
receive(child_2) +
receive(child_3) +
receive(child_4)

Here, the run_task function is a generic tool that takes a function, an argument, and a channel to a parent process. It calls the function by passing it the argument and then sends the result through the provided channel. The main expression of the program uses this run_task function four times, once to start the computation of each fauxbinacci subexpression, and then waits to receive the values that they have produced.

To test parallelism in Lorikeet, you should save both of these sources to .bird files and then ./hatch and run them. When you run them, use the time shell command to see how long each of them takes to run. The parallel version should run considerably faster than the serial version if Lorikeet is working correctly. For reference, \(f_1(32) + f_2(32) + f_3(32) + f_4(32) = 16,790,850\).

Debugging Lorikeet

When debugging the Lorikeet compiler, you will need to examine the behavior of the programs it has compiled. The techniques already described in the course’s general GDB guide will be useful. You’ll especially want to be prepared to examine the registers and memory that are affected by the system calls you perform.

The fork system call presents a particularly unique challenge. Upon executing a fork, your program creates an additional process. The default behavior for GDB is to allow the forked process to run unimpeded. This is useful if you want to watch the parent process. You may, however, prefer to observe the behavior of the child process instead. To make GDB follow the child process when a fork occurs, you can run the command set follow-fork-mode child. Running set follow-fork-mode parent reverts this change, meaning that future forks will follow the parent process instead.

GDB also has the ability to debug multiple processes simultaneously. By default, GDB always detaches from one process and follows the other. The GDB command set detach-on-fork off will instead cause GDB to follow both processes, meaning that you will need to step both of them in order for your overall Lorikeet program to run. The processes launched under GDB are termed “inferiors” and you can view them using the command info inferiors. You can switch between the processes being debugged by running inferior N where N is the GDB-assigned index for the process (e.g. inferior 3). Multi-process debugging can be especially useful when observing how different processes are using system calls to communicate.

As always: if you have any trouble using GDB, please reach out to ask questions! Having a good toolkit is crucial to interacting with complex interfaces like the system calls used in this lab.

Summary

To complete this assignment, you’ll need to

Much of the work in this lab will be in reasoning about what your compiler is doing, since the debugging tools we have will allow us only a narrow view of what’s happening. Make sure to ask if you have trouble divining what your compiler is doing!

Submitting

It is not enough to push your work. In your repository, you will find a Python script called submit.py. In order to submit your lab assignment, it should be sufficient to run that script by typing python3 submit.py -a lorikeet in your terminal. For this script to work correctly, you must have committed all uncommitted files in your repository and pushed your work so that your repository is in sync with your GitHub remote repository. Note that, because your compiler repository will be used for multiple different assignments, you must specify which assignment you are submitting when you run the submit script.

The aforementioned Python script is designed to create a Git tag: a name for a specific commit in your repository. (The script also performs a series of checks to help ensure that the tag you create contains what you likely think it contains.) The tag will be named using the words “submission”, the name of your assignment, and a version number (such as submission_lorikeet_01). Your instructor will know that you have submitted your work because this tag will appear in your repository. If you need to resubmit your work to correct changes after receiving a grade, you can simply create new commits and then create another tag (preferrably with submit.py). At any given time, only your most recent outstanding submission will be graded.

In addition to submitting your work, you should fill out a brief questionnaire found here. In most cases, the questionnaire should take less than a minute.

If You Have Trouble…

…then please contact your instructor! The course forum is the preferred method, but you can reach out via e-mail as well. Good luck!