Compilers

Debugging x86-64 Assembly with GDB

The majority of lab assignments in this course focus on developing a compiler which generates executables from source code. While you can use a number of techniques to find bugs in your compiler, such as debug printing or examining the assembly files that the compiler produces, it is often quite illustrative to watch the compiled executables run as well. This guide focuses on how to use GDB to inspect the behavior of your compiled programs so that you can use that information to adjust and correct your compiler as necessary.

Configuring GDB

Before you debug your compiled programs with GDB, you’ll want to configure GDB to give you legible feedback. GDB reads a file from your home directory called ~/.gdbinit and runs all commands appearing in that file every time it starts. You probably don’t already have a file of that name, so create one with your preferred text editor and include the following lines:

set disassembly-flavor intel
tui new-layout asmreg regs 1 asm 1 status 0 cmd 1

define 75
  layout asmreg
  focus cmd
end

The first line will ensure that GDB shows assembly in Intel syntax (which we are using in the class) rather than AT&T syntax (which we are not). The next line defines a layout mode which displays the contents of registers at the top, the disassembled program in the middle, and the command window at the bottom. When you are debugging, you can run layout asmreg to switch to this view mode. This mode is termed the “TUI mode”, which is short for “text user interface”. There is plenty of documentation regarding GDB’s TUI mode, though this documentation attempts to highlight the elements most useful for this course.

The define stanza allows us to define our own custom GDB commands. Here, we have created a command called 75 (for this course’s number) which switches to our new layout and places the window focus on the command window (so that you can use the arrow keys to navigate your command history). You can run this command on the GDB prompt when you start a debugging session rather than having to switch layouts each time. (Due to how GDB interacts with the assembly window, you can’t just run these commands in .gdbinit directly.)

Note that the ~/.gdbinit file supports comments via lines that start with #. If you want to comment out configuration elements (or leave notes for yourself), you can use these comments.

Getting Started

To debug one of your assembly programs, you can simply run GDB with the executable as an argument. For instance, you might run

  gdb output/example.run

on the prompt on a CS lab computer. This will start a GDB session with the output/example.run binary loaded and ready. (This is a good time to follow up with the 75 command from above if you configured it.) GDB will not run the program until you tell it to do so, however, since you probably want to configure breakpoints and other settings before you begin debugging.

You’ll usually want to start debugging at bird_main, the label where your generated assembly code starts. To do this, you can enter the command

  break bird_main

This will set a GDB breakpoint. A breakpoint will pause execution of your program when the instruction pointer reaches the provided memory address. In this case, bird_main is the memory address of your first generated instruction. Once you have set this breakpoint, you can run your program just by typing

  run

This tells GDB to run your program. Upon reaching bird_main, your program will stop and GDB will wait for instructions. If you are in a layout which shows assembly code or disassembled machine code (such as the 75 command from above), the highlighted line is code which has not yet run but is about to run.

In order to execute the very next assembly instruction, you can use the GDB command si. This stands for “step instruction” and will cause GDB to run a single machine instruction of your program before waiting for more commands. Note that giving GDB an empty command (just pressing Enter) will re-execute your previous GDB command, so you don’t have to type si for every instruction if you want to watch your program run several steps one at a time.

Note that the ni command (standing for “next instruction”) will also step like si over most instructions. The distinction is that, for call instructions, the ni command will attempt to perform the entire call and only stop once you’ve returned to the instruction after the call. The si instruction, on the other hand, will stop immediately after jumping to the location named by the call. Otherwise, these commands are basically the same.

If you want to restart your program, you can just issue another run command. You can also use the break command to set additional breakpoints (e.g. if you know that bad behavior starts at a particular label in your code) and use the delete command to clear breakpoints. GDB has built-in help for these commands which you can access with the help command (e.g. help delete).

Window Focus and Command History

Because GDB’s TUI mode is entirely text-based, you cannot switch window focus using the mouse. While you can use GDB terminal commands to switch focus, this is pretty tedious. Instead, you can use GDB’s multi-stroke key bindings for navigating the TUI. GDB has several of these bindings, but the one that is meaningful to us is “Ctrl+X o”: that is,

One iteration of the above key sequences will switch focus to the next window. Focusing the assembly window allows you to scroll through the disassembled memory. Focusing the command window allows you to scroll through your command history.

Examining Memory

As you debug your compiled programs, you will often need to examine the memory of a running program to understand what is going on. The most useful and general way to do this is to use GDB’s “examine” command (abbreviated as x). We will use the command primarily in the format x/#xg PTR where # is replaced by a word count and PTR is replaced by a pointer to the memory we want to examine. For instance, the command x/4xg $rsp will show four words of memory starting with the memory address stored in the rsp register and then moving forward in memory (down the stack) to show the next three words after that.

The x command is useful for inspecting the state of memory as you debug. If r8 contains a bird pointer to a tuple, for instance, you might write x/8xg $r8-1 to look at the eight words starting with that pointer. Here, eight is an arbitrary number: as human debuggers, we can elect to ignore the extra words, so we can round up to make sure we’re not missing anything important. GDB can perform simple arithmetic (such as the $r8-1 here), allowing us to examine the machine pointer corresponding to the bird pointer stored in the r8 register.

If you have questions…

…then please let your instructor know! It’s important to learn how to use gdb as you work to debug more advanced programs. Feel free to post on the course forum or reach out to your instructor in some other way if you prefer.