CS31 Weekly Lab: Week 5

Assembly, C pointers, and valgrind


Create a week05 subdirectory in your weeklylab subdirectory and copy over some files (note: the cp -r):

    cd cs31/weeklylab
    pwd
    mkdir week05
    ls
    cd week05
    pwd
    cp -r ~kwebb/public/cs31/week05/* .  # Note the -r flag passed to cp!
    ls
    Makefile  gdb_examples/  memparts.c  pointers.c  valgrind_examples/  while-loop-asm.s  while-loop.c

Week 5 lab goals:

  1. Practice writing assembly code.
  2. Introduction to the leal instruction.
  3. Example of accessing C pointer variables in IA32.
  4. See an example of where different parts of program memory live.
  5. Learn valgrind to find memory access errors.

Writing a loop in assembly

For this exercise, refer to the x86 instruction reference sheet.

Next, we'll try writing a while loop in IA32 assembly. Take a look at the while-loop.c file. You'll see that it creates an array with five buckets. It sets three of them to 5, 10, and 2, and it prompts the user to fill in the other two. It then calls a sum_function. Right now, the sum function, which should return the sum of all five buckets in the array, is empty! We need to fill it in, only rather than doing it in C, let's write it IA32 assembly.

Open the file while-loop-asm.s in a text editor. There's lots of stuff in here that was generated by gcc that isn't easily human-readable, but I've added comments in the section that you need to edit. At the beginning, the base of the array is stored in the %ecx register. You want to put your result in %eax when you're done. Right now, the constant 10 is being put into that register, which is no good. Work with your partner to write a loop that sums up the five array values.

At any time, you can run make and then execute while-loop-asm to test your implementation. Don't be surprised if you see some crazy values the first few times you execute it.

Comparison with gcc

After we've written a correct while loop in assembly, let's edit while-loop.c with an equivalent C implementation and use gcc -S to see what the compiler generated:

gcc -m32 -S while-loop.c
gcc -m32 -Wall -g -o while-loop while-loop.c
gvim while-loop.s

The leal instruction

Load effective address: leal S,D # D<--&S, where D must be a register, and S is a Memory operand. It's often used to implement C's address of (&) operator.

leal looks like a mov instr, but does not access Memory. Instead, it takes advantage of the addressing circuitry and uses it to do arithmetic (as opposed to generating multiple arithmetic instructions to do arithmetic). For example:

if edx holds the value of x:
 leal (%eax),%ecx  # R[%ecx]<--&(M[R[%eax]])
 # this moves the value stored in %eax to %ecx

The key is that the address of (M[ at address x ]) is x, so this is moving the value stored in %eax to %ecx; there is no memory access in this instruction's execution.

Examples:

Assume:   %eax: x    %edx: y

leal (%eax), %ecx               # R[%ecx] <-- x
leal 6(%eax), %ecx              # R[%ecx] <-- x+6


Assume y is a variable on the stack, at address %ebp - 4.

leal -4(%ebp), %ecx             # R[%ecx] <--- &(M[R[%ebp]-4]]): R[%ecx] <-- &y
leal appears a lot in compiler generated code. The compiler sometimes abuses leal to perform basic arithmetic.

C pointer variables in IA32

pointers.c is a simple program that uses a pointer variable.

$ cat pointers.c
 int pointers() {
   int x, y, *ptr;
   x = 8;
   ptr = &y;
   *ptr = 30;
   x = *ptr + 20;
 }

Lets compile a simple program using pointers and see what its assembly code looks like (or just type make):

$ gcc -m32 -S pointers.c

Let's cat out the .s file an look at some of the instructions. The thing to note is that when the *ptr is used (ptr is dereferenced), first the value of the ptr variable is obtained (its value is the address of y) and then the value at that address is accessed: a level of indirection.

Here is the code with some annotations around what it is doing (note the use of leal instruction):

pointers:
  pushl   %ebp            # stack setup
  movl    %esp, %ebp      # stack setup
  subl    $24, %esp       # stack setup
  ...
  movl    $8, -20(%ebp)   # x = 8
  leal    -24(%ebp), %eax # put the address of y, which is the value stored in %ebp - 24, in %eax
  movl	  %eax, -16(%ebp) # copy the address of y into the variable named 'ptr'
  movl	  -16(%ebp), %eax # copy in the other direction (I don't know why the compiler chose to do this!)
  movl	  $30, (%eax)     # dereference ptr and store 30 at the location it points to
  movl	  -16(%ebp), %eax # load that same address, again... (This also does nothing, the value was already in %eax)
  movl	  (%eax), %eax    # overwrite %eax with the value that it pointed to
  addl	  $20, %eax       # add 20
  movl	  %eax, -20(%ebp) # store the resulting value in x's memory location (%ebp-20)
  ...

Parts of Memory

Let's look at memparts.c. This program prints out the memory address of different parts of the program: global variables, local variables on the stack, instructions, and heap memory locations for malloc'ed space.

Let's just run this and see where some things are:

./memparts

The thing to note now is that heap memory locations (malloc'ed space) and local variable locations (on the stack) are at very different addresses. We will revisit this program later in the semester when we talk about other parts of program memory.

gdb, valgrind, and memory bugs

In the code you copied over are two subdirectories with test files for gdb and valgrind. We are going to go over just a couple of these, but see our departmental gdb and valgrind documentation (both linked to below) for more information about using gdb and valgrind and try out some more examples.

GDB for C program debugging

Today, we are going to look at some features of gdb for debugging C programs. In particular, looking at a stack trace, moving between frames to examine parameter and argument values, and examining runtime state of a segfaulting program. Again, you can use ddd, but I'm going to show you the gdb commands running gdb.

cd into the gdb_examples subdirectory.

First, run make to build the executables (note they are all compiled with -g).

Let's look through a couple of the example programs in gdb, following along in Tia's GDB Guide.

We are going to look at "segfaulter" in gdb. It's listed, among other examples, in the "Sample gdb sessions" part of the gdb guide under run 2: debugging segfaulter.

Up the page on this guide are lists of common gdb commands and some examples of how to use them.

Valgrind

Valgrind is a tool for finding heap memory access errors and memory leaks in C and C++ programs. Memory access errors are often very difficult bugs to find, and valgrind helps you easily find errors like reads or writes beyond the bounds of a malloc'ed array, accessing free'ed memory, reading uninitialized memory, and memory leaks (not freeing malloc'ed space before all variables referring to it go out of scope).

To use valgrind, just compile with -g, and run valgrind on your program:

make
valgrind ./program
The output at first seems a bit cryptic, but once you see the basics of how to interpret it, it is extremely helpful for finding and fixing memory access errors. Let's look at the Valgrind Guide to see how to interpret some of this valgrind output. This guide contains links to other valgrind resources, and the README file in the code you copied over lists some command line options for running valgrind. Some more information on debugging tools for C: C programming tools: gdb and valgrind

Lab 4 Overview

Let's look at lab assignment #4, and then you can use the remaining time to get started. Start with Part 1, which is a C programming assignment using pointers, and remember this page with information on using gdb and valgrind to debug your C programs.