CS31 Weekly Lab: Week 6

Examining binary files


Create a week06 subdirectory in your weeklylab subdirectory and copy over some files:

    cd cs31/weeklylab		
    pwd
    mkdir week06
    ls
    cd week06
    pwd
    cp ~kwebb/public/cs31/week06/* .
    ls

The leal instruction

Load effective address: leal S,D # D<--&S, where D must be a register, and S is a Memory operand. It's often used to implement C's address of (&) operator.

leal looks like a mov instr, but does not access Memory. Instead, it takes advantage of the addressing circuitry and uses it to do arithmetic (as opposed to generating multiple arithmetic instructions to do arithmetic). For example:

if edx holds the value of x:
 leal (%eax),%ecx  # R[%ecx]<--&(M[R[%eax]])
  # this moves the value stored in %eax to %ecx
  

The key is that the address of (M[ at address x ]) is x, so this is moving the value stored in %eax to %ecx; there is no memory access in this instruction's execution.

Examples:

  Assume:   %eax: x    %edx: y

  leal (%eax), %ecx               # R[%ecx] <-- x
  leal 6(%eax), %ecx              # R[%ecx] <-- x+6

Assume y is a variable on the stack, at address %ebp - 4.

leal -4(%ebp), %ecx             # R[%ecx] <--- &(M[R[%ebp]-4]]): R[%ecx] <-- &y
leal appears a lot in compiler generated code. The compiler sometimes abuses leal to perform basic arithmetic.

C pointer variables in IA32

pointers.c is a simple program that uses a pointer variable.

$ cat pointers.c
 int pointers() {
   int x, y, *ptr;
   x = 8;
   ptr = &y;
   *ptr = 30;
   x = *ptr + 20;
 }

Lets compile a simple program using pointers and see what its assembly code looks like (or just type make):

$ gcc-4.4 -m32 -S pointers.c

Let's cat out the .s file an look at some of the instructions. The thing to note is that when the *ptr is used (ptr is dereferenced), first the value of the ptr variable is obtained (its value is the address of y) and then the value at that address is accessed: a level of indirection.

Here is the code with some annotations around what it is doing (note the use of leal instruction):

$ cat pointers.s
  pointers:
  pushl   %ebp
  movl    %esp, %ebp
  subl    $16, %esp
  movl    $8, -4(%ebp)     # x = 8
  leal    -8(%ebp), %eax   # R[%eax] <--- &(M[R[%ebp]-8]]): R[%eax] <-- &y
  movl    %eax, -12(%ebp)  # ptr = &y;
  movl    -12(%ebp), %eax  # R[%eax] <-- ptr: R[%eax] <-- &y
  movl    $30, (%eax)      # what ptr points to gets 30
  movl    -12(%ebp), %eax
  movl    (%eax), %eax
  addl    $20, %eax
  movl    %eax, -4(%ebp)
  leave
  ret

Compiling Phases and Assembly

First, let's open up simplefuncs.c in a text editor.

We are going to look again at how to use gcc to create an assembly version of this file, and how to create a object .o file, and how to examine its contents.

If you open up the Makefile you can see the rules for building .s, .o and executable files from simplefuncs.c. We will be compiling the 32-bit version of instructions, so we will use the -m32 flag to gcc:

gcc-4.4 -m32 -S simplefuncs.c   # just runs the assembler to create a .s text file
gcc-4.4 -m32 -c simplefuncs.c   # compiles to a relocatable object binary file (.o)
gcc-4.4 -m32 -o simplefuncs simplefuncs.o  # creates a 32-bit executable file
To see the machine code and assembly code mappings in the .o file:
objdump -d simplefuncs.o
You can compare this to the assembly file:
cat simplefuncs.s

Tools for examining binary files

Some tools for examining binary files:
  1. strings dumps all the strings in a binary file:
      strings simplefuncs 
    
  2. nm (or objdump -t) to list symbol table contents:
      objdump -t  simplefuncs   # list symbol table in the executable (a.out) file
      nm --format sysv  simplefuncs  # list symbol table in the executable file
    
  3. gdb and ddd with disass.

gdb and ddd to debug at the assembly code level

We covered using gdb to step through IA32 assembly in the week 4 lab, but let's try it out again with the simplefuncs program.

First, let's open up simplefuncs.c in vim. Then, let's try some things out in gdb:

gdb simplefuncs
(gdb) break main
(gdb) break func1
(gdb) run
In gdb you can disassemble code using the disass command:
(gdb) disass main
You can set a break point at a specific instruction:
(gdb) break *0x08048477   # set breakpoint at specified address 
And you can step or next at the instruction level using ni or si (si steps into function calls, ni skips over them):
(gdb) ni	  # execute the next instruction then gdb gets control again 
(gdb) ni
(gdb) ni
(gdb) ni
(gdb) ni
(gdb) disass
(gdb) cont      # continue to next break point
Now we are at the call to func1, let's step into this function using si (we also have a breakpoint at this function, let's see when it is hit):
(gdb) si
(gdb) disass
(gdb) ni 
(gdb) where
(gdb) disass
(gdb) cont
You can print out the values of individual registers like this:
(gdb) print $eax
Or the memory contents at a given address, providing either the absolute numeric address or its value stored in registers:
(gdb) p *(int *)($ebp + 8)
(gdb) x     $ebp + 8
(gdb) x/d     $ebp + 8   # x/d display as decimal value
You can also view all register values:
(gdb) info registers
You can also use the display command to automatically display values each time a breakpoint is reached:
(gdb) display $eax
(gdb) display $edx
You can use the examine command (x) to display the contents of a memory location either an address of via a register value (x is shorthand for examine, and p is shorthand for the print command):
x $esp-0x8  # see what p and x display for the same value
p $esp-0x8    

p *(int *)($ebp-0x8)    # here is how to get what x gives you using print

x $esp + 0x1c    # here is examining the contents at a memory location
x 0xffffd2fc     # specifying the address in two different ways

ddd

We are going to try running this in ddd instead of gdb, because ddd has a nicer interface for viewing assembly, registers, and stepping through program execution:
ddd simplefuncs
The gdb prompt is in the bottom window. There are also menu options and buttons for gdb commands, but I find using the gdb prompt at the bottom easier to use.

You can view the register values as the program runs (choose Status->Registers to open the register window).

More Info

Quick summary of some useful gdb commands for debugging at the assembly code level (this is a made up code example):
  ddd a.out
  (gdb) break main
  (gdb) run  6              # run with the command line argument 6
  (gdb) disass main         # disassemble the main function
  (gdb) break sum           # set a break point at the beginning of a function
  (gdb) cont                # continue execution of the program
  (gdb) break *0x0804851a   # set a break point at memory address 0x0804851a
  (gdb) ni                  # execute the next instruction
  (gdb) si                  # step into a function call (step instruction)
  (gdb) info registers      # list the register contents
  (gdb) p $eax              # print the value stored in register %eax
  (gdb) p  *(int *)($ebp+8) # print out value of an int at addr (%ebp+8)
  (gdb) x/d $ebp+8          # examine the contents of memory at the given
                            # address (/d: prints the value as an int)
                            # display type in x is sticky: subsequent x commands
                            # will display values in decimal until another type
                            # is specified (e.g. x/x $ebp+8   # in hex)

Lab 5, it's aMAZEing.

Next, let's put these tools to use in solving the puzzles of lab 5.