Problem Introduction
For this project you will add the first part of the code generator to
your C-- compiler. You will add code generation for parts of the C--
language that do not involve variables or functions; C-- code with a
single main function that deals directly with numbers.
The code generator part of your compiler will take as input the AST
produced by the parser, and will traverse the AST to generate MIPS
assembly code.
Your compiler executable should take two command line arguments, the first
is the C-- code source file and the second is the name of the MIPS assembly
code output file:
$ ./mycc foo.c-- foo.mips
You can then run your MIPS assembly program using spim or xspim:
$ spim foo.mips
$ xspim foo.mips
Getting Started
Grab the starting point code and add it to your svn repository
One of you or your partner, should do the following:
- From you project subdirectory, copy over the setup3 file from my
directory containing the project 3 starting point code:
$ cd cs75/CS75_repositoryname/project
$ cp /home/newhall/public/cs75/proj3_startingpt/setup3 .
- run setup3:
$ ./setup3
The setup3 script will copy over project 3 starting point files into
the symtab, parser, and includes subdirectories, call svn add to add these
files to your repository, and then call svn to commit the new files.
- At this point your partner should be able to do an "svn update" to grab
the project3 starting point files from your repository.
Starting Point Code Details
There is not much starting point code for this assignment.
Most of the files are empty, and are just provided as a starting point that
can be built with the supplied Makefile. You may want to add additional
.c and .h files as you implement code generation functionality. If you do
so, make sure to edit the Makefile to include your .c files in building
the compiler executable.
include/[codegen.h, symtab.h]: shell of header file for code generator and symbol table
codegen/codegen.c: starting point for code generator module, the codegen function
should take the AST and generate MIPS code to a codetable
codegen/main.c: the main program for the full C-- compiler, you may need to change
some of the function calls in here, but it is basically complete
codegen/Makefile: builds the mycc C-- compiler, modify this if you add more .c files
symtab/symtab.c: starting point for the symbol table module, you won't need
this until the next assignment
Project Details:
As a first step, read over some of the MIPS references available off
the "On-line Resources" part of the CS75 home page.
For this part of the code generator, you will ignore all variables and
function calls. Assume that all C-- code will simply be a main program
that deals directly with numbers.
Things to keep in mind as you add code generation
-
Your code generator will traverse the AST produced by the parser to generate
MIPS code. Structure the code generator so that it has functions that
roughly match the different node types in the AST.
- keep in mind the difference between what happens at compile time and
what happens at runtime (you will be confused by this at some point).
at compile time:
- creating, modifying, and using the symbol table
- determining where program data are located in memory or registers
- generating instructions to create/tear-down stack frames, to access
memory values, etc.
- there is no stack
- code is not being executed (e.g. functions are not being called)
at run-time:
- the stack exits (as does the rest of the address space)
- there is no symbol table
- instructions are executed (e.g. function calls are made)
- I strongly suggest that you do a little bit of code generation at a
time and test thoroughly after each new phase.
- As you test, start with simple cases, then build up to stress testing
functionality (do you handle all typical and special cases?)
Suggestions for order in which to implement code generation
The following is a suggested order of implementation and testing:
- Implement a codetable data structure and functions for adding
instructions to the code table, for modifying instructions in the code
table, and for writing the codetable to the resulting MIPS assembly code
output file.
The code table contains instruction entries. You should define a
struct that can store all types of MIPS instructions, labels, and directives.
Make sure to use enum or #defines for field values to make your code readable.
- Add a function to generate the start of every MIPS assembly code file
to the code table. The beginning of every MIPS assembly
file should look like this:
.data
_newline_:
.asciiz "\n"
.text
.globl main
_newline_ is used for writeln and the only global symbol is main.
- Add functions to generate all the generic parts of function preamble
and function epilog code to the code table.
Assume that the callee will always save all callee saved registers in
the prolog. As an extra credit part of the next assignment, you can
think about saving only the ones the callee actually needs to use.
Because you are not handling code generation for functions in this
assignment, you should hard-code in calls to generate main's prolog
before you handle generating code for the parts of C-- that you are
handling, and then hard-code in a call to generate main's epilog after you
are done generation code for the parts of C-- you are handling.
In the next assignment you will remove these calls and handle generating
code for main's prolog and epilog just like you handle for any other
function definition in the AST.
- Start adding code to support simple C-- programs with a single main
function with expressions involving numbers only and write and writeln
statements. Here is a suggestion for an order in which to add and test
functionality:
- writeln;
- write num;
- write expression;
- if else statements
- while loops (at this point you can only test loops that never execute
or loops that are infinite, since you cannot yet handle assignment
statements)
An example C-- program that you should be able to compile is the following
(you do not need to handle the variable declaration list part of a block
for this assignment):
int main() {
if(4 != 5 || 6 == 8) {
write 1;
writeln;
write 2 + 6 * 3 - 'a';
writeln;
} else {
write 0;
writeln;
}
while(8 < 3) {
write 8;
}
}
write and writeln
write and writeln are should be implemented using SPIM's support for
system calls (print_int and print_string). Registers $a0 and $v0 are
used for system calls.
Expressions
To generate code for expressions, start with C-- programs that
contain simple arithmetic expressions like:
int main() {
write 9;
write 9 + 10;
write 10 - 3;
}
Then try more and more complicated and longer arithmetic expressions
and test that your generated code is correct.
To generate code for expressions you need to use registers to store temporary
results. For example, if the AST for a write stmt looks like:
write
|
+
/ \
9 *
/ \
10 3
The value of 10 needs to be loaded into a register, the value of 3 needs to be
loaded into a register, they need to be multiplied together and the result
stored in a register, then the value of 9 needs to be stored in a register
and added to the register storing 10*3 and that result needs to be stored
in a result register.
You should use the caller saved registers $t0-$t7 for evaluating expressions.
Your code generator will need to implement a register allocation scheme for
these registers so that when a temporary register is needed to store an
operand or result, one can be found and when a temporary register
is no longer needed it can be made available for generating other instructions.
Make sure that you can handle expressions that have more operands then there
are temporary registers:
write 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10;
Once you have stress tested arithmetic expressions, implement relational
expressions:
write 9 > 13; // evaluates to 0 (false)
write 9 <= 13; // evaluates to 1 (true)
write 9 == 13; // evaluates to 0
Once you have thoroughly tested relational operators, then implement code
generation for expressions with logical operators:
write !2; // evaluates to 0
write 0 || 1; // evaluates to 1
write 8 < 9 && 7 != 6 && 2 > 10; // evaluates to 0
You must implement
short-circuiting for
(expr || expr) and ( expr && expr)
To do this you need to implement unique labels to which the short-circuiting
code will branch to skip over evaluating the rest of the expression once the
final answer is known. You can use a simple label notation, like
.Li where
i increases for each unique label generated.
sprintf may be a useful function for producing the label string.
if-else and while
if-else and while are C-- constructs that alter the control flow of
a program. You need to generate branch instructions to unique labels
to implement these correctly. Your code generator should implement
somewhat meaningful unique labels to make your assembly code
for if-else and while control flow more readable. For example, here is
what my labels for an if-else might look like:
... # instructions for the if part
...
.L0_else: # a unique label for the start of the else part
...
...
.L1_ifdone: # a unique label for the end of the if-else stmt
What to hand in
A tar file to be submitted using cs75handin containing:
- a tar file of your project directory (i.e. all the source, header,
and Makefile files I need to build and run your parser). You should do a
'make clean' before taring up your project directory, and make sure that
all debug output is turned off (i.e. your compiler should have no
output to stdout).
- A README file with:
- Your name and your partner's name
- The number of late days you have used on this assignment
- The total number of late days you have used so far
Project Demo
You and your partner will sign-up for a 20 minute demo of your compiler.
Prior to your demo, you should come up with some example C-- programs
that demonstrate both the correctness of your solution and that demonstrate
any parts of your code generation that are particularly interesting, tricky,
etc. You should also show me some error handling.