Interpreter Project (Part 2)

Quick link: Summary page

Wrapping up Part 1: Test and handin37 and update37

Each step of the interpreter assumes that you have successfully completed the previous step. By writing comprehensive tests (in i-tests.rkt), you can be sure that the code you have written so far works properly. Although I will usually provide a few sample tests, the responsibility for extensively testing each section is yours. If your code passes all of my tests and all of your tests, you should turn in your completed version of Part 1 using handin37.

To get the starting-point code and sample tests for Part 2, you will first need to run the update37 command which will create the directory cs37/labs/i-2 and copy the files interpreter-p2.rkt and i-tests-p2.rkt. Next, you must join your old code and tests (from cs37/labs/i-1) with the new code and tests. To do so, simply follow these steps. (Note: These steps assume that your interpreter stored in a file called "interpreter.rkt", that your tests are stored in a file called "i-tests.rkt", and that both files are located in your labs/i-1 directory. If these conditions are not met and you don't know how to remedy the problem, please ask for help before proceeding.)

Change to the labs/i-2 directory: cd ~/cs37/labs/i-2
Run the merge script to join your old files and the new files together: ./merge
- Assuming you successfully merged your old and new files together, the above script will leave you with a new interpreter.rkt and i-tests.rkt file.
- In addition, to be sure you don't accidentally run the merge program a second time, the files of the now-unnecessary interpreter-p2.rkt and i-tests-p2.rkt will be set to be empty. (Do not remove these files as they'll just get re-copied once you run update37.)
Do not go back and edit the files in labs/i-1, as they are just copies of the current version you'll be working with in the labs/i-2 directory.

(Temporarily) completing the environment

In order to continue with our interpreter, we need to be able to complete two pieces of the environment abstraction which we omitted in Part 1: the function setup-env and the variable global-env which stores the variables in the global environment.

Racket predefines many bindings in its global environment: null, null?, car, cdr, list, and cons to name just a few. At this point, we will start with a very small global environment with just a single binding: the symbol null will be bound to the empty list ().

This is a temporary solution. In part 3, we will greatly expand our global environment. For now, we will use the following as our definition for setup-env:

(define setup-env
  (lambda ()
    (extend-env '(null) '(()) empty-env)))

We will then set the global environment to the environment returned by setup-env:

(define global-env (setup-env))

Introduction to Part 2

The central component of our interpreter is the evaluator. This is the component that takes a Racket expression and an environment and evaluates the expression in the context of that particular environment. We will do this evaluation by first determining what type of expression we are given, and then dealing with each type on a case-by-case basis, much like we did with the F♭ evaluator we built in Haskell.

The most basic types of expressions are the self-evaluating expressions. Self-evaluating expressions are expressions that evaluate to be themselves regardless of the environment. In F♭, these were Boolean, Integer, and Function. In our evaluator, booleans (#t and #f), numbers (e.g. 5, -56, 3.1415) and strings (e.g. "hello", "this is a string") will be self-evaulating. Functions will not be self-evaluating in our interpreter.

We will deal with other types of expressions, such as the special forms define and set! later in Part 2. We will postpone implementing the remaining special forms (if, begin, cond, lambda, let, etc.) until Parts 4 and 5. Our interpreter will begin to be able to evaluate procedure applications in Part 3, and we will complete the evaluation of procedure applications in Part 5.

As always, you can refer to the the summary page to get an overview of the functions you are working on in this part of the project.

Creating i-eval

Since the name eval has already been taken by the real Racket interpreter, we will call our evaluator i‑eval (short for interpreter‑evaluator).

We will start with self-evaluating expressions. Racket has primitive functions that can test whether or not an expression is one of the self-evaluating types: boolean?, number?, and string?. Since these are the only types of self-evaluating expressions in our language, we can write an evaluator to handle these fairly easily.

Read through the implementation I give below (which should now be in your interpreter.rkt file) and test it out on a few simple expressions to make sure you understand how it works.

(define i-eval
  (lambda (exp env)
    (cond
     ((boolean? exp) exp)
     ((number? exp) exp)
     ((string? exp) exp)
     (else (error "i-eval::unknown expression type" exp)))))

Now, we can create the (read‑eval‑print‑loop) which will read input from the user, evaluate the input in the global‑env, print the result, then loop. For example:

> (read-eval-print-loop)
INTERPRETER> 5
5
INTERPRETER> -3.14
-3.14
INTERPRETER> #f
#f
INTERPRETER> "a short string"
"a short string"
INTERPRETER> exit
INTERPRETER done.

An initial version of the read‑eval‑print‑loop function, shown below, is included in your interpreter.rkt file. You need to add the ability to allow the user to exit the interpreter by typing the command exit.

(define read-eval-print-loop
  (lambda ()
    (display "INTERPRETER> ")
    (let ((user-input (read)))
      (display (i-eval user-input global-env))
      (newline)
      (read-eval-print-loop))))

You will get tired of typing (read‑eval‑print‑loop), so you can use the shortcut (repl).

Implementing quote and quoted expressions

Your interpreter should now handle the self-evaluating boolean, number and string expressions. Now you will extend the range of expressions your interpreter can handle by allowing for quoted expressions. Recall that in Racket, you can create a quoted expression either by using the quote special form, or by using it's short-hand notation, the single quote:

> (quote this-is-quoted)
this-is-quoted
> 'this-is-also-quoted
this-is-also-quoted
> (quote (4 5 6))
(4 5 6)
> '(3.14 #t "hello")
(3.14 #t "hello")

Remember that a single-quote placed immediately before an expression is equivalent to using the full quote notation. The read function automatically expands an apostrophe to the quote notation, so our interpreter doesn't need to worry about expressions that start with single-quotes: handling expressions which use the full quote notation is sufficient.

You should write the procedure quoted? to test to see if an expression is quoted. Notice from the examples above that when a user enters a quoted expression, they are simply entering a list whose first element is the symbol quote. To test out, your function, try the following line, which will read one line from you and then return whether or not the line you entered was a quoted expression.

(quoted? (read))   ;See important note below

This will allow you test if the expressions you type are quoted. You should try a number of different inputs to make sure it works. The following should return #t: 'x, (quote x), '(quote x). The following should return #f: 7, "hello", #t

(Note: You can not simply say (quoted? 'x). In evaluating the expression (quoted? 'x), Racket evaluates 'x to be the symbol x and so the procedure quoted? is called with the symbol x -- which is no longer quoted; therefore, (quoted? 'x) ; ==> #f. To get the desired effect, you'd have to put your test case in a second quote: (quoted? ''x). When evaluating (quoted? ''x), Racket evaluates ''x, or (quote (quote x)), to be the list (quote x), which will return #t when passed to quoted?.

You should also write the procedure text-of-quotation which returns the text of a quoted expression. For example, the text of the quotation (quote x) is x. You can test this as follows:

(text-of-quotation (read)) ; try typing (quote x) or '(1 2 3) at the prompt

Extend i-eval to evaluate quoted expressions. You should use the functions quoted? and text‑of‑quotation which you just wrote.

(See the summary page for more details.)

Implementing define and variables

Your interpreter can now evaluate self-evaluating expressions as well as quoted expressions. You will now add the ability to define (and, in the next section, mutate) the environment by extending your interpreter to handle variable definitions. Notice that your interpreter's define special form returns the name of the variable you are defining:

> (read-eval-print-loop)
INTERPRETER> (define pi 3.1415)
pi
INTERPRETER> pi
3.1415
INTERPRETER> (define a (quote apple))
a
INTERPRETER> a
apple
INTERPRETER> (define j (quote jacks))
j
INTERPRETER> j
jacks

Fortunately, your work in Part 1 has already laid the necessary groundwork for implementing define. To begin the implementation, we will need to add a case to our cond statement in i‑eval to identify the expression type. You will write the tester function definition? for identifying define expressions.

The definition? tester function should end up looking a lot like the quoted? tester you just wrote. In fact, you will need to write many more tester functions which will have the same format. So, we will add a helper function, (tagged-list? exp tag), to capture this abstraction. The tagged-list? function will return #t whenever:

exp is a list, and
the list has a first element, and
the first element of exp is tag.

For example:

(tagged-list? '(define x 8) 'define) ;=> #t
(tagged-list? "some string" 'define) ;=> #f  (not a list)
(tagged-list? '(set! x 6) 'define)   ;=> #f  
(tagged-list? '(define x) 'define)   ;=> #t  ; See note below

Notice in the third case that tagged-list? returns true even though the expression is not a well-formed Racket definition. The procedure tagged-list? is only checking to see if the expression is a list and that it begins with the tag you specify. You will address this issue in the next section.

Using tagged-list?, write definition? (and go back and rewrite quoted?) in terms of tagged-list?.

(definition? '(define y 3)) ;=> #t
(definition? 5)             ;=> #f

We also need to build the selector functions for definitions, definition-variable and definition-value.

> (definition-variable '(define x 6))
'x
> (definition-value '(define x 6))
6

After recognizing that an expression is a definition, we need to evaluate the expression, and then update the environment to reflect the new definition. The special form define can be only be used to create a new binding in the most recently added frame of an environment: define cannot mutate a binding in the most recently added frame, but it can define a variable even if the variable is defined in another frame. If the variable binding already exists in the most recently added frame, you should report an error (e.g. duplicate definition for identifier).

However, before you place a binding for a variable into the environment, you will need to evaluate the third element of the define expression. For example, when you say (define a (quote (1 2 3))), a is not bound to (quote (1 2 3)); rather, (quote (1 2 3)) is first evaluated, and its result, (1 2 3), is stored as the binding for a.

Write the function (eval-definition exp env) which extracts the definition-value of the define statement from exp and evaluates it using your evaluator, i-eval, and then calls define-variable! with the definition-variable, the result of evaluating the definition-value, and the environment, env.

You will also need to write the above-mentioned function (define-variable! var val env) which checks to see if a binding already exists for var in the first frame of env. If so, report an error. Otherwise, if no binding exists for var in the first frame, we add the new binding to the first frame of the environment. (Notice that the parameter is val, not exp. This is because val has already been evaluated as part of eval-definition.)

To complete the implementation for define, you will need to make two changes to i-eval. First, i-eval must recognize and properly handle define expressions (by adding a case to your cond statement). Second, i-eval must recognize and deal with variables (also by adding a case to your cond statement). Below is the tester variable? needed to recognize variables.

(define variable?
   (lambda (exp)
     (symbol? exp)))

When you evaluate (using i-eval) an expression that is a variable, it should evaluate as its value in the environment. In Part 1 you wrote a procedure that finds the value of a variable in an environment, so you should use that to extend i-eval so that it properly evaluates variables.

Be sure to thoroughly test your code. Below are some tests you can use, but you should also add your own to your i-tests.rkt file:

>  (read-eval-print-loop)
INTERPRETER> (define x 3)
x
INTERPRETER> (define x 23)
error: duplicate definition for identifier
> (read-eval-print-loop)   ; See final section for explanation
INTERPRETER> x
3
INTERPRETER> (define q x)
q
INTERPRETER> q
3
INTERPRETER> (define m 'q)
m
INTERPRETER> m
q

Notice that in the expression (define q x), if you hadn't called i-eval on the value component of the define, q would have been bound to the symbol x rather than the value 3.

(See the summary page for more details.)

Implementing a simple syntax checker

To implement a simple syntax checker, you will write two new functions.

(tagged-list-length-n? exp tag n) will return true only if exp is a tagged-list with tag tag and exp is a list of exactly n elements.
(tagged-list-min-length-n? exp tag n) will return true only if exp is a tagged-list with tag tag and exp is a list of at least n elements.
For both functions, if exp is a tagged list with tag tag but does not meet the proper length requirements, you should return an approriate error. For some ideas about what your error message should look like, try things like (quote 3 5) or (define x 3 5) in Racket.

By replacing your calls to tagged‑list? with calls to tagged‑list‑length‑n? or tagged‑list‑min‑length‑n? (depending on which is necessary for a given expression type), your interpreter will be able to catch many simple syntax errors. For example, since define always requires exactly 3 parameters, you should replace (tagged‑list? expr 'define) with (tagged‑list‑length‑n? expr 'define 3).

Remember that as you complete future parts of the interpreter, you will want to continue to use these length-verifying tagged-list? variants instead of the original tagged-list? procedure.

Implementing set!

Congratulations! You have now completed implementing define. You should find that implementing set! is relatively straightforward now. In Racket, set! can only be used on a variable that has a previous binding in the environment. We search the environment (starting with the most recently added frame) until we find such a binding. Once we find a previous binding, we mutate it to reflect the new value. If we do not find any binding, we return an error.

You will need to implement each of these functions (and update i-eval appropriately) to handle set!. You will find most of these to be similar to the ones that you previously wrote for define:

assignment?
eval-assignment
assignment-variable
assignment-value
set-variable-value!

Here are some simple tests which behave exactly as they do in Racket with one exception: our versions of define and set! display the name of the variable that was defined/mutated.

> (read-eval-print-loop)
INTERPRETER> (define x 5)
x
INTERPRETER> (define y x)
y
INTERPRETER> (set! y 10)
y
INTERPRETER> y
10
INTERPRETER> x
5
INTERPRETER> (set! z 5)
set! cannot set undefined identifier:  z
> (read-eval-print-loop)   ; See final section for explanation
INTERPRETER> x
5

Note that for each new feature we incorporate into our interpreter, we will go through the same steps: creating testers, selectors, controllers, and sometimes helpers, and then updating i-eval.

(See the summary page for more details.)

Exiting the interpreter

As you saw from the final two lines of the example interactions of set! listed above, when we exit the read-eval-print-loop, either by explicity typing exit or causing a crash, the global environment still contains all of the bindings we created. This means that we can just run read-eval-print-loop again, and all of our bindings will be in place. This is a nice feature for testing, because if read-eval-print-loop crashes, you can resume with the same environment in place. However, it may also confuse you if this is not what you are expecting after cleanly exiting the read-eval-print-loop.

Modify read-eval-print-loop so that it allows users the option of resetting the global environment back to its initial state. If the user types exit, stop the read-eval-print-loop as you are currently doing -- without resetting the global-env. However, if the user types exit! (that's exit with an exclamation point), stop the read-eval-print-loop and also set! the global-env back to the environment returned by setup-env.

Interpreter Project

Part 2: Evaluation and Environment Modification quote, define, set!

Quick link: Summary page

Part 2: Evaluation and Environment Modification
`quote, define, set!`