Homework #12: Turbo Toy

Administrative

The language for this homework is:

#lang pl 12

In this homework you will apply the changes that were discussed in class to the Toy language implementation, and then extend it with two new features. To begin your work, get the Toy source code which should be used as the basis for your work. You will implement a recursive binder called bindrec, a special form for variable assignment, multiple body expressions, and functions that receive their arguments by reference. The next homework will follow-up on this one.

The grade value of each part is indicated in the following sections. If you did not manage to do some parts, indicate it clearly at the top of your submission file.

5%Boxes

As discussed in class, the first step toward implementing recursive environments is changing the implementation of environments so they hold boxes of values instead of plain values. Begin by changing the type — you need to change the FRAME type definition so that instead of VAL it uses (Boxof VAL), do this and fix the rest of the code so it works with the new implementation.

As we have seen, this is a pretty easy task, but to make it even easier, here’s a summary of things you should do after the above change:

Change extend: it still receives VALs, so it should wrap them in boxes when it creates the new frame. To do this, rename extend to raw-extend which should have exactly the same code but slightly different type (make sure you rename the values input argument, and the symbol used in the error message). Then make a new one-liner extend that simply wraps the input values in boxes and sends them to raw-extend. Note: to keep it a one-liner, you need to use (inst box VAL) — this makes Typed Racket treat box as value that has a type of (VAL -> (Boxof VAL)), otherwise it cannot infer the right type.
lookup’s definition should not change, so it now returns a different type: fix its type declaration to indicate this. (This will be useful later on when we implement assignment.)
The global-environment should also be fixed so it is initialized with boxes of values. (It will be cleaner to make racket-func->prim-val create a box, instead of doing it yourself for each use.)
Finally, the type of lookup changed, so you should change the single place that uses it accordingly.

Completing this part is needed for the rest of the homework, which will build up on it.

Note: the tests that are provided with the Toy implementation provide complete coverage, make sure that this is still the case. Also, as you continue extending the interpreter, make sure that you maintain complete coverage.

20%Variable Assignment

Now that we have an environment where names are associated with mutable boxes, we can use that to implement variable mutation. To do this, you need to extend the language with a new special form: set!. The syntax of this form is: {set! <id> <TOY>}. Add this to the BNF, to the TOY type definition (use Set for the name of the new variant), and to the parse code. To implement the semantics of the new form, add a Set case to the eval function which will use lookup to retrieve the box that holds the value of the named identifier, then evaluate the expression and use set-box! to store that in the box.

As discussed in class, when you implement the semantics of set!, you will have to decide what value to return as a result of evaluating these forms. The three choices that we have seen in class are:

Return the new value, making it possible to “chain” assignments, and use assignment expressions for their values, for example

{set! x {set! y 0}}

will set both x and y to zero, and

{if {> {set! x {- x 1}} 0} ... ...}

decrements x and then checks if it’s bigger than zero.
Return the old value, which makes it possible to do something with the value that is now overwritten, for example

{set! x {set! y x}}

will swap x and y, and

{display {car {set! l {rest l}}}}

“pops” a value from l and prints it.
Return some bogus value that cannot really be used for anything, which will make sure that expressions that are used for side-effect (only assignments for now) are not confused with other expressions.

The last option is the most strict, but it helps keeping things clear so we will use it.

In class, we have used a random number value for that, but that was a kludge. To do this properly, you need a new kind of value that is absolutely useless: add a new BogusV variant to the VAL type definition (it should not consume any inputs), and use that as the return value for evaluating Set syntaxes. As a minor optimization, define the-bogus-value as a BogusV value and use that instead of creating new bogosities every time one is needed.

Be sure to add some test cases to check that your modification works. This will be tricky at this point, since bind and fun bodies have just one expression. Hint: you can bind a name to an expression, and never use it. For example, in Racket:

(let ([x (display "foo")]) 2)

Use a similar approach for testing assignments.

25%Recursive Binders

The next extension is bindrec: a recursive binder form, which will be implemented as shown in class. First, add it to the BNF, it looks just like bind:

<TOY> ::= ...
| { bind {{ <id> <TOY> } ... } <TOY>}
| { bindrec {{ <id> <TOY> } ... } <TOY>}
| ...

and continue by extending the TOY type definition with a new BindRec variant.

The next step is to extend the parser: do not follow the bad cut-and-paste habits that were used in class, instead use the fact that parsing a bindrec is exactly the same as parsing a bind. The new code in parse-sexpr can use an or pattern to match two symbols, and it can use an and pattern to get a pattern variable bound to the symbol. To make this easy, here is a template for the resulting code:

[(cons (and binder (or 'bind 'bindrec)) more)
(match sexpr
[(list _ (list (list (symbol: names) (sexpr: nameds)) ...) body)
(if (unique-list? names)
(Bind names (map parse-sexpr nameds) (parse-sexpr body))
(error 'parse-sexpr "duplicate `bind' names: ~s"
names))]
[else (error 'parse-sexpr "bad `bind' syntax in ~s"
sexpr)])]

where the three places that need to be modified are marked in red.

Now to the important part: the change to eval which should implement the semantics of the new form. To make sure that your implementation is correct, add this test case:

(test (run "{bindrec {{fact {fun {n}
{if {= 0 n}
1
{* n {fact {- n 1}}}}}}}
{fact 5}}")
=> 120)

As shown in class, you will need to write a new function, extend-rec,

(: extend-rec : (Listof Symbol) (Listof TOY) ENV -> ENV)
;; extends an environment with a new recursive frame
(define (extend-rec names exprs env)
fill-in)

that consumes unevaluated expressions, creates the new environment with the identifiers bound to temporary values, then evaluates the expressions in the new environment and modifies the bindings appropriately. Instead of using some random number value for the temporary bindings, use the above the-bogus-value that is bound to an instance of BogusV.

In general, there are various approaches that you can use to correctly implement extend-rec — the actual implementation is up to you. A few Racket features that we have seen in class and can be useful for your code are:

let* is a form that has the same syntax as let, but each new binding is in the scope of the previous one. For example:

(let* ([a 2]
[b (+ a 3)]
[c (* a b)])
(+ b c))

is valid, since it is just a short-hand form of

(let ([a 2])
(let ([b (+ a 3)])
(let ([c (* a b)])
(+ b c))))
for-each is a function that is similar to map, except that it does not construct a list of results. You should use it to iterate some side-effect over some list(s).
Note that you can also create the new frame as usual (using extend) and then set-box! the values in place. This is a little slower since it will require looking up the names during this process, but this is a minor (practically constant) cost.

In any case, the code should use extend to make it shorter. Also, make sure that your implementation handles mutually-recursive functions too.

25%Multiple Body Expressions

Given that our language has side-effects now, it makes sense to have multiple expressions in places where a body expression is expected. There are three such places now: bind, bindrec, fun.

To implement this:

Extend the BNF and the TOY type definition, then adjust the parse-sexpr function accordingly. Remember that a body of expressions requires at least one expression (it doesn’t make much sense to have no body). It easiest to use a single Listof in the type and make the parser reject a bad expression as a syntax error, so in the rest of the code you can rely on the list being non-empty. Some trickiness is required to deal with the match patterns, so as an example, here is how the fun parse code should look like:

(match sexpr
[(list 'fun (list (symbol: names) ...)
body0 body ...)
(if (unique-list? names)
(Fun names (map parse-sexpr (cons body0 body)))
(error 'parse-sexpr "duplicate `fun' names: ~s" names))]
[else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])

Use this in the binders case too.
You will also need to extend the FunV value type since a closure should now hold a list of expressions.
Write a helper function called eval-body that simply uses eval to evaluate a list of expressions. There is a decision to make here: what should the value of evaluating a body of multiple expressions return. Like in Racket, we will choose to return the last value.

The declaration of eval-body should look like this:

(: eval-body : (Listof TOY) ENV -> VAL)
;; evaluates a list of expressions, returns the last value
(define (eval-body exprs env)
fill-in)

Be careful not to collect a list of all results in your implementation.
Finally, use eval-body in the right places (again, you’ll have three of these).

To test your code for both this and for assignments, you can change your earlier assignment test (the one with the redundant binding just to be able to test the side-effect), and you should also use these two tests:

(test (run "{bind {{make-counter
{fun {}
{bind {{c 0}}
{fun {}
{set! c {+ 1 c}}
c}}}}}
{bind {{c1 {make-counter}}
{c2 {make-counter}}}
{* {c1} {c1} {c2} {c1}}}}")
=> 6)
(test (run "{bindrec {{foo {fun {}
{set! foo {fun {} 2}}
1}}}
{+ {foo} {* 10 {foo}}}}")
=> 21)

25%By-Reference Function Arguments

Finally, you will extend the language so that it is possible to pass variables by reference, instead of the usual “by value”. In some languages, it is possible to mark some variables as “by reference” with some keyword (& in C++, var in Pascal, and ByRef in Visual Basic). C++ programmers that know C too will tell you that if you look under the hood, a by-reference argument is really just a pointer: we will use a similar approach here, only we will use boxes instead. (Note for C programmers: you can view a box as a pointer to its value.)

To make this a little easier we will not use a special mark for arguments. Instead, we will add a new kind of “by reference” functions that take all their arguments by reference: rfun — begin by adding this to the BNF (same as fun), and to the TOY type definition (as an RFun variant). Then modify the parser so it generates this new syntax type when it sees an rfun keyword in the input: do not copy the code, generalize the existing code for fun in a similar way that you did for the bind case above. You will also need to deal with evaluating RFun syntaxes — make them evaluate just like Funs, but the resulting closure should be distinguishable from plain closures: so add a new byref? boolean field to your FunV definition.

The last step is the fun part (“fun”, not “fun”): applying the new kind of functions. To apply a function, the code should still evaluate the body in a new environment — the new thing here is that the environment should be created in a different way if this is a by-reference function. You just need to add a new case for calling a FunV closure with a true byref? flag on, which is going to be simply this:

[(FunV names body fun-env by-ref?)
(if by-ref?
(eval-body body (raw-extend names
(get-boxes arg-exprs env)
fun-env))
same as before)]

Your job is to define get-boxes as a function that consumes the expressions and returns a suitable list of boxes. This requires the argument expressions to all be identifiers (otherwise it doesn’t make sense to send them to an rfun — in C terms, they must be lvalues, so you can take their address). get-boxes should check for this and throw a “non-identifier” error otherwise; the error should be phrased as an rfun error since get-boxes is just its utility.

Also, make the code avoid evaluating the function arguments if the function is an rfun, since there is no need to redundantly evaluate the argument in this case, or in case the where the applied value is not a function. Here are two test cases for this:

(test (run "{{rfun {x} x} {/ 4 0}}") =error> "non-identifier")
(test (run "{5 {/ 6 0}}") =error> "non-function")

Using this new functionality, you can play with some examples and make them into tests. Here’s one example that defines a swap function:

(test (run "{bind {{swap! {rfun {x y}
{bind {{tmp x}}
{set! x y}
{set! y tmp}}}}
{a 1}
{b 2}}
{swap! a b}
{+ a {* 10 b}}}")
=> 12)