The homework is for the same pairs by default. Email me if you have any problems or issues.
The language for this homework is:
In this homework you will implement a Flang-like language by introducing a new preprocessor step into the interpreter. This step will do two things: it will be similar to a compiler pass in that it will translate one language to another (a lower-level one); and it will be used to add a new feature to the language, where the feature is also implemented by translating it to the lower-level language. More specifically, the preprocessor will translate bindings in the source program to a different representation that uses de-Bruijn indexes (see the class notes for reference), and then you will extend it to translate functions and function calls with more than one argument.
As is commonly done by compiler passes, you will need to keep track of
the lexical scope of the code that is preprocessed. This will be done
by adding an environment argument to the preprocessor which will keep
track of the bindings. Note that this is not the same as the
runtime environment argument in
eval, it only keeps track of the
names that are currently in scope — there are no runtime values at
There is also a small twist here: the program processor will use a functional environment for this — implemented using real Racket mapping functions instead of association lists.
Begin your work with the Flang interpreter that uses environments.
First of all, rename the language to avoid possible confusion. Since this is going to be a Flang variation that uses de-Bruijn indexes, we name this interpreter “Brang”. Edit the file accordingly.
As mentioned above, the transformation from source to de-Bruijn indexes
that we will implement is essentially a new compilation step: it
translates one language (Brang) to a different language, which in this
case is quite similar except for identifier references. This language
will be our “core” language in that it represents what we execute at the
low level. We need a new type definition for this language — use
CORE for its name. Create this language as a copy of
that all of its variant names are prefixed by a “
The two languages are identical at this point, but they will drift apart in several important ways, and will end up being quite different. Using the fact that one language represents user input code, and the other represents what our core evaluator knows, we will get the flexibility of being able to develop each part independently of the other. Generally speaking, the BNF will be updated for the actual language, and on the other hand, the formal evaluation rules will be updated to follow the core language (so they’d end up incomplete since they don’t show how the user language runs). Ideally, both would get updated, but we skip it to make things easier.
To complete the new type definition, you need to modify it to fit its
first purpose of using de-Bruijn indexes instead of identifier names.
There are several things to change: first, instead of
CId, it should
CRef for identifier references — and it will have a
non-negative integer index, and for that you will need to use the
Natural type. Continue by eliminating all other mentions of
identifier names in the AST (for example,
CFun is going to be a
one-argument constructor). Note that “AST” in this context is
technically incorrect, since it’s no longer an abstract representation
of the input syntax, but an internal language.
The preprocessed programs that we will be running will have no
identifiers, as they will all be translated into de-Bruijn references.
You can therefore remove the code that deals with run-time environments:
ENV type definition, and the
You still need some definition for an
ENV type because closures still
hold environments, but the only thing we need in an environment is a
list of values — when we encounter a reference, it will tell us what
position in this list is needed. Define
ENV as the appropriate type.
Before we change
eval, it is worthwhile to go over our evaluation
rules, which can guide us through changing
eval. There are two
changes that are required:
First, we do not have identifiers, and we do not use
Instead, when we encounter a
CRef, we use it as an index to pull out
a value from the environment, which is now a simple list of values.
In other words, we now have this rule:
or write this more conveniently as:
(Note that the evaluation rules are slightly bogus, since they still use concrete syntax for an internal language that has no concrete syntax. But we still keep this for convenience.)
The second change is in the way environments are extended.
Previously, we used
extend in the formal rules, which was a function
of three arguments. Replace this with something appropriate.
You can now reflect these changes in the
eval code. First, change all
BRANG variant names to the
CORE variants (in types too). Note that
there is no
CId — you have to use
Next, adjust the rest of the
eval code as you did with the formal
rules: specifically, you will need to replace
Extend according to the
ENV type. The new
eval code should be a little simpler than the
original. There will be a few other parts of the code that will need to
The basic idea of the preprocessor is to scan the input program, and
convert identifiers into
CRef values. This would be a mostly-routine
function that translates
BRANG values to
CORE values. But there is
one important thing: when we encounter an identifier, we need to know
the “shape” of the current scope — a mapping from identifiers to their
de-Bruijn indexes. In other words, a kind of a compile-time
You need to implement this new environment type — and the twist is
that you should use functional values for this. These environments map
identifiers to non-negative integers (
Natural), so we get this type
definition (call it
DE-ENV; fill in the “?“s):
To complete the implementation, you need to implement
the empty environment, and
de-extend that consumes a DE-ENV and a
symbol, and returns the extended environment. Here are a few
observations about the behavior of these
de-empty-envis a vacuous mapping — it always throws an error.
(de-extend e x)should map x to
(de-extend e y)should map x to n+1.
For example, the following tests should all work:
(These tests can be useful while you implement this functionality, but
your submission should be tested only via
run, as usual.)
Using the above, implementing the preprocessor should be easy. As said
above, it is a simple recursive scan of its input that translates a
BRANG value (and a
DE-ENV mapping) to the corresponding
value. The only interesting cases are ones that deal with scope: the
two cases where a new scope is introduced, and dealing with identifiers.
Finally, you need to connect the preprocessor and the evaluator in
run: it should now use
preprocess over the parsed input (and an
empty static environment), and the result is sent to
eval with an
empty run-time environment (which is now just an empty list).
You should have a working evaluator at this point. This would be a good time to add enough tests to cover your code.
Now that everything is working, we can work on the two parts
independently. Begin with a change to the
CORE type and the
evaluator: there is no need for a
CWith at the core — in class,
we’ve seen that we can get local bindings with a simple function
application. So for this part, begin by removing
CWith variant from
CORE definition, and remove the corresponding
eval case. To get
back to a working evaluator, you need to extend the preprocessor code
Withs to the appropriate combination of
Notice how having a preprocessor allows changing the core evaluator implementation, with no changes to the language that users see.
We can also do the other side of this: extend the language, but deal with the extension through the preprocessor, so the core language doesn’t need to change.
To demonstrate this, extend the language with
fun forms of more than
one argument and a corresponding extension of
call forms with more
than one argument. Do this by translating them to curried functions and
curried function calls. You will need to begin by changing the BNF, the
BRANG type definition, and the parser.
<foo> <foo> ... means one or more
this points to one convenient representation: have one field hold a
“thing”, and another field hold a list of “more things”. It avoids the
need for a list that can never be empty. However, for the actual type
definition you may prefer to have a single field with a “non-empty list
of things” — it works fine even without some “non-empty-list-of” type
because assuming that the list is non-empty is risking only runtime
errors (and you will know that the list is never empty).
Finally, you need to extend the
preprocess function so it deals with
translating away functions and applications of more than a single
argument. When you do this, avoid using
match over the lists (it
would be convenient, but there are some technical problems with it).
Also, note that the translation can be done in two ways: either
translate the input to a different
BRANG and continue, or translate to
CORE. One of these will usually be more convenient.
Remember to add tests for the new feature.