Lecture #11, Tuesday, October 11th
==================================
 Implementing Lexical Scope: Closures and Environments
 Fixing an Overlooked Bug

# Implementing Lexical Scope: Closures and Environments
So how do we fix this?
Lets go back to the root of the problem: the new evaluator does not
behave in the same way as the substituting evaluator. In the old
evaluator, it was easy to see how functions can behave as objects that
remember values. For example, when we do this:
{with {x 1}
{fun {y}
{+ x y}}}
the result was a function value, which actually was the syntax object
for this:
{fun {y} {+ 1 y}}
Now if we call this function from someplace else like:
{with {f {with {x 1} {fun {y} {+ x y}}}}
{with {x 2}
{call f 3}}}
it is clear what the result will be: f is bound to a function that adds
1 to its input, so in the above the later binding for `x` has no effect
at all.
But with the caching evaluator, the value of
{with {x 1}
{fun {y}
{+ x y}}}
is simply:
{fun {y} {+ x y}}
and there is no place where we save the 1  *that's* the root of our
problem. (That's also what makes people suspect that using `lambda` in
Racket and any other functional language involves some inefficient
coderecompiling magic.) In fact, we can verify that by inspecting the
returned value, and see that it does contain a free identifier.
Clearly, we need to create an object that contains the body and the
argument list, like the function syntax object  but we don't do any
substitution, so in addition to the body an argument name(s) we need to
remember that we still need to substitute `x` by `1` . This means that
the pieces of information we need to know are:
 formal argument(s): {y}
 body: {+ x y}
 pending substitutions: [1/x]
and that last bit has the missing `1`. The resulting object is called a
`closure` because it closes the function body over the substitutions
that are still pending (its environment).
So, the first change is in the value of functions which now need all
these pieces, unlike the `Fun` case for the syntax object.
A second place that needs changing is the when functions are called.
When we're done evaluating the `call` arguments (the function value and
the argument value) but before we apply the function we have two
*values*  there is no more use for the current substitution cache at
this point: we have finished dealing with all substitutions that were
necessary over the current expression  we now continue with
evaluating the body of the function, with the new substitutions for the
formal arguments and actual values given. But the body itself is the
same one we had before  which is the previous body with its suspended
substitutions that we *still* did not do.
Rewrite the evaluation rules  all are the same except for evaluating
a `fun` form and a `call` form:
eval(N,sc) = N
eval({+ E1 E2},sc) = eval(E1,sc) + eval(E2,sc)
eval({ E1 E2},sc) = eval(E1,sc)  eval(E2,sc)
eval({* E1 E2},sc) = eval(E1,sc) * eval(E2,sc)
eval({/ E1 E2},sc) = eval(E1,sc) / eval(E2,sc)
eval(x,sc) = lookup(x,sc)
eval({with {x E1} E2},sc) = eval(E2,extend(x,eval(E1,sc),sc))
eval({fun {x} E},sc) = <{fun {x} E}, sc>
eval({call E1 E2},sc1)
= eval(Ef,extend(x,eval(E2,sc1),sc2))
if eval(E1,sc1) = <{fun {x} Ef}, sc2>
= error! otherwise
As a side note, these substitution caches are a little more than "just a
cache" now  they actually hold an *environment* of substitutions in
which expression should be evaluated. So we will switch to the common
*environment* name now.:
eval(N,env) = N
eval({+ E1 E2},env) = eval(E1,env) + eval(E2,env)
eval({ E1 E2},env) = eval(E1,env)  eval(E2,env)
eval({* E1 E2},env) = eval(E1,env) * eval(E2,env)
eval({/ E1 E2},env) = eval(E1,env) / eval(E2,env)
eval(x,env) = lookup(x,env)
eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
eval({fun {x} E},env) = <{fun {x} E}, env>
eval({call E1 E2},env1)
= eval(Ef,extend(x,eval(E2, env1),env2))
if eval(E1,env1) = <{fun {x} Ef}, env2>
= error! otherwise
In case you find this easier to follow, the "flat algorithm" for
evaluating a `call` is:
1. f := evaluate E1 in env1
2. if f is not a <{fun ...},...> closure then error!
3. x := evaluate E2 in env1
4. new_env := extend env_of(f) by mapping arg_of(f) to x
5. evaluate (and return) body_of(f) in new_env
Note how the scoping rules that are implied by this definition match the
scoping rules that were implied by the substitutionbased rules. (It
should be possible to prove that they are the same.)
The changes to the code are almost trivial, except that we need a way to
represent `<{fun {x} Ef}, env>` pairs.

The implication of this change is that we now cannot use the same type
for function syntax and function values since function values have more
than just syntax. There is a simple solution to this  we never do any
substitutions now, so we don't need to translate values into expressions
 we can come up with a new type for values, separate from the type of
abstract syntax trees.
When we do this, we will also fix our hack of using FLANG as the type of
values: this was merely a convenience since the AST type had cases for
all kinds of values that we needed. (In fact, you should have noticed
that Racket does this too: numbers, strings, booleans, etc are all used
by both programs and syntax representation (sexpressions)  but note
that function values are *not* used in syntax.) We will now implement a
separate `VAL` type for runtime values.
First, we need now a type for such environments  we can use `Listof`
for this:
;; a type for environments:
(definetype ENV = (Listof (List Symbol VAL)))
but we can just as well define a new type for environment values:
(definetype ENV
[EmptyEnv]
[Extend Symbol VAL ENV])
Reimplementing `lookup` is now simple:
(: lookup : Symbol ENV > VAL)
;; lookup a symbol in an environment, return its value or throw an
;; error if it isn't bound
(define (lookup name env)
(cases env
[(EmptyEnv) (error 'lookup "no binding for ~s" name)]
[(Extend id val restenv)
(if (eq? id name) val (lookup name restenv))]))
... we don't need `extend` because we get `Extend` from the type
definition, and we also get `(EmptyEnv)` instead of `emptysubst`.
We now use this with the new type for values  two variants of these:
(definetype VAL
[NumV Number]
[FunV Symbol FLANG ENV]) ; argname, body, scope
And now the new implementation of `eval` which uses the new type and
implements lexical scope:
(: eval : FLANG ENV > VAL)
;; evaluates FLANG expressions by reducing them to values
(define (eval expr env)
(cases expr
[(Num n) (NumV n)]
[(Add l r) (arithop + (eval l env) (eval r env))]
[(Sub l r) (arithop  (eval l env) (eval r env))]
[(Mul l r) (arithop * (eval l env) (eval r env))]
[(Div l r) (arithop / (eval l env) (eval r env))]
[(With boundid namedexpr boundbody)
(eval boundbody
(Extend boundid (eval namedexpr env) env))]
[(Id name) (lookup name env)]
[(Fun boundid boundbody)
(FunV boundid boundbody env)]
[(Call funexpr argexpr)
(let ([fval (eval funexpr env)])
(cases fval
[(FunV boundid boundbody fenv)
(eval boundbody
(Extend boundid (eval argexpr env) fenv))]
[else (error 'eval "`call' expects a function, got: ~s"
fval)]))]))
We also need to update `arithop` to use `VAL` objects. The full code
follows  it now passes all tests, including the example that we used
to find the problem.
;;; <<>>
;; The Flang interpreter, using environments
#lang pl
#
The grammar:
::=
 { + }
 {  }
 { * }
 { / }
 { with { } }

 { fun { } }
 { call }
Evaluation rules:
eval(N,env) = N
eval({+ E1 E2},env) = eval(E1,env) + eval(E2,env)
eval({ E1 E2},env) = eval(E1,env)  eval(E2,env)
eval({* E1 E2},env) = eval(E1,env) * eval(E2,env)
eval({/ E1 E2},env) = eval(E1,env) / eval(E2,env)
eval(x,env) = lookup(x,env)
eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
eval({fun {x} E},env) = <{fun {x} E}, env>
eval({call E1 E2},env1)
= eval(Ef,extend(x,eval(E2,env1),env2))
if eval(E1,env1) = <{fun {x} Ef}, env2>
= error! otherwise
#
(definetype FLANG
[Num Number]
[Add FLANG FLANG]
[Sub FLANG FLANG]
[Mul FLANG FLANG]
[Div FLANG FLANG]
[Id Symbol]
[With Symbol FLANG FLANG]
[Fun Symbol FLANG]
[Call FLANG FLANG])
(: parsesexpr : Sexpr > FLANG)
;; parses sexpressions into FLANGs
(define (parsesexpr sexpr)
(match sexpr
[(number: n) (Num n)]
[(symbol: name) (Id name)]
[(cons 'with more)
(match sexpr
[(list 'with (list (symbol: name) named) body)
(With name (parsesexpr named) (parsesexpr body))]
[else (error 'parsesexpr "bad `with' syntax in ~s" sexpr)])]
[(cons 'fun more)
(match sexpr
[(list 'fun (list (symbol: name)) body)
(Fun name (parsesexpr body))]
[else (error 'parsesexpr "bad `fun' syntax in ~s" sexpr)])]
[(list '+ lhs rhs) (Add (parsesexpr lhs) (parsesexpr rhs))]
[(list ' lhs rhs) (Sub (parsesexpr lhs) (parsesexpr rhs))]
[(list '* lhs rhs) (Mul (parsesexpr lhs) (parsesexpr rhs))]
[(list '/ lhs rhs) (Div (parsesexpr lhs) (parsesexpr rhs))]
[(list 'call fun arg)
(Call (parsesexpr fun) (parsesexpr arg))]
[else (error 'parsesexpr "bad syntax in ~s" sexpr)]))
(: parse : String > FLANG)
;; parses a string containing a FLANG expression to a FLANG AST
(define (parse str)
(parsesexpr (string>sexpr str)))
;; Types for environments, values, and a lookup function
(definetype ENV
[EmptyEnv]
[Extend Symbol VAL ENV])
(definetype VAL
[NumV Number]
[FunV Symbol FLANG ENV])
(: lookup : Symbol ENV > VAL)
;; lookup a symbol in an environment, return its value or throw an
;; error if it isn't bound
(define (lookup name env)
(cases env
[(EmptyEnv) (error 'lookup "no binding for ~s" name)]
[(Extend id val restenv)
(if (eq? id name) val (lookup name restenv))]))
(: NumV>number : VAL > Number)
;; convert a FLANG runtime numeric value to a Racket one
(define (NumV>number val)
(cases val
[(NumV n) n]
[else (error 'arithop "expected a number, got: ~s" val)]))
(: arithop : (Number Number > Number) VAL VAL > VAL)
;; gets a Racket numeric binary operator, and uses it within a NumV
;; wrapper
(define (arithop op val1 val2)
(NumV (op (NumV>number val1) (NumV>number val2))))
(: eval : FLANG ENV > VAL)
;; evaluates FLANG expressions by reducing them to values
(define (eval expr env)
(cases expr
[(Num n) (NumV n)]
[(Add l r) (arithop + (eval l env) (eval r env))]
[(Sub l r) (arithop  (eval l env) (eval r env))]
[(Mul l r) (arithop * (eval l env) (eval r env))]
[(Div l r) (arithop / (eval l env) (eval r env))]
[(With boundid namedexpr boundbody)
(eval boundbody
(Extend boundid (eval namedexpr env) env))]
[(Id name) (lookup name env)]
[(Fun boundid boundbody)
(FunV boundid boundbody env)]
[(Call funexpr argexpr)
(let ([fval (eval funexpr env)])
(cases fval
[(FunV boundid boundbody fenv)
(eval boundbody
(Extend boundid (eval argexpr env) fenv))]
[else (error 'eval "`call' expects a function, got: ~s"
fval)]))]))
(: run : String > Number)
;; evaluate a FLANG program contained in a string
(define (run str)
(let ([result (eval (parse str) (EmptyEnv))])
(cases result
[(NumV n) n]
[else (error 'run "evaluation returned a nonnumber: ~s"
result)])))
;; tests
(test (run "{call {fun {x} {+ x 1}} 4}")
=> 5)
(test (run "{with {add3 {fun {x} {+ x 3}}}
{call add3 1}}")
=> 4)
(test (run "{with {add3 {fun {x} {+ x 3}}}
{with {add1 {fun {x} {+ x 1}}}
{with {x 3}
{call add1 {call add3 x}}}}}")
=> 7)
(test (run "{with {identity {fun {x} x}}
{with {foo {fun {x} {+ x 1}}}
{call {call identity foo} 123}}}")
=> 124)
(test (run "{with {x 3}
{with {f {fun {y} {+ x y}}}
{with {x 5}
{call f 4}}}}")
=> 7)
(test (run "{call {with {x 3}
{fun {y} {+ x y}}}
4}")
=> 7)
(test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
{with {x 100}
{call f 4}}}")
=> 7)
(test (run "{call {call {fun {x} {call x 1}}
{fun {x} {fun {y} {+ x y}}}}
123}")
=> 124)

# Fixing an Overlooked Bug
Incidentally, this version fixes a bug we had previously in the
substitution version of FLANG:
(run "{with {f {fun {y} {+ x y}}}
{with {x 7}
{call f 1}}}")
This bug was due to our naive `subst`, which doesn't avoid capturing
renames. But note that since that version of the evaluator makes its way
from the outside in, there is no difference in semantics for *valid*
programs  ones that don't have free identifiers.
(Reminder: This was *not* a dynamically scoped language, just a bug that
happened when `x` wasn't substituted away before `f` was replaced with
something that refers to `x`.)