Implementing Lexical Scope: Closures and Environments
So how do we fix this?
Lets go back to the root of the problem: the new evaluator does not
behave in the same way as the substituting evaluator. In the old
evaluator, it was easy to see how functions can behave as objects that
remember values. For example, when we do this:
{with {x 1}
{fun {y}
{+ x y}}}
the result was a function value, which actually was the syntax object
for this:
{fun {y} {+ 1 y}}
Now if we call this function from someplace else like:
{with {f {with {x 1} {fun {y} {+ x y}}}}
{with {x 2}
{call f 3}}}
it is clear what the result will be: f is bound to a function that adds
1 to its input, so in the above the later binding for x
has no effect
at all.
But with the caching evaluator, the value of
{with {x 1}
{fun {y}
{+ x y}}}
is simply:
{fun {y} {+ x y}}
and there is no place where we save the 1 — that’s the root of our
problem. (That’s also what makes people suspect that using lambda
in
Racket and any other functional language involves some inefficient
code-recompiling magic.) In fact, we can verify that by inspecting the
returned value, and see that it does contain a free identifier.
Clearly, we need to create an object that contains the body and the
argument list, like the function syntax object — but we don’t do any
substitution, so in addition to the body an argument name(s) we need to
remember that we still need to substitute x
by 1
. This means that
the pieces of information we need to know are:
- formal argument(s): {y}
- body: {+ x y}
- pending substitutions: [1/x]
and that last bit has the missing 1
. The resulting object is called a
closure
because it closes the function body over the substitutions
that are still pending (its environment).
So, the first change is in the value of functions which now need all
these pieces, unlike the Fun
case for the syntax object.
A second place that needs changing is the when functions are called.
When we’re done evaluating the call
arguments (the function value and
the argument value) but before we apply the function we have two
values — there is no more use for the current substitution cache at
this point: we have finished dealing with all substitutions that were
necessary over the current expression — we now continue with
evaluating the body of the function, with the new substitutions for the
formal arguments and actual values given. But the body itself is the
same one we had before — which is the previous body with its suspended
substitutions that we still did not do.
Rewrite the evaluation rules — all are the same except for evaluating
a fun
form and a call
form:
eval(N,sc) = N
eval({+ E1 E2},sc) = eval(E1,sc) + eval(E2,sc)
eval({- E1 E2},sc) = eval(E1,sc) - eval(E2,sc)
eval({* E1 E2},sc) = eval(E1,sc) * eval(E2,sc)
eval({/ E1 E2},sc) = eval(E1,sc) / eval(E2,sc)
eval(x,sc) = lookup(x,sc)
eval({with {x E1} E2},sc) = eval(E2,extend(x,eval(E1,sc),sc))
eval({fun {x} E},sc) = <{fun {x} E}, sc>
eval({call E1 E2},sc1)
= eval(Ef,extend(x,eval(E2,sc1),sc2))
if eval(E1,sc1) = <{fun {x} Ef}, sc2>
= error! otherwise
As a side note, these substitution caches are a little more than “just a
cache” now — they actually hold an environment of substitutions in
which expression should be evaluated. So we will switch to the common
environment name now.:
eval(N,env) = N
eval({+ E1 E2},env) = eval(E1,env) + eval(E2,env)
eval({- E1 E2},env) = eval(E1,env) - eval(E2,env)
eval({* E1 E2},env) = eval(E1,env) * eval(E2,env)
eval({/ E1 E2},env) = eval(E1,env) / eval(E2,env)
eval(x,env) = lookup(x,env)
eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
eval({fun {x} E},env) = <{fun {x} E}, env>
eval({call E1 E2},env1)
= eval(Ef,extend(x,eval(E2, env1),env2))
if eval(E1,env1) = <{fun {x} Ef}, env2>
= error! otherwise
In case you find this easier to follow, the “flat algorithm” for
evaluating a call
is:
1. f := evaluate E1 in env1
2. if f is not a <{fun ...},...> closure then error!
3. x := evaluate E2 in env1
4. new_env := extend env_of(f) by mapping arg_of(f) to x
5. evaluate (and return) body_of(f) in new_env
Note how the scoping rules that are implied by this definition match the
scoping rules that were implied by the substitution-based rules. (It
should be possible to prove that they are the same.)
The changes to the code are almost trivial, except that we need a way to
represent <{fun {x} Ef}, env>
pairs.
The implication of this change is that we now cannot use the same type
for function syntax and function values since function values have more
than just syntax. There is a simple solution to this — we never do
any substitutions now, so we don’t need to translate values into
expressions — we can come up with a new type for values, separate from
the type of abstract syntax trees.
When we do this, we will also fix our hack of using FLANG as the type of
values: this was merely a convenience since the AST type had cases for
all kinds of values that we needed. (In fact, you should have noticed
that Racket does this too: numbers, strings, booleans, etc are all used
by both programs and syntax representation (s-expressions) — but note
that function values are not used in syntax.) We will now implement a
separate VAL
type for runtime values.
First, we need now a type for such environments — we can use Listof
for this:
;; a type for environments:
(define-type ENV = (Listof (List Symbol VAL)))
but we can just as well define a new type for environment values:
(define-type ENV
[EmptyEnv]
[Extend Symbol VAL ENV])
Reimplementing lookup
is now simple:
(: lookup : Symbol ENV -> VAL)
;; lookup a symbol in an environment, return its value or throw an
;; error if it isn't bound
(define (lookup name env)
(cases env
[(EmptyEnv) (error 'lookup "no binding for ~s" name)]
[(Extend id val rest-env)
(if (eq? id name) val (lookup name rest-env))]))
… we don’t need extend
because we get Extend
from the type
definition, and we also get (EmptyEnv)
instead of empty-subst
.
We now use this with the new type for values — two variants of these:
(define-type VAL
[NumV Number]
[FunV Symbol FLANG ENV]) ; arg-name, body, scope
And now the new implementation of eval
which uses the new type and
implements lexical scope:
(: eval : FLANG ENV -> VAL)
;; evaluates FLANG expressions by reducing them to values
(define (eval expr env)
(cases expr
[(Num n) (NumV n)]
[(Add l r) (arith-op + (eval l env) (eval r env))]
[(Sub l r) (arith-op - (eval l env) (eval r env))]
[(Mul l r) (arith-op * (eval l env) (eval r env))]
[(Div l r) (arith-op / (eval l env) (eval r env))]
[(With bound-id named-expr bound-body)
(eval bound-body
(Extend bound-id (eval named-expr env) env))]
[(Id name) (lookup name env)]
[(Fun bound-id bound-body)
(FunV bound-id bound-body env)]
[(Call fun-expr arg-expr)
(let ([fval (eval fun-expr env)])
(cases fval
[(FunV bound-id bound-body f-env)
(eval bound-body
(Extend bound-id (eval arg-expr env) f-env))]
[else (error 'eval "`call' expects a function, got: ~s"
fval)]))]))
We also need to update arith-op
to use VAL
objects. The full code
follows — it now passes all tests, including the example that we used
to find the problem.
▶;; The Flang interpreter, using environments
#lang pl
#|
The grammar:
<FLANG> ::= <num>
| { + <FLANG> <FLANG> }
| { - <FLANG> <FLANG> }
| { * <FLANG> <FLANG> }
| { / <FLANG> <FLANG> }
| { with { <id> <FLANG> } <FLANG> }
| <id>
| { fun { <id> } <FLANG> }
| { call <FLANG> <FLANG> }
Evaluation rules:
eval(N,env) = N
eval({+ E1 E2},env) = eval(E1,env) + eval(E2,env)
eval({- E1 E2},env) = eval(E1,env) - eval(E2,env)
eval({* E1 E2},env) = eval(E1,env) * eval(E2,env)
eval({/ E1 E2},env) = eval(E1,env) / eval(E2,env)
eval(x,env) = lookup(x,env)
eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
eval({fun {x} E},env) = <{fun {x} E}, env>
eval({call E1 E2},env1)
= eval(Ef,extend(x,eval(E2,env1),env2))
if eval(E1,env1) = <{fun {x} Ef}, env2>
= error! otherwise
|#
(define-type FLANG
[Num Number]
[Add FLANG FLANG]
[Sub FLANG FLANG]
[Mul FLANG FLANG]
[Div FLANG FLANG]
[Id Symbol]
[With Symbol FLANG FLANG]
[Fun Symbol FLANG]
[Call FLANG FLANG])
(: parse-sexpr : Sexpr -> FLANG)
;; parses s-expressions into FLANGs
(define (parse-sexpr sexpr)
(match sexpr
[(number: n) (Num n)]
[(symbol: name) (Id name)]
[(cons 'with more)
(match sexpr
[(list 'with (list (symbol: name) named) body)
(With name (parse-sexpr named) (parse-sexpr body))]
[else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
[(cons 'fun more)
(match sexpr
[(list 'fun (list (symbol: name)) body)
(Fun name (parse-sexpr body))]
[else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
[(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
[(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
[(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
[(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
[(list 'call fun arg)
(Call (parse-sexpr fun) (parse-sexpr arg))]
[else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))
(: parse : String -> FLANG)
;; parses a string containing a FLANG expression to a FLANG AST
(define (parse str)
(parse-sexpr (string->sexpr str)))
;; Types for environments, values, and a lookup function
(define-type ENV
[EmptyEnv]
[Extend Symbol VAL ENV])
(define-type VAL
[NumV Number]
[FunV Symbol FLANG ENV])
(: lookup : Symbol ENV -> VAL)
;; lookup a symbol in an environment, return its value or throw an
;; error if it isn't bound
(define (lookup name env)
(cases env
[(EmptyEnv) (error 'lookup "no binding for ~s" name)]
[(Extend id val rest-env)
(if (eq? id name) val (lookup name rest-env))]))
(: NumV->number : VAL -> Number)
;; convert a FLANG runtime numeric value to a Racket one
(define (NumV->number val)
(cases val
[(NumV n) n]
[else (error 'arith-op "expected a number, got: ~s" val)]))
(: arith-op : (Number Number -> Number) VAL VAL -> VAL)
;; gets a Racket numeric binary operator, and uses it within a NumV
;; wrapper
(define (arith-op op val1 val2)
(NumV (op (NumV->number val1) (NumV->number val2))))
(: eval : FLANG ENV -> VAL)
;; evaluates FLANG expressions by reducing them to values
(define (eval expr env)
(cases expr
[(Num n) (NumV n)]
[(Add l r) (arith-op + (eval l env) (eval r env))]
[(Sub l r) (arith-op - (eval l env) (eval r env))]
[(Mul l r) (arith-op * (eval l env) (eval r env))]
[(Div l r) (arith-op / (eval l env) (eval r env))]
[(With bound-id named-expr bound-body)
(eval bound-body
(Extend bound-id (eval named-expr env) env))]
[(Id name) (lookup name env)]
[(Fun bound-id bound-body)
(FunV bound-id bound-body env)]
[(Call fun-expr arg-expr)
(let ([fval (eval fun-expr env)])
(cases fval
[(FunV bound-id bound-body f-env)
(eval bound-body
(Extend bound-id (eval arg-expr env) f-env))]
[else (error 'eval "`call' expects a function, got: ~s"
fval)]))]))
(: run : String -> Number)
;; evaluate a FLANG program contained in a string
(define (run str)
(let ([result (eval (parse str) (EmptyEnv))])
(cases result
[(NumV n) n]
[else (error 'run "evaluation returned a non-number: ~s"
result)])))
;; tests
(test (run "{call {fun {x} {+ x 1}} 4}")
=> 5)
(test (run "{with {add3 {fun {x} {+ x 3}}}
{call add3 1}}")
=> 4)
(test (run "{with {add3 {fun {x} {+ x 3}}}
{with {add1 {fun {x} {+ x 1}}}
{with {x 3}
{call add1 {call add3 x}}}}}")
=> 7)
(test (run "{with {identity {fun {x} x}}
{with {foo {fun {x} {+ x 1}}}
{call {call identity foo} 123}}}")
=> 124)
(test (run "{with {x 3}
{with {f {fun {y} {+ x y}}}
{with {x 5}
{call f 4}}}}")
=> 7)
(test (run "{call {with {x 3}
{fun {y} {+ x y}}}
4}")
=> 7)
(test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
{with {x 100}
{call f 4}}}")
=> 7)
(test (run "{call {call {fun {x} {call x 1}}
{fun {x} {fun {y} {+ x y}}}}
123}")
=> 124)