PL: Lecture #10  Tuesday, February 5th
(text file)

Implementing Lexical Scope: Closures and Environments

So how do we fix this?

Lets go back to the root of the problem: the new evaluator does not behave in the same way as the substituting evaluator. In the old evaluator, it was easy to see how functions can behave as objects that remember values. For example, when we do this:

{with {x 1}
  {fun {y}
    {+ x y}}}

the result was a function value, which actually was the syntax object for this:

{fun {y} {+ 1 y}}

Now if we call this function from someplace else like:

{with {f {with {x 1} {fun {y} {+ x y}}}}
  {with {x 2}
    {call f 3}}}

it is clear what the result will be: f is bound to a function that adds 1 to its input, so in the above the later binding for x has no effect at all.

But with the caching evaluator, the value of

{with {x 1}
  {fun {y}
    {+ x y}}}

is simply:

{fun {y} {+ x y}}

and there is no place where we save the 1 — that’s the root of our problem. (That’s also what makes people suspect that using lambda in Racket and any other functional language involves some inefficient code-recompiling magic.) In fact, we can verify that by inspecting the returned value, and see that it does contain a free identifier.

Clearly, we need to create an object that contains the body and the argument list, like the function syntax object — but we don’t do any substitution, so in addition to the body an argument name(s) we need to remember that we still need to substitute x by 1 . This means that the pieces of information we need to know are:

- formal argument(s):    {y}
- body:                  {+ x y}
- pending substitutions: [1/x]

and that last bit has the missing 1. The resulting object is called a closure because it closes the function body over the substitutions that are still pending (its environment).

So, the first change is in the value of functions which now need all these pieces, unlike the Fun case for the syntax object.

A second place that needs changing is the when functions are called. When we’re done evaluating the call arguments (the function value and the argument value) but before we apply the function we have two values — there is no more use for the current substitution cache at this point: we have finished dealing with all substitutions that were necessary over the current expression — we now continue with evaluating the body of the function, with the new substitutions for the formal arguments and actual values given. But the body itself is the same one we had before — which is the previous body with its suspended substitutions that we still did not do.

Rewrite the evaluation rules — all are the same except for evaluating a fun form and a call form:

eval(N,sc)                = N
eval({+ E1 E2},sc)        = eval(E1,sc) + eval(E2,sc)
eval({- E1 E2},sc)        = eval(E1,sc) - eval(E2,sc)
eval({* E1 E2},sc)        = eval(E1,sc) * eval(E2,sc)
eval({/ E1 E2},sc)        = eval(E1,sc) / eval(E2,sc)
eval(x,sc)                = lookup(x,sc)
eval({with {x E1} E2},sc) = eval(E2,extend(x,eval(E1,sc),sc))
eval({fun {x} E},sc)      = <{fun {x} E}, sc>
eval({call E1 E2},sc1)
        = eval(Ef,extend(x,eval(E2,sc1),sc2))
                          if eval(E1,sc1) = <{fun {x} Ef}, sc2>
        = error!          otherwise

As a side note, these substitution caches are a little more than “just a cache” now — they actually hold an environment of substitutions in which expression should be evaluated. So we will switch to the common environment name now.:

eval(N,env)                = N
eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
eval({* E1 E2},env)        = eval(E1,env) * eval(E2,env)
eval({/ E1 E2},env)        = eval(E1,env) / eval(E2,env)
eval(x,env)                = lookup(x,env)
eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
eval({fun {x} E},env)      = <{fun {x} E}, env>
eval({call E1 E2},env1)
        = eval(Ef,extend(x,eval(E2, env1),env2))
                          if eval(E1,env1) = <{fun {x} Ef}, env2>
        = error!          otherwise

In case you find this easier to follow, the “flat algorithm” for evaluating a call is:

1. f := evaluate E1 in env1
2. if f is not a <{fun ...},...> closure then error!
3. x := evaluate E2 in env1
4. new_env := extend env_of(f) by mapping arg_of(f) to x
5. evaluate (and return) body_of(f) in new_env

Note how the scoping rules that are implied by this definition match the scoping rules that were implied by the substitution-based rules. (It should be possible to prove that they are the same.)

The changes to the code are almost trivial, except that we need a way to represent <{fun {x} Ef}, env> pairs.


The implication of this change is that we now cannot use the same type for function syntax and function values since function values have more than just syntax. There is a simple solution to this — we never do any substitutions now, so we don’t need to translate values into expressions — we can come up with a new type for values, separate from the type of abstract syntax trees.

When we do this, we will also fix our hack of using FLANG as the type of values: this was merely a convenience since the AST type had cases for all kinds of values that we needed. (In fact, you should have noticed that Racket does this too: numbers, strings, booleans, etc are all used by both programs and syntax representation (s-expressions) — but note that function values are not used in syntax.) We will now implement a separate VAL type for runtime values.

First, we need now a type for such environments — we can use Listof for this:

;; a type for environments:
(define-type ENV = (Listof (List Symbol VAL)))

but we can just as well define a new type for environment values:

(define-type ENV
  [EmptyEnv]
  [Extend Symbol VAL ENV])

Reimplementing lookup is now simple:

(: lookup : Symbol ENV -> VAL)
;; lookup a symbol in an environment, return its value or throw an
;; error if it isn't bound
(define (lookup name env)
  (cases env
    [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
    [(Extend id val rest-env)
    (if (eq? id name) val (lookup name rest-env))]))

… we don’t need extend because we get Extend from the type definition, and we also get (EmptyEnv) instead of empty-subst.

We now use this with the new type for values — two variants of these:

(define-type VAL
  [NumV Number]
  [FunV Symbol FLANG ENV]) ; arg-name, body, scope

And now the new implementation of eval which uses the new type and implements lexical scope:

(: eval : FLANG ENV -> VAL)
;; evaluates FLANG expressions by reducing them to values
(define (eval expr env)
  (cases expr
    [(Num n) (NumV n)]
    [(Add l r) (arith-op + (eval l env) (eval r env))]
    [(Sub l r) (arith-op - (eval l env) (eval r env))]
    [(Mul l r) (arith-op * (eval l env) (eval r env))]
    [(Div l r) (arith-op / (eval l env) (eval r env))]
    [(With bound-id named-expr bound-body)
    (eval bound-body
          (Extend bound-id (eval named-expr env) env))]
    [(Id name) (lookup name env)]
    [(Fun bound-id bound-body)
    (FunV bound-id bound-body env)]
    [(Call fun-expr arg-expr)
    (let ([fval (eval fun-expr env)])
      (cases fval
        [(FunV bound-id bound-body f-env)
          (eval bound-body
                (Extend bound-id (eval arg-expr env) f-env))]
        [else (error 'eval "`call' expects a function, got: ~s"
                            fval)]))]))

We also need to update arith-op to use VAL objects. The full code follows — it now passes all tests, including the example that we used to find the problem.

;; The Flang interpreter, using environments

#lang pl

#|
The grammar:
  <FLANG> ::= <num>
            | { + <FLANG> <FLANG> }
            | { - <FLANG> <FLANG> }
            | { * <FLANG> <FLANG> }
            | { / <FLANG> <FLANG> }
            | { with { <id> <FLANG> } <FLANG> }
            | <id>
            | { fun { <id> } <FLANG> }
            | { call <FLANG> <FLANG> }

Evaluation rules:
  eval(N,env)                = N
  eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
  eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
  eval({* E1 E2},env)        = eval(E1,env) * eval(E2,env)
  eval({/ E1 E2},env)        = eval(E1,env) / eval(E2,env)
  eval(x,env)                = lookup(x,env)
  eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
  eval({fun {x} E},env)      = <{fun {x} E}, env>
  eval({call E1 E2},env1)
          = eval(Ef,extend(x,eval(E2,env1),env2))
                            if eval(E1,env1) = <{fun {x} Ef}, env2>
          = error!          otherwise
|#

(define-type FLANG
  [Num  Number]
  [Add  FLANG FLANG]
  [Sub  FLANG FLANG]
  [Mul  FLANG FLANG]
  [Div  FLANG FLANG]
  [Id  Symbol]
  [With Symbol FLANG FLANG]
  [Fun  Symbol FLANG]
  [Call FLANG FLANG])

(: parse-sexpr : Sexpr -> FLANG)
;; parses s-expressions into FLANGs
(define (parse-sexpr sexpr)
  (match sexpr
    [(number: n)    (Num n)]
    [(symbol: name) (Id name)]
    [(cons 'with more)
    (match sexpr
      [(list 'with (list (symbol: name) named) body)
        (With name (parse-sexpr named) (parse-sexpr body))]
      [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
    [(cons 'fun more)
    (match sexpr
      [(list 'fun (list (symbol: name)) body)
        (Fun name (parse-sexpr body))]
      [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
    [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
    [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
    [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
    [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
    [(list 'call fun arg)
                      (Call (parse-sexpr fun) (parse-sexpr arg))]
    [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

(: parse : String -> FLANG)
;; parses a string containing a FLANG expression to a FLANG AST
(define (parse str)
  (parse-sexpr (string->sexpr str)))

;; Types for environments, values, and a lookup function

(define-type ENV
  [EmptyEnv]
  [Extend Symbol VAL ENV])

(define-type VAL
  [NumV Number]
  [FunV Symbol FLANG ENV])

(: lookup : Symbol ENV -> VAL)
;; lookup a symbol in an environment, return its value or throw an
;; error if it isn't bound
(define (lookup name env)
  (cases env
    [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
    [(Extend id val rest-env)
    (if (eq? id name) val (lookup name rest-env))]))

(: NumV->number : VAL -> Number)
;; convert a FLANG runtime numeric value to a Racket one
(define (NumV->number val)
  (cases val
    [(NumV n) n]
    [else (error 'arith-op "expected a number, got: ~s" val)]))

(: arith-op : (Number Number -> Number) VAL VAL -> VAL)
;; gets a Racket numeric binary operator, and uses it within a NumV
;; wrapper
(define (arith-op op val1 val2)
  (NumV (op (NumV->number val1) (NumV->number val2))))

(: eval : FLANG ENV -> VAL)
;; evaluates FLANG expressions by reducing them to values
(define (eval expr env)
  (cases expr
    [(Num n) (NumV n)]
    [(Add l r) (arith-op + (eval l env) (eval r env))]
    [(Sub l r) (arith-op - (eval l env) (eval r env))]
    [(Mul l r) (arith-op * (eval l env) (eval r env))]
    [(Div l r) (arith-op / (eval l env) (eval r env))]
    [(With bound-id named-expr bound-body)
    (eval bound-body
          (Extend bound-id (eval named-expr env) env))]
    [(Id name) (lookup name env)]
    [(Fun bound-id bound-body)
    (FunV bound-id bound-body env)]
    [(Call fun-expr arg-expr)
    (let ([fval (eval fun-expr env)])
      (cases fval
        [(FunV bound-id bound-body f-env)
          (eval bound-body
                (Extend bound-id (eval arg-expr env) f-env))]
        [else (error 'eval "`call' expects a function, got: ~s"
                            fval)]))]))

(: run : String -> Number)
;; evaluate a FLANG program contained in a string
(define (run str)
  (let ([result (eval (parse str) (EmptyEnv))])
    (cases result
      [(NumV n) n]
      [else (error 'run "evaluation returned a non-number: ~s"
                  result)])))

;; tests
(test (run "{call {fun {x} {+ x 1}} 4}")
      => 5)
(test (run "{with {add3 {fun {x} {+ x 3}}}
              {call add3 1}}")
      => 4)
(test (run "{with {add3 {fun {x} {+ x 3}}}
              {with {add1 {fun {x} {+ x 1}}}
                {with {x 3}
                  {call add1 {call add3 x}}}}}")
      => 7)
(test (run "{with {identity {fun {x} x}}
              {with {foo {fun {x} {+ x 1}}}
                {call {call identity foo} 123}}}")
      => 124)
(test (run "{with {x 3}
              {with {f {fun {y} {+ x y}}}
                {with {x 5}
                  {call f 4}}}}")
      => 7)
(test (run "{call {with {x 3}
                    {fun {y} {+ x y}}}
                  4}")
      => 7)
(test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
              {with {x 100}
                {call f 4}}}")
      => 7)
(test (run "{call {call {fun {x} {call x 1}}
                        {fun {x} {fun {y} {+ x y}}}}
                  123}")
      => 124)