PL: Lecture #6  Tuesday, January 21st

Evaluation of with (contd.)

Reminder:

Formal Specs

Note the formal definitions that were included in the WAE code. They are ways of describing pieces of our language that are more formal than plain English, but still not as formal (and as verbose) as the actual code.

A formal definition of subst:

(N is a <num>, E1, E2 are <WAE>s, x is some <id>, y is a different <id>)

N[v/x]                = N

{+ E1 E2}[v/x]        = {+ E1[v/x] E2[v/x]}

{- E1 E2}[v/x]        = {- E1[v/x] E2[v/x]}

{* E1 E2}[v/x]        = {* E1[v/x] E2[v/x]}

{/ E1 E2}[v/x]        = {/ E1[v/x] E2[v/x]}

y[v/x]                = y
x[v/x]                = v

{with {y E1} E2}[v/x] = {with {y E1[v/x]} E2[v/x]}
{with {x E1} E2}[v/x] = {with {x E1[v/x]} E2}

And a formal definition of eval:

eval(N)        = N

eval({+ E1 E2}) = eval(E1) + eval(E2)

eval({- E1 E2}) = eval(E1) - eval(E2)

eval({* E1 E2}) = eval(E1) * eval(E2)

eval({/ E1 E2}) = eval(E1) / eval(E2)

eval(id)        = error!

eval({with {x E1} E2}) = eval(E2[eval(E1)/x])

Lazy vs Eager Evaluation

As we have previously seen, there are two basic approaches for evaluation: either eager or lazy. In lazy evaluation, bindings are used for sort of textual references — it is only for avoiding writing an expression twice, but the associated computation is done twice anyway. In eager evaluation, we eliminate not only the textual redundancy, but also the computation.

Which evaluation method did our evaluator use? The relevant piece of formalism is the treatment of with:

eval({with {x E1} E2}) = eval(E2[eval(E1)/x])

And the matching piece of code is:

[(With bound-id named-expr bound-body)
(eval (subst bound-body
              bound-id
              (Num (eval named-expr))))]

How do we make this lazy?

In the formal equation:

eval({with {x E1} E2}) = eval(E2[E1/x])

and in the code:

(: eval : WAE -> Number)
;; evaluates WAE expressions by reducing them to numbers
(define (eval expr)
  (cases expr
    [(Num n) n]
    [(Add l r) (+ (eval l) (eval r))]
    [(Sub l r) (- (eval l) (eval r))]
    [(Mul l r) (* (eval l) (eval r))]
    [(With bound-id named-expr bound-body)
    (eval (subst bound-body
                  bound-id
                  named-expr))] ;*** no eval and no Num wrapping
    [(Id name) (error 'eval "free identifier: ~s" name)]))

We can verify the way this works by tracing eval (compare the trace you get for the two versions):

> (trace eval) ; (put this in the definitions window)
> (run "{with {x {+ 1 2}} {* x x}}")

Ignoring the traces for now, the modified WAE interpreter works as before, specifically, all tests pass. So the question is whether the language we get is actually different than the one we had before. One difference is in execution speed, but we can’t really notice a difference, and we care more about meaning. Is there any program that will run differently in the two languages?

The main feature of the lazy evaluator is that it is not evaluating the named expression until it is actually needed. As we have seen, this leads to duplicating computations if the bound identifier is used more than once — meaning that it does not eliminate the dynamic redundancy. But what if the bound identifier is not used at all? In that case the named expression simply evaporates. This is a good hint at an expression that behaves differently in the two languages — if we add division to both languages, we get a different result when we try running:

{with {x {/ 8 0}} 7}

The eager evaluator stops with an error when it tries evaluating the division — and the lazy evaluator simply ignores it.

Even without division, we get a similar behavior for

{with {x y} 7}

but it is questionable whether the fact that this evaluates to 7 is correct behavior — we really want to forbid program that use free variable.

Furthermore, there is an issue with name capturing — we don’t want to substitute an expression into a context that captures some of its free variables. But our substitution allows just that, which is usually not a problem because by the time we do the substitution, the named expression should not have free variables that need to be replaced. However, consider evaluating this program:

{with {y x}
  {with {x 2}
    {+ x y}}}

under the two evaluation regimens: the eager version stops with an error, and the lazy version succeed. This points at a bug in our substitution, or rather not dealing with an issue that we do not encounter.

So the summary is: as long as the initial program is correct, both evaluation regimens produce the same results. If a program contains free variables, they might get captured in a naive lazy evaluator implementation (but this is a bug that should be fixed). Also, there are some cases where eager evaluation runs into a run-time problem which does not happen in a lazy evaluator because the expression is not used. It is possible to prove that when you evaluate an expression, if there is an error that can be avoided, lazy evaluation will always avoid it, whereas an eager evaluator will always run into it. On the other hand, lazy evaluators are usually slower than eager evaluator, so it’s a speed vs. robustness trade-off.

Note that with lazy evaluation we say that an identifier is bound to an expression rather than a value. (Again, this is why the eager version needed to wrap eval’s result in a Num and this one doesn’t.)

(It is possible to change things and get a more well behaved substitution, we basically will need to find if a capture might happen, and rename things to avoid it. For example,

{with {y E1} E2}[v/x]
  if `x' and `y' are equal
    = {with {y E1[v/x]} E2} = {with {x E1[v/x]} E2}
  if `y' has a free occurrence in `v'
    = {with {y1 E1[v/x]} E2[y1/y][v/x]} ; `y1' is "fresh"
  otherwise
    = {with {y E1[v/x]} E2[v/x]}

With this, we might have gone through this path in evaluating the above:

{with {y x} {with {x 2} {+ x y}}}
{with {x₁ 2} {+ x₁ x}} ; note that x₁ is a fresh name, not x
{+ 2 x}
error: free `x`

But you can see that this is much more complicated (more code: requires a free-in predicate, being able to invent new fresh names, etc). And it’s not even the end of that story…)

de Bruijn Indexes

This whole story revolves around names, specifically, name capture is a problem that should always be avoided (it is one major source of PL headaches).

But are names the only way we can use bindings?

There is a least one alternative way: note that the only thing we used names for are for references. We don’t really care what the name is, which is pretty obvious when we consider the two WAE expressions:

{with {x 5} {+ x x}}
{with {y 5} {+ y y}}

or the two Racket function definitions:

(define (foo x) (list x x))
(define (foo y) (list y y))

Both of these show a pair of expressions that we should consider as equal in some sense (this is called “alpha-equality”). The only thing we care about is what variable points where: the binding structure is the only thing that matters. In other words, as long as DrRacket produces the same arrows when we use Check Syntax, we consider the program to be the same, regardless of name choices (for argument names and local names, not for global names like foo in the above).

The alternative idea uses this principle: if all we care about is where the arrows go, then simply get rid of the names… Instead of referencing a binding through its name, just specify which of the surrounding scopes we want to refer to. For example, instead of:

{with {x 5} {with {y 6} {+ x y}}}

we can use a new “reference” syntax — [N] — and use this instead of the above:

{with 5 {with 6 {+ [1] [0]}}}

So the rules for [N] are — [0] is the value bound in the current scope, [1] is the value from the next one up etc.

Of course, to do this translation, we have to know the precise scope rules. Two more complicated examples:

{with {x 5} {+ x {with {y 6} {+ x y}}}}

is translated to:

{with 5 {+ [0] {with 6 {+ [1] [0]}}}}

(note how x appears as a different reference based on where it appeared in the original code.) Even more subtle:

{with {x 5} {with {y {+ x 1}} {+ x y}}}

is translated to:

{with 5 {with {+ [0] 1} {+ [1] [0]}}}

because the inner with does not have its own named expression in its scope, so the named expression is immediately in the scope of the outer with.

This is called “de Bruijn Indexes”: instead of referencing identifiers by their name, we use an index into the surrounding binding context. The major disadvantage, as can be seen in the above examples, is that it is not convenient for humans to work with. Specifically, the same identifier is referenced using different numbers, which makes it hard to understand what some code is doing. After all, abstractions are the main thing we deal with when we write programs, and having labels make the bindings structure much easier to understand than scope counts.

However, practically all compilers use this for compiled code (think about stack pointers). For example, GCC compiles this code:

{
  int x = 5;
  {
    int y = x + 1;
    return x + y;
  }
}

to:

subl $8, %esp
movl $5, -4(%ebp)  ; int x = 5
movl -4(%ebp), %eax
incl %eax
movl %eax, -8(%ebp) ; int y = %eax
movl -8(%ebp), %eax
addl -4(%ebp), %eax