(text)

Using infinite lists is similar to using channels — a tool for
synchronizing threads and (see a Rob Pike’s talk), and
generators (as they exist in Python). Here are examples of both, note
how similar they both are, and how similar they are to the above
definition of `primes`

. (But note that there is an important
difference, can you see it? It has to be with whether a stream is
reusable or not.)

First, the threads & channels version:

#lang racket

(define-syntax-rule (bg expr ...) (thread (lambda () expr ...)))

(define nats

(let ([out (make-channel)])

(define (loop i) (channel-put out i) (loop (add1 i)))

(bg (loop 1))

out))

(define (divides? n m)

(zero? (modulo m n)))

(define (filter pred c)

(define out (make-channel))

(define (loop)

(let ([x (channel-get c)])

(when (pred x) (channel-put out x))

(loop)))

(bg (loop))

out)

(define (sift n c)

(filter (lambda (x) (not (divides? n x))) c))

(define (sieve c)

(define out (make-channel))

(define (loop c)

(define first (channel-get c))

(channel-put out first)

(loop (sift first c)))

(bg (loop c))

out)

(define primes

(begin (channel-get nats) (sieve nats)))

(define (take n c)

(if (zero? n) null (cons (channel-get c) (take (sub1 n) c))))

(take 10 primes)

(define-syntax-rule (bg expr ...) (thread (lambda () expr ...)))

(define nats

(let ([out (make-channel)])

(define (loop i) (channel-put out i) (loop (add1 i)))

(bg (loop 1))

out))

(define (divides? n m)

(zero? (modulo m n)))

(define (filter pred c)

(define out (make-channel))

(define (loop)

(let ([x (channel-get c)])

(when (pred x) (channel-put out x))

(loop)))

(bg (loop))

out)

(define (sift n c)

(filter (lambda (x) (not (divides? n x))) c))

(define (sieve c)

(define out (make-channel))

(define (loop c)

(define first (channel-get c))

(channel-put out first)

(loop (sift first c)))

(bg (loop c))

out)

(define primes

(begin (channel-get nats) (sieve nats)))

(define (take n c)

(if (zero? n) null (cons (channel-get c) (take (sub1 n) c))))

(take 10 primes)

And here is the generator version:

#lang racket

(require racket/generator)

(define nats

(generator ()

(define (loop i)

(yield i)

(loop (add1 i)))

(loop 1)))

(define (divides? n m)

(zero? (modulo m n)))

(define (filter pred g)

(generator ()

(define (loop)

(let ([x (g)])

(when (pred x) (yield x))

(loop)))

(loop)))

(define (sift n g)

(filter (lambda (x) (not (divides? n x))) g))

(define (sieve g)

(define (loop g)

(define first (g))

(yield first)

(loop (sift first g)))

(generator () (loop g)))

(define primes

(begin (nats) (sieve nats)))

(define (take n g)

(if (zero? n) null (cons (g) (take (sub1 n) g))))

(take 10 primes)

(require racket/generator)

(define nats

(generator ()

(define (loop i)

(yield i)

(loop (add1 i)))

(loop 1)))

(define (divides? n m)

(zero? (modulo m n)))

(define (filter pred g)

(generator ()

(define (loop)

(let ([x (g)])

(when (pred x) (yield x))

(loop)))

(loop)))

(define (sift n g)

(filter (lambda (x) (not (divides? n x))) g))

(define (sieve g)

(define (loop g)

(define first (g))

(yield first)

(loop (sift first g)))

(generator () (loop g)))

(define primes

(begin (nats) (sieve nats)))

(define (take n g)

(if (zero? n) null (cons (g) (take (sub1 n) g))))

(take 10 primes)

Finally, note that on requiring different parts of the `primes`

, the
same calls are not repeated. This indicates that our language
implements “call by need” rather than “call by name”: once an expression
is forced, its value is remembered, so subsequent usages of this value
do not require further computations.

Using “call by name” means that we actually use expressions which can lead to confusing code. An old programming language that used this is Algol. A confusing example that demonstrates this evaluation strategy is:

#lang algol60

begin

integer procedure SIGMA(x, i, n);

value n;

integer x, i, n;

begin

integer sum;

sum := 0;

for i := 1 step 1 until n do

sum := sum + x;

SIGMA := sum;

end;

integer q;

printnln(SIGMA(q*2-1, q, 7));

end

begin

integer procedure SIGMA(x, i, n);

value n;

integer x, i, n;

begin

integer sum;

sum := 0;

for i := 1 step 1 until n do

sum := sum + x;

SIGMA := sum;

end;

integer q;

printnln(SIGMA(q*2-1, q, 7));

end

`x`

and `i`

are arguments that are passed by name, which means that they
can use the same memory location. This is called *aliasing*, a problem
that happens when pointers are involved (eg, pointers in C and
`reference`

arguments in C++). The code, BTW, is called “Jensen’s
device”.

Another interesting behavior that we can now observe, is that the TOY
evaluation rule for `with`

:

eval({with {x E1} E2}) = eval(E2[eval(E1)/x])

is specifying an eager evaluator *only if* the language that this rule
is written in is itself eager. Indeed, if we run the TOY interpreter in
Lazy Racket (or other interpreters we have implemented), we can verify
that running:

(run "{bind {{x {/ 1 0}}} 1}")

is perfectly fine — the call to Racket’s division is done in the evaluation of the TOY division expression, but since Lazy Racket is lazy, then if this value is never used then we never get to do this division! On the other hand, if we evaluate

(run "{bind {{x {/ 1 0}}} {+ x 1}}")

we do get an error when DrRacket tries to display the result, which
forces strictness. Note how the arrows in DrRacket that show where the
computation is are quite confusing: the computation seem to go directly
to the point of the arithmetic operations (`arith-op`

) since the rest of
the evaluation that the evaluator performed was already done, and
succeeded. The actual failure happens when we try to force the
resulting promise which contains only the strict points in our code.

Generally, we know how lazy evaluation works when we use the substitution model. We even know that if we have:

{bind {{x y}}

{bind {{y 2}}

{+ x y}}}

{bind {{y 2}}

{+ x y}}}

then the result should be an error because we cannot substitute the `y`

expression in because it will capture the `y`

— changing the binding
structure. As an indication, the original expression contains a free
reference to `y`

, which is exactly why we shouldn’t substitute it. But
what about:

{bind {{x {+ 4 5}}}

{bind {{y {+ x x}}}

{bind {{z y}}

{bind {{x 4}}

z}}}}

{bind {{y {+ x x}}}

{bind {{z y}}

{bind {{x 4}}

z}}}}

Evaluating this eagerly returns 18, we therefore expect any other
evaluation (eager or lazy, using substitutions or environments) to
return 18 too, because any of these options should not change the
meaning of numbers, of addition, *or* of the scoping rules. (And we
know that no matter what evaluation strategy we choose, if we get to a
value (no infinite loop or exception) then it’ll always be the same
value.) For example, try using lazy evaluation with substitutions:

{bind {{x {+ 4 5}}}

{bind {{y {+ x x}}}

{bind {{z y}}

{bind {{x 4}}

z}}}}

-->

{bind {{y {+ {+ 4 5} {+ 4 5}}}}

{bind {{z y}}

{bind {{x 4}}

z}}}

-->

{bind {{z {+ {+ 4 5} {+ 4 5}}}}

{bind {{x 4}}

z}}

-->

{bind {{x 4}}

{+ {+ 4 5} {+ 4 5}}}

-->

{+ {+ 4 5} {+ 4 5}}

-->

{+ 9 9}

-->

18

{bind {{y {+ x x}}}

{bind {{z y}}

{bind {{x 4}}

z}}}}

-->

{bind {{y {+ {+ 4 5} {+ 4 5}}}}

{bind {{z y}}

{bind {{x 4}}

z}}}

-->

{bind {{z {+ {+ 4 5} {+ 4 5}}}}

{bind {{x 4}}

z}}

-->

{bind {{x 4}}

{+ {+ 4 5} {+ 4 5}}}

-->

{+ {+ 4 5} {+ 4 5}}

-->

{+ 9 9}

-->

18

And what about lazy evaluation using environments:

{bind {{x {+ 4 5}}}

{bind {{y {+ x x}}}

{bind {{z y}}

{bind {{x 4}}

z}}}} []

-->

{bind {{y {+ x x}}}

{bind {{z y}}

{bind {{x 4}}

z}}} [x:={+ 4 5}]

-->

{bind {{z y}}

{bind {{x 4}}

z}} [x:={+ 4 5}, y:={+ x x}]

-->

{bind {{x 4}}

z} [x:={+ 4 5}, y:={+ x x}, z:=y]

-->

z [x:=4, y:={+ x x}, z:=y]

-->

y [x:=4, y:={+ x x}, z:=y]

-->

{+ x x} [x:=4, y:={+ x x}, z:=y]

-->

{+ 4 4} [x:=4, y:={+ x x}, z:=y]

-->

8 [x:=4, y:={+ x x}, z:=y]

{bind {{y {+ x x}}}

{bind {{z y}}

{bind {{x 4}}

z}}}} []

-->

{bind {{y {+ x x}}}

{bind {{z y}}

{bind {{x 4}}

z}}} [x:={+ 4 5}]

-->

{bind {{z y}}

{bind {{x 4}}

z}} [x:={+ 4 5}, y:={+ x x}]

-->

{bind {{x 4}}

z} [x:={+ 4 5}, y:={+ x x}, z:=y]

-->

z [x:=4, y:={+ x x}, z:=y]

-->

y [x:=4, y:={+ x x}, z:=y]

-->

{+ x x} [x:=4, y:={+ x x}, z:=y]

-->

{+ 4 4} [x:=4, y:={+ x x}, z:=y]

-->

8 [x:=4, y:={+ x x}, z:=y]

We have a problem! This problem should be familiar now, it is very
similar to the problem that led us down the mistaken path of dynamic
scoping when we tried to have first-class functions. In both cases,
substitution always worked, and it looks like in both cases the problem
is that we don’t remember the environment of an expression: in the case
of functions, it is the environment at the time of creating the closure
that we want to capture and use when we go back later to evaluate the
body of the function. Here we have a similar situation, except that we
don’t need a function to defer computation: *most* expressions get
evaluated at some time in the future, so every time we defer such a
computation we need to remember the lexical environment of the
expression.

This is the major point that will make things work again: every
expression creates something like a closure — an object that closes
over an expression and an environment at the (lexical) place where that
expression was used, and when we actually want to evaluate it later, we
need to do it in the right lexical context. So it is like a closure
except it doesn’t need to be applied, and there are no arguments. In
fact it is also a form of a closure — instead of closing over a
function body and an environment, it closes over any expression and an
environment. (As we shall see, lazy evaluation is tightly related to
using nullary functions: *thunks*.)

So we implement this by creating such closure values for all expressions that are not evaluated right now. We begin with the Toy language, and rename it to “Sloth”. We then add one more case to the data type of values which implements the new kind of expression closures, which contains the expression and its environment:

(define-type VAL

[RktV Any]

[FunV (Listof Symbol) SLOTH ENV]

[ExprV SLOTH ENV] ;*** new: expression and scope

[PrimV ((Listof VAL) -> VAL)])

[RktV Any]

[FunV (Listof Symbol) SLOTH ENV]

[ExprV SLOTH ENV] ;*** new: expression and scope

[PrimV ((Listof VAL) -> VAL)])

(Intuition#1: `ExprV`

is a delayed evaluation and therefore it has the
two values that are ultimately passed to `eval`

. Intuition#2: laziness
can be implemented with thunks, so we hold the same information as a
`FunV`

does, only there’s no need for the argument names.)

Where should we use the new `ExprV`

? — At any place where we want to
be lazy and defer evaluating an expression for later. The two places in
the interpreter where we want to delay evaluation are the named
expressions in a bind form and the argument expressions in a function
application. Both of these cases use the helper `eval*`

function to do
their evaluations, for example:

[(Bind names exprs bound-body)

(eval bound-body (extend names (map eval* exprs) env))]

(eval bound-body (extend names (map eval* exprs) env))]

To delay these evaluations, we need to change `eval*`

so it returns an
expression closure instead of actually doing the evaluation — change:

(: eval* : SLOTH -> VAL)

(define (eval* expr) (eval expr env))

(define (eval* expr) (eval expr env))

to:

(: eval* : SLOTH -> VAL)

(define (eval* expr) (ExprV expr env))

(define (eval* expr) (ExprV expr env))

Note how simple this change is — instead of an `eval`

function call,
we create a value that contains the parts that would have been used in
the `eval`

function call. This value serves as a promise to do this
evaluation (the `eval`

call) later, if needed. (This is exactly why a
Lazy Racket would make this a lazy evaluator: in it, *all* function
calls are promises.)

Side note: this can be used in any case when you’re using an eager language, and you want to delay some function call — all you need to do is replace (using a C-ish syntax)

int foo(int x, str y) {

...do some work...

}

...do some work...

}

with

// rename `foo':

int real_foo(int x, str y) {

...same work...

}

// `foo' is a delayed constructor, instead of a plain function

struct delayed_foo {

int x;

str y;

}

delayed_foo foo(int x, str y) {

return new delayed_foo(x, y);

}

int real_foo(int x, str y) {

...same work...

}

// `foo' is a delayed constructor, instead of a plain function

struct delayed_foo {

int x;

str y;

}

delayed_foo foo(int x, str y) {

return new delayed_foo(x, y);

}

now all calls to `foo`

return a `delayed_foo`

instance instead of an
integer. Whenever we want to force the delayed promise, we can use this
function:

int force_foo(delayed_foo promise) {

return real_foo(promise.x, promise.y);

}

return real_foo(promise.x, promise.y);

}

You might even want to make sure that each such promise is evaluated exactly once — this is simple to achieve by adding a cache field to the struct:

int real_foo(int x, str y) {

...same work...

}

struct delayed_foo {

int x;

str y;

bool is_computed;

int result;

}

delayed_foo foo(int x, str y) {

return new delayed_foo(x, y, false, 0);

}

int force_foo(delayed_foo promise) {

if (!promise.is_computed) {

promise.result = real_foo(promise.x, promise.y);

promise.is_computed = true;

}

return promise.result;

}

...same work...

}

struct delayed_foo {

int x;

str y;

bool is_computed;

int result;

}

delayed_foo foo(int x, str y) {

return new delayed_foo(x, y, false, 0);

}

int force_foo(delayed_foo promise) {

if (!promise.is_computed) {

promise.result = real_foo(promise.x, promise.y);

promise.is_computed = true;

}

return promise.result;

}

As we will see shortly, this corresponds to switching from a call-by-name lazy language to a call-by-need one.

Back to our Sloth interpreter — given the `eval*`

change, we expect
that `eval`

-uating:

{bind {{x 1}} x}

will return:

(ExprV (Num 1) ...the-global-environment...)

and the same goes for `eval`

-uating

{{fun {x} x} 1}

Similarly, evaluating

{bind {{x {+ 1 2}}} x}

should return

(ExprV (Call (Id +) (Num 1) (Num 2)) ...the-global-environment...)

But what about evaluating an expression like this one:

{bind {{x 2}}

{+ x x}}

{+ x x}}

?

Using what we have so far, we will get to evaluate the body, which is a
(Call …) expression, but when we evaluate the arguments for this
function call, we will get `ExprV`

values — so we will not be able to
perform the addition. Instead, we will get an error from the function
that `racket-func->prim-val`

creates, due to the value being an `ExprV`

instead of a `RktV`

.

What we really want is to actually add two *values*, not promises. So
maybe distinguish the two applications — treat `PrimV`

differently
from `FunV`

closures?

(: eval* : SLOTH -> VAL)

(define (eval* expr) (ExprV expr env))

(: real-eval* : SLOTH -> VAL)

(define (real-eval* expr) (eval expr env))

(cases expr

...

[(Call fun-expr arg-exprs)

(let ([fval (eval fun-expr env)]

;; move: [arg-vals (map eval* arg-exprs)]

)

(cases fval

[(PrimV proc) (proc (map real-eval* arg-exprs))] ; change

[(FunV names body fun-env)

(eval body (extend names (map eval* arg-exprs) fun-env))]

...))]

...)

(define (eval* expr) (ExprV expr env))

(: real-eval* : SLOTH -> VAL)

(define (real-eval* expr) (eval expr env))

(cases expr

...

[(Call fun-expr arg-exprs)

(let ([fval (eval fun-expr env)]

;; move: [arg-vals (map eval* arg-exprs)]

)

(cases fval

[(PrimV proc) (proc (map real-eval* arg-exprs))] ; change

[(FunV names body fun-env)

(eval body (extend names (map eval* arg-exprs) fun-env))]

...))]

...)

This still doesn’t work — the problem is that the function now gets a
bunch of values, where some of these can still be `ExprV`

s because the
evaluation itself can return such values… Another way to see this
problem is to consider the code for evaluating an `If`

conditional
expression:

[(If cond-expr then-expr else-expr)

(eval* (if (cases (real-eval* cond-expr)

[(RktV v) v] ; Racket value => use as boolean

[else #t]) ; other values are always true

then-expr

else-expr))]

(eval* (if (cases (real-eval* cond-expr)

[(RktV v) v] ; Racket value => use as boolean

[else #t]) ; other values are always true

then-expr

else-expr))]

…we need to take care of a possible `ExprV`

here. What should we do?
The obvious solution is to use `eval`

if we get an `ExprV`

value:

[(If cond-expr then-expr else-expr)

(eval* (if (cases (real-eval* cond-expr)

[(RktV v) v] ; Racket value => use as boolean

[(ExprV expr env) (eval expr env)] ; force a promise

[else #t]) ; other values are always true

then-expr

else-expr))]

(eval* (if (cases (real-eval* cond-expr)

[(RktV v) v] ; Racket value => use as boolean

[(ExprV expr env) (eval expr env)] ; force a promise

[else #t]) ; other values are always true

then-expr

else-expr))]

Note how this translates back the data structure that represents a
delayed `eval`

promise back into a real `eval`

call…

Going back to our code for `Call`

, there is a problem with it — the

(define (real-eval* expr) (eval expr env))

will indeed evaluate the expression instead of lazily deferring this to the future, but this evaluation might itself return such lazy values. So we need to inspect the resulting value again, forcing the promise if needed:

(define (real-eval* expr)

(let ([val (eval expr env)])

(cases val

[(ExprV expr env) (eval expr env)]

[else val])))

(let ([val (eval expr env)])

(cases val

[(ExprV expr env) (eval expr env)]

[else val])))

But we *still* have a problem — programs can get an arbitrarily long
nested chains of `ExprV`

s that get forced to other `ExprV`

s.

{bind {{x true}}

{bind {{y x}}

{bind {{z y}}

{if z

{foo}

{bar}}}}}

{bind {{y x}}

{bind {{z y}}

{if z

{foo}

{bar}}}}}

What we really need is to write a loop that keeps forcing promises over
and over until it gets a proper non-`ExprV`

value.

(: strict : VAL -> VAL)

;; forces a (possibly nested) ExprV promise,

;; returns a VAL that is not an ExprV

(define (strict val)

(cases val

[(ExprV expr env) (strict (eval expr env))] ; loop back

[else val]))

;; forces a (possibly nested) ExprV promise,

;; returns a VAL that is not an ExprV

(define (strict val)

(cases val

[(ExprV expr env) (strict (eval expr env))] ; loop back

[else val]))

Note that it’s close to `real-eval*`

, but there’s no need to mix it with
`eval`

. The recursive call is important: we can never be sure that
`eval`

didn’t return an `ExprV`

promise, so we have to keep looping
until we get a “real” value.

Now we can change the evaluation of function calls to something more manageable:

[(Call fun-expr arg-exprs)

(let ([fval (strict (eval* fun-expr))] ;*** strict!

[arg-vals (map eval* arg-exprs)])

(cases fval

[(PrimV proc) (proc (map strict arg-vals))] ;*** strict!

[(FunV names body fun-env)

(eval body (extend names arg-vals fun-env))]

[else (error 'eval "function call with a non-function: ~s"

fval)]))]

(let ([fval (strict (eval* fun-expr))] ;*** strict!

[arg-vals (map eval* arg-exprs)])

(cases fval

[(PrimV proc) (proc (map strict arg-vals))] ;*** strict!

[(FunV names body fun-env)

(eval body (extend names arg-vals fun-env))]

[else (error 'eval "function call with a non-function: ~s"

fval)]))]

The code is fairly similar to what we had previously — the only
difference is that we wrap a `strict`

call where a proper value is
needed — the function value itself, and arguments to primitive
functions.

The `If`

case is similar (note that it doesn’t matter if `strict`

is
used with the result of `eval`

or `eval*`

(which returns an `ExprV`

)):

[(If cond-expr then-expr else-expr)

(eval* (if (cases (strict (eval* cond-expr))

[(RktV v) v] ; Racket value => use as boolean

[else #t]) ; other values are always true

then-expr

else-expr))]

(eval* (if (cases (strict (eval* cond-expr))

[(RktV v) v] ; Racket value => use as boolean

[else #t]) ; other values are always true

then-expr

else-expr))]

Note that, like before, we always return `#t`

for non-`RktV`

values —
this is because we know that the value there is never an `ExprV`

. All
we need now to get a working evaluator, is one more strictness point:
the outermost point that starts our evaluation — `run`

— needs to
use `strict`

to get a proper result value.

(: run : String -> Any)

;; evaluate a SLOTH program contained in a string

(define (run str)

(let ([result (strict (eval (parse str) global-environment))])

(cases result

[(RktV v) v]

[else (error 'run "evaluation returned a bad value: ~s"

result)])))

;; evaluate a SLOTH program contained in a string

(define (run str)

(let ([result (strict (eval (parse str) global-environment))])

(cases result

[(RktV v) v]

[else (error 'run "evaluation returned a bad value: ~s"

result)])))

With this, all of the tests that we took from the Toy evaluator run successfully. To make sure that the interpreter is lazy, we can add a test that will fail if the language is strict:

;; Test laziness

(test (run "{{fun {x} 1} {/ 9 0}}") => 1)

(test (run "{{fun {x} 1} {{fun {x} {x x}} {fun {x} {x x}}}}") => 1)

(test (run "{bind {{x {{fun {x} {x x}} {fun {x} {x x}}}}} 1}") => 1)

(test (run "{{fun {x} 1} {/ 9 0}}") => 1)

(test (run "{{fun {x} 1} {{fun {x} {x x}} {fun {x} {x x}}}}") => 1)

(test (run "{bind {{x {{fun {x} {x x}} {fun {x} {x x}}}}} 1}") => 1)

[In fact, we can continue and replace all `eval`

calls with `ExprV`

,
leaving only the one call in `strict`

. This doesn’t make any
difference, because the resulting promises will eventually be forced by
`strict`

anyway.]