2010-03-23 - Implementing Laziness (in plain Scheme) - Sloth: A Lazy Evaluator - Implementing Call by Need - Side Effects in a Lazy Language ======================================================================== >>> Implementing Laziness (in plain Scheme) Generally, we know how lazy evaluation works when we use the substitution model. We even know that if we have: {bind {{x y}} {bind {{y 2}} {+ x y}}} then the result should be an error because we cannot substitute the `y' expression in because it will capture the `y' -- changing the binding structure. As an indication, the original expression contains a free reference to `y', which is exactly why we shouldn't substitute it. But what about: {bind {{x {+ 4 5}}} {bind {{y {+ x x}}} {bind {{z y}} {bind {{x 4}} z}}}} Evaluating this eagerly returns 18, we therefore expect any other evaluation (eager or lazy, using substitutions or environments) to return 18, because any of these options should not change the meaning of numbers, of addition, *or* of the scoping rules. For example, try using lazy evaluation with substitutions: {bind {{x {+ 4 5}}} {bind {{y {+ x x}}} {bind {{z y}} {bind {{x 4}} z}}}} --> {bind {{y {+ {+ 4 5} {+ 4 5}}}} {bind {{z y}} {bind {{x 4}} z}}} --> {bind {{z {+ {+ 4 5} {+ 4 5}}}} {bind {{x 4}} z}} --> {bind {{x 4}} {+ {+ 4 5} {+ 4 5}}} --> {+ {+ 4 5} {+ 4 5}} --> {+ 9 9} --> 18 And what about lazy evaluation using environments: {bind {{x {+ 4 5}}} {bind {{y {+ x x}}} {bind {{z y}} {bind {{x 4}} z}}}} --> {bind {{y {+ x x}}} {bind {{z y}} {bind {{x 4}} z}}} [x:={+ 4 5}] --> {bind {{z y}} {bind {{x 4}} z}} [x:={+ 4 5}, y:={+ x x}] --> {bind {{x 4}} z} [x:={+ 4 5}, y:={+ x x}, z:=y] --> z [x:=4, y:={+ x x}, z:=y] --> y [x:=4, y:={+ x x}, z:=y] --> {+ x x} [x:=4, y:={+ x x}, z:=y] --> {+ 4 4} [x:=4, y:={+ x x}, z:=y] --> 8 [x:=4, y:={+ x x}, z:=y] We have a problem! This problem should be familiar now, it is very similar to the problem that led us down the mistaken path of dynamic scoping when we tried to have first-class functions. In both cases, substitution always worked, and it looks like in both cases the problem is that we don't remember the environment of an expression: in the case of functions, it is the environment at the time of creating the closure that we want to capture and use when we go back later to evaluate the body of the function. Here we have a similar situation, except that we don't need a function to defer computation: *most* expressions get evaluated at some time in the future, so every time we defer such a computation we need to remember the lexical environment of the expression. This is the major point that will make things work again: every expression creates something like a closure -- an object that closes over an expression and an environment at the (lexical) place where that expression was used, and when we actually want to evaluate it later, we need to do it in the right lexical context. So it is like a closure except it doesn't need to be applied, and there are no arguments. In fact it is also a form of a closure -- instead of closing over a function body and an environment, it closes over any expression and an environment. (As we shall see, lazy evaluation is tightly related to using nullary functions: "thunks".) ======================================================================== >>> Sloth: A Lazy Evaluator So we implement this by creating such `closure' values for all expressions that are not evaluated right now. We begin with the Toy language, and rename it to "Sloth". We then add one more case to the data type of values which implements the new kind of expression closures, which contains the expression and its environment: (define-type VAL [ScmV (x Any)] [FunV (names (Listof Symbol)) (body SLOTH) (env ENV)] [ExprV (expr SLOTH) (env ENV)] ;*** new [PrimV (prim ((Listof VAL) -> VAL))]) Where should we use the new `ExprV'? -- At any place where we want to be lazy and defer evaluating an expression for later. The two places in the interpreter where we want to delay evaluation is the named expressions in a bind form and the argument expressions in a function application. Both of these cases use the helper `eval*' function to do their evaluations, for example: [(Bind names exprs bound-body) (eval bound-body (extend names (map eval* exprs) env))] To delay these evaluations, we need to change `eval*' so it returns an expression closure instead of actually doing the evaluation -- change: (: eval* : SLOTH -> VAL) (define (eval* expr) (eval expr env)) to: (: eval* : SLOTH -> VAL) (define (eval* expr) (ExprV expr env)) Note how simple this change is -- instead of an `eval' function call, we create a value that contains the parts that would have been used in the `eval' function call. This value serves as a promise to do this evaluation (the `eval' call) later, if needed. (This is exactly why a Lazy Scheme would make this a lazy evaluator: in it, *all* function calls are promises.) ======================================================================== Side note: this can be used in any case when you're using an eager language, and you want to delay some function call -- all you need to do is replace (using a C-ish syntax) int foo(int x, str y) { ...do some work... } with // rename `foo': int real_foo(int x, str y) { ...same work... } // `foo' is a constuctor, instead of a function, holding the arguments struct delayed_foo { int x; str y; } delayed_foo foo(int x, str y) { return new delayed_foo(x, y); } now all calls to `foo' return a `delayed_foo' instance instead of an integer. Whenever we want to force the delayed promise, we can use this function: int force_foo(delayed_foo promise) { return real_foo(promise.x, promise.y); } You might even want to make sure that each such promise is evaluated exactly once -- this is simple to achieve by adding a cache field to the struct: int real_foo(int x, str y) { ...same work... } struct delayed_foo { int x; str y; bool is_computed; int result; } delayed_foo foo(int x, str y) { return new delayed_foo(x, y, false, 0); } int force_foo(delayed_foo promise) { if (!promise.is_computed) { promise.result = real_foo(promise.x, promise.y); promise.is_computed = true; } return promise.result; } As we will see shortly, this corresponds to switching from a call-by-name lazy language to a call-by-need one. ======================================================================== Back to our Sloth interpreter -- given the `eval*' change, we expect that `eval'-uating: {bind {{x 1}} x} will return: (ExprV (Num 1) ...the-global-environment...) and the same goes for `eval'-uating {{fun {x} x} 1} Similarly, evaluating {bind {{x {+ 1 2}}} x} should return (ExprV (Call (Id +) (Num 1) (Num 2)) ...the-global-environment...) But what about evaluating an expression like this one: {bind {{x 2}} {+ x x}} ? Using what we have so far, we will get to evaluate the body, which is a (Call ...) expression, but when we evaluate the arguments for this function call, we will get `ExprV' values -- so we will not be able to perform the addition. Instead, we will get an error from the function that `scheme-func->prim-val' creates, due to the value being an `ExprV' instead of a `ScmV'. What we really want is to actually add two *values*, not promises. So maybe distinguish the two applications -- treat PrimV differently from FunV closures? (: eval* : SLOTH -> VAL) (define (eval* expr) (ExprV expr env)) (: real-eval* : SLOTH -> VAL) (define (real-eval* expr) (eval expr env)) (cases expr ... [(Call fun-expr arg-exprs) (let ([fval (eval fun-expr env)] ;; move: [arg-vals (map eval* arg-exprs)] ) (cases fval [(PrimV proc) (proc (map real-eval* arg-exprs))] ; change [(FunV names body fun-env) (eval body (extend names (map eval* arg-exprs) fun-env))] ...))] ...) This still doesn't work -- the problem is that the function now gets a bunch of values, where some of these can still be `ExprV's because the evaluation itself can return such values... Another way to see this problem is to consider the code for evaluating an `If' conditional expression: [(If cond-expr then-expr else-expr) (eval (if (cases (real-eval* cond-expr) [(ScmV v) v] ; Scheme value => use as boolean [else #t]) ; other values are always true then-expr else-expr) env)] ...we need to take care of a possible ExprV here. What should we do? The obvious solution is to use `eval' if we get an `ExprV' value: [(If cond-expr then-expr else-expr) (eval (if (cases (real-eval* cond-expr) [(ScmV v) v] ; Scheme value => use as boolean [(ExprV expr env) (eval expr env)] ; force a promise [else #t]) ; other values are always true then-expr else-expr) env)] Note how this translates back the data structure that represents a delayed `eval' promise back into a real `eval' call... Going back to our code for `Call', there is a problem with it -- the (define (real-eval* expr) (eval expr env)) will indeed evaluate the expression instead of lazily deferring this to the future, but this evaluation might itself return such lazy values. So we need to inspect the resulting value again, forcing the promise if needed: (define (real-eval* expr) (let ([v (eval expr env)]) (cases v [(ExprV expr env) (eval expr env)] [else v]))) But we *still* have a problem -- programs can get an arbitrarily long nested chains of `ExprV's that get forced to other `ExprV's. What we really need is to write a loop that keeps forcing promises over and over until it gets a proper non-`ExprV' value. An obvious solution is to write a function that will do just that, basically extend `real-eval*' to loop over and over as long as we have an `ExprV', and move it outside of `eval' (changing it to consume a value on the way): (: strict : VAL -> VAL) ;; forces a (possibly nested) ExprV promise, returns a VAL that is ;; not an ExprV (define (strict v) (cases v [(ExprV expr env) (strict (eval expr env))] ; loop back [else v])) Note the recursive call: we can never be sure that `eval' didn't return an ExprV promise, so we have to keep trying until we get a "real" value. Now we can change the function call to something more manageable: [(Call fun-expr arg-exprs) (let ([fval (strict (eval* fun-expr))] ;*** strict! [arg-vals (map eval* arg-exprs)]) (cases fval [(PrimV proc) (proc (map strict arg-vals))] ;*** strict! [(FunV names body fun-env) (eval body (extend names arg-vals fun-env))] [else (error 'eval "function call with a non-function: ~s" fval)]))] Note that the code is fairly similar to what we had previously -- the only difference is that we wrap a `strict' call where a proper value is needed -- the function value itself, and arguments to primitive functions. The `If' case is similar (note that it doesn't matter if `strict' is used with the result of `eval' or `ExprV'): [(If cond-expr then-expr else-expr) (eval (if (cases (strict (eval cond-expr env)) [(ScmV v) v] ; Scheme value => use as boolean [else #t]) ; other values are always true then-expr else-expr) env)] Note that, like before, we always return #t for non-`ScmV' values -- this is because we know that the value there is never an `ExprV'. All we need now to get a working evaluator, is one more strictness point: the outermost point that starts our evaluation -- `run' -- needs to use `strict' to get a proper result value. (: run : String -> Any) ;; evaluate a SLOTH program contained in a string (define (run str) (let ([result (strict (eval (parse str) global-environment))]) (cases result [(ScmV v) v] [else (error 'run "evaluation returned a bad value: ~s" result)]))) With this, all of the tests that we took from the Toy evaluator run successfully. To make sure that the interpreter is lazy, we can add a test that will fail if the language is strict: ;; Test laziness (test (run "{{fun {x} 1} {/ 9 0}}") => 1) (test (run "{{fun {x} 1} {{fun {x} {x x}} {fun {x} {x x}}}}") => 1) [In fact, we can continue and replace all `eval' calls with `ExprV', leaving only the one call in `strict'. This doesn't make any difference, because the resulting promises will eventually be forced by `strict' anyway.] ======================================================================== >> Getting More from Sloth As we've seen, using `strict' in places where we need an actual value rather than a delayed promise is enough to get a working lazy evaluator. Our current implementation assumes that all primitive functions need strict values, therefore the argument values are all passed through the `strict' function -- but this is not always the case. Specifically, if we have constructor functions, then we don't need (and usually don't want) to force the promises. This is basically what allows us to use infinite lists in Lazy Scheme: the fact that `list' and `cons' do not require forcing their arguments. To allow some primitive functions to consume strict values and some to leave them as is, we're going to change `scheme-func->prim-val' and add a flag that indicates whether the primitive function is strict or not. Obviously, we also need to move the `strict' call around arguments to a primitive function application into the `scheme-func->prim-val' generated function -- which simplifies the `Call' case in `eval' (we go from (proc (map strict arg-vals)) back to (proc arg-vals)). The new code for `scheme-func->prim-val' is: (: scheme-func->prim-val : Function Boolean -> VAL) ;; converts a scheme function to a primitive evaluator function ... (define (scheme-func->prim-val scheme-func strict?) (let ([list-func (make-untyped-list-function scheme-func)]) (PrimV (lambda: ([args : (Listof VAL)]) (let ([args (if strict? (map (lambda: ([a : VAL]) (let ([v (strict a)]) (cases v [(ScmV x) x] [else (error 'scheme-func "bad input: ~s" v)]))) args) args)]) ;*** use values as is! (ScmV (list-func args))))))) We now need to annotate the primitives in the global environment, as well as add a few constructors: ;; The global environment has a few primitives: (: global-environment : ENV) (define global-environment (FrameEnv (list (list '+ (scheme-func->prim-val + #t)) (list '- (scheme-func->prim-val - #t)) (list '* (scheme-func->prim-val * #t)) (list '/ (scheme-func->prim-val / #t)) (list '< (scheme-func->prim-val < #t)) (list '> (scheme-func->prim-val > #t)) (list '= (scheme-func->prim-val = #t)) ;; note flags: (list 'cons (scheme-func->prim-val cons #f)) (list 'list (scheme-func->prim-val list #f)) (list 'first (scheme-func->prim-val first #t)) (list 'rest (scheme-func->prim-val rest #t)) (list 'null? (scheme-func->prim-val null? #t)) ;; values (list 'true (ScmV #t)) (list 'false (ScmV #f)) (list 'null (ScmV null))) (EmptyEnv))) Note that this last change raises a subtle type issue: we're actually abusing the Scheme `list' and `cons' constructors to hold Sloth values. One way in which this becomes a problem is the current assumption that a primitive function always returns a Scheme value (it is always wrapped in a `ScmV') -- but this is no longer the case for `first' and `rest': when we use {cons 1 null} in Sloth, the resulting value will be (ScmV (cons (ExprV (Num 1) ...) (ExprV (Id null) ...))) so if we try and grab the first value of this {first {cons 1 null}} we will eventually get back the `ExprV' and wrap it in a `ScmV': (ScmV (ExprV (Num 1) ...)) and finally `run' will strip off the `ScmV' and return the `ExprV'. A solution to this is to make our `first' and `rest' functions return a value *without* wrapping it in a `ScmV' -- we can identify this situation by the fact that the returned value is already a VAL instead of some other Scheme value. We can identify such values with the `VAL?' predicate that gets defined by our `define-type': (: scheme-func->prim-val : Function Boolean -> VAL) ;; converts a scheme function to a primitive evaluator function ... (define (scheme-func->prim-val scheme-func strict?) (let ([list-func (make-untyped-list-function scheme-func)]) (PrimV (lambda: ([args : (Listof VAL)]) (let* ([args (if strict? (map (lambda: ([a : VAL]) (let ([v (strict a)]) (cases v [(ScmV x) x] [else (error 'scheme-func "bad input: ~s" v)]))) args) args)] [result (list-func args)]) ;; Because there are non-strict constructors, ;; primitives like `car' might be returning promises ;; which are already VAL objects. (if (VAL? result) result (ScmV result))))))) Note that we don't need to worry about the result being an `ExprV' -- that will eventually be taken care of by `strict'. The complete Sloth code follows. It can be used to do the same fun things we did with Lazy Scheme. ---<<>>-------------------------------------------------------- #lang pl ;;; ================================================================== ;;; Syntax #| The BNF: ::= | | { bind {{ } ... } } | { fun { ... } } | { if } | { ... } |# ;; A matching abstract syntax tree datatype: (define-type SLOTH [Num (n Number)] [Id (name Symbol)] [Bind (names (Listof Symbol)) (exprs (Listof SLOTH)) (body SLOTH)] [Fun (names (Listof Symbol)) (body SLOTH)] [Call (fun-expr SLOTH) (arg-exprs (Listof SLOTH))] [If (cond-expr SLOTH) (then-expr SLOTH) (else-expr SLOTH)]) (: unique-list? : (Listof Any) -> Boolean) ;; Tests whether a list is unique, used to guard Bind and Fun values. (define (unique-list? xs) (or (null? xs) (and (not (member (first xs) (rest xs))) (unique-list? (rest xs))))) (: parse-sexpr : Sexpr -> SLOTH) ;; to convert s-expressions into SLOTHs (define (parse-sexpr sexpr) (match sexpr [(number: n) (Num n)] [(symbol: name) (Id name)] [(cons 'bind more) (match sexpr [(list 'bind (list (list (symbol: names) (sexpr: nameds)) ...) body) (if (unique-list? names) (Bind names (map parse-sexpr nameds) (parse-sexpr body)) (error 'parse-sexpr "`bind' got duplicate names: ~s" names))] [else (error 'parse-sexpr "bad `bind' syntax in ~s" sexpr)])] [(cons 'fun more) (match sexpr [(list 'fun (list (symbol: names) ...) body) (if (unique-list? names) (Fun names (parse-sexpr body)) (error 'parse-sexpr "`fun' got duplicate names: ~s" names))] [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])] [(cons 'if more) (match sexpr [(list 'if cond then else) (If (parse-sexpr cond) (parse-sexpr then) (parse-sexpr else))] [else (error 'parse-sexpr "bad `if' syntax in ~s" sexpr)])] [(list fun (sexpr: args) ...) ; other lists are applications (Call (parse-sexpr fun) (map parse-sexpr args))] [else (error 'parse-sexpr "bad syntax in ~s" sexpr)])) (: parse : String -> SLOTH) ;; Parses a string containing an SLOTH expression to a SLOTH AST. (define (parse str) (parse-sexpr (string->sexpr str))) ;;; ================================================================== ;;; Values and environments (define-type ENV [EmptyEnv] [FrameEnv (frame FRAME) (rest ENV)]) (define-type VAL [ScmV (x Any)] [FunV (names (Listof Symbol)) (body SLOTH) (env ENV)] [ExprV (expr SLOTH) (env ENV)] [PrimV (prim ((Listof VAL) -> VAL))]) ;; a frame is an association list of names and values. (define-type FRAME = (Listof (List Symbol VAL))) (: extend : (Listof Symbol) (Listof VAL) ENV -> ENV) ;; extends an environment with a new frame. (define (extend names values env) (if (= (length names) (length values)) (FrameEnv (map (lambda: ([name : Symbol] [val : VAL]) (list name val)) names values) env) (error 'extend "arity mismatch for names: ~s" names))) (: lookup : Symbol ENV -> VAL) ;; looks a name in an environment, searching through each frame. (define (lookup name env) (cases env [(EmptyEnv) (error 'lookup "no binding for ~s" name)] [(FrameEnv frame rest) (let ([cell (assq name frame)]) (if cell (second cell) (lookup name rest)))])) (: scheme-func->prim-val : Function Boolean -> VAL) ;; converts a scheme function to a primitive evaluator function which ;; is a PrimV holding a ((Listof VAL) -> VAL) procedure. (the ;; resulting procedure will use the list function as is, and it is the ;; list function's responsibility to throw an error if it's given a ;; bad number of arguments or bad input types.) (define (scheme-func->prim-val scheme-func strict?) (let ([list-func (make-untyped-list-function scheme-func)]) (PrimV (lambda: ([args : (Listof VAL)]) (let* ([args (if strict? (map (lambda: ([a : VAL]) (let ([v (strict a)]) (cases v [(ScmV x) x] [else (error 'scheme-func "bad input: ~s" v)]))) args) args)] [result (list-func args)]) ;; Because there are non-strict constructors, ;; primitives like `car' might be returning promises ;; which are already VAL objects. (if (VAL? result) result (ScmV result))))))) ;; The global environment has a few primitives: (: global-environment : ENV) (define global-environment (FrameEnv (list (list '+ (scheme-func->prim-val + #t)) (list '- (scheme-func->prim-val - #t)) (list '* (scheme-func->prim-val * #t)) (list '/ (scheme-func->prim-val / #t)) (list '< (scheme-func->prim-val < #t)) (list '> (scheme-func->prim-val > #t)) (list '= (scheme-func->prim-val = #t)) ;; note flags: (list 'cons (scheme-func->prim-val cons #f)) (list 'list (scheme-func->prim-val list #f)) (list 'first (scheme-func->prim-val first #t)) (list 'rest (scheme-func->prim-val rest #t)) (list 'null? (scheme-func->prim-val null? #t)) ;; values (list 'true (ScmV #t)) (list 'false (ScmV #f)) (list 'null (ScmV null))) (EmptyEnv))) ;;; ================================================================== ;;; Evaluation (: strict : VAL -> VAL) ;; forces a (possibly nested) ExprV promise, returns a VAL that is ;; not an ExprV (define (strict v) (cases v [(ExprV expr env) (strict (eval expr env))] [else v])) (: eval : SLOTH ENV -> VAL) ;; evaluates SLOTH expressions. (define (eval expr env) ;; convenient helper (: eval* : SLOTH -> VAL) (define (eval* expr) (ExprV expr env)) (cases expr [(Num n) (ScmV n)] [(Id name) (lookup name env)] [(Bind names exprs bound-body) (eval bound-body (extend names (map eval* exprs) env))] [(Fun names bound-body) (FunV names bound-body env)] [(Call fun-expr arg-exprs) (let ([fval (strict (eval* fun-expr))] [arg-vals (map eval* arg-exprs)]) (cases fval [(PrimV proc) (proc arg-vals)] [(FunV names body fun-env) (eval body (extend names arg-vals fun-env))] [else (error 'eval "function call with a non-function: ~s" fval)]))] [(If cond-expr then-expr else-expr) (eval (if (cases (strict (eval cond-expr env)) [(ScmV v) v] ; Scheme value => use as boolean [else #t]) ; other values are always true then-expr else-expr) env)])) (: run : String -> Any) ;; evaluate a SLOTH program contained in a string (define (run str) (let ([result (strict (eval (parse str) global-environment))]) (cases result [(ScmV v) v] [else (error 'run "evaluation returned a bad value: ~s" result)]))) ;;; ================================================================== ;;; Tests (test (run "{{fun {x} {+ x 1}} 4}") => 5) (test (run "{bind {{add3 {fun {x} {+ x 3}}}} {add3 1}}") => 4) (test (run "{bind {{add3 {fun {x} {+ x 3}}} {add1 {fun {x} {+ x 1}}}} {bind {{x 3}} {add1 {add3 x}}}}") => 7) (test (run "{bind {{identity {fun {x} x}} {foo {fun {x} {+ x 1}}}} {{identity foo} 123}}") => 124) (test (run "{bind {{x 3}} {bind {{f {fun {y} {+ x y}}}} {bind {{x 5}} {f 4}}}}") => 7) (test (run "{{{fun {x} {x 1}} {fun {x} {fun {y} {+ x y}}}} 123}") => 124) ;; More tests for complete coverage (test (run "{bind x 5 x}") =error> "bad `bind' syntax") (test (run "{fun x x}") =error> "bad `fun' syntax") (test (run "{if x}") =error> "bad `if' syntax") (test (run "{}") =error> "bad syntax") (test (run "{bind {{x 5} {x 5}} x}") =error> "bind* duplicate names") (test (run "{fun {x x} x}") =error> "fun* duplicate names") (test (run "{+ x 1}") =error> "no binding for") (test (run "{+ 1 {fun {x} x}}") =error> "bad input") (test (run "{+ 1 {fun {x} x}}") =error> "bad input") (test (run "{1 2}") =error> "with a non-function") (test (run "{{fun {x} x}}") =error> "arity mismatch") (test (run "{if {< 4 5} 6 7}") => 6) (test (run "{if {< 5 4} 6 7}") => 7) (test (run "{if + 6 7}") => 6) (test (run "{fun {x} x}") =error> "returned a bad value") ;; Test laziness (test (run "{{fun {x} 1} {/ 9 0}}") => 1) (test (run "{{fun {x} 1} {{fun {x} {x x}} {fun {x} {x x}}}}") => 1) ;; Test lazy constructors (test (run "{bind {{l {list 1 {/ 9 0} 3}}} {+ {first l} {first {rest {rest l}}}}}") => 4) ;;; ================================================================== ---------------------------------------------------------------------- ======================================================================== >>> Implementing Call by Need As we have seen, there are a number of advantages for lazy evaluation, but its main disadvantage is the fact that it is extremely inefficient, to the point of rendering lots of programs impractical, for example, in: {bind {{x {+ 4 5}}} {bind {{y {+ x x}}} y}} we end up adding 4 and 5 twice. In other words, we don't suffer from textual redundancy (each expression is written once), but we don't avoid dynamic redundancy. We can get it back by simply caching evaluation results, using a box that will be used to remember the results. The box will initially hold #f, and it will change to hold the VAL that results from evaluation: (define-type VAL [ScmV (x Any)] [FunV (names (Listof Symbol)) (body SLOTH) (env ENV)] [ExprV (expr SLOTH) (env ENV) (cache (Boxof (U #f VAL)))] [PrimV (prim ((Listof VAL) -> VAL))]) We need a utility function to create an evaluation promise, because when an ExprV is created, its initial cache box needs to be initialized. (: eval-promise : SLOTH ENV -> VAL) ;; used instead of `eval' to create an evaluation promise (define (eval-promise expr env) (ExprV expr env (box #f))) (And note that Typed Scheme needs to figure out that the `#f' in this definition has a type of (U #f VAL) and not just `#f'.) This `eval-promise' is used instead of `ExprV' in eval. Finally, whenever we force such an ExprV promise, we need to check if it was already evaluated, otherwise force it and cache the result. This is simple to do since there is a single field that is used both as a flag and a cached value: (: strict : VAL -> VAL) ;; forces a (possibly nested) ExprV promise, returns a VAL that is ;; not an ExprV (define (strict v) (cases v [(ExprV expr env cache) (or (unbox cache) (let ([val (strict (eval expr env))]) (set-box! cache val) val))] [else v])) But note that this makes using side-effects in our interpreter even more confusing. (It was true with call-by-name too.) The resulting code follows. ---<<>>------------------------------------------------- ;; A call-by-need version of the SLOTH interpreter #lang pl ;;; ================================================================== ;;; Syntax #| The BNF: ::= | | { bind {{ } ... } } | { fun { ... } } | { if } | { ... } |# ;; A matching abstract syntax tree datatype: (define-type SLOTH [Num (n Number)] [Id (name Symbol)] [Bind (names (Listof Symbol)) (exprs (Listof SLOTH)) (body SLOTH)] [Fun (names (Listof Symbol)) (body SLOTH)] [Call (fun-expr SLOTH) (arg-exprs (Listof SLOTH))] [If (cond-expr SLOTH) (then-expr SLOTH) (else-expr SLOTH)]) (: unique-list? : (Listof Any) -> Boolean) ;; Tests whether a list is unique, used to guard Bind and Fun values. (define (unique-list? xs) (or (null? xs) (and (not (member (first xs) (rest xs))) (unique-list? (rest xs))))) (: parse-sexpr : Sexpr -> SLOTH) ;; to convert s-expressions into SLOTHs (define (parse-sexpr sexpr) (match sexpr [(number: n) (Num n)] [(symbol: name) (Id name)] [(cons 'bind more) (match sexpr [(list 'bind (list (list (symbol: names) (sexpr: nameds)) ...) body) (if (unique-list? names) (Bind names (map parse-sexpr nameds) (parse-sexpr body)) (error 'parse-sexpr "`bind' got duplicate names: ~s" names))] [else (error 'parse-sexpr "bad `bind' syntax in ~s" sexpr)])] [(cons 'fun more) (match sexpr [(list 'fun (list (symbol: names) ...) body) (if (unique-list? names) (Fun names (parse-sexpr body)) (error 'parse-sexpr "`fun' got duplicate names: ~s" names))] [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])] [(cons 'if more) (match sexpr [(list 'if cond then else) (If (parse-sexpr cond) (parse-sexpr then) (parse-sexpr else))] [else (error 'parse-sexpr "bad `if' syntax in ~s" sexpr)])] [(list fun (sexpr: args) ...) ; other lists are applications (Call (parse-sexpr fun) (map parse-sexpr args))] [else (error 'parse-sexpr "bad syntax in ~s" sexpr)])) (: parse : String -> SLOTH) ;; Parses a string containing an SLOTH expression to a SLOTH AST. (define (parse str) (parse-sexpr (string->sexpr str))) ;;; ================================================================== ;;; Values and environments (define-type ENV [EmptyEnv] [FrameEnv (frame FRAME) (rest ENV)]) (define-type VAL [ScmV (x Any)] [FunV (names (Listof Symbol)) (body SLOTH) (env ENV)] [ExprV (expr SLOTH) (env ENV) (cache (Boxof (U #f VAL)))] [PrimV (prim ((Listof VAL) -> VAL))]) ;; a frame is an association list of names and values. (define-type FRAME = (Listof (List Symbol VAL))) (: extend : (Listof Symbol) (Listof VAL) ENV -> ENV) ;; extends an environment with a new frame. (define (extend names values env) (if (= (length names) (length values)) (FrameEnv (map (lambda: ([name : Symbol] [val : VAL]) (list name val)) names values) env) (error 'extend "arity mismatch for names: ~s" names))) (: lookup : Symbol ENV -> VAL) ;; looks a name in an environment, searching through each frame. (define (lookup name env) (cases env [(EmptyEnv) (error 'lookup "no binding for ~s" name)] [(FrameEnv frame rest) (let ([cell (assq name frame)]) (if cell (second cell) (lookup name rest)))])) (: scheme-func->prim-val : Function Boolean -> VAL) ;; converts a scheme function to a primitive evaluator function which ;; is a PrimV holding a ((Listof VAL) -> VAL) procedure. (the ;; resulting procedure will use the list function as is, and it is the ;; list function's responsibility to throw an error if it's given a ;; bad number of arguments or bad input types.) (define (scheme-func->prim-val scheme-func strict?) (let ([list-func (make-untyped-list-function scheme-func)]) (PrimV (lambda: ([args : (Listof VAL)]) (let* ([args (if strict? (map (lambda: ([a : VAL]) (let ([v (strict a)]) (cases v [(ScmV x) x] [else (error 'scheme-func "bad input: ~s" v)]))) args) args)] [result (list-func args)]) ;; Because there are non-strict constructors, ;; primitives like `car' might be returning promises ;; which are already VAL objects. (if (VAL? result) result (ScmV result))))))) ;; The global environment has a few primitives: (: global-environment : ENV) (define global-environment (FrameEnv (list (list '+ (scheme-func->prim-val + #t)) (list '- (scheme-func->prim-val - #t)) (list '* (scheme-func->prim-val * #t)) (list '/ (scheme-func->prim-val / #t)) (list '< (scheme-func->prim-val < #t)) (list '> (scheme-func->prim-val > #t)) (list '= (scheme-func->prim-val = #t)) ;; note flags: (list 'cons (scheme-func->prim-val cons #f)) (list 'list (scheme-func->prim-val list #f)) (list 'first (scheme-func->prim-val first #t)) (list 'rest (scheme-func->prim-val rest #t)) (list 'null? (scheme-func->prim-val null? #t)) ;; values (list 'true (ScmV #t)) (list 'false (ScmV #f)) (list 'null (ScmV null))) (EmptyEnv))) ;;; ================================================================== ;;; Evaluation (: eval-promise : SLOTH ENV -> VAL) ;; used instead of `eval' to create an evaluation promise (define (eval-promise expr env) (ExprV expr env (box #f))) (: strict : VAL -> VAL) ;; forces a (possibly nested) ExprV promise, returns a VAL that is ;; not an ExprV (define (strict v) (cases v [(ExprV expr env cache) (or (unbox cache) (let ([val (strict (eval expr env))]) (set-box! cache val) val))] [else v])) (: eval : SLOTH ENV -> VAL) ;; evaluates SLOTH expressions. (define (eval expr env) ;; convenient helper (: eval* : SLOTH -> VAL) (define (eval* expr) (eval-promise expr env)) (cases expr [(Num n) (ScmV n)] [(Id name) (lookup name env)] [(Bind names exprs bound-body) (eval bound-body (extend names (map eval* exprs) env))] [(Fun names bound-body) (FunV names bound-body env)] [(Call fun-expr arg-exprs) (let ([fval (strict (eval* fun-expr))] [arg-vals (map eval* arg-exprs)]) (cases fval [(PrimV proc) (proc arg-vals)] [(FunV names body fun-env) (eval body (extend names arg-vals fun-env))] [else (error 'eval "function call with a non-function: ~s" fval)]))] [(If cond-expr then-expr else-expr) (eval (if (cases (strict (eval cond-expr env)) [(ScmV v) v] ; Scheme value => use as boolean [else #t]) ; other values are always true then-expr else-expr) env)])) (: run : String -> Any) ;; evaluate a SLOTH program contained in a string (define (run str) (let ([result (strict (eval (parse str) global-environment))]) (cases result [(ScmV v) v] [else (error 'run "evaluation returned a bad value: ~s" result)]))) ;;; ================================================================== ;;; Tests (test (run "{{fun {x} {+ x 1}} 4}") => 5) (test (run "{bind {{add3 {fun {x} {+ x 3}}}} {add3 1}}") => 4) (test (run "{bind {{add3 {fun {x} {+ x 3}}} {add1 {fun {x} {+ x 1}}}} {bind {{x 3}} {add1 {add3 x}}}}") => 7) (test (run "{bind {{identity {fun {x} x}} {foo {fun {x} {+ x 1}}}} {{identity foo} 123}}") => 124) (test (run "{bind {{x 3}} {bind {{f {fun {y} {+ x y}}}} {bind {{x 5}} {f 4}}}}") => 7) (test (run "{{{fun {x} {x 1}} {fun {x} {fun {y} {+ x y}}}} 123}") => 124) ;; More tests for complete coverage (test (run "{bind x 5 x}") =error> "bad `bind' syntax") (test (run "{fun x x}") =error> "bad `fun' syntax") (test (run "{if x}") =error> "bad `if' syntax") (test (run "{}") =error> "bad syntax") (test (run "{bind {{x 5} {x 5}} x}") =error> "bind* duplicate names") (test (run "{fun {x x} x}") =error> "fun* duplicate names") (test (run "{+ x 1}") =error> "no binding for") (test (run "{+ 1 {fun {x} x}}") =error> "bad input") (test (run "{+ 1 {fun {x} x}}") =error> "bad input") (test (run "{1 2}") =error> "with a non-function") (test (run "{{fun {x} x}}") =error> "arity mismatch") (test (run "{if {< 4 5} 6 7}") => 6) (test (run "{if {< 5 4} 6 7}") => 7) (test (run "{if + 6 7}") => 6) (test (run "{fun {x} x}") =error> "returned a bad value") ;; Test laziness (test (run "{{fun {x} 1} {/ 9 0}}") => 1) (test (run "{{fun {x} 1} {{fun {x} {x x}} {fun {x} {x x}}}}") => 1) ;; Test lazy constructors (test (run "{bind {{l {list 1 {/ 9 0} 3}}} {+ {first l} {first {rest {rest l}}}}}") => 4) ;;; ================================================================== ---------------------------------------------------------------------- ======================================================================== >>> Side Effects in a Lazy Language We've seen that a lazy language without the call-by-need optimization is too slow to be practical, but the optimization makes using side-effects extremely confusing. Specifically, when we deal with side-effects (I/O, mutation, errors, etc) the order of evaluation matters, but in our interpreter expressions are getting evaluated as needed. (Remember tracing the prime-numbers code in lazy scheme -- numbers are tested as needed, not in order.) If we can't do these things, the question is whether there is any point in using a purely functional lazy language at all -- since computer programs often interact with an imperative world. There is a solution for this: the lazy language does not have any (sane) facilities for *doing* things (like `printf' that prints something in plain Scheme), but it can use a data structure that *describes* such operations. For example, in lazy Scheme we cannot print stuff sanely using `printf', but we can construct a string using `format' (which is just like `printf', except that it returns the formatted string instead of printing it). So (assuming Scheme syntax for simplicity), instead of: (define (foo n) (printf "~s + 1 = ~s\n" n (+ n 1))) we will write: (define (foo n) (format "~s + 1 = ~s\n" n (+ n 1))) and get back a string. We can now change the way that our interpreter deals with the output value that it receives after evaluating a lazy expression: if it receives a string, then it can take that string as denoting a request for printout, and simply print it. Such an evaluator will do the printout when the lazy evaluation is done, and everything works fine because we don't try to use any side-effects in the lazy language -- we just describe the desired side-effects, and constructing such a description does not require *performing* side-effects. But this only solves printing a single string, and nothing else. If we want to print two strings, then the only thing we can do is concatenate the two strings -- but that is not only inefficient, it cannot describe infinite output (since we will not be able to construct the infinite string in memory). So we need a better way to chain several printout representations. One way to do so is to use a list of strings, but to make things a little easier to manage, we will create a type for I/O descriptions -- and populate it with one variant holding a string (for plain printout) and one for holding a chain of two descriptions (which can be used to construct an arbitrarily long sequence of descriptions): (define-type IO [Print (str String)] [Begin2 (1st IO) (2nd IO)]) Now we can use this to chain any number of printout representations by turning them into a single `Begin2' request, which is very similar to simply using a loop to print the list. For example, the eager printout code: (: print-list : (Listof A) -> Void) (define (print-list l) (if (null? l) (printf "\n") (begin (printf "~s " (car l)) (print-list (cdr l))))) turns to the following code: (: print-list : (Listof A) -> IO) (define (print-list l) (if (null? l) (Print "\n") (Begin2 (Print (format "~s " (car l))) (print-list (cdr l))))) This will basically scan an input list like the eager version, but instead of printing the list, it will convert it into a single output request that forms a recipe for this printout. Note that within the lazy world, the result of `print-list' is just a value, there are no side effects involved. Turning this value into the actual printout is something that needs to be done on the eager side, which must be part of the implementation. In the case of Lazy Scheme, we have no access to the implementation, but we can do so in our Sloth implementation: again, `run' will inspect the result and either print a given string (if it gets a `Print' value), or print two things recursively (if it gets a `Begin2' value). (To implement this, we will add an `IOV' variant to the `VAL' type definition, and have it contain an `IO' description of the above type.) Because the sequence is constructed in the lazy world, it will not require allocating the whole sequence in memory -- it can be forced bits by bits (using `strict') as the imperative backend (the `run' part of the implementation) follows the instructions in the resulting IO description. More concretely, it will also work on an infinite list: the translation of an infinite-loop printout function will be one that returns an infinite IO description tree of `Begin2' values. This loop will also force only what it needs to print and will go on recursively printing the whole sequence (possibly not terminating). For example (again, using Scheme syntax), the infinite printout loop (: print-loop : -> Void) (define (print-loop) (printf "foo\n") (print-loop)) is translated into a function that returns an infinite tree of print operations: (: print-loop : -> IO) (define (print-loop) (Begin2 (Print "foo\n") (print-loop))) When this tree is converted to actions, it will result in an infinite loop that produces the same output -- it is essentially the same infinite loop, only now it's derived by an infinite description rather than an infinite process. Finally, how should we deal with inputs? We can add another variant to our type definition that represents a `read-line' operation, assuming that like `read-line' it does not require any arguments: (define-type IO [Print (str String)] [ReadLine] [Begin2 (1st IO) (2nd IO)]) Now the eager implementation can invoke `read-line' when it encounters a `ReadLine' value -- but what should it do with the resulting string? The solution is to use a `receiver' procedure as part of the `ReadLine' operation description. This receiver value is a kind of a "continuation" of the computation, provided as a callback value -- it will get the string that was read on the terminal, and will return a new description of side-effects that represents the rest of the process: (define-type IO [Print (str String)] [ReadLine (receiver (String -> IO))] [Begin2 (1st IO) (2nd IO)]) Now, when the eager side sees a `ReadLine' value, it will read a line, and invoke the callback function with the string that it has read. By doing this, the control goes back to the lazy world to process the value and get back another IO value to continue the processing. This results in a process where the lazy code generates some IO descriptions, then the imperative side will execute it and control goes back to the lazy code, then back to the imperative side, etc. For example, this silly loop: (: silly-loop : -> Void) (define (silly-loop) (printf "What is your name? ") (let ([name (read-line)]) (if (equal? name "quit") (printf "bye\n") (begin (printf "Your name is ~s\n" name) (silly-loop))))) is translated to: (: silly-loop : -> IO) (define (silly-loop) (Begin2 (Print "What is your name? ") (ReadLine (lambda (name) (if (equal? name "quit") (Print "bye\n") (Begin2 (Print (format "Your name is ~s\n" name)) (silly-loop))))))) Using this strategy to implement side-effects is possible, and you will do that in the homework -- some technical details are going to be different but the principle is the same as discussed above. The last problem is that the above code is difficult to work with -- in the homework you will see how to use syntactic abstractions to make thing much simpler. ========================================================================