All notes, Tuesday, April 15th
==============================

- Intro to CS4400/CS5400
- Intro to Programming Languages
- Intro to Racket
- Side-note: "Goto Statement Considered Harmful"
- Quick Intro to Racket
- Lists & Recursion
- Some Style
- Tail calls
- Sidenote on Types
- Side-note: Names are important
- BNF, Grammars, the AE Language
- Simple Parsing
- The `match` Form
- The `define-type` Form
- The `cases` Form
- Semantics (= Evaluation)
- Side-note: Compositionality
- Implementing an Evaluator
- Implementing The AE Language
- Intro to Typed Racket
- Bindings & Substitution
- WAE: Adding Bindings to AE
- Evaluation of `with`
- Formal Specs
- Lazy vs Eager Evaluation
- de Bruijn Indexes
- Functions & Function Values
- Implementing First Class Functions
- Side-note: how important is it to have *anonymous* functions?
- The FLANG Language
- Introducing Racket's `lambda`
- Using Functions as Objects
- Using `define-type` for new "type aliases"
- Currying
- Using Higher-Order & Anonymous Functions
- Side-note: "Point-Free" combinators
- This is not Runtime Code Generation
- Substitution Caches
- Implementation of Cache Functionality
- Formal Rules for Cached Substitutions
- Evaluating with Substitution Caches
- Dynamic and Lexical Scopes
- Dynamic versus Lexical Scope
- Implementing Lexical Scope: Closures and Environments
- Fixing an Overlooked Bug
- Lexical Scope using Racket Closures
- More Closures (on both levels)
- Types of Evaluators
- Feature Embedding
- Recursion, Recursion, Recursion
- Recursion without the Magic
- The Core of `make-recursive`
- Denotational Explanation of Recursion
- The Y Combinator
- The main property of Y
- Yet another explanation for Y
- Typing the Y Combinator [extra]
- Lambda Calculus --- Schlac
- Church Numerals
- More Encodings
- Alternative Church Encoding
- Implementing `define-type` & `cases` in Schlac
- Recursive Environments
- Recursion: Racket's `letrec`
- Implementing Recursion using `letrec`
- Implementing `rec` Using Cyclic Structures
- Boxes and Mutation
- Types for Boxes
- `Boxof`'s Lack of Subtyping
- Implementing a Circular Environment
- Variable Mutation
- State and Environments
- Implementing Objects with State
- The Toy Language
- Compilation
- Is this really a compiler?
- Lazy Evaluation: Using a Lazy Racket
- Lazy Evaluation: Some Issues
- Lazy Evaluation: Shell Examples
- Lazy Evaluation: Programming Examples
- Side Note: Similarity to Generators and Channels
- Call by Need vs Call by Name
- Example of Feature Embedding
- Implementing Laziness (in plain Racket)
- Sloth: A Lazy Evaluator
- Getting more from Sloth
- The Sloth Implementation
- Shouldn't there be more `ExprV` promises?
- Implementing Call by Need
- Side Effects in a Lazy Language
- Designing Domain Specific Languages (DSLs)
- Syntax Transformations: Macros
- Macro Problems
- Complexity of S-expression transformations
- Scoping problems
- Scheme (and Racket) Macros
- Meta Macros
- Lazy Constructions in an Eager Language
- Recursive Macros
- Another example: a simple loop
- Yet Another: List Comprehension
- Problems of `syntax-rules` Macros
- Racket's "Low-Level" Macros
- Solving the `syntax-rules` problems [extra]
- Breaking Hygiene, How Bad is it? [extra]
- Macros in Racket's Module System [extra]
- Defining Languages in Racket [extra]
- Macro Conclusions
- Side-note: macros in mainstream languages
- Types
- What is a Type?
- Our Types --- The Picky Language
- Typing control
- Extending Picky
- Implementing Picky
- Improving Picky
- Typing Recursion
- Extending Picky with recursion
- Typing Data
- Judgments for recursive types
- "Runaway" instances [extra]
- Type soundness
- Explicit polymorphism [extra]
- Polymorphism in the type description language [extra]
- Type judgments for explicit polymorphism and execution [extra]
- Explicit polymorphism conclusions [extra]
- Web Programming
- Basic web programming
- Continuations: Web Programming
- Simulating web reading
- More Web Transformations
- Transforming a recursive function
- Using `sum/k`
- Converting stateful code
- Converting higher order functions
- Highlevel Overview on Continuations
- An Automatic Continuation Converter
- Continuations as a Language Feature
- Continuations in Racket
- Playing with Continuations [extra]
- The Ambiguous Operator: `amb` [extra]
- Generators (and Producers) [extra]
- Delimited Continuations [extra]
- Continuation Conclusions
- Sidenote: JavaScript: Continuations, Promises, and Async/Await

------------------------------------------------------------------------
# Intro to CS4400/CS5400 [Tuesday, January 7th]

* General plan for how the course will go.

* Administrative stuff. (Mostly going over the web pages.)

  ___`https://pl.barzilay.org/`___


------------------------------------------------------------------------
# Intro to Programming Languages [Tuesday, January 7th]

> [PLAI §1]

* Why should we care about programming languages? (Any examples of big
  projects *without* a little language?)

* What defines a language?
  - syntax
  - semantics
  - libraries (runtime)
  - idioms (community)

* How important is each of these?

  ** libraries give you the run-time support, not an important part of
     the language itself. (BTW, the line between "a library" and "part
     of the language" is less obvious than it seems.)

  ** idioms originate from both language design and culture. They are
     often misleading. For example, JavaScript programmers will often
     write:

         function explorer_move() { doThis(); }
         function mozilla_move() { doThat(); }
         if (isExplorer)
           document.onmousemove = explorer_move;
         else
           document.onmousemove = mozilla_move;

     or

         if (isExplorer)
           document.onmousemove = function() { doThis(); };
         else
           document.onmousemove = function() { doThat(); };

     or

         document.onmousemove =
           isExplorer ? function() { ... }
                      : function() { ... };

     or

         document.onmousemove =
           isExplorer ? () => { doThis(); } : () => { doThat(); };

     or

         document.onmousemove = isExplorer ? doThis : doThat;

     How many JavaScript programmers will know what this does:

         function foo(n) {
           return function(m) { return m + n; };
         }

     or this:

         n => m => m+n;
         (x,y) => s => s(x,y);

     or, what seems fishy in this? ---

         const foo = (x,y) => bar(x,y)

     Yet another example:

         let x = "";
         while (foo())
           x += whatever();

     How would you *expect* this code perform? How do you think it does
     in the reality of many uses of JS by people who are not really
     programmers?

  ** Compare:
     ** `a[25]+5`                   (Java: exception)
     ** `(+ (vector-ref a 25) 5)`   (Racket: exception)
     ** `a[25]+5`                   (JavaScript: exception (or NaN))
     ** `a[25]+5`                   (Python: exception)
     ** `$a[25]+5`                  (Perl: 5)
     ** `a[25]+5`                   (C: ___BOOM___)
     => syntax is mostly in the cosmetics department; semantics is the
        real thing.

  ** Another example:
     ** `a + 1 > a`                  (Python: always true)
     ** `(> (+ a 1) a)`              (Racket: always true)
     ** `a + 1 > a`                  (C: sometimes true)

* How should we talk about semantics?

  - A few well-known formalisms for semantics.

  - We will use programs to explain semantics: the best explanation *is*
    a program.

  - Ignore possible philosophical issues with circularity (but be aware
    of them). (Actually, they are solved: Scheme has a formal
    explanation that can be taken as a translation from Scheme to logic,
    which means that things that we write can be translated to logic.)

  - We will use Racket for many reasons (syntax, functional, practical,
    simple, formal, statically typed, environment).


------------------------------------------------------------------------
# Intro to Racket [Tuesday, January 7th]

* General layout of the parts of Racket:

  - The Racket language is (mostly) in the Scheme family, or more
    generally in the Lisp family;

  - Racket: the core language implementation (language and runtime),
    written in C;

  - The actual language(s) that are available in Racket have lots of
    additional parts that are implemented in Racket itself;

  - GRacket: a portable Racket GUI extension, written in Racket too;

  - DrRacket: a GRacket application (also written in Racket);

  - Our language(s)...

* Documentation: the Racket documentation is your friend (But beware
  that some things are provided in different forms from different
  places).


------------------------------------------------------------------------
## Side-note: "Goto Statement Considered Harmful" [Tuesday, January 7th]

> A review of "Goto Statement Considered Harmful", by E.W. DIJKSTRA
>
> This paper tries to convince us that the well-known goto statement
> should be eliminated from our programming languages or, at least
> (since I don't think that it will ever be eliminated), that
> programmers should not use it. It is not clear what should replace it.
> The paper doesn't explain to us what would be the use of the `if`
> statement without a `goto` to redirect the flow of execution: Should
> all our postconditions consist of a single statement, or should we
> only use the arithmetic `if`, which doesn't contain the offensive
> `goto`?
>
> And how will one deal with the case in which, having reached the end
> of an alternative, the program needs to continue the execution
> somewhere else?
>
> The author is a proponent of the so-called "structured programming"
> style, in which, if I get it right, gotos are replaced by indentation.
> Structured programming is a nice academic exercise, which works well
> for small examples, but I doubt that any real-world program will ever
> be written in such a style. More than 10 years of industrial
> experience with Fortran have proved conclusively to everybody
> concerned that, in the real world, the goto is useful and necessary:
> its presence might cause some inconveniences in debugging, but it is a
> de facto standard and we must live with it. It will take more than the
> academic elucubrations of a purist to remove it from our languages.
>
> Publishing this would waste valuable paper: Should it be published, I
> am as sure it will go uncited and unnoticed as I am confident that, 30
> years from now, the goto will still be alive and well and used as
> widely as it is today.
>
> Confidential comments to the editor: The author should withdraw the
> paper and submit it someplace where it will not be peer reviewed. A
> letter to the editor would be a perfect choice: Nobody will notice it
> there!


------------------------------------------------------------------------
# Quick Intro to Racket [Tuesday, January 7th]

Racket syntax: similar to other Sexpr-based languages.

Reminder: the parens can be compared to C/etc function call parens ---
they always mean that some function is applied. This is the reason why
`(+ (1) (2))` won't work: if you use C syntax that is `+(1(), 2())` but
`1` isn't a function so `1()` is an error.

> An important difference between *syntax* and *semantics*: A good way
> to think about this is the difference between the *string* `42` stored
> in a file somewhere (two ASCII values), and the *number* `42` stored
> in memory (in some representation). You could also continue with the
> above example: there is nothing wrong with *"murder"* --- it's just a
> word, but *murder* is something you'll go to jail for.

The evaluation function that Racket uses is actually a function that
takes a piece of syntax and returns (or executes) its semantics.

------------------------------------------------------------------------

`define` expressions are used for creating new bindings, do not try to
use them to change values. For example, you should not try to write
something like `(define x (+ x 1))` in an attempt to mimic `x = x+1`. It
will not work.

------------------------------------------------------------------------

There are two boolean values built in to Racket: `#t` (true) and `#f`
(false). They can be used in `if` statements, for example:

    (if (< 2 3) 10 20)  -->  10

because `(< 2 3)` evaluates to `#t`. As a matter of fact, *any* value
except for `#f` is considered to be true, so:

    (if 0 1 2)        -->  1  ; all of these are "truthy"
    (if "false" 1 2)  -->  1
    (if "" 1 2)       -->  1
    (if null 1 2)     -->  1
    (if #t 1 2)       -->  1  ; including the true value
    (if #f 1 2)       -->  2  ; the only false value
    (if #false 1 2)   -->  2  ; another way to write it
    (if false 1 2)    -->  2  ; also false since it's bound to #f

------------------------------------------------------------------------

Note: Racket is a *functional* language --- so *everything* has a value.

This means that the expression

    (if test consequent)

has no meaning when `test` evaluates to `#f`. This is unlike Pascal/C
where statements *do* something (side effect) like printing or an
assignment --- here an `if` statement with no alternate part will just
*do nothing* if the test is false... Racket, however, must return some
value --- it could decide on simply returning `#f` (or some unspecified
value) as the value of

    (if #f something)

as some implementations do, but Racket just declares it a syntax error.
(As we will see in the future, Racket has a more convenient `when` with
a clearer intention.)

------------------------------------------------------------------------

Well, *almost* everything is a value...

There are certain things that are part of Racket's syntax --- for
example `if` and `define` are special forms, they do not have a value!
More about this shortly.

(Bottom line: much more things do have a value, compared with other
languages.)

------------------------------------------------------------------------

`cond` is used for a `if` … `else if` … `else if` … `else` … sequence.
The problem is that nested `if`s are inconvenient. For example,

    (define (digit-num n)
      (if (<= n 9)
        1
        (if (<= n 99)
          2
          (if (<= n 999)
            3
            (if (<= n 9999)
              4
              "a lot")))))

In C/Java/Whatever, you'd write:

    function digit_num(n) {
      if (n <= 9)         return 1;
      else if (n <= 99)   return 2;
      else if (n <= 999)  return 3;
      else if (n <= 9999) return 4;
      else return "a lot";
    }

(Side question: why isn't there a `return` statement in Racket?)

But trying to force Racket code to look similar:

    (define (digit-num n)
      (if (<= n 9)
        1
      (if (<= n 99)
        2
      (if (<= n 999)
        3
      (if (<= n 9999)
        4
        "a lot")))))

is more than just bad taste --- the indentation rules are there for a
reason, the main one is that you can see the structure of your program
at a quick glance, and this is no longer true in the above code. (Such
code will be penalized!)

So, instead of this, we can use Racket's `cond` statement, like this:

    (define (digit-num n)
      (cond [(<= n 9)    1]
            [(<= n 99)   2]
            [(<= n 999)  3]
            [(<= n 9999) 4]
            [else        "a lot"]))

Note that `else` is a keyword that is used by the `cond` form --- you
should always use an `else` clause (for similar reasons as an `if`, to
avoid an extra expression evaluation there, and we will need it when we
use a typed language). Also note that square brackets are read by
DrRacket like round parens, it will only make sure that the paren pairs
match. We use this to make code more readable --- specifically, there is
a major difference between the above use of `[]` from the conventional
use of `()`. Can you see what it is?

The general structure of a `cond`:

    (cond [test-1 expr-1]
          [test-2 expr-2]
          ...
          [test-n expr-n]
          [else   else-expr])

------------------------------------------------------------------------

Example for using an `if` expression, and a recursive function:

    (define (fact n)
      (if (zero? n)
        1
        (* n (fact (- n 1)))))

Use this to show the different tools, especially:
* special objects that *cannot* be used
* syntax-checker
* stepper
* submission tool (installing, registering and submitting)

An example of converting it to tail recursive form:

    (define (helper n acc)
      (if (zero? n)
        acc
        (helper (- n 1) (* acc n))))

    (define (fact n)
      (helper n 1))

------------------------------------------------------------------------

Additional notes about homework submissions:

* Begin every function with clear documentation: a type followed by a
  purpose statement.

* Document the function when needed, and according to the guidelines
  above and in the style guide.

* After the function, always have a few test cases --- they should cover
  your complete code (make sure to include possible corner cases). Later
  on, we will switch to testing the whole file through it's "public
  interface", instead of testing each function.


------------------------------------------------------------------------
# Lists & Recursion [Tuesday, January 7th]

Lists are a fundamental Racket data type.

A list is defined as either:

1. the empty list (`null`, `empty`, or `'()`),

2. a pair (`cons` cell) of anything and a list.

As simple as this may seem, it gives us precise *formal* rules to prove
that something is a list.

* Why is there a "the" in the first rule?

Examples:

    null
    (cons 1 null)
    (cons 1 (cons 2 (cons 3 null)))
    (list 1 2 3) ; a more convenient function to get the above

List operations --- predicates:

    null? ; true only for the empty list
    pair? ; true for any cons cell
    list? ; this can be defined using the above

We can derive `list?` from the above rules:

    (define (list? x)
      (if (null? x)
        #t
        (and (pair? x) (list? (rest x)))))

or better:

    (define (list? x)
      (or (null? x)
          (and (pair? x) (list? (rest x)))))

But why can't we define `list?` more simply as

    (define (list? x)
      (or (null? x) (pair? x)))

The difference between the above definition and the proper one can be
observed in the full Racket language, not in the student languages
(where there are no pairs with non-list values in their tails).

List operations --- destructors for pairs (`cons` cells):

    first
    rest

Traditionally called `car`, `cdr`.

Also, any `c<x>r` combination for `<x>` that is made of up to four `a`s
and/or `d`s --- we will probably not use much more than `cadr`, `caddr`
etc.

------------------------------------------------------------------------

Example for recursive function involving lists:

    (define (list-length list)
      (if (null? list)
        0
        (+ 1 (list-length (rest list)))))

Use different tools, esp:
* syntax-checker
* stepper

How come we could use `list` as an argument --- use the syntax checker

    (define (list-length-helper list len)
      (if (null? list)
        len
        (list-length-helper (rest list) (+ len 1))))

    (define (list-length list)
      (list-length-helper list 0))

Main idea: lists are a recursive structure, so functions that operate on
lists should be recursive functions that follow the recursive definition
of lists.

Another example for list function --- summing a list of numbers

    (define (sum-list l)
      (if (null? l)
        0
        (+ (first l) (sum-list (rest l)))))

Also show how to implement `rcons`, using this guideline.

------------------------------------------------------------------------

More examples:

Define `reverse` --- solve the problem using `rcons`.

`rcons` can be generalized into something very useful: `append`.

* How would we use `append` instead of `rcons`?

* How much time will this take? Does it matter if we use `append` or
  `rcons`?

Redefine `reverse` using tail recursion.

* Is the result more complex? (Yes, but not too bad because it collects
  the elements in reverse.)


------------------------------------------------------------------------
# Some Style [Tuesday, January 7th]

When you have some common value that you need to use in several places,
it is bad to duplicate it. For example:

    (define (how-many a b c)
      (cond [(> (* b b) (* 4 a c)) 2]
            [(= (* b b) (* 4 a c)) 1]
            [(< (* b b) (* 4 a c)) 0]))

What's bad about it?

* It's longer than necessary, which will eventually make your code less
  readable.

* It's slower --- by the time you reach the last case, you have
  evaluated the two sequences three times.

* It's more prone to bugs --- the above code is short enough, but what
  if it was longer so you don't see the three occurrences on the same
  page? Will you remember to fix all places when you debug the code
  months after it was written?

In general, the ability to use names is probably the most fundamental
concept in computer science --- the fact that makes computer programs
what they are.

We already have a facility to name values: function arguments. We could
split the above function into two like this:

    (define (how-many-helper b^2 4ac) ; note: valid names!
      (cond [(> b^2 4ac) 2]
            [(= b^2 4ac) 1]
            [else        0]))

    (define (how-many a b c)
      (how-many-helper (* b b) (* 4 a c)))

But instead of the awkward solution of coming up with a new function
just for its names, we have a facility to bind local names --- `let`. In
general, the syntax for a `let` special form is

    (let ([id expr] ...) expr)

For example,

    (let ([x 1] [y 2]) (+ x y))

But note that the bindings are done "in parallel", for example, try
this:

    (let ([x 1] [y 2])
      (let ([x y] [y x])
        (list x y)))

(Note that "in parallel" is quoted here because it's not really
parallelism, but just a matter of scopes: the RHSs are all evaluated in
the surrounding scope!)

Using this for the above problem:

    (define (how-many a b c)
      (let ([b^2 (* b b)]
            [4ac (* 4 a c)])
        (cond [(> b^2 4ac) 2]
              [(= b^2 4ac) 1]
              [else        0])))

------------------------------------------------------------------------

* Some notes on writing code (also see the style-guide in the handouts
  section)

* ***Code quality will be graded to in this course!***

* Use abstractions whenever possible, as said above. This is bad:

      (define (how-many a b c)
        (cond
          [(> (* b b) (* 4 a c)) 2]
          [(= (* b b) (* 4 a c)) 1]
          [(< (* b b) (* 4 a c)) 0]))

      (define (what-kind a b c)
        (cond
          [(= a 0) 'degenerate]
          [(> (* b b) (* 4 a c)) 'two]
          [(= (* b b) (* 4 a c)) 'one]
          [(< (* b b) (* 4 a c)) 'none]))

* But don't over abstract: `(define one 1)` or `(define two "two")`

* Always do test cases, you might want to comment them, but you should
  always make sure your code works. Use DrRacket's covergae features to
  ensure complete coverage.

* Do not under-document, but also don't over-document.

* ***INDENTATION!***  (Let DrRacket decide; get used to its rules)
  --> This is part of the culture that was mentioned last time, but it's
      done this way for good reason: decades of programming experience
      have shown this to be the most readable format. It's also
      extremely important to keep good indentation since programmers in
      all Lisps don't count parens --- they look at the structure.

* As a general rule, `if` should be either all on one line, or the
  condition on the first and each consequent on a separate line.
  Similarly for `define` --- either all on one line or a newline after
  the object that is being define (either an identifier or a an
  identifier with arguments).

* Another general rule: you should never have white space after an
  open-paren, or before a close paren (white space includes newlines).
  Also, before an open paren there should be either another open paren
  or white space, and the same goes for after a closing paren.

* Use the tools that are available to you: for example, use `cond`
  instead of nested `if`s (definitely do not force the indentation to
  make a nested `if` look like its C counterpart --- remember to let
  DrRacket indent for you).

  Another example --- do not use `(+ 1 (+ 2 3))` instead of `(+ 1 2 3)`
  (this might be needed in *extremely* rare situations, only when you
  know your calculus and have extensive knowledge about round-off
  errors).

  Another example --- do not use `(cons 1 (cons 2 (cons 3 null)))`
  instead of `(list 1 2 3)`.

  Also --- don't write things like:

      (if (< x 100) #t #f)

  since it's the same as just

      (< x 100)

  A few more of these:

      (if x #t y)   --same-as-->  (or x y)             ; (almost)
      (if x y #f)   --same-as-->  (and x y)            ; (exacly same)
      (if x #f #t)  --same-as-->  (not x)              ; (almost)

  (Actually the first two are almost the same, for example, `(and 1 2)`
  will return `2`, not `#t`.)

* Use these as examples for many of these issues:

      (define (interest x)
        (* x (cond
          [(and (> x 0) (<= x 1000)) 0.04]
          [(and (> x 1000) (<= x 5000)) 0.045]
          [else 0.05])))

      (define (how-many a b c)
        (cond ((> (* b b) (* (* 4 a) c))
               2)
              ((< (* b b) (* (* 4 a) c))
               0)
              (else
               1)))

      (define (what-kind a b c)
        (if (equal? a 0) 'degenerate
            (if (equal? (how-many a b c) 0) 'zero
                (if (equal? (how-many a b c) 1) 'one
                    'two)
                )
            )
        )

      (define (interest deposit)
        (cond
        [(< deposit 0) "invalid deposit"]
        [(and (>= deposit 0) (<= deposit 1000)) (* deposit 1.04) ]
        [(and (> deposit 1000) (<= deposit 5000)) (* deposit 1.045)]
        [(> deposit 5000) (* deposit 1.05)]))

      (define (interest deposit)
        (if (< deposit 1001) (* 0.04 deposit)
          (if (< deposit 5001) (* 0.045 deposit)
          (* 0.05 deposit))))

      (define (what-kind a b c) (cond ((= 0 a) 'degenerate)
                                (else (cond ((> (* b b)(*(* 4 a) c)) 'two)
                                     (else (cond ((= (* b b)(*(* 4 a) c)) 'one)
                                     (else 'none)))))));


------------------------------------------------------------------------
# Tail calls [Tuesday, January 7th]

You should generally know what tail calls are, but here's a quick review
of the subject. A function call is said to be in tail position if there
is no context to "remember" when you're calling it. Very roughly, this
means that function calls that are not nested as argument expressions of
another *call* are tail calls. Pay attention that we're talking about
*function calls*, not, for example, being nested in an `if` expression
since that's not a function. (The same holds for `cond`, `and`, `or`.)

This definition is something that depends on the context, for example,
in an expression like

    (if (huh?)
      (foo (add1 (* x 3)))
      (foo (/ x 2)))

both calls to `foo` are tail calls, but they're tail calls of *this*
expression and therefore apply to *this* context. It might be that this
code is inside another call, as in

    (blah (if (huh?)
            (foo (add1 (* x 3)))
            (foo (/ x 2)))
          something-else)

and the `foo` calls are now *not* in tail position. The main feature of
all Scheme implementations including Racket (and including Javascript)
WRT tail calls is that calls that are in tail position of a function are
said to be "eliminated". That means that if we're in an `f` function,
and we're about to call `g` in tail position and therefore whatever `g`
returns would be the result of `f` too, then when Racket does the call
to `g` it doesn't bother keeping the `f` context --- it won't remember
that it needs to "return" to `f` and will instead return straight to its
caller. In other words, when you think about a conventional
implementation of function calls as frames on a stack, Racket will get
rid of a stack frame when it can.

You can also try this with any code in DrRacket: hovering over the paren
that starts a function call will show a faint pinkish arrow showing the
tail-call chain from there for call that are actually tail calls. This
is a simple feature since tail calls are easily identifiable by just
looking at the syntax of a function.

Another way to see this is to use DrRacket's stepper to step through a
function call. The stepper is generally an alternative debugger, where
instead of visualizing stack frames it assembles an expression that
represents these frames. Now, in the case of tail calls, there is no
room in such a representation to keep the call --- and the thing is that
in Racket that's perfectly fine since these calls are not kept on the
call stack.

Note that there are several names for this feature:

* "Tail recursion". This is a common way to refer to the more limited
  optimization of *only* tail-recursive functions into loops. In
  languages that have tail calls as a feature, this is too limited,
  since they also optimize cases of mutual recursion, or any case of a
  tail call.

* "Tail call optimization". In some languages, or more specifically in
  some compilers, you'll hear this term. This is fine when tail calls
  are considered only an "optimization" --- but in Racket's case (as
  well as Scheme), it's more than just an optimization: it's a *language
  feature* that you can rely on. For example, a tail-recursive function
  like `(define (loop) (loop))` *must* run as an infinite loop, not just
  optimized to one when the compiler feels like it.

* "Tail call elimination". This is the so far the most common proper
  name for the feature: it's not just recursion, and it's not an
  optimization.

### When should you use tail calls? ####################################

Often, people who are aware of tail calls will try to use them *always*.
That's not always a good idea. You should generally be aware of the
tradeoffs when you consider what style to use. The main thing to
remember is that tail-call elimination is a property that helps reducing
*space* use (stack space) --- often reducing it from linear space to
constant space. This can obviously make things faster, but usually the
speedup is just a constant factor since you need to do the same number
of iterations anyway, so you just reduce the time spent on space
allocation.

Here is one such example that we've seen:

    (define (list-length-1 list)
      (if (null? list)
        0
        (+ 1 (list-length-1 (rest list)))))

    ;; versus

    (define (list-length-helper list len)
      (if (null? list)
        len
        (list-length-helper (rest list) (+ len 1))))
    (define (list-length-2 list)
      (list-length-helper list 0))

In this case the first (recursive) version version consumes space linear
to the length of the list, whereas the second version needs only
constant space. But if you consider only the asymptotic runtime, they
are both *O(length(`l`))*.

A second example is a simple implementation of `map`:

    (define (map-1 f l)
      (if (null? l) l (cons (f (first l)) (map-1 f (rest l)))))

    ;; versus

    (define (map-helper f l acc)
      (if (null? l)
        (reverse acc)
        (map-helper f (rest l) (cons (f (first l)) acc))))
    (define (map-2 f l)
      (map-helper f l '()))

In this case, both the asymptotic space and the runtime consumption are
the same. In the recursive case we have a constant factor for the stack
space, and in the iterative one (the tail-call version) we also have a
similar factor for accumulating the reversed list. In this case, it is
probably better to keep the first version since the code is simpler. In
fact, Racket's stack space management can make the first version run
faster than the second --- so optimizing it into the second version is
useless.


------------------------------------------------------------------------
## Sidenote on Types [Tuesday, January 7th]

> Note: this is all just a side note for a particularly hairy example.
> You don't need to follow all of this to write code in this class!
> Consider this section a kind of an extra type-related puzzle to read
> trough, and maybe get back to it much later, after we cover
> typechecking.

Types can become interestingly complicated when dealing with
higher-order functions. Specifically, the nature of the type system used
by Typed Racket makes it have one important weakness: it often fails to
infer types when there are higher-order functions that operate on
polymorphic functions.

For example, consider how `map` receives a function and a list of some
type, and applies the function over this list to accumulate its output,
so it's a polymorphic function with the following type:

    map : (A -> B) (Listof A) -> (Listof B)

But Racket's `map` is actually more flexible that that: it can take more
than a single list input, in which case it will apply the function on
the first element in all lists, then the second and so on. Narrowing our
vision to the two-input-lists case, the type of `map` then becomes:

    map : (A B -> C) (Listof A) (Listof B) -> (Listof C)

Now, here's a hairy example --- what is the type of this function:

    (define (foo x y)
      (map map x y))

Begin by what we know --- both `map`s, call them `map1` and `map2`, have
the double- and single-list types of `map` respectively, here they are,
with different names for types:

    ;; the first `map', consumes a function and two lists
    map1 : (A B -> C) (Listof A) (Listof B) -> (Listof C)

    ;; the second `map', consumes a function and one list
    map2 : (X -> Y) (Listof X) -> (Listof Y)

Now, we know that `map2` is the first argument to `map1`, so the type of
`map1`s first argument should be the type of `map2`:

    (A B -> C) = (X -> Y) (Listof X) -> (Listof Y)

From here we can conclude that

    A = (X -> Y)

    B = (Listof X)

    C = (Listof Y)

If we use these equations in `map1`'s type, we get:

    map1 : ((X -> Y) (Listof X) -> (Listof Y))
           (Listof (X -> Y))
           (Listof (Listof X))
           -> (Listof (Listof Y))

Now, `foo`'s two arguments are the 2nd and 3rd arguments of `map1`, and
its result is `map1`s result, so we can now write our "estimated" type
of `foo`:

    (: foo : (Listof (X -> Y))
             (Listof (Listof X))
             -> (Listof (Listof Y)))
    (define (foo x y)
      (map map x y))

This should help you understand why, for example, this will cause a type
error:

    (foo (list add1 sub1 add1) (list 1 2 3))

and why this is valid:

    (foo (list add1 sub1 add1) (map list (list 1 2 3)))

***But...!***

There's a big "but" here which is that weakness of Typed Racket that was
mentioned. If you try to actually write such a defninition in `#lang pl`
(which is based on Typed Racket), you will first find that you need to
explicitly list the type variable that are needed to make it into a
generic type. So the above becomes:

    (: foo : (All (X Y)
               (Listof (X -> Y))
               (Listof (Listof X))
               -> (Listof (Listof Y))))
    (define (foo x y)
      (map map x y))

But not only does that not work --- it throws an obscure type error.
That error is actually due to TR's weakness: it's a result of not being
able to infer the proper types. In such cases, TR has two mechanisms to
"guide it" in the right direction. The first one is `inst`, which is
used to instantiate a generic (= polymorphic) type some actual type. The
problem here is with the second `map` since that's the polymorphic
function that is given to a higher-order function (the first `map`). If
we provide the types to instantiate this, it will work fine:

    (: foo : (All (X Y)
               (Listof (X -> Y))
               (Listof (Listof X))
               -> (Listof (Listof Y))))
    (define (foo x y)
      (map (inst map Y X) x y))

Now, you can use this definition to run the above example:

    (foo (list add1 sub1 add1) (list (list 1) (list 2) (list 3)))

This example works fine, but that's because we wrote the list argument
explicitly. If you try to use the exact example above,

    (foo (list add1 sub1 add1) (map list (list 1 2 3)))

you'd run into the same problem again, since this also uses a
polymorphic function (`list`) with a higher-order one (`map`). Indeed,
an `inst` can make this work for this too:

    (foo (list add1 sub1 add1) (map (inst list Number) (list 1 2 3)))

The second facility is `ann`, which can be used to annotate an
expression with the type that you expect it to have.

    (define (foo x y)
      (map (ann map ((X -> Y) (Listof X) -> (Listof Y)))
           x y))

(Note: this is not type casting! It's using a different type which is
also applicable for the given expression, and having the type checker
validate that this is true. TR does have a similar `cast` form, which is
used for a related but different cases.)

This tends to be more verbose than `inst`, but is sometimes easier to
follow, since the expected type is given explicitly. The thing about
`inst` is that it's kind of "applying" a polymorphic `(All (A B) ...)`
type, so you need to know the order of the `A B` arguments, which is why
in the above we use `(inst map Y X)` rather than `(inst map X Y)`.

> Again, remember that this is all not something that you need to know.
> We will have a few (very rare) cases where we'll need to use `inst`,
> and in each of these, you'll be told where and how to use it.


------------------------------------------------------------------------
## Side-note: Names are important [Tuesday, January 7th]

An important "discovery" in computer science is that we *don't* need
names for every intermediate sub-expression --- for example, in almost
any language we can write something like:

    s = (-b + sqrt(b^2 - 4*a*c)) / (2*a)

instead of

    x₁ = b * b
    y₁ = 4 * a
    y₂ = y * c
    x₂ = x - y
    x₃ = sqrt(x)
    y₃ = -b
    x₄ = y + x
    y₄ = 2 * a
    s  = x / y

Such languages are put in contrast to assembly languages, and were all
put under the generic label of "high level languages".

(Here's an interesting idea --- why not do the same for function
values?)


------------------------------------------------------------------------
# BNF, Grammars, the AE Language [Tuesday, January 14th]

Getting back to the theme of the course: we want to investigate
programming languages, and we want to do that *using* a programming
language.

The first thing when we design a language is to specify the language.
For this we use BNF (Backus-Naur Form). For example, here is the
definition of a simple arithmetic language:

    <AE> ::= <num>
           | <AE> + <AE>
           | <AE> - <AE>

Explain the different parts. Specifically, this is a mixture of
low-level (concrete) syntax definition with parsing.

We use this to derive expressions in some language. We start with
`<AE>`, which should be one of these:
* a number `<num>`
* an `<AE>`, the text "`+`", and another `<AE>`
* the same but with "`-`"

`<num>` is a terminal: when we reach it in the derivation, we're done.
`<AE>` is a non-terminal: when we reach it, we have to continue with one
of the options. It should be clear that the `+` and the `-` are things
we expect to find in the input --- because they are not wrapped in
`<>`s.

We could specify what `<num>` is (turning it into a `<NUM>`
non-terminal):

    <AE> ::= <NUM>
           | <AE> + <AE>
           | <AE> - <AE>

    <NUM> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
            | <NUM> <NUM>

But we don't --- why? Because in Racket we have numbers as primitives
and we want to use Racket to implement our languages. This makes life a
lot easier, and we get free stuff like floats, rationals etc.

To use a BNF formally, for example, to prove that `1-2+3` is a valid
`<AE>` expression, we first label the rules:

    <AE> ::= <num>        (1)
           | <AE> + <AE>  (2)
           | <AE> - <AE>  (3)

and then we can use them as formal justifications for each derivation
step:

    <AE>
    <AE> + <AE>         ; (2)
    <AE> + <num>        ; (1)
    <AE> - <AE> + <num> ; (3)
    <AE> - <AE> + 3     ; (num)
    <num> - <AE> + 3    ; (1)
    <num> - <num> + 3   ; (1)
    1 - <num> + 3       ; (num)
    1 - 2 + 3           ; (num)

This would be one way of doing this. Alternatively, we can can visualize
the derivation using a tree, with the rules used at the nodes.

These specifications suffer from being ambiguous: an expression can be
derived in multiple ways. Even the little syntax for a number is
ambiguous --- a number like `123` can be derived in two ways that result
in trees that look different. This ambiguity is not a "real" problem
now, but it will become one very soon. We want to get rid of this
ambiguity, so that there is a single (= deterministic) way to derive all
expressions.

There is a standard way to resolve that --- we add another non-terminal
to the definition, and make it so that each rule can continue to exactly
one of its alternatives. For example, this is what we can do with
numbers:

    <NUM>   ::= <DIGIT> | <DIGIT> <NUM>

    <DIGIT> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Similar solutions can be applied to the `<AE>` BNF --- we either
restrict the way derivations can happen or we come up with new
non-terminals to force a deterministic derivation trees.

As an example of restricting derivations, we look at the current
grammar:

    <AE> ::= <num>
           | <AE> + <AE>
           | <AE> - <AE>

and instead of allowing an `<AE>` on both sides of the operation, we
force one to be a number:

    <AE> ::= <num>
           | <num> + <AE>
           | <num> - <AE>

Now there is a single way to derive any expression, and it is always
associating operations to the right: an expression like `1+2+3` can only
be derived as `1+(2+3)`. To change this to left-association, we would
use this:

    <AE> ::= <num>
           | <AE> + <num>
           | <AE> - <num>

But what if we want to force precedence? Say that our AE syntax has
addition and multiplication:

    <AE> ::= <num>
           | <AE> + <AE>
           | <AE> * <AE>

We can do that same thing as above and add new non-terminals --- say one
for "products":

    <AE>   ::= <num>
             | <AE> + <AE>
             | <PROD>

    <PROD> ::= <num>
             | <PROD> * <PROD>

Now we must parse any AE expression as additions of multiplications (or
numbers). First, note that if `<AE>` goes to `<PROD>` and that goes to
`<num>`, then there is no need for an `<AE>` to go to a `<num>`, so this
is the same syntax:

    <AE>   ::= <AE> + <AE>
             | <PROD>

    <PROD> ::= <num>
             | <PROD> * <PROD>

Now, if we want to still be able to multiply additions, we can force
them to appear in parentheses:

    <AE>   ::= <AE> + <AE>
             | <PROD>

    <PROD> ::= <num>
             | <PROD> * <PROD>
             | ( <AE> )

Next, note that `<AE>` is still ambiguous about additions, which can be
fixed by forcing the left hand side of an addition to be a factor:

    <AE>   ::= <PROD> + <AE>
             | <PROD>

    <PROD> ::= <num>
             | <PROD> * <PROD>
             | ( <AE> )

We still have an ambiguity for multiplications, so we do the same thing
and add another non-terminal for "atoms":

    <AE>   ::= <PROD> + <AE>
             | <PROD>

    <PROD> ::= <ATOM> * <PROD>
             | <ATOM>

    <ATOM> ::= <num>
             | ( <AE> )

And you can try to derive several expressions to be convinced that
derivation is always deterministic now.

But as you can see, this is exactly the cosmetics that we want to avoid
--- it will lead us to things that might be interesting, but unrelated
to the principles behind programming languages. It will also become much
much worse when we have a real language rather such a tiny one.

Is there a good solution? --- It is right in our face: do what Racket
does --- always use fully parenthesized expressions:

    <AE> ::= <num>
           | ( <AE> + <AE> )
           | ( <AE> - <AE> )

To prevent confusing Racket code with code in our language(s), we also
change the parentheses to curly ones:

    <AE> ::= <num>
           | { <AE> + <AE> }
           | { <AE> - <AE> }

But in Racket *everything* has a value --- including those `+`s and
`-`s, which makes this extremely convenient with future operations that
might have either more or less arguments than 2 as well as treating
these arithmetic operators as plain functions. In our toy language we
will not do this initially (that is, `+` and `-` are second order
operators: they cannot be used as values). But since we will get to it
later, we'll adopt the Racket solution and use a fully-parenthesized
prefix notation:

    <AE> ::= <num>
           | { + <AE> <AE> }
           | { - <AE> <AE> }

(Remember that in a sense, Racket code is written in a form of
already-parsed syntax...)


------------------------------------------------------------------------
# Simple Parsing [Tuesday, January 14th]

On to an implementation of a "parser":

Unrelated to what the syntax actually looks like, we want to parse it as
soon as possible --- converting the concrete syntax to an abstract
syntax tree.

No matter how we write our syntax:
- `3+4`     (infix),
- `3 4 +`   (postfix),
- `+(3,4)`  (prefix with args in parens),
- `(+ 3 4)` (parenthesized prefix),

we always mean the same abstract thing --- adding the number `3` and the
number `4`. The essence of this is basically a tree structure with an
addition operation as the root and two leaves holding the two numerals.

With the right data definition, we can describe this in Racket as the
expression `(Add (Num 3) (Num 4))` where `Add` and `Num` are
constructors of a tree type for syntax, or in a C-like language, it
could be something like `Add(Num(3),Num(4))`.

Similarly, the expression `(3-4)+7` will be described in Racket as the
expression:

    (Add (Sub (Num 3) (Num 4)) (Num 7))

Important note: "expression" was used in two *different* ways in the
above --- each way corresponds to a different language, and the result
of evaluating the second "expression" is a Racket value that
*represents* the first expression.

To define the data type and the necessary constructors we will use this:

    (define-type AE
      [Num Number]
      [Add AE AE]
      [Sub AE AE])

* Note --- Racket follows the tradition of Lisp which makes syntax
  issues almost negligible --- the language we use is almost as if we
  are using the parse tree directly. Actually, it is a very simple
  syntax for parse trees, one that makes parsing extremely easy.

  [This has an interesting historical reason... Some Lisp history ---
  *M-expressions* vs. *S-expressions*, and the fact that we write code
  that is isomorphic to an AST. Later we will see some of the advantages
  that we get by doing this. See also "*The Evolution of Lisp*", section
  3.5.1. Especially the last sentence:

  > Therefore we expect future generations of Lisp programmers to
  > continue to reinvent Algol-style syntax for Lisp, over and over and
  > over again, and we are equally confident that they will continue,
  > after an initial period of infatuation, to reject it. (Perhaps this
  > process should be regarded as a rite of passage for Lisp hackers.)

  And an interesting & modern *counter*-example of this [here](
  https://ts-ast-viewer.com/#code/DYUwLgBAghC8EEYDcAoFYCeAHE06NSA).]

To make things very simple, we will use the above fact through a
double-level approach:

* we first "parse" our language into an intermediate representation ---
  a Racket list --- this is mostly done by a modified version of
  Racket's `read` function that uses curly `{}` braces instead of round
  `()` parens,

* then we write our own `parse` function that will parse the resulting
  list into an instance of the `AE` type --- an abstract syntax tree
  (AST).

This is achieved by the following simple recursive function:

    (: parse-sexpr : Sexpr -> AE)
    ;; parses s-expressions into AEs
    (define (parse-sexpr sexpr)
      (cond [(number? sexpr) (Num sexpr)]
            [(and (list? sexpr) (= 3 (length sexpr)))
             (let ([make-node
                    (match (first sexpr)
                      ['+ Add]
                      ['- Sub]
                      [else (error 'parse-sexpr "unknown op: ~s"
                                   (first sexpr))])
                    #| the above is the same as:
                    (cond [(equal? '+ (first sexpr)) Add]
                          [(equal? '- (first sexpr)) Sub]
                          [else (error 'parse-sexpr "unknown op: ~s"
                                       (first sexpr))])
                    |#])
               (make-node (parse-sexpr (second sexpr))
                          (parse-sexpr (third sexpr))))]
            [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

This function is pretty simple, but as our languages grow, they will
become more verbose and more difficult to write. So, instead, we use a
new special form: `match`, which is matching a value and binds new
identifiers to different parts (try it with "Check Syntax"). Re-writing
the above code using `match`:

    (: parse-sexpr : Sexpr -> AE)
    ;; parses s-expressions into AEs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n) (Num n)]
        [(list '+ left right)
         (Add (parse-sexpr left) (parse-sexpr right))]
        [(list '- left right)
         (Sub (parse-sexpr left) (parse-sexpr right))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

And finally, to make it more uniform, we will combine this with the
function that parses a string into a sexpr so we can use strings to
represent our programs:

    (: parse : String -> AE)
    ;; parses a string containing an AE expression to an AE
    (define (parse str)
      (parse-sexpr (string->sexpr str)))


------------------------------------------------------------------------
# The `match` Form [Tuesday, January 14th]

The syntax for `match` is

    (match value
      [pattern result-expr]
      ...)

The value is matched against each pattern, possibly binding names in the
process, and if a pattern matches it evaluates the result expression.
The simplest form of a pattern is simply an identifier --- it always
matches and binds that identifier to the value:

    (match (list 1 2 3)
      [x x]) ; evaluates to the list

Another simple pattern is a quoted symbol, which matches that symbol.
For example:

    (match foo
      ['x "yes"]
      [else "no"])

will evaluate to `"yes"` if `foo` is the symbol `x`, and to `"no"`
otherwise. Note that `else` is not a keyword here --- it happens to be a
pattern that always succeeds, so it behaves like an else clause except
that it binds `else` to the unmatched-so-far value.

Many patterns look like function application --- but don't confuse them
with applications. A `(list x y z)` pattern matches a list of exactly
three items and binds the three identifiers; or if the "arguments" are
themselves patterns, `match` will descend into the values and match them
too. More specifically, this means that patterns can be nested:

    (match (list 1 2 3)
      [(list x y z) (+ x y z)]) ; evaluates to 6
    (match (list 1 2 3)
      [(cons x (list y z)) (+ x y z)]) ; matches the same shape (also 6)
    (match '((1) (2) 3)
      [(list (list x) (list y) z) (+ x y z)]) ; also 6

As seen above, there is also a `cons` pattern that matches a non-empty
list and then matches the first part against the head for the list and
the second part against the tail of the list.

In a `list` pattern, you can use `...` to specify that the previous
pattern is repeated zero or more times, and bound names get bound to the
list of respective matching. One simple consequent is that the `(list hd
tl ...)` pattern is exactly the same as `(cons hd tl)`, but being able
to repeat an arbitrary pattern is very useful:

    > (match '((1 2) (3 4) (5 6) (7 8))
        [(list (list x y) ...) (list x y)])
    '((1 3 5 7) (2 4 6 8))

A few more useful patterns:

    id              -- matches anything, binds `id' to it
    _               -- matches anything, but does not bind
    (number: n)     -- matches any number and binds it to `n'
    (symbol: s)     -- same for symbols
    (string: s)     -- strings
    (sexpr: s)      -- S-expressions (needed sometimes for Typed Racket)
    (and pat1 pat2) -- matches both patterns
    (or pat1 pat2)  -- matches either pattern (careful with bindings)

Note that the `foo:` patterns are all specific to our `#lang pl`, they
are not part of `#lang racket` or `#lang typed/racket`.

The patterns are tried one by one *in-order*, and if no pattern matches
the value, an error is raised.

Note that `...` in a `list` pattern can follow *any* pattern, including
all of the above, and including nested list patterns.

Here are a few examples --- you can try them out with `#lang pl untyped`
at the top of the definitions window. This:

    (match x
      [(list (symbol: syms) ...) syms])

matches `x` against a pattern that accepts only a list of symbols, and
binds `syms` to those symbols. If you want to match only a list of, say,
one or more symbols, then just add one before the `...`-ed pattern
variable:

    (match x
      [(list (symbol: sym) (symbol: syms) ...) syms])
    ;; same as:
    (match x
      [(cons (symbol: sym) (list (symbol: syms) ...)) syms])

which will match such a non-empty list, where the whole list (on the
right hand side) is `(cons sym syms)`.

Here's another example that matches a list of any number of lists, where
each of the sub-lists begins with a symbol and then has any number of
numbers. Note how the `n` and `s` bindings get values for a list of all
symbols and a list of lists of the numbers:

    > (define (foo x)
        (match x
          [(list (list (symbol: s) (number: n) ...) ...)
           (list 'symbols: s 'numbers: n)]))
    > (foo (list (list 'x 1 2 3) (list 'y 4 5)))
    '(symbols: (x y) numbers: ((1 2 3) (4 5)))

Here is a quick example for how `or` is used with two literal
alternatives, how `and` is used to name a specific piece of data, and
how `or` is used with a binding:

    > (define (foo x)
        (match x
          [(list (or 1 2 3)) 'single]
          [(list (and x (list 1 _)) 2) x]
          [(or (list 1 x) (list 2 x)) x]))
    > (foo (list 3))
    'single
    > (foo (list (list 1 99) 2))
    '(1 99)
    > (foo (list 1 10))
    10
    > (foo (list 2 10))
    10


------------------------------------------------------------------------
# The `define-type` Form [Tuesday, January 14th]

The class language that we're using, `#lang pl`, is based on *Typed
Racket*: a statically-typed dialect of Racket. It is not exactly the
same as Typed Racket --- it is restricted in many ways, and extended in
a few ways. (You should therefore try to avoid looking at the Typed
Racket documentation and expect things to be the same in `#lang pl`.)

The most important extension is `define-type`, which is the construct we
will be using to create new user-defined types. In general, such
definitions looks like what we just used:

    (define-type AE
      [Num Number]
      [Add AE AE]
      [Sub AE AE])

This defines a *new type* called `AE`, an `AE?` predicate for this type,
and a few *variants* for this type: `Num`, `Add`, and `Sub` in this
case. Each of these variant names is a constructor, taking in arguments
with the listed types, where these types can include the newly defined
type itself in (the very common) case we're defining a recursive type.
The return type is always the newly defined type, `AE` here.

To summarize, this definition gives us a new `AE` type, and three
constructors, as if we wrote the following type declarations:

* `(: Num : Number -> AE)`
* `(: Add : AE AE -> AE)`
* `(: Sub : AE AE -> AE)`

The newly defined types are known as *"disjoint unions"*, since values
in these types are disjoint --- there is no overlap between the
different variants. As we will see, this is what makes this such a
useful construct for our needs: the compiler knows about the variants of
each newly defined type, which will make it possible for it to complain
if we extend a type with more variants but not update all uses of the
type.

Furthermore, since the return types of these constructors are all the
new type itself, there is *no way* for us to write code that expects
just *one* of these variants. We will use a second form, `cases`, to
handle these values.


------------------------------------------------------------------------
# The `cases` Form [Tuesday, January 14th]

A `define-type` declaration defines *only* what was described above: one
new type name and a matching predicate, and a few variants as
constructor functions. Unlike HtDP, we don't get predicates for each of
the variants, and we don't get accessor functions for the fields of the
variants.

The way that we handle the new kind of values is with `cases`: this is a
form that is very similar to `match`, but is specific to instances of
the user-defined type.

> Many students find it confusing to distinguish `match` and `cases`
> since they are so similar. Try to remember that `match` is for
> primitive Racket values (we'll mainly use them for S-expression
> values), while `cases` is for user-defined values. The distinction
> between the two forms is unfortunate, and doesn't serve any purpose.
> It is just technically difficult to unify the two.

For example, code that handles `AE` values (as defined above) can look
as follows:

    (cases some-ae-value
      [(Num n)   "a number"]
      [(Add l r) "an addition"]
      [(Sub l r) "a subtraction"])

As you can see, we need to have patterns for each of the listed variants
(and the compiler will throw an error if some are missing), and each of
these patterns specifies bindings that will get the field values
contained in a given variant object.

We can also use nested patterns:

    (cases some-ae-value
      [(Num n)               "a number"]
      [(Add (Num m) (Num n)) "a simple addition"]
      [(Add l r)             "an addition"]
      [(Sub (Num m) (Num n)) "a simple subtraction"]
      [(Sub l r)             "a subtraction"])

but this is a feature that we will not use too often.

The final clause in a `cases` form can be an `else` clause, which serves
as a fallback in case none of the previous clauses matched the input
value. However, using an `else` like this is ***strongly discouraged!***
The problem with using it is that it effectively eliminates the
advantage in getting the type-checker to complain when a type definition
is extended with new variants. Using these `else` clauses, we can
actually mimic all of the functionality that you expect in HtDP-style
code, which demonstrates that this is equivalent to HtDP-style
definitions. For example:

    (: Add? : AE -> Boolean)
    ;; identifies instances of the `Add` variant
    (define (Add? ae)
      (cases ae
        [(Add l r) #t]
        [else #f]))

    (: Add-left : AE -> AE)
    ;; get the left-hand subexpression of an addition
    (define (Add-left ae)
      (cases ae
        [(Add l r) l]
        [else (error 'Add-left "expecting an Add value, got ~s" ae)]))

    ...

***Important reminder:*** this is code that ***you should not write!***
Doing so will lead to code that is more fragile than just using `cases`,
since you'd be losing the protection the compiler gives you in the form
of type errors on occurrences of `cases` that need to be updated when a
type is extended with new variants. You would therefore end up writing a
bunch of boiler-plate code only to end up with lower-quality code. The
core of the problem is in the prevalent use of `else` which gives up
that protection.

In these examples the `else` clause is justified because even if `AE` is
extended with new variants, functions like `Add?` and `Add-left` should
not be affected and treat the new variants as they treat all other
non-`Add` instances. (And since `else` is inherent to these functions,
using them in our code is inherently a bad idea.) We will, however, have
a few (*very few!*) places where we'll need to use `else` --- but this
will always be done only on some specific functionality rather than a
wholesale approach of defining a different interface for user-defined
types.


------------------------------------------------------------------------
# Semantics (= Evaluation) [Tuesday, January 14th]

> [PLAI §2]

Back to BNF --- now, meaning.

An important feature of these BNF specifications: we can use the
derivations to specify *meaning* (and meaning in our context is
"running" a program (or "interpreting", "compiling", but we will use
"evaluating")). For example:

    <AE> ::= <num>         ; <AE> evaluates to the number
           | <AE1> + <AE2> ; <AE> evaluates to the sum of evaluating
                           ;      <AE1> and <AE2>
           | <AE1> - <AE2> ; ... the subtraction of <AE2> from <AE1>
                                   (... roughly!)

To do this a little more formally:

    a. eval(<num>) = <num> ;*** special rule: translate syntax to value
    b. eval(<AE1> + <AE2>) = eval(<AE1>) + eval(<AE2>)
    c. eval(<AE1> - <AE2>) = eval(<AE1>) - eval(<AE2>)

Note the completely different roles of the two `+`s and `-`s. In fact,
it might have been more correct to write:

    a. eval("<num>") = <num>
    b. eval("<AE1> + <AE2>") = eval("<AE1>") + eval("<AE2>")
    c. eval("<AE1> - <AE2>") = eval("<AE1>") - eval("<AE2>")

or even using a marker to denote meta-holes in these strings:

    a. eval("$<num>") = <num>
    b. eval("$<AE1> + $<AE2>") = eval("$<AE1>") + eval("$<AE2>")
    c. eval("$<AE1> - $<AE2>") = eval("$<AE1>") - eval("$<AE2>")

but we will avoid pretending that we're doing that kind of string
manipulation. (For example, it will require specifying what does it mean
to return `<num>` for `$<num>` (involves `string->number`), and the
fragments on the right side mean that we need to specify these as
substring operations.)

Note that there's a similar kind of informality in our BNF
specifications, where we assume that `<foo>` refers to some terminal or
non-terminal. In texts that require more formal specifications (for
example, in RFC specifications), each literal part of the BNF is usually
double-quoted, so we'd get

    <AE> ::= <num> | <AE1> "+" <AE2> | <AE1> "-" <AE2>

An alternative popular notation for `eval(X)` is `⟦X⟧`:

    a. [[<num>]] = <num>
    b. [[<AE1> + <AE2>]] = [[<AE1>]] + [[<AE2>]]
    c. [[<AE1> - <AE2>]] = [[<AE1>]] - [[<AE2>]]

Is there a problem with this definition? Ambiguity:

    eval(1 - 2 + 3) = ?

Depending on the way the expression is parsed, we can get either a
result of `2` or `-4`:

    eval(1 - 2 + 3) = eval(1 - 2) + eval(3)          [b]
                    = eval(1) - eval(2) + eval(3)    [c]
                    = 1 - 2 + 3                      [a,a,a]
                    = 2

    eval(1 - 2 + 3) = eval(1) - eval(2 + 3)          [c]
                    = eval(1) - (eval(2) + eval(3))  [a]
                    = 1 - (2 + 3)                    [a,a,a]
                    = -4

Again, be very aware of confusing subtleties which are extremely
important: We need parens around a sub-expression only in one side, why?
--- When we write:

    eval(1 - 2 + 3) = ... = 1 - 2 + 3

we have two expressions, but one stands for an *input syntax*, and one
stands for a real mathematical expression.

In a case of a computer implementation, the syntax on the left is (as
always) an AE syntax, and the real expression on the right is an
expression in whatever language we use to implement our AE language.

Like we said earlier, ambiguity is not a real problem until the actual
parse tree matters. With `eval` it definitely matters, so we must not
make it possible to derive any syntax in multiple ways or our evaluation
will be non-deterministic.

------------------------------------------------------------------------

Quick exercise:

We can define a meaning for `<digit>`s and then `<num>`s in a similar
way:

    <NUM> ::= <digit> | <digit> <NUM>

    eval(0) = 0
    eval(1) = 1
    eval(2) = 2
    ...
    eval(9) = 9

    eval(<digit>) = <digit>
    eval(<digit> <NUM>) = 10*eval(<digit>) + eval(<NUM>)

Is this exactly what we want? --- Depends on what we actually want...

* First, there's a bug in this code --- having a BNF derivation like

      <NUM> ::= <digit> | <digit> <NUM>

  is unambiguous, but makes it hard to parse a number. We get:

       eval(123) = 10*eval(1) + eval(23)
                 = 10*1 + 10*eval(2) + eval(3)
                 = 10*1 + 10*2 + 3
                 = 33

  Changing the order of the last rule works much better:

      <NUM> ::= <digit> | <NUM> <digit>

  and then:

      eval(<NUM> <digit>) = 10*eval(<NUM>) + eval(<digit>)

* As a concrete example see how you would make it work with `107`, which
  demonstrates why compositionality is important.

* Example for free stuff that looks trivial: if we were to define the
  meaning of numbers this way, would it always work? Think an average
  language that does not give you bignums, making the above rules fail
  when the numbers are too big. In Racket, we happen to be using an
  integer representation for the syntax of integers, and both are
  unlimited. But what if we wanted to write a Racket compiler in C or a
  C compiler in Racket? What about a C compiler in C, where the compiler
  runs on a 64 bit machine, and the result needs to run on a 32 bit
  machine?


------------------------------------------------------------------------
## Side-note: Compositionality [Tuesday, January 14th]

The example of

    <NUM> ::= <digit> | <NUM> <digit>

being a language that is easier to write an evaluator for leads us to an
important concept --- compositionality. This definition is easier to
write an evaluator for, since the resulting language is compositional:
the meaning of an expression --- for example `123` --- is composed out
of the meaning of its two parts, which in this BNF are `12` and `3`.
Specifically, the evaluation of `<NUM> <digit>` is `10 *` the evaluation
of the first, plus the evaluation of the second. In the `<digit> <NUM>`
case this is more difficult --- the meaning of such a number depends not
only on the *meaning* of the two parts, but also on the `<NUM>`
*syntax*:

    eval(<digit> <NUM>) =
      eval(<digit>) * 10^length(<NUM>) + eval(<NUM>)

This this case this can be tolerable, since the meaning of the
expression is still made out of its parts --- but imperative programming
(when you use side effects) is much more problematic since it is not
compositional (at least not in the obvious sense). This is compared to
functional programming, where the meaning of an expression is a
combination of the meanings of its subexpressions. For example, every
sub-expression in a functional program has some known meaning, and these
all make up the meaning of the expression that contains them --- but in
an imperative program we can have a part of the code be `x++` --- and
that doesn't have a meaning by itself, at least not one that contributes
to the meaning of the whole program in a direct way.

(Actually, we can have a well-defined meaning for such an expression:
the meaning is going from a world where `x` is a container of some value
N, to a world where the same container has a different value N+1. You
can probably see now how this can make things more complicated. On an
intuitive level --- if we look at a random part of a functional program
we can tell its meaning, so building up the meaning of the whole code is
easy, but in an imperative program, the meaning of a random part is
pretty much useless.)


------------------------------------------------------------------------
# Implementing an Evaluator [Tuesday, January 14th]

Now continue to implement the semantics of our syntax --- we express
that through an `eval` function that evaluates an expression.

We use a basic programming principle --- splitting the code into two
layers, one for parsing the input, and one for doing the evaluation.
Doing this avoids the mess we'd get into otherwise, for example:

    (define (eval sexpr)
      (match sexpr
        [(number: n) n]
        [(list '+ left right) (+ (eval left) (eval right))]
        [(list '- left right) (- (eval left) (eval right))]
        [else (error 'eval "bad syntax in ~s" sexpr)]))

This is messy because it combines two very different things --- syntax
and semantics --- into a single lump of code. For this particular kind
of evaluator it looks simple enough, but this is only because it's
simple enough that all we do is replace constructors by arithmetic
operations. Later on things will get more complex, and bundling the
evaluator with the parser will be more problematic. (Note: the fact that
we can replace constructors with the run-time operators mean that we
have a very simple, calculator-like language, and that we can, in fact,
"compile" all programs down to a number.)

If we split the code, we can easily include decisions like making

    {+ 1 {- 3 "a"}}

syntactically invalid. (Which is not, BTW, what Racket does...) (Also,
this is like the distinction between XML syntax and well-formed XML
syntax.)

An additional advantage is that by using two separate components, it is
simple to replace each one, making it possible to change the input
syntax, and the semantics independently --- we only need to keep the
same interface data (the AST) and things will work fine.

Our `parse` function converts an input syntax to an abstract syntax tree
(AST). It is abstract exactly because it is independent of any actual
concrete syntax that you type in, print out etc.


------------------------------------------------------------------------
# Implementing The AE Language [Tuesday, January 14th]

Back to our `eval` --- this will be its (obvious) type:

    (: eval : AE -> Number)
    ;; consumes an AE and computes
    ;; the corresponding number

which leads to some obvious test cases:

    (equal? 3 (eval (parse "3")))
    (equal? 7 (eval (parse "{+ 3 4}")))
    (equal? 6 (eval (parse "{+ {- 3 4} 7}")))

which from now on we will write using the new `test` form that the
`#lang pl` language provides:

    (test (eval (parse "3"))             => 3)
    (test (eval (parse "{+ 3 4}"))       => 7)
    (test (eval (parse "{+ {- 3 4} 7}")) => 6)

Note that we're testing *only* at the interface level --- only running
whole functions. For example, you could think about a test like:

    (test (parse "{+ {- 3 4} 7}")
          => (Add (Sub (Num 3) (Num 4)) (Num 7)))

but the details of parsing and of the constructor names are things that
nobody outside of our evaluator cares about --- so we're not testing
them. In fact, we shouldn't even mention `parse` in these tests, since
it is not part of the public interface of our users; they only care
about using it as a compiler-like black box. (This is sometimes called
"integration tests".) We'll address this shortly.

Like everything else, the structure of the recursive `eval` code follows
the recursive structure of its input. In HtDP terms, our template is:

    (: eval : AE -> Number)
    (define (eval expr)
      (cases expr
        [(Num n)   ... n ...]
        [(Add l r) ... (eval l) ... (eval r) ...]
        [(Sub l r) ... (eval l) ... (eval r) ...]))

In this case, filling in the gaps is very simple

    (: eval : AE -> Number)
    (define (eval expr)
      (cases expr
        [(Num n)   n]
        [(Add l r) (+ (eval l) (eval r))]
        [(Sub l r) (- (eval l) (eval r))]))

We now further combine `eval` and `parse` into a single `run` function
that evaluates an AE string.

    (: run : String -> Number)
    ;; evaluate an AE program contained in a string
    (define (run str)
      (eval (parse str)))

This function becomes the single public entry point into our code, and
the only thing that should be used in tests that verify our interface:

    (test (run "3")             => 3)
    (test (run "{+ 3 4}")       => 7)
    (test (run "{+ {- 3 4} 7}") => 6)

The resulting *full* code is:

    ;;; ---<<<AE>>>-----------------------------------------------------
    #lang pl

    #| BNF for the AE language:
       <AE> ::= <num>
              | { + <AE> <AE> }
              | { - <AE> <AE> }
              | { * <AE> <AE> }
              | { / <AE> <AE> }
    |#

    ;; AE abstract syntax trees
    (define-type AE
      [Num Number]
      [Add AE AE]
      [Sub AE AE]
      [Mul AE AE]
      [Div AE AE])

    (: parse-sexpr : Sexpr -> AE)
    ;; parses s-expressions into AEs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n) (Num n)]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> AE)
    ;; parses a string containing an AE expression to an AE AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    (: eval : AE -> Number)
    ;; consumes an AE and computes the corresponding number
    (define (eval expr)
      (cases expr
        [(Num n)   n]
        [(Add l r) (+ (eval l) (eval r))]
        [(Sub l r) (- (eval l) (eval r))]
        [(Mul l r) (* (eval l) (eval r))]
        [(Div l r) (/ (eval l) (eval r))]))

    (: run : String -> Number)
    ;; evaluate an AE program contained in a string
    (define (run str)
      (eval (parse str)))

    ;; tests
    (test (run "3") => 3)
    (test (run "{+ 3 4}") => 7)
    (test (run "{+ {- 3 4} 7}") => 6)

(Note that the tests are done with a `test` form, which we mentioned
above.)

For anyone who thinks that Racket is a bad choice, this is a good point
to think how much code would be needed in some other language to do the
same as above.


------------------------------------------------------------------------
# Intro to Typed Racket [Tuesday, January 14th]

The plan:

* Why Types?

* Why Typed Racket?

* What's Different about Typed Racket?

* Some Examples of Typed Racket for Course Programs

### Types ##############################################################

- Who has used a (statically) typed language?

- Who has used a typed language that's not Java?

Typed Racket will be both similar to and very different from anything
you've seen before.

### Why types? #########################################################

- Types help structure programs.

- Types provide enforced and mandatory documentation.

- Types help catch errors.

Types ***will*** help you. A *lot*.

### Structuring programs ###############################################

- Data definitions

      ;; An AE is one of:   ; \
      ;;  (make-Num Number) ;  > HtDP
      ;;  (make-Add AE AE)  ; /

      (define-type AE       ; \
        [Num number?]       ;  > Predicates =~= contracts (PLAI)
        [Add AE? AE?])      ; /    (has names of defined types too)

      (define-type AE       ; \
        [Num Number]        ;  > Typed Racket (our PL)
        [Add AE AE])        ; /

- Data-first

  The structure of your program is derived from the structure of your
  data.

  You have seen this in Fundamentals with the design recipe and with
  templates. In this class, we will see it extensively with type
  definitions and the (cases ...) form. Types make this pervasive --- we
  have to think about our data before our code.

- A language for describing data

  Instead of having an informal language for describing types in
  contract lines, and a more formal description of predicates in a
  `define-type` form, we will have a single, unified language for both
  of these. Having such a language means that we get to be more precise
  and more expressive (since the typed language covers cases that you
  would otherwise dismiss with some hand waving, like "a function").

### Why Typed Racket? ##################################################

Racket is the language we all know, and it has the benefits that we
discussed earlier. Mainly, it is an excellent language for experimenting
with programming languages.

- Typed Racket allows us to take our Racket programs and typecheck them,
  so we get the benefits of a statically typed language.

- Types are an important programming language feature; Typed Racket will
  help us understand them.

[Also: the development of Typed Racket is happening here in
Northeastern, and will benefit from your feedback.]

### How is Typed Racket different from Racket ##########################

- Typed Racket will reject your program if there are type errors! This
  means that it does that at compile-time, *before* any code gets to
  run.

- Typed Racket files start like this:

      #lang typed/racket
      ;; Program goes here.

  but we will use a variant of the Typed Racket language, which has a
  few additional constructs:

      #lang pl
      ;; Program goes here.

- Typed Racket requires you to write the contracts on your functions.

  Racket:

      ;; f : Number -> Number
      (define (f x)
        (* x (+ x 1)))

  Typed Racket:

      #lang pl
      (: f : Number -> Number)
      (define (f x)
        (* x (+ x 1)))

  [In the "real" Typed Racket the preferred style is with prefix arrows:

      #lang typed/racket
      (: f (-> Number Number))
      (define (f x) : Number
        (* x (+ x 1)))

  and you can also have the type annotations appear inside the
  definition:

      #lang typed/racket
      (define (f [x : Number]) : Number
        (* x (+ x 1)))

  but we will not use these form.]

- As we've seen, Typed Racket uses types, not predicates, in
  `define-type`.

      (define-type AE
        [Num Number]
        [Add AE AE])

  versus

      (define-type AE
        [Num number?]
        [Add AE? AE?])

- There are other differences, but these will suffice for now.

### Examples ###########################################################

    (: digit-num : Number -> (U Number String))
    (define (digit-num n)
      (cond [(<= n 9)    1]
            [(<= n 99)   2]
            [(<= n 999)  3]
            [(<= n 9999) 4]
            [else        "a lot"]))

    (: fact : Number -> Number)
    (define (fact n)
      (if (zero? n)
        1
        (* n (fact (- n 1)))))

    (: helper : Number Number -> Number)
    (define (helper n acc)
      (if (zero? n)
        acc
        (helper (- n 1) (* acc n))))
    (: fact : Number -> Number)
    (define (fact n)
      (helper n 1))

    (: fact : Number -> Number)
    (define (fact n)
      (: helper : Number Number -> Number)
      (define (helper n acc)
        (if (zero? n)
          acc
          (helper (- n 1) (* acc n))))
      (helper n 1))

    (: every? : (All (A) (A -> Boolean) (Listof A) -> Boolean))
    ;; Returns false if any element of lst fails the given pred,
    ;; true if all pass pred.
    (define (every? pred lst)
      (or (null? lst)
          (and (pred (first lst))
               (every? pred (rest lst)))))

    (define-type AE
      [Num Number]
      [Add AE AE]
      [Sub AE AE])

    ;; the only difference in the following definition is
    ;; using (: <name> : <type>) instead of ";; <name> : <type>"

    (: parse-sexpr : Sexpr -> AE)
    ;; parses s-expressions into AEs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n) (Num n)]
        [(list '+ left right)
         (Add (parse-sexpr left) (parse-sexpr right))]
        [(list '- left right)
         (Sub (parse-sexpr left) (parse-sexpr right))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

### More interesting examples ##########################################

* Typed Racket is designed to be a language that is friendly to the kind
  of programs that people write in Racket. For example, it has unions:

      (: foo : (U String Number) -> Number)
      (define (foo x)
        (if (string? x)
          (string-length x)
          ;; at this point it knows that `x' is not a
          ;; string, therefore it must be a number
          (+ 1 x)))

  This is not common in statically typed languages, which are usually
  limited to only *disjoint* unions. For example, in OCaml you'd write
  this definition:

      type string_or_number = Str of string | Int of int ;;
      let foo x = match x with Str s -> String.length s
                             | Int i -> i+1 ;;

  And use it with an explicit constructor:

      foo (Str "bar") ;;
      foo (Int 3) ;;

* Note that in the Typed Racket case, the language keeps track of
  information that is gathered via predicates --- which is why it knows
  that one `x` is a String, and the other is a Number.

* Typed Racket has a concept of subtypes --- which is also something
  that most statically typed languages lack. In fact, the fact that it
  has (arbitrary) unions means that it must have subtypes too, since a
  type is always a subtype of a union that contains this type.

* Another result of this feature is that there is an `Any` type that is
  the union of all other types. Note that you can always use this type
  since everything is in it --- but it gives you the *least* information
  about a value. In other words, Typed Racket gives you a choice: *you*
  decide which type to use, one that is very restricted but has a lot of
  information about its values to a type that is very permissive but has
  almost no useful information. This is in contrast to other type system
  (HM systems) where there is always exactly one correct type.

  To demonstrate, consider the identity function:

      (define (id x) x)

  You could use a type of `(: id : Integer -> Integer)` which is very
  restricted, but you know that the function always returns an integer
  value.

  Or you can make it very permissive with a `(: id : Any -> Any)`, but
  then you know nothing about the result --- in fact, `(+ 1 (id 2))`
  will throw a type error. It *does* return `2`, as expected, but the
  type checker doesn't know the type of that `2`. If you wanted to use
  this type, you'd need to check that the result is a number, eg:

      (let ([x (id 123)]) (if (number? x) (+ x 10) 999))

  This means that for this particular function there is no good
  *specific* type that we can choose --- but there are *polymorphic*
  types. These types allow propagating their input type(s) to their
  output type. In this case, it's a simple "my output type is the same
  as my input type":

      (: id : (All (A) A -> A))

  This makes the output preserve the same level of information that you
  had on its input.

* Another interesting thing to look at is the type of `error`: it's a
  function that returns a type of `Nothing` --- a type that is the same
  as an *empty* union: `(U)`. It's a type that has no values in it ---
  it fits `error` because it *is* a function that doesn't return any
  value, in fact, it doesn't return at all. In addition, it means that
  an `error` expression can be used anywhere you want because it is a
  subtype of anything at all.

* An `else` clause in a `cond` expression is almost always needed, for
  example:

      (: digit-num : Number -> (U Number String))
      (define (digit-num n)
        (cond [(<= n 9)    1]
              [(<= n 99)   2]
              [(<= n 999)  3]
              [(<= n 9999) 4]
              [(>  n 9999) "a lot"]))

  (and if you think that the type checker should know what this is
  doing, then how about

      (> (* n 10) (/ (* (- 10000 1) 20) 2))

  or

      (>= n 10000)

  for the last test?)

* In some rare cases you will run into one limitation of Typed Racket:
  it is difficult (that is: a generic solution is not known at the
  moment) to do the right inference when polymorphic functions are
  passed around to higher-order functions. For example:

      (: call : (All (A B) (A -> B) A -> B))
      (define (call f x)
        (f x))
      (call rest (list 4))

  In such cases, we can use `inst` to *instantiate* a function with a
  polymorphic type to a given type --- in this case, we can use it to
  make it treat `rest` as a function that is specific for numeric lists:

      (call (inst rest Number) (list 4))

  In other rare cases, Typed Racket will infer a type that is not
  suitable for us --- there is another form, `ann`, that allows us to
  specify a certain type. Using this in the `call` example is more
  verbose:

      (call (ann rest : ((Listof Number) -> (Listof Number))) (list 4))

  However, these are going to be rare and will be mentioned explicitly
  whenever they're needed.


------------------------------------------------------------------------
# Bindings & Substitution [Tuesday, January 14th]

We now get to an important concept: substitution.

Even in our simple language, we encounter repeated expressions. For
example, if we want to compute the square of some expression:

    {* {+ 4 2} {+ 4 2}}

Why would we want to get rid of the repeated sub-expression?

* It introduces a redundant computation. In this example, we want to
  avoid computing the same sub-expression a second time.

* It makes the computation more complicated than it could be without the
  repetition. Compare the above with:

      with x = {+ 4 2},
        {* x x}

* This is related to a basic fact in programming that we have already
  discussed: duplicating information is always a bad thing. Among other
  bad consequences, it can even lead to bugs that could not happen if we
  wouldn't duplicate code. A toy example is "fixing" one of the numbers
  in one expression and forgetting to fix the other:

      {* {+ 4 2} {+ 4 1}}

  Real world examples involve much more code, which make such bugs very
  difficult to find, but they still follow the same principle.

* This gives us more expressive power --- we don't just say that we want
  to multiply two expressions that both happen to be `{+ 4 2}`, we say
  that we multiply the `{+ 4 2}` expression by *itself*. It allows us to
  express identity of two values as well as using two values that happen
  to be the same.

So, the normal way to avoid redundancy is to introduce an identifier.
Even when we speak, we might say: "let x be 4 plus 2, multiply x by x".

(These are often called "variables", but we will try to avoid this name:
what if the identifier does not change (vary)?)

To get this, we introduce a new form into our language:

    {with {x {+ 4 2}}
      {* x x}}

We expect to be able to reduce this to:

    {* 6 6}

by substituting 6 for `x` in the body sub-expression of `with`.

A little more complicated example:

    {with {x {+ 4 2}}
      {with {y {* x x}}
        {+ y y}}}
    [add]  = {with {x 6} {with {y {* x x}} {+ y y}}}
    [subst]= {with {y {* 6 6}} {+ y y}}
    [mul]  = {with {y 36} {+ y y}}
    [subst]= {+ 36 36}
    [add]  = 72


------------------------------------------------------------------------
# WAE: Adding Bindings to AE [Tuesday, January 21st]

> [PLAI §3]

To add this to our language, we start with the BNF. We now call our
language "WAE" (With+AE):

    <WAE> ::= <num>
            | { + <WAE> <WAE> }
            | { - <WAE> <WAE> }
            | { * <WAE> <WAE> }
            | { / <WAE> <WAE> }
            | { with { <id> <WAE> } <WAE> }
            | <id>

Note that we had to introduce *two* new rules: one for introducing an
identifier, and one for using it. This is common in many language
specifications, for example `define-type` introduces a new type, and it
comes with `cases` that allows us to destruct its instances.

For `<id>` we need to use some form of identifiers, the natural choice
in Racket is to use symbols. We can therefore write the corresponding
type definition:

    (define-type WAE
      [Num  Number]
      [Add  WAE WAE]
      [Sub  WAE WAE]
      [Mul  WAE WAE]
      [Div  WAE WAE]
      [Id   Symbol]
      [With Symbol WAE WAE])

The parser is easily extended to produce these syntax objects:

    (: parse-sexpr : Sexpr -> WAE)
    ;; parses s-expressions into WAEs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(list 'with (list (symbol: name) named) body)
         (With name (parse-sexpr named) (parse-sexpr body))]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

But note that this parser is inconvenient --- if any of these
expressions:

    {* 1 2 3}
    {foo 5 6}
    {with x 5 {* x 8}}
    {with {5 x} {* x 8}}

would result in a "bad syntax" error, which is not very helpful. To make
things better, we can add another case for `with` expressions that are
malformed, and give a more specific message in that case:

    (: parse-sexpr : Sexpr -> WAE)
    ;; parses s-expressions into WAEs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(list 'with (list (symbol: name) named) body)
         (With name (parse-sexpr named) (parse-sexpr body))]
        [(cons 'with more)
         (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

and finally, to group all of the parsing code that deals with `with`
expressions (both valid and invalid ones), we can use a single case for
both of them:

    (: parse-sexpr : Sexpr -> WAE)
    ;; parses s-expressions into WAEs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         ;; go in here for all sexpr that begin with a 'with
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

And now we're done with the syntactic part of the `with` extension.

> Quick note --- why would we indent `With` like a normal function in
> code like this
>
>     (With 'x
>           (Num 2)
>           (Add (Id 'x) (Num 4)))
>
> instead of an indentation that looks like a `let`
>
>     (With 'x (Num 2)
>       (Add (Id 'x) (Num 4)))
>
> ?
>
> The reason for this is that the second indentation looks like a
> binding construct (eg, the indentation used in a `let` expression),
> but `With` is *not* a binding form --- it's a *plain function* because
> it's at the Racket level. You should therefore keep in mind the huge
> difference between that `With` and the `with` that appears in WAE
> programs:
>
>     {with {x 2}
>       {+ x 4}}
>
> Another way to look at it: imagine that we intend for the language to
> be used by Spanish/Chinese/German/French speakers. In this case we
> would translate "`with`":
>
>     {con  {x 2} {+ x 4}}
>     {he   {x 2} {+ x 4}}
>     {mit  {x 2} {+ x 4}}
>     {avec {x 2} {+ x 4}}
>     {c    {x 2} {+ x 4}}
>
> but we will *not* do the same for `With` if we (the language
> implementors) are English speakers.


------------------------------------------------------------------------
# Evaluation of `with` [Tuesday, January 21st]

Now, to make this work, we will need to do some substitutions.

We basically want to say that to evaluate:

    {with {id WAE1} WAE2}

we need to evaluate `WAE2` with id substituted by `WAE1`. Formally:

    eval( {with {id WAE1} WAE2} )
      = eval( subst(WAE2,id,WAE1) )

There is a more common syntax for substitution (quick: what do I mean by
this use of "syntax"?):

    eval( {with {id WAE1} WAE2} )
      = eval( WAE2[WAE1/id] )

> Side-note: this syntax originates with logicians who used `[x/v]e`,
> and later there was a convention that mimicked the more natural order
> of arguments to a function with `e[x->v]`, and eventually both of
> these got combined into `e[v/x]` which is a little confusing in that
> the left-to-right order of the arguments is not the same as for the
> `subst` function.

Now all we need is an exact definition of substitution.

> Note that substitution is not the same as evaluation, it's only a part
> of the evaluation process. In the previous examples, when we evaluated
> the expression we did substitutions as well as the usual arithmetic
> operations that were already part of the AE evaluator. In this last
> definition there is still a missing evaluation step, see if you can
> find it.

So let us try to define substitution now:

> Substitution (take 1):  `e[v/i]` \
>   To substitute an identifier `i` in an expression `e` with an
>   expression `v`, replace all identifiers in `e` that have the same
>   name `i` by the expression `v`.

This seems to work with simple expressions, for example:

    {with {x 5} {+ x x}}   -->  {+ 5 5}
    {with {x 5} {+ 10 4}}  -->  {+ 10 4}

however, we crash with an invalid syntax if we try:

    {with {x 5} {+ x {with {x 3} 10}}}
      -->  {+ 5 {with {5 3} 10}} ???

--- we got to an invalid expression.

To fix this, we need to distinguish normal occurrences of identifiers,
and ones that are used as new bindings. We need a few new terms for
this:

1. Binding Instance: a binding instance of an identifier is one that is
   used to name it in a new binding. In our `<WAE>` syntax, binding
   instances are only the `<id>` position of the `with` form.

2. Scope: the scope of a binding instance is the region of program text
   in which instances of the identifier refer to the value bound in the
   binding instance. (Note that this definition actually relies on a
   definition of substitution, because that is what is used to specify
   how identifiers refer to values.)

3. Bound Instance (or Bound Occurrence / Identifier): an instance of an
   identifier is bound if it is contained within the scope of a binding
   instance of its name.

4. Free Instance (or Free Occurrence / Identifier): An identifier that
   is not contained in any binding instance of its name is said to be
   free.

Using this we can say that the problem with the previous definition of
substitution is that it failed to distinguish between bound instances
(which should be substituted) and binding instances (which should not).
So we try to fix this:

> Substitution (take 2):  `e[v/i]` \
>   To substitute an identifier `i` in an expression `e` with an
>   expression `v`, replace all instances of `i` that are not themselves
>   binding instances with the expression `v`.

First of all, check the previous examples:

    {with {x 5} {+ x x}}   -->  {+ 5 5}
    {with {x 5} {+ 10 4}}  -->  {+ 10 4}

still work, and

    {with {x 5} {+ x {with {x 3} 10}}}
      -->  {+ 5 {with {x 3} 10}}
      -->  {+ 5 10}

also works. However, if we try this:

    {with {x 5}
      {+ x {with {x 3}
             x}}}

we get:

    -->  {+ 5 {with {x 3} 5}}
    -->  {+ 5 5}
    -->  10

but we want that to be `8`: the inner `x` should be bound by the closest
`with` that binds it.

The problem is that the new definition of substitution that we have
respects binding instances, but it fails to deal with their scope. In
the above example, we want the inner `with` to *shadow* the outer
`with`'s binding for `x`.

> Substitution (take 3):  `e[v/i]` \
>   To substitute an identifier `i` in an expression `e` with an
>   expression `v`, replace all instances of `i` that are not themselves
>   binding instances, and that are not in any nested scope, with the
>   expression `v`.

This avoids bad substitution above, but it is now doing things too
carefully:

    {with {x 5} {+ x {with {y 3} x}}}

becomes

    -->  {+ 5 {with {y 3} x}}
    -->  {+ 5 x}

which is an error because `x` is unbound (and there is reasonable no
rule that we can specify to evaluate it).

The problem is that our substitution halts at every new scope, in this
case, it stopped at the new `y` scope, but it shouldn't have because it
uses a different name. In fact, that last definition of substitution
cannot handle any nested scope.

Revise again:

> Substitution (take 4):  `e[v/i]` \
>   To substitute an identifier `i` in an expression `e` with an
>   expression `v`, replace all instances of `i` that are not themselves
>   binding instances, and that are not in any nested scope of `i`, with
>   the expression `v`.

which, finally, is a good definition. This is just a little too
mechanical. Notice that we actually refer to all instances of `i` that
are not in a scope of a binding instance of `i`, which simply means all
*free occurrences* of `i` --- free in `e` (why? --- remember the
definition of "free"?):

> Substitution (take 4b):  `e[v/i]` \
>   To substitute an identifier `i` in an expression `e` with an
>   expression `v`, replace all instances of `i` that are free in `e`
>   with the expression `v`.

Based on this we can finally write the code for it:

    (: subst : WAE Symbol WAE -> WAE)
    ;; substitutes the second argument with the third argument in the
    ;; first argument, as per the rules of substitution; the resulting
    ;; expression contains no free instances of the second argument
    (define (subst expr from to) ; returns expr[to/from]
      (cases expr
        [(Num n) expr]
        [(Add l r) (Add (subst l from to) (subst r from to))]
        [(Sub l r) (Sub (subst l from to) (subst r from to))]
        [(Mul l r) (Mul (subst l from to) (subst r from to))]
        [(Div l r) (Div (subst l from to) (subst r from to))]
        [(Id name) (if (eq? name from) to expr)]
        [(With bound-id named-expr bound-body)
         (if (eq? bound-id from)
           expr           ;*** don't go in!
           (With bound-id
                 named-expr
                 (subst bound-body from to)))]))

... and this is just the same as writing a formal "paper version" of the
substitution rule.

We still have bugs: but we'll need some more work to get to them.

------------------------------------------------------------------------

Before we find the bugs, we need to see when and how substitution is
used in the evaluation process.

To modify our evaluator, we will need rules to deal with the new syntax
pieces --- `with` expressions and identifiers.

When we see an expression that looks like:

    {with {x E1} E2}

we continue by *evaluating* `E1` to get a value `V1`, we then substitute
the identifier `x` with the expression `V1` in `E2`, and continue by
evaluating this new expression. In other words, we have the following
evaluation rule:

    eval( {with {x E1} E2} )
      = eval( E2[eval(E1)/x] )

So we know what to do with `with` expressions. How about identifiers?
The main feature of `subst`, as said in the purpose statement, is that
it leaves no free instances of the substituted variable around. This
means that if the initial expression is valid (did not contain any free
variables), then when we go from

    {with {x E1} E2}

to

    E2[E1/x]

the result is an expression that has *no* free instances of `x`. So we
don't need to handle identifiers in the evaluator --- substitutions make
them all go away.

We can now extend the formal definition of AE to that of WAE:

    eval(...) = ... same as the AE rules ...
    eval({with {x E1} E2}) = eval(E2[eval(E1)/x])
    eval(id) = error!

If you're paying close attention, you might catch a potential problem in
this definition: we're substituting `eval(E1)` for `x` in `E2` --- an
operation that requires a WAE expression, but `eval(E1)` is a number.
(Look at the type of the `eval` definition we had for AE, then look at
the above definition of `subst`.) This seems like being overly pedantic,
but we it will require some resolution when we get to the code. The
above rules are easily coded as follows:

    (: eval : WAE -> Number)
    ;; evaluates WAE expressions by reducing them to numbers
    (define (eval expr)
      (cases expr
        [(Num n) n]
        [(Add l r) (+ (eval l) (eval r))]
        [(Sub l r) (- (eval l) (eval r))]
        [(Mul l r) (* (eval l) (eval r))]
        [(Div l r) (/ (eval l) (eval r))]
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      (Num (eval named-expr))))] ;***
        [(Id name) (error 'eval "free identifier: ~s" name)]))

Note the `Num` expression in the marked line: evaluating the named
expression gives us back a number --- we need to convert this number
into a syntax to be able to use it with `subst`. The solution is to use
`Num` to convert the resulting number into a numeral (the syntax of a
number). It's not an elegant solution, but it will do for now.

Finally, here are a few test cases. We use a new `test` special form
which is part of the course plugin. The way to use `test` is with two
expressions and an `=>` arrow --- DrRacket evaluates both, and nothing
will happen if the results are equal. If the results are different, you
will get a warning line, but evaluation will continue so you can try
additional tests. You can also use an `=error>` arrow to test an error
message --- use it with some text from the expected error, `?` stands
for any single character, and `*` is a sequence of zero or more
characters. (When you use `test` in your homework, the handin server
will abort when tests fail.) We expect these tests to succeed (make sure
that you understand *why* they should succeed).

    ;; tests
    (test (run "5") => 5)
    (test (run "{+ 5 5}") => 10)
    (test (run "{with {x {+ 5 5}} {+ x x}}") => 20)
    (test (run "{with {x 5} {+ x x}}") => 10)
    (test (run "{with {x {+ 5 5}} {with {y {- x 3}} {+ y y}}}") => 14)
    (test (run "{with {x 5} {with {y {- x 3}} {+ y y}}}") => 4)
    (test (run "{with {x 5} {+ x {with {x 3} 10}}}") => 15)
    (test (run "{with {x 5} {+ x {with {x 3} x}}}") => 8)
    (test (run "{with {x 5} {+ x {with {y 3} x}}}") => 10)
    (test (run "{with {x 5} {with {y x} y}}") => 5)
    (test (run "{with {x 5} {with {x x} x}}") => 5)
    (test (run "{with {x 1} y}") =error> "free identifier")

Putting this all together, we get the following code; trying to run this
code will raise an unexpected error...

    #lang pl

    #| BNF for the WAE language:
         <WAE> ::= <num>
                 | { + <WAE> <WAE> }
                 | { - <WAE> <WAE> }
                 | { * <WAE> <WAE> }
                 | { / <WAE> <WAE> }
                 | { with { <id> <WAE> } <WAE> }
                 | <id>
    |#

    ;; WAE abstract syntax trees
    (define-type WAE
      [Num  Number]
      [Add  WAE WAE]
      [Sub  WAE WAE]
      [Mul  WAE WAE]
      [Div  WAE WAE]
      [Id   Symbol]
      [With Symbol WAE WAE])

    (: parse-sexpr : Sexpr -> WAE)
    ;; parses s-expressions into WAEs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> WAE)
    ;; parses a string containing a WAE expression to a WAE AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    (: subst : WAE Symbol WAE -> WAE)
    ;; substitutes the second argument with the third argument in the
    ;; first argument, as per the rules of substitution; the resulting
    ;; expression contains no free instances of the second argument
    (define (subst expr from to)
      (cases expr
        [(Num n) expr]
        [(Add l r) (Add (subst l from to) (subst r from to))]
        [(Sub l r) (Sub (subst l from to) (subst r from to))]
        [(Mul l r) (Mul (subst l from to) (subst r from to))]
        [(Div l r) (Div (subst l from to) (subst r from to))]
        [(Id name) (if (eq? name from) to expr)]
        [(With bound-id named-expr bound-body)
         (if (eq? bound-id from)
           expr
           (With bound-id
                 named-expr
                 (subst bound-body from to)))]))

    (: eval : WAE -> Number)
    ;; evaluates WAE expressions by reducing them to numbers
    (define (eval expr)
      (cases expr
        [(Num n) n]
        [(Add l r) (+ (eval l) (eval r))]
        [(Sub l r) (- (eval l) (eval r))]
        [(Mul l r) (* (eval l) (eval r))]
        [(Div l r) (/ (eval l) (eval r))]
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      (Num (eval named-expr))))]
        [(Id name) (error 'eval "free identifier: ~s" name)]))

    (: run : String -> Number)
    ;; evaluate a WAE program contained in a string
    (define (run str)
      (eval (parse str)))

    ;; tests
    (test (run "5") => 5)
    (test (run "{+ 5 5}") => 10)
    (test (run "{with {x {+ 5 5}} {+ x x}}") => 20)
    (test (run "{with {x 5} {+ x x}}") => 10)
    (test (run "{with {x {+ 5 5}} {with {y {- x 3}} {+ y y}}}") => 14)
    (test (run "{with {x 5} {with {y {- x 3}} {+ y y}}}") => 4)
    (test (run "{with {x 5} {+ x {with {x 3} 10}}}") => 15)
    (test (run "{with {x 5} {+ x {with {x 3} x}}}") => 8)
    (test (run "{with {x 5} {+ x {with {y 3} x}}}") => 10)
    (test (run "{with {x 5} {with {y x} y}}") => 5)
    (test (run "{with {x 5} {with {x x} x}}") => 5)
    (test (run "{with {x 1} y}") =error> "free identifier")

------------------------------------------------------------------------

Oops, this program still has problems that were caught by the tests ---
we encounter unexpected free identifier errors. What's the problem now?
In expressions like:

    {with {x 5}
      {with {y x}
        y}}

we forgot to substitute `x` in the expression that `y` is bound to. We
need to the recursive substitute in both the with's body expression as
well as its named expression:

    (: subst : WAE Symbol WAE -> WAE)
    ;; substitutes the second argument with the third argument in the
    ;; first argument, as per the rules of substitution; the resulting
    ;; expression contains no free instances of the second argument
    (define (subst expr from to)
      (cases expr
        [(Num n) expr]
        [(Add l r) (Add (subst l from to) (subst r from to))]
        [(Sub l r) (Sub (subst l from to) (subst r from to))]
        [(Mul l r) (Mul (subst l from to) (subst r from to))]
        [(Div l r) (Div (subst l from to) (subst r from to))]
        [(Id name) (if (eq? name from) to expr)]
        [(With bound-id named-expr bound-body)
         (if (eq? bound-id from)
           expr
           (With bound-id
                 (subst named-expr from to)       ;*** new
                 (subst bound-body from to)))]))

And *still* we have a problem... Now it's

    {with {x 5}
      {with {x x}
        x}}

that halts with an error, but we want it to evaluate to `5`! Carefully
trying out our substitution code reveals the problem: when we substitute
`5` for the outer `x`, we don't go inside the inner `with` because it
has the same name --- but we *do* need to go into its named expression.
We need to substitute in the named expression even if the identifier is
the *same* one we're substituting:

    (: subst : WAE Symbol WAE -> WAE)
    ;; substitutes the second argument with the third argument in the
    ;; first argument, as per the rules of substitution; the resulting
    ;; expression contains no free instances of the second argument
    (define (subst expr from to)
      (cases expr
        [(Num n) expr]
        [(Add l r) (Add (subst l from to) (subst r from to))]
        [(Sub l r) (Sub (subst l from to) (subst r from to))]
        [(Mul l r) (Mul (subst l from to) (subst r from to))]
        [(Div l r) (Div (subst l from to) (subst r from to))]
        [(Id name) (if (eq? name from) to expr)]
        [(With bound-id named-expr bound-body)
         (With bound-id
               (subst named-expr from to)
               (if (eq? bound-id from)
                 bound-body
                 (subst bound-body from to)))]))

The complete (and, finally, correct) version of the code is now:

    ;;; ---<<<WAE>>>----------------------------------------------------
    #lang pl

    #| BNF for the WAE language:
         <WAE> ::= <num>
                 | { + <WAE> <WAE> }
                 | { - <WAE> <WAE> }
                 | { * <WAE> <WAE> }
                 | { / <WAE> <WAE> }
                 | { with { <id> <WAE> } <WAE> }
                 | <id>
    |#

    ;; WAE abstract syntax trees
    (define-type WAE
      [Num  Number]
      [Add  WAE WAE]
      [Sub  WAE WAE]
      [Mul  WAE WAE]
      [Div  WAE WAE]
      [Id   Symbol]
      [With Symbol WAE WAE])

    (: parse-sexpr : Sexpr -> WAE)
    ;; parses s-expressions into WAEs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> WAE)
    ;; parses a string containing a WAE expression to a WAE AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    #| Formal specs for `subst':
       (`N' is a <num>, `E1', `E2' are <WAE>s, `x' is some <id>,
       `y' is a *different* <id>)
          N[v/x]                = N
          {+ E1 E2}[v/x]        = {+ E1[v/x] E2[v/x]}
          {- E1 E2}[v/x]        = {- E1[v/x] E2[v/x]}
          {* E1 E2}[v/x]        = {* E1[v/x] E2[v/x]}
          {/ E1 E2}[v/x]        = {/ E1[v/x] E2[v/x]}
          y[v/x]                = y
          x[v/x]                = v
          {with {y E1} E2}[v/x] = {with {y E1[v/x]} E2[v/x]}
          {with {x E1} E2}[v/x] = {with {x E1[v/x]} E2}
    |#

    (: subst : WAE Symbol WAE -> WAE)
    ;; substitutes the second argument with the third argument in the
    ;; first argument, as per the rules of substitution; the resulting
    ;; expression contains no free instances of the second argument
    (define (subst expr from to)
      (cases expr
        [(Num n) expr]
        [(Add l r) (Add (subst l from to) (subst r from to))]
        [(Sub l r) (Sub (subst l from to) (subst r from to))]
        [(Mul l r) (Mul (subst l from to) (subst r from to))]
        [(Div l r) (Div (subst l from to) (subst r from to))]
        [(Id name) (if (eq? name from) to expr)]
        [(With bound-id named-expr bound-body)
         (With bound-id
               (subst named-expr from to)
               (if (eq? bound-id from)
                 bound-body
                 (subst bound-body from to)))]))

    #| Formal specs for `eval':
         eval(N)         = N
         eval({+ E1 E2}) = eval(E1) + eval(E2)
         eval({- E1 E2}) = eval(E1) - eval(E2)
         eval({* E1 E2}) = eval(E1) * eval(E2)
         eval({/ E1 E2}) = eval(E1) / eval(E2)
         eval(id)        = error!
         eval({with {x E1} E2}) = eval(E2[eval(E1)/x])
    |#

    (: eval : WAE -> Number)
    ;; evaluates WAE expressions by reducing them to numbers
    (define (eval expr)
      (cases expr
        [(Num n) n]
        [(Add l r) (+ (eval l) (eval r))]
        [(Sub l r) (- (eval l) (eval r))]
        [(Mul l r) (* (eval l) (eval r))]
        [(Div l r) (/ (eval l) (eval r))]
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      (Num (eval named-expr))))]
        [(Id name) (error 'eval "free identifier: ~s" name)]))

    (: run : String -> Number)
    ;; evaluate a WAE program contained in a string
    (define (run str)
      (eval (parse str)))

    ;; tests
    (test (run "5") => 5)
    (test (run "{+ 5 5}") => 10)
    (test (run "{with {x 5} {+ x x}}") => 10)
    (test (run "{with {x {+ 5 5}} {+ x x}}") => 20)
    (test (run "{with {x 5} {with {y {- x 3}} {+ y y}}}") => 4)
    (test (run "{with {x {+ 5 5}} {with {y {- x 3}} {+ y y}}}") => 14)
    (test (run "{with {x 5} {+ x {with {x 3} 10}}}") => 15)
    (test (run "{with {x 5} {+ x {with {x 3} x}}}") => 8)
    (test (run "{with {x 5} {+ x {with {y 3} x}}}") => 10)
    (test (run "{with {x 5} {with {y x} y}}") => 5)
    (test (run "{with {x 5} {with {x x} x}}") => 5)
    (test (run "{with {x 1} y}") =error> "free identifier")


------------------------------------------------------------------------
# Evaluation of `with` (contd.) [Tuesday, January 21st]

Reminder:

* We started doing substitution, with a `let`-like form: `with`.

* Reasons for using bindings:
  - Avoid writing expressions twice.
    * More expressive language (can express identity).
    * Duplicating is bad! ("DRY": *Don't Repeat Yourself*.)
    * Avoids *static* redundancy.
  - Avoid redundant computations.
    * More than *just* an optimization when it avoids exponential
      resources.
    * Avoids *dynamic* redundancy.

* BNF:

      <WAE> ::= <num>
              | { + <WAE> <WAE> }
              | { - <WAE> <WAE> }
              | { * <WAE> <WAE> }
              | { / <WAE> <WAE> }
              | { with { <id> <WAE> } <WAE> }
              | <id>

  Note that we had to introduce two new rules: one for introducing an
  identifier, and one for using it.

* Type definition:

      (define-type WAE
        [Num  Number]
        [Add  WAE WAE]
        [Sub  WAE WAE]
        [Mul  WAE WAE]
        [Div  WAE WAE]
        [Id   Symbol]
        [With Symbol WAE WAE])

* Parser:

      (: parse-sexpr : Sexpr -> WAE)
      ;; parses s-expressions into WAEs
      (define (parse-sexpr sexpr)
        (match sexpr
          [(number: n)    (Num n)]
          [(symbol: name) (Id name)]
          [(cons 'with more)
           (match sexpr
             [(list 'with (list (symbol: name) named) body)
              (With name (parse-sexpr named) (parse-sexpr body))]
             [else (error 'parse-sexpr "bad `with' syntax in ~s"
                          sexpr)])]
          [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
          [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
          [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
          [(list '/ lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
          [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

* We need to define substitution. Terms:
  1. Binding Instance.
  2. Scope.
  3. Bound Instance.
  4. Free Instance.

* After lots of attempts:

  > e[v/i] --- To substitute an identifier `i` in an expression `e`
  >   with an expression `v`, replace all instances of `i` that are free
  >   in `e` with the expression `v`.

* Implemented the code, and again, needed to fix a few bugs:

      (: subst : WAE Symbol WAE -> WAE)
      ;; substitutes the second argument with the third argument in the
      ;; first argument, as per the rules of substitution; the resulting
      ;; expression contains no free instances of the second argument
      (define (subst expr from to)
        (cases expr
          [(Num n) expr]
          [(Add l r) (Add (subst l from to) (subst r from to))]
          [(Sub l r) (Sub (subst l from to) (subst r from to))]
          [(Mul l r) (Mul (subst l from to) (subst r from to))]
          [(Div l r) (Div (subst l from to) (subst r from to))]
          [(Id name) (if (eq? name from) to expr)]
          [(With bound-id named-expr bound-body)
           (With bound-id
                 (subst named-expr from to)
                 (if (eq? bound-id from)
                   bound-body
                   (subst bound-body from to)))]))

  (Note that the bugs that we fixed clarify the exact way that our
  scopes work: in `{with {x 2} {with {x {+ x 2}} x}}`, the scope of the
  first `x` is the `{+ x 2}` expression.)

* We then extended the AE evaluation rules:

      eval(...) = ... same as the AE rules ...
      eval({with {x E1} E2}) = eval(E2[eval(E1)/x])
      eval(id) = error!

  and noted the possible type problem.

* The above translated into a Racket definition for an `eval` function
  (with a hack to avoid the type issue):

      (: eval : WAE -> Number)
      ;; evaluates WAE expressions by reducing them to numbers
      (define (eval expr)
        (cases expr
          [(Num n) n]
          [(Add l r) (+ (eval l) (eval r))]
          [(Sub l r) (- (eval l) (eval r))]
          [(Mul l r) (* (eval l) (eval r))]
          [(Div l r) (/ (eval l) (eval r))]
          [(With bound-id named-expr bound-body)
           (eval (subst bound-body
                        bound-id
                        (Num (eval named-expr))))]
          [(Id name) (error 'eval "free identifier: ~s" name)]))


------------------------------------------------------------------------
# Formal Specs [Tuesday, January 21st]

Note the formal definitions that were included in the WAE code. They are
ways of describing pieces of our language that are more formal than
plain English, but still not as formal (and as verbose) as the actual
code.

A formal definition of `subst`:

(`N` is a `<num>`, `E1`, `E2` are `<WAE>`s, `x` is some `<id>`, `y` is a
*different* `<id>`)

    N[v/x]                = N

    {+ E1 E2}[v/x]        = {+ E1[v/x] E2[v/x]}

    {- E1 E2}[v/x]        = {- E1[v/x] E2[v/x]}

    {* E1 E2}[v/x]        = {* E1[v/x] E2[v/x]}

    {/ E1 E2}[v/x]        = {/ E1[v/x] E2[v/x]}

    y[v/x]                = y
    x[v/x]                = v

    {with {y E1} E2}[v/x] = {with {y E1[v/x]} E2[v/x]}
    {with {x E1} E2}[v/x] = {with {x E1[v/x]} E2}

And a formal definition of `eval`:

    eval(N)         = N

    eval({+ E1 E2}) = eval(E1) + eval(E2)

    eval({- E1 E2}) = eval(E1) - eval(E2)

    eval({* E1 E2}) = eval(E1) * eval(E2)

    eval({/ E1 E2}) = eval(E1) / eval(E2)

    eval(id)        = error!

    eval({with {x E1} E2}) = eval(E2[eval(E1)/x])


------------------------------------------------------------------------
# Lazy vs Eager Evaluation [Tuesday, January 21st]

As we have previously seen, there are two basic approaches for
evaluation: either eager or lazy. In lazy evaluation, bindings are used
for sort of textual references --- it is only for avoiding writing an
expression twice, but the associated computation is done twice anyway.
In eager evaluation, we eliminate not only the textual redundancy, but
also the computation.

Which evaluation method did our evaluator use? The relevant piece of
formalism is the treatment of `with`:

    eval({with {x E1} E2}) = eval(E2[eval(E1)/x])

And the matching piece of code is:

    [(With bound-id named-expr bound-body)
     (eval (subst bound-body
                  bound-id
                  (Num (eval named-expr))))]

How do we make this lazy?

In the formal equation:

    eval({with {x E1} E2}) = eval(E2[E1/x])

and in the code:

    (: eval : WAE -> Number)
    ;; evaluates WAE expressions by reducing them to numbers
    (define (eval expr)
      (cases expr
        [(Num n) n]
        [(Add l r) (+ (eval l) (eval r))]
        [(Sub l r) (- (eval l) (eval r))]
        [(Mul l r) (* (eval l) (eval r))]
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      named-expr))] ;*** no eval and no Num wrapping
        [(Id name) (error 'eval "free identifier: ~s" name)]))

We can verify the way this works by tracing `eval` (compare the trace
you get for the two versions):

    > (trace eval) ; (put this in the definitions window)
    > (run "{with {x {+ 1 2}} {* x x}}")

Ignoring the traces for now, the modified WAE interpreter works as
before, specifically, all tests pass. So the question is whether the
language we get is actually different than the one we had before. One
difference is in execution speed, but we can't really notice a
difference, and we care more about meaning. Is there any program that
will run differently in the two languages?

The main feature of the lazy evaluator is that it is not evaluating the
named expression until it is actually needed. As we have seen, this
leads to duplicating computations if the bound identifier is used more
than once --- meaning that it does not eliminate the dynamic redundancy.
But what if the bound identifier is not used at all? In that case the
named expression simply evaporates. This is a good hint at an expression
that behaves differently in the two languages --- if we add division to
both languages, we get a different result when we try running:

    {with {x {/ 8 0}} 7}

The eager evaluator stops with an error when it tries evaluating the
division --- and the lazy evaluator simply ignores it.

Even without division, we get a similar behavior for

    {with {x y} 7}

but it is questionable whether the fact that this evaluates to 7 is
correct behavior --- we really want to forbid program that use free
variable.

Furthermore, there is an issue with name capturing --- we don't want to
substitute an expression into a context that captures some of its free
variables. But our substitution allows just that, which is usually not a
problem because by the time we do the substitution, the named expression
should not have free variables that need to be replaced. However,
consider evaluating this program:

    {with {y x}
      {with {x 2}
        {+ x y}}}

under the two evaluation regimens: the eager version stops with an
error, and the lazy version succeed. This points at a bug in our
substitution, or rather not dealing with an issue that we do not
encounter.

So the summary is: as long as the initial program is correct, both
evaluation regimens produce the same results. If a program contains free
variables, they might get captured in a naive lazy evaluator
implementation (but this is a bug that should be fixed). Also, there are
some cases where eager evaluation runs into a run-time problem which
does not happen in a lazy evaluator because the expression is not used.
It is possible to prove that when you evaluate an expression, if there
is an error that can be avoided, lazy evaluation will always avoid it,
whereas an eager evaluator will always run into it. On the other hand,
lazy evaluators are usually slower than eager evaluator, so it's a speed
vs. robustness trade-off.

Note that with lazy evaluation we say that an identifier is bound to an
expression rather than a value. (Again, this is why the eager version
needed to wrap `eval`'s result in a `Num` and this one doesn't.)

(It is possible to change things and get a more well behaved
substitution, we basically will need to find if a capture might happen,
and rename things to avoid it. For example,

    {with {y E1} E2}[v/x]
      if `x' and `y' are equal
        = {with {y E1[v/x]} E2} = {with {x E1[v/x]} E2}
      if `y' has a free occurrence in `v'
        = {with {y1 E1[v/x]} E2[y1/y][v/x]} ; `y1' is "fresh"
      otherwise
        = {with {y E1[v/x]} E2[v/x]}

With this, we might have gone through this path in evaluating the above:

    {with {y x} {with {x 2} {+ x y}}}
    {with {x₁ 2} {+ x₁ x}} ; note that x₁ is a fresh name, not x
    {+ 2 x}
    error: free `x`

But you can see that this is much more complicated (more code: requires
a `free-in` predicate, being able to invent new *fresh* names, etc). And
it's not even the end of that story...)


------------------------------------------------------------------------
# de Bruijn Indexes [Tuesday, January 21st]

This whole story revolves around names, specifically, name capture is a
problem that should always be avoided (it is one major source of PL
headaches).

But are names the only way we can use bindings?

There is a least one alternative way: note that the only thing we used
names for are for references. We don't really care what the name is,
which is pretty obvious when we consider the two WAE expressions:

    {with {x 5} {+ x x}}
    {with {y 5} {+ y y}}

or the two Racket function definitions:

    (define (foo x) (list x x))
    (define (foo y) (list y y))

Both of these show a pair of expressions that we should consider as
equal in some sense (this is called "alpha-equality"). The only thing we
care about is what variable points where: the binding structure is the
only thing that matters. In other words, as long as DrRacket produces
the same arrows when we use Check Syntax, we consider the program to be
the same, regardless of name choices (for argument names and local
names, not for global names like `foo` in the above).

The alternative idea uses this principle: if all we care about is where
the arrows go, then simply get rid of the names... Instead of
referencing a binding through its name, just specify which of the
surrounding scopes we want to refer to. For example, instead of:

    {with {x 5} {with {y 6} {+ x y}}}

we can use a new "reference" syntax --- `[N]` --- and use this instead
of the above:

    {with 5 {with 6 {+ [1] [0]}}}

So the rules for `[N]` are --- `[0]` is the value bound in the current
scope, `[1]` is the value from the next one up etc.

Of course, to do this translation, we have to know the precise scope
rules. Two more complicated examples:

    {with {x 5} {+ x {with {y 6} {+ x y}}}}

is translated to:

    {with 5 {+ [0] {with 6 {+ [1] [0]}}}}

(note how `x` appears as a different reference based on where it
appeared in the original code.) Even more subtle:

    {with {x 5} {with {y {+ x 1}} {+ x y}}}

is translated to:

    {with 5 {with {+ [0] 1} {+ [1] [0]}}}

because the inner `with` does not have its own named expression in its
scope, so the named expression is immediately in the scope of the outer
`with`.

This is called "de Bruijn Indexes": instead of referencing identifiers
by their name, we use an index into the surrounding binding context. The
major disadvantage, as can be seen in the above examples, is that it is
not convenient for humans to work with. Specifically, the same
identifier is referenced using different numbers, which makes it hard to
understand what some code is doing. After all, *abstractions* are the
main thing we deal with when we write programs, and having labels make
the bindings structure much easier to understand than scope counts.

However, practically all compilers use this for compiled code (think
about stack pointers). For example, GCC compiles this code:

    {
      int x = 5;
      {
        int y = x + 1;
        return x + y;
      }
    }

to:

    subl $8, %esp
    movl $5, -4(%ebp)   ; int x = 5
    movl -4(%ebp), %eax
    incl %eax
    movl %eax, -8(%ebp) ; int y = %eax
    movl -8(%ebp), %eax
    addl -4(%ebp), %eax


------------------------------------------------------------------------
# Functions & Function Values [Tuesday, January 28th]

> [PLAI §4]

Now that we have a form for local bindings, which forced us to deal with
proper substitutions and everything that is related, we can get to
functions. The concept of a function is itself very close to
substitution, and to our `with` form. For example, when we write:

    {with {x 5}
      {* x x}}

then the `{* x x}` body is itself parametrized over some value for `x`.
If we take this expression and take out the `5`, we're left with
something that has all of the necessary ingredients of a function --- a
bunch of code that is parameterized over some input identifier:

    {with {x}
      {* x x}}

We only need to replace `with` and use a proper name that indicates that
it's a function:

    {fun {x}
      {* x x}}

Now we have a new form in our language, one that should have a function
as its meaning. As we have seen in the case of `with` expressions, we
also need a new form to *use* these functions. We will use `call` for
this, so that

    {call {fun {x} {* x x}}
          5}

will be the same as the original `with` expression that we started with
--- the `fun` expression is like the `with` expression with no value,
and applying it on `5` is providing that value back:

    {with {x 5}
      {* x x}}

Of course, this does not help much --- all we get is a way to use local
bindings that is more verbose from what we started with. What we're
really missing is a way to *name* these functions. If we get the right
evaluation rules, we can evaluate a `fun` expression to some value ---
which will allow us to bind it to a variable using `with`. Something
like this:

    {with {sqr {fun {x} {* x x}}}
      {+ {call sqr 5}
         {call sqr 6}}}

In this expression, we say that `x` is the formal parameter (or
argument), and the `5` and `6` are actual parameters (sometimes
abbreviated as formals and actuals). Note that naming functions often
helps, but many times there are small functions that are fine to specify
without a name --- for example, consider a two-stage addition function,
where there is no apparent good name for the returned function:

    {with {add {fun {x}
                 {fun {y}
                   {+ x y}}}}
      {call {call add 8} 9}}


------------------------------------------------------------------------
# Implementing First Class Functions [Tuesday, January 28th]

> [PLAI §6] (uses some stuff from [PLAI §5], which we do later)

This is a simple plan, but it is directly related to how functions are
going to be used in our language. We know that `{call {fun {x} E1} E2}`
is equivalent to a `with` expression, but the new thing here is that we
do allow writing just the `{fun ...}` expression by itself, and
therefore we need to have some meaning for it. The meaning, or the value
of this expression, should roughly be "an expression that needs a value
to be plugged in for `x`". In other words, our language will have these
new kinds of values that contain an expression to be evaluated later on.

There are three basic approaches that classify programming languages in
relation to how the deal with functions:

1. First order: functions are not real values. They cannot be used or
   returned as values by other functions. This means that they cannot be
   stored in data structures. This is what most "conventional" languages
   used to have in the past. (You will be implementing such a language
   in homework 4.)

   An example of such a language is the Beginner Student language that
   is used in HtDP, where the language is intentionally first-order to
   help students write correct code (at the early stages where using a
   function as a value is usually an error). It's hard to find practical
   modern languages that fall in this category.

2. Higher order: functions can receive and return other functions as
   values. This is what you get with C and modern Fortran.

3. First class: functions are values with all the rights of other
   values. In particular, they can be supplied to other functions,
   returned from functions, stored in data structures, and new functions
   can be created at run-time. (And most modern languages have first
   class functions.)

The last category is the most interesting one. Back in the old days,
complex expressions were not first-class in that they could not be
freely composed. This is still the case in machine-code: as we've seen
earlier, to compute an expression such as

    (-b + sqrt(b^2 - 4*a*c)) / 2a

you have to do something like this:

    x = b * b
    y = 4 * a
    y = y * c
    x = x - y
    x = sqrt(x)
    y = -b
    x = y + x
    y = 2 * a
    s = x / y

In other words, every intermediate value needs to have its own name. But
with proper ("high-level") programming languages (at least most of
them...) you can just write the original expression, with no names for
these values.

With first-class functions something similar happens --- it is possible
to have complex expressions that consume and return functions, and they
do not need to be named.

What we get with our `fun` expression (if we can make it work) is
exactly this: it generates a function, and you can choose to either bind
it to a name, or not. The important thing is that the value exists
independently of a name.

This has a major effect on the "personality" of a programming language
as we will see. In fact, just adding this feature will make our language
much more advanced than languages with just higher-order or first-order
functions.

------------------------------------------------------------------------

Quick Example: the following is working JavaScript code, that uses first
class functions.

    function foo(x) {
      function bar(y) { return x + y; }
      return bar;
    }
    function main() {
      var f = foo(1);
      var g = foo(10);
      return [f(2), g(2)];
    }

Note that the above definition of `foo` does *not* use an anonymous
"lambda expression" --- in Racket terms, it's translated to

    (define (foo x)
      (define (bar y) (+ x y))
      bar)

The returned function is not anonymous, but it's not really named
either: the `bar` name is bound only inside the body of `foo`, and
outside of it that name no longer exists since it's not its scope. It
gets used in the printed form if the function value is displayed, but
this is merely a debugging aid. The anonymous `lambda` version that is
common in Racket can be used in JavaScript too:

    function foo(x) {
      return function(y) { return x + y; }
    }

> Side-note: GCC includes extensions that allow internal function
> definitions, but it still does not have first class functions ---
> trying to do the above is broken:
>
>     #include <stdio.h>
>     typedef int(*int2int)(int);
>     int2int foo(int x) {
>       int bar(int y) { return x + y; }
>       return bar;
>     }
>     int main() {
>       int2int f = foo(1);
>       int2int g = foo(10);
>       printf(">> %d, %d\n", f(2), g(2));
>     }


------------------------------------------------------------------------
## Side-note: how important is it to have *anonymous* functions? [Tuesday, January 28th]

You'll see many places where people refer to the feature of first-class
functions as the ability to create *anonymous* functions, but this is a
confusion and it's not accurate. Whether a function has a name or not is
not the important question --- instead, the important question is
whether functions can exist with no *bindings* that refers to them.

As a quick example in Racket:

    (define (foo x)
      (define (bar y) (+ x y))
      bar)

in Javascript:

    function foo(x) {
      function bar(y) {
        return x + y;
      }
      return bar;
    }

and in Python:

    def foo(x):
      def bar(y):
        return x + y
      return bar

In all three of these, we have a `foo` function that returns a function
*named* `bar` --- but the `bar` name, is only available in the scope of
`foo`. The fact that the name is displayed as part of the textual
rendering of the function value is merely a debugging feature.


------------------------------------------------------------------------
# The FLANG Language [Tuesday, January 28th]

Now for the implementation --- we call this new language FLANG.

First, the BNF:

    <FLANG> ::= <num>
              | { + <FLANG> <FLANG> }
              | { - <FLANG> <FLANG> }
              | { * <FLANG> <FLANG> }
              | { / <FLANG> <FLANG> }
              | { with { <id> <FLANG> } <FLANG> }
              | <id>
              | { fun { <id> } <FLANG> }
              | { call <FLANG> <FLANG> }

And the matching type definition:

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [Fun  Symbol       FLANG] ; No named-expression
      [Call FLANG FLANG])

The parser for this grammar is, as usual, straightforward:

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

We also need to patch up the substitution function to deal with these
things. The scoping rule for the new function form is, unsurprisingly,
similar to the rule of `with`, except that there is no extra expression
now, and the scoping rule for `call` is the same as for the arithmetic
operators:

    N[v/x]                = N

    {+ E1 E2}[v/x]        = {+ E1[v/x] E2[v/x]}

    {- E1 E2}[v/x]        = {- E1[v/x] E2[v/x]}

    {* E1 E2}[v/x]        = {* E1[v/x] E2[v/x]}

    {/ E1 E2}[v/x]        = {/ E1[v/x] E2[v/x]}

    y[v/x]                = y
    x[v/x]                = v

    {with {y E1} E2}[v/x] = {with {y E1[v/x]} E2[v/x]}
    {with {x E1} E2}[v/x] = {with {x E1[v/x]} E2}

    {call E1 E2}[v/x]     = {call E1[v/x] E2[v/x]}

    {fun {y} E}[v/x]      = {fun {y} E[v/x]}
    {fun {x} E}[v/x]      = {fun {x} E}

And the matching code:

    (: subst : FLANG Symbol FLANG -> FLANG)
    ;; substitutes the second argument with the third argument in the
    ;; first argument, as per the rules of substitution; the resulting
    ;; expression contains no free instances of the second argument
    (define (subst expr from to)
      (cases expr
        [(Num n) expr]
        [(Add l r) (Add (subst l from to) (subst r from to))]
        [(Sub l r) (Sub (subst l from to) (subst r from to))]
        [(Mul l r) (Mul (subst l from to) (subst r from to))]
        [(Div l r) (Div (subst l from to) (subst r from to))]
        [(Id name) (if (eq? name from) to expr)]
        [(With bound-id named-expr bound-body)
         (With bound-id
               (subst named-expr from to)
               (if (eq? bound-id from)
                 bound-body
                 (subst bound-body from to)))]
        [(Call l r) (Call (subst l from to) (subst r from to))]
        [(Fun bound-id bound-body)
         (if (eq? bound-id from)
           expr
           (Fun bound-id (subst bound-body from to)))]))

------------------------------------------------------------------------

Now, before we start working on an evaluator, we need to decide on what
exactly do we use to represent values of this language. Before we had
functions, we had only number values and we used Racket numbers to
represent them. Now we have two kinds of values --- numbers and
functions. It seems easy enough to continue using Racket numbers to
represent numbers, but what about functions? What should be the result
of evaluating

    {fun {x} {+ x 1}}

? Well, this is the new toy we have: it should be a function value,
which is something that can be used just like numbers, but instead of
arithmetic operations, we can `call` these things. What we need is a way
to avoid evaluating the body expression of the function --- *delay* it
--- and instead use some value that will contain this delayed expression
in a way that can be used later.

To accommodate this, we will change our implementation strategy a
little: we will use our syntax objects for numbers (`(Num n)` instead of
just `n`), which will be a little inconvenient when we do the arithmetic
operations, but it will simplify life by making it possible to evaluate
functions in a similar way: simply return their own syntax object as
their values. The syntax object has what we need: the body expression
that needs to be evaluated later when the function is called, and it
also has the identifier name that should be replaced with the actual
input to the function call. This means that evaluating:

    (Add (Num 1) (Num 2))

now yields

    (Num 3)

and a number `(Num 5)` evaluates to `(Num 5)`.

In a similar way, `(Fun 'x (Num 2))` evaluates to `(Fun 'x (Num 2))`.

Why would this work? Well, because `call` will be very similar to `with`
--- the only difference is that its arguments are ordered a little
differently, being retrieved from the function that is applied and the
argument.

The formal evaluation rules are therefore treating functions like
numbers, and use the syntax object to represent both values:

    eval(N)         = N

    eval({+ E1 E2}) = eval(E1) + eval(E2)

    eval({- E1 E2}) = eval(E1) - eval(E2)

    eval({* E1 E2}) = eval(E1) * eval(E2)

    eval({/ E1 E2}) = eval(E1) / eval(E2)

    eval(id)        = error!

    eval({with {x E1} E2}) = eval(E2[eval(E1)/x])

    eval(FUN)       = FUN ; assuming FUN is a function expression

    eval({call E1 E2})
                    = eval(B[eval(E2)/x])   if eval(E1) = {fun {x} B}
                    = error!                otherwise

Note that the last rule could be written using a translation to a `with`
expression:

    eval({call E1 E2})
                    = eval({with {x E2} B}) if eval(E1) = {fun {x} B}
                    = error!                otherwise

And alternatively, we could specify `with` using `call` and `fun`:

    eval({with {x E1} E2}) = eval({call {fun {x} E2} E1})

There is a small problem in these rules which is intuitively seen by the
fact that the evaluation rule for a `call` is expected to be very
similar to the one for arithmetic operations. We now have two kinds of
values, so we need to check the arithmetic operation's arguments too:

    eval({+ E1 E2}) = N1 + N2
                        if eval(E1), eval(E2) evaluate to numbers N1, N2
                        otherwise error!
    ...

The corresponding code is:

    (: eval : FLANG -> FLANG)                      ;*** note return type
    ;; evaluates FLANG expressions by reducing them to *expressions* but
    ;; only expressions that stand for values: only `Fun`s and `Num`s
    (define (eval expr)
      (cases expr
        [(Num n) expr]                             ;*** change here
        [(Add l r) (arith-op + (eval l) (eval r))]
        [(Sub l r) (arith-op - (eval l) (eval r))]
        [(Mul l r) (arith-op * (eval l) (eval r))]
        [(Div l r) (arith-op / (eval l) (eval r))]
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      (eval named-expr)))]         ;*** no `(Num ...)'
        [(Id name) (error 'eval "free identifier: ~s" name)]
        [(Fun bound-id bound-body) expr]           ;*** similar to `Num'
        [(Call (Fun bound-id bound-body) arg-expr) ;*** nested pattern
         (eval (subst bound-body                   ;*** just like `with'
                      bound-id
                      (eval arg-expr)))]
        [(Call something arg-expr)
         (error 'eval "`call' expects a function, got: ~s" something)]))

Note that the `Call` case is doing the same thing we do in the `With`
case. In fact, we could have just *generated* a `With` expression and
evaluate that instead:

        ...
        [(Call (Fun bound-id bound-body) arg-expr)
         (eval (With bound-id arg-expr bound-body))]
        ...

The `arith-op` function is in charge of checking that the input values
are numbers (represented as FLANG numbers), translating them to plain
numbers, performing the Racket operation, then re-wrapping the result in
a `Num`. Note how its type indicates that it is a higher-order function.

    (: arith-op : (Number Number -> Number) FLANG FLANG -> FLANG)
    ;; gets a Racket numeric binary operator, and uses it within a FLANG
    ;; `Num' wrapper (note the H.O. type, and note the hack of the `val`
    ;; name which is actually an AST that represents a runtime value)
    (define (arith-op op val1 val2)
      (Num (op (Num->number val1) (Num->number val2))))

It uses the following function to convert FLANG numbers to Racket
numbers. (Note that `else` is almost always a bad idea since it can
prevent the compiler from showing you places to edit code --- but this
case is an exception since we never want to deal with anything other
than `Num`s.) The reason that this function is relatively trivial is
that we chose the easy way and represented numbers using Racket numbers,
but we could have used strings or anything else.

    (: Num->number : FLANG -> Number)
    ;; convert a FLANG number to a Racket one
    (define (Num->number e)
      (cases e
        [(Num n) n]
        [else (error 'arith-op "expected a number, got: ~s" e)]))

We can also make things a little easier to use if we make `run` convert
the result to a number:

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str))])
        (cases result
          [(Num n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

Adding few simple tests we get:

    ;; The Flang interpreter

    #lang pl

    #|
    The grammar:
      <FLANG> ::= <num>
                | { + <FLANG> <FLANG> }
                | { - <FLANG> <FLANG> }
                | { * <FLANG> <FLANG> }
                | { / <FLANG> <FLANG> }
                | { with { <id> <FLANG> } <FLANG> }
                | <id>
                | { fun { <id> } <FLANG> }
                | { call <FLANG> <FLANG> }

    Evaluation rules:

      subst:
        N[v/x]                = N
        {+ E1 E2}[v/x]        = {+ E1[v/x] E2[v/x]}
        {- E1 E2}[v/x]        = {- E1[v/x] E2[v/x]}
        {* E1 E2}[v/x]        = {* E1[v/x] E2[v/x]}
        {/ E1 E2}[v/x]        = {/ E1[v/x] E2[v/x]}
        y[v/x]                = y
        x[v/x]                = v
        {with {y E1} E2}[v/x] = {with {y E1[v/x]} E2[v/x]} ; if y =/= x
        {with {x E1} E2}[v/x] = {with {x E1[v/x]} E2}
        {call E1 E2}[v/x]     = {call E1[v/x] E2[v/x]}
        {fun {y} E}[v/x]      = {fun {y} E[v/x]}           ; if y =/= x
        {fun {x} E}[v/x]      = {fun {x} E}

      eval:
        eval(N)            = N
        eval({+ E1 E2})    = eval(E1) + eval(E2)  \ if both E1 and E2
        eval({- E1 E2})    = eval(E1) - eval(E2)   \ evaluate to numbers
        eval({* E1 E2})    = eval(E1) * eval(E2)   / otherwise error!
        eval({/ E1 E2})    = eval(E1) / eval(E2)  /
        eval(id)           = error!
        eval({with {x E1} E2}) = eval(E2[eval(E1)/x])
        eval(FUN)          = FUN ; assuming FUN is a function expression
        eval({call E1 E2}) = eval(B[eval(E2)/x])
                               if eval(E1)={fun {x} B}, otherwise error!
    |#

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    (: subst : FLANG Symbol FLANG -> FLANG)
    ;; substitutes the second argument with the third argument in the
    ;; first argument, as per the rules of substitution; the resulting
    ;; expression contains no free instances of the second argument
    (define (subst expr from to)
      (cases expr
        [(Num n) expr]
        [(Add l r) (Add (subst l from to) (subst r from to))]
        [(Sub l r) (Sub (subst l from to) (subst r from to))]
        [(Mul l r) (Mul (subst l from to) (subst r from to))]
        [(Div l r) (Div (subst l from to) (subst r from to))]
        [(Id name) (if (eq? name from) to expr)]
        [(With bound-id named-expr bound-body)
         (With bound-id
               (subst named-expr from to)
               (if (eq? bound-id from)
                 bound-body
                 (subst bound-body from to)))]
        [(Call l r) (Call (subst l from to) (subst r from to))]
        [(Fun bound-id bound-body)
         (if (eq? bound-id from)
           expr
           (Fun bound-id (subst bound-body from to)))]))

    (: Num->number : FLANG -> Number)
    ;; convert a FLANG number to a Racket one
    (define (Num->number e)
      (cases e
        [(Num n) n]
        [else (error 'arith-op "expected a number, got: ~s" e)]))

    (: arith-op : (Number Number -> Number) FLANG FLANG -> FLANG)
    ;; gets a Racket numeric binary operator, and uses it within a FLANG
    ;; `Num' wrapper
    (define (arith-op op val1 val2)
      (Num (op (Num->number val1) (Num->number val2))))

    (: eval : FLANG -> FLANG)
    ;; evaluates FLANG expressions by reducing them to *expressions* but
    ;; only expressions that stand for values: only `Fun`s and `Num`s
    (define (eval expr)
      (cases expr
        [(Num n) expr]
        [(Add l r) (arith-op + (eval l) (eval r))]
        [(Sub l r) (arith-op - (eval l) (eval r))]
        [(Mul l r) (arith-op * (eval l) (eval r))]
        [(Div l r) (arith-op / (eval l) (eval r))]
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      (eval named-expr)))]
        [(Id name) (error 'eval "free identifier: ~s" name)]
        [(Fun bound-id bound-body) expr]
        [(Call (Fun bound-id bound-body) arg-expr)
         (eval (subst bound-body
                      bound-id
                      (eval arg-expr)))]
        [(Call something arg-expr)
         (error 'eval "`call' expects a function, got: ~s" something)]))

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str))])
        (cases result
          [(Num n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)


------------------------------------------------------------------------
# The FLANG Language (contd.) [Tuesday, January 28th]

There is still a problem with this version. First a question --- if
`call` is similar to arithmetic operations (and to `with` in what it
actually does), then how come the code is different enough that it
doesn't even need an auxiliary function?

Second question: what *should* happen if we evaluate these code
snippets:

    (run "{with {add {fun {x}
                       {fun {y}
                         {+ x y}}}}
           {call {call add 8} 9}}")
    (run "{with {identity {fun {x} x}}
            {with {foo {fun {x} {+ x 1}}}
              {call {call identity foo} 123}}}")
    (run "{call {call {fun {x} {call x 1}}
                      {fun {x} {fun {y} {+ x y}}}}
                123}")

Third question, what *will* happen if we do the above?

What we're missing is an evaluation of the function expression, in case
it's not a literal `fun` form. The following fixes this:

    (: eval : FLANG -> FLANG)
    ;; evaluates FLANG expressions by reducing them to *expressions* but
    ;; only expressions that stand for values: only `Fun`s and `Num`s
    (define (eval expr)
      (cases expr
        [(Num n) expr]
        [(Add l r) (arith-op + (eval l) (eval r))]
        [(Sub l r) (arith-op - (eval l) (eval r))]
        [(Mul l r) (arith-op * (eval l) (eval r))]
        [(Div l r) (arith-op / (eval l) (eval r))]
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      (eval named-expr)))]
        [(Id name) (error 'eval "free identifier: ~s" name)]
        [(Fun bound-id bound-body) expr]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr))       ;*** need to evaluate this!
         (cases fval
           [(Fun bound-id bound-body)
            (eval (subst bound-body bound-id (eval arg-expr)))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

The complete code is:

    ;;; ---<<<FLANG>>>--------------------------------------------------
    ;; The Flang interpreter

    #lang pl

    #|
    The grammar:
      <FLANG> ::= <num>
                | { + <FLANG> <FLANG> }
                | { - <FLANG> <FLANG> }
                | { * <FLANG> <FLANG> }
                | { / <FLANG> <FLANG> }
                | { with { <id> <FLANG> } <FLANG> }
                | <id>
                | { fun { <id> } <FLANG> }
                | { call <FLANG> <FLANG> }

    Evaluation rules:

      subst:
        N[v/x]                = N
        {+ E1 E2}[v/x]        = {+ E1[v/x] E2[v/x]}
        {- E1 E2}[v/x]        = {- E1[v/x] E2[v/x]}
        {* E1 E2}[v/x]        = {* E1[v/x] E2[v/x]}
        {/ E1 E2}[v/x]        = {/ E1[v/x] E2[v/x]}
        y[v/x]                = y
        x[v/x]                = v
        {with {y E1} E2}[v/x] = {with {y E1[v/x]} E2[v/x]} ; if y =/= x
        {with {x E1} E2}[v/x] = {with {x E1[v/x]} E2}
        {call E1 E2}[v/x]     = {call E1[v/x] E2[v/x]}
        {fun {y} E}[v/x]      = {fun {y} E[v/x]}           ; if y =/= x
        {fun {x} E}[v/x]      = {fun {x} E}

      eval:
        eval(N)            = N
        eval({+ E1 E2})    = eval(E1) + eval(E2)  \ if both E1 and E2
        eval({- E1 E2})    = eval(E1) - eval(E2)   \ evaluate to numbers
        eval({* E1 E2})    = eval(E1) * eval(E2)   / otherwise error!
        eval({/ E1 E2})    = eval(E1) / eval(E2)  /
        eval(id)           = error!
        eval({with {x E1} E2}) = eval(E2[eval(E1)/x])
        eval(FUN)          = FUN ; assuming FUN is a function expression
        eval({call E1 E2}) = eval(B[eval(E2)/x])
                               if eval(E1)={fun {x} B}, otherwise error!
    |#

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add  (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub  (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul  (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div  (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    (: subst : FLANG Symbol FLANG -> FLANG)
    ;; substitutes the second argument with the third argument in the
    ;; first argument, as per the rules of substitution; the resulting
    ;; expression contains no free instances of the second argument
    (define (subst expr from to)
      (cases expr
        [(Num n) expr]
        [(Add l r) (Add (subst l from to) (subst r from to))]
        [(Sub l r) (Sub (subst l from to) (subst r from to))]
        [(Mul l r) (Mul (subst l from to) (subst r from to))]
        [(Div l r) (Div (subst l from to) (subst r from to))]
        [(Id name) (if (eq? name from) to expr)]
        [(With bound-id named-expr bound-body)
         (With bound-id
               (subst named-expr from to)
               (if (eq? bound-id from)
                 bound-body
                 (subst bound-body from to)))]
        [(Call l r) (Call (subst l from to) (subst r from to))]
        [(Fun bound-id bound-body)
         (if (eq? bound-id from)
           expr
           (Fun bound-id (subst bound-body from to)))]))

    (: Num->number : FLANG -> Number)
    ;; convert a FLANG number to a Racket one
    (define (Num->number e)
      (cases e
        [(Num n) n]
        [else (error 'arith-op "expected a number, got: ~s" e)]))

    (: arith-op : (Number Number -> Number) FLANG FLANG -> FLANG)
    ;; gets a Racket numeric binary operator, and uses it within a FLANG
    ;; `Num' wrapper
    (define (arith-op op val1 val2)
      (Num (op (Num->number val1) (Num->number val2))))

    (: eval : FLANG -> FLANG)
    ;; evaluates FLANG expressions by reducing them to *expressions* but
    ;; only expressions that stand for values: only `Fun`s and `Num`s
    (define (eval expr)
      (cases expr
        [(Num n) expr]
        [(Add l r) (arith-op + (eval l) (eval r))]
        [(Sub l r) (arith-op - (eval l) (eval r))]
        [(Mul l r) (arith-op * (eval l) (eval r))]
        [(Div l r) (arith-op / (eval l) (eval r))]
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      (eval named-expr)))]
        [(Id name) (error 'eval "free identifier: ~s" name)]
        [(Fun bound-id bound-body) expr]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr))
         (cases fval
           [(Fun bound-id bound-body)
            (eval (subst bound-body bound-id (eval arg-expr)))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str))])
        (cases result
          [(Num n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {add {fun {x}
                             {fun {y}
                               {+ x y}}}}
                  {call {call add 8} 9}}")
          => 17)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)


------------------------------------------------------------------------
# Introducing Racket's `lambda` [Tuesday, January 28th]

> Quick laundry list of things to go over:
> - `fun` & `lambda`
> - difference between lambda and simple values
> - not being able to do recursive functions with `let`
> - let* as a derived form
> - let with lambda in Racket --> can be a derived form
> - how `if` can be used to implement `and` and `or` as derived forms
> - Newtonian syntax vs. a lambda expression.

Almost all modern languages have this capability. For example, this:

    (define (f g) (g 2 3))
    (f +) ==> 5
    (f *) ==> 6
    (f (lambda (x y) (+ (square x) (square y)))) ==> 13

Can be written in JavaScript like this:

    function f(g) { return g(2,3); }
    function square(x) { return x*x; }
    console.log(f(function (x,y) { return square(x) + square(y); }));

or in ES6 JavaScript:

    let f = (g) => g(2,3);
    let square = (x) => x*x;
    console.log(f((x,y) => square(x) + square(y)));

In Perl:

    sub f { my ($g) = @_; return $g->(2,3); }
    sub square { my ($x) = @_; return $x * $x; }
    print f(sub { my ($x, $y) = @_; return square($x) + square($y); });

In Ruby:

    def f(g) g.call(2,3) end
    def square(x) x*x end
    puts f(lambda{|x,y| square(x) + square(y)})

etc. Even [Java has lambda expressions], and "recently" [C++ added them
too].

[Java has lambda expressions]:
  http://www.drdobbs.com/jvm/lambda-expressions-in-java-8/240166764
[C++ added them too]:
  http://www.cprogramming.com/c++11/c++11-lambda-closures.html


------------------------------------------------------------------------
# Using Functions as Objects [Tuesday, January 28th]

A very important aspect of Racket --- using "higher order" functions ---
functions that get and return functions. Here is a very simple example:

    (define (f x) (lambda () x))
    (define a (f 2))
    (a)  -->  2
    (define b (f 3))
    (b)  -->  3

Note: what we get is actually an object that remembers (by the
substitution we're doing) a number. How about:

    (define aa (f a))
    (aa)    -->  #<procedure> (this is a)
    ((aa))  -->  2

Take this idea to the next level:

    (define (kons x y)
      (lambda (b)
        (if b x y)))
    (define (kar p) (p #t))
    (define (kdr p) (p #f))
    (define a (kons 1 2))
    (define b (kons 3 4))
    (list (kar a) (kdr a))
    (list (kar b) (kdr b))

Or, with types:

    (: kons : (All (A B) A B -> (Boolean -> (U A B))))
    (define (kons x y)
      (lambda (b)
        (if b x y)))
    (: kar : (All (T) (Boolean -> T) -> T))
    (define (kar p) (p #t))
    (: kdr : (All (T) (Boolean -> T) -> T))
    (define (kdr p) (p #f))
    (define a (kons 1 2))
    (define b (kons 3 4))
    (list (kar a) (kdr a))
    (list (kar b) (kdr b))

Even more --- why should the internal function expect a boolean and
choose what to return? We can simply expect a function that will take
the two values and return one:

    (define (kons x y) (lambda (s) (s x y)))
    (define (kar p) (p (lambda (x y) x)))
    (define (kdr p) (p (lambda (x y) y)))
    (define a (kons 1 2))
    (define b (kons 3 4))
    (list (kar a) (kdr a))
    (list (kar b) (kdr b))

And a typed version, using our own constructor to make it a little less
painful:

    (define-type (Kons A B) = ((A B -> (U A B)) -> (U A B)))

    (: kons : (All (A B) A B -> (Kons A B)))
    (define (kons x y) (lambda (s) (s x y)))
    (: kar : (All (A B) (Kons A B) -> (U A B)))
    (define (kar p) (p (lambda (x y) x)))
    (: kdr : (All (A B) (Kons A B) -> (U A B)))
    (define (kdr p) (p (lambda (x y) y)))
    (define a (kons 1 2))
    (define b (kons 3 4))
    (list (kar a) (kdr a))
    (list (kar b) (kdr b))

Note that the `Kons` type definition is the same as:

    (define-type Kons = (All (A B) (A B -> (U A B)) -> (U A B)))

so `All` is to polymorphic type definitions what `lambda` is for
function definitions.

Finally, in JavaScript:

    function kons(x,y) { return function(s) { return s(x, y); } }
    function kar(p) { return p(function(x,y){ return x; }); }
    function kdr(p) { return p(function(x,y){ return y; }); }
    a = kons(1,2);
    b = kons(3,4);
    console.log('a = <' + kar(a) + ',' + kdr(a) + '>' );
    console.log('b = <' + kar(b) + ',' + kdr(b) + '>' );

or with ES6 *arrow functions*, the function definitions become:

    const kons = (x,y) => s => s(x,y);
    const kar  = p => p((x,y) => x);
    const kdr  = p => p((x,y) => y);

and using Typescript to add types:

    type Kons<A,B> = (s: (x:A, y:B) => A|B) => A | B
    const kons = <A,B>(x:A,y:B) => (s: ((x:A, y:B) => A|B)) => s(x,y);
    const kar  = <A,B>(p: Kons<A,B>) => p((x,y) => x);
    const kdr  = <A,B>(p: Kons<A,B>) => p((x,y) => y);


------------------------------------------------------------------------
# Using `define-type` for new "type aliases" [Tuesday, January 28th]

As seen in these examples, there is another way to use `define-type`,
using a `=` to create a new type name "alias" for an *existing* type.
For example:

    (define-type Strings = (Listof String))

These uses of `define-type` do not define any new kind of type, they are
essentially a convenience tool for making code shorter and more
readable.

    (define-type NumericFunction = Number -> Number)

    (: square : NumericFunction)
    (define (square n) (* n n))

Note in particular that this can also be used to define "alias type
constructors" too: somewhat similar to creating new "type functions".
For example:

    (define-type (BinaryFun In Out) = In In -> Out)

    (: diagonal : (BinaryFun Natural Number))
    (define (diagonal width height)
      (sqrt (+ (* width width) (* height height))))

This is something that we will only need in a few rare cases.


------------------------------------------------------------------------
# Currying [Tuesday, January 28th]

A *curried* function is a function that, instead of accepting two (or
more) arguments, accepts only one and returns a function that accepts
the rest. For example:

    (: plus : Number -> (Number -> Number))
    (define (plus x)
      (lambda (y)
        (+ x y)))

It's easy to write functions for translating between normal and curried
versions.

    (define (currify f)
      (lambda (x)
        (lambda (y)
          (f x y))))

Typed version of that, with examples:

    (: currify : (All (A B C) (A B -> C) -> (A -> (B -> C))))
    ;; convert a double-argument function to a curried one
    (define (currify f)
      (lambda (x) (lambda (y) (f x y))))

    (: add : Number Number -> Number)
    (define (add x y) (+ x y))

    (: plus : Number -> (Number -> Number))
    (define plus (currify add))

    (test ((plus 1) 2) => 3)
    (test (((currify add) 1) 2) => 3)
    (test (map (plus 1) '(1 2 3)) => '(2 3 4))
    (test (map ((currify add) 1) '(1 2 3)) => '(2 3 4))
    (test (map ((currify +) 1) '(1 2 3)) => '(2 3 4))

Usages --- common with H.O. functions like map, where we want to *fix*
one argument.

When dealing with such higher-order code, the types are very helpful,
since every arrow corresponds to a function:

    (: currify : (All (A B C) (A B -> C) -> (A -> (B -> C))))

It is common to make the `->` function type associate to the right, so
you can find this type written as:

    currify : (A B -> C) -> (A -> B -> C)

or even as

    currify : (A B -> C) -> A -> B -> C

but that can be a little confusing...


------------------------------------------------------------------------
# Using Higher-Order & Anonymous Functions [Tuesday, January 28th]

Say that we have a function for estimating derivatives of a function at
a specific point:

    (define dx 0.01)

    (: deriv : (Number -> Number) Number -> Number)
    ;; compute the derivative of `f' at the given point `x'
    (define (deriv f x)
      (/ (- (f (+ x dx)) (f x)) dx))

    (: integrate : (Number -> Number) Number -> Number)
    ;; compute an integral of `f' at the given point `x'
    (define (integrate f x)
      (: loop : Number Number -> Number)
      (define (loop y acc)
        (if (> y x)
          (* acc dx)
          (loop (+ y dx) (+ acc (f y)))))
      (loop 0 0))

And say that we want to try out various functions given some `plot`
function that draws graphs of numeric functions, for example:

    (plot sin)

The problem is that `plot` expects a single `(Number -> Number)`
function --- if we want to try it with a derivative, we can do this:

    (: sin-deriv : Number -> Number)
    ;; the derivative of sin
    (define sin-deriv (lambda (x) (deriv sin x)))
    (plot sin-deriv)

But this will get very tedious very fast --- it is much simpler to use
an anonymous function:

    (plot (lambda (x) (deriv sin x)))

we can even verify that our derivative is correct by comparing a known
function to its derivative

    (plot (lambda (x) (- (deriv sin x) (cos x))))

But it's still not completely natural to do these things --- you need to
explicitly combine functions, which is not too convenient. Instead of
doing this, we can write H.O. functions that will work with functional
inputs and outputs. For example, we can write a function to subtract
functions:

    (: fsub : (Number -> Number) (Number -> Number)
              -> (Number -> Number))
    ;; subtracts two numeric 1-argument functions
    (define (fsub f g)
      (lambda (x) (- (f x) (g x))))

and the same for the derivative:

    (: fderiv : (Number -> Number) -> (Number -> Number))
    ;; compute the derivative function of `f'
    (define (fderiv f)
      (lambda (x) (deriv f x)))

Now we can try the same in a much easier way:

    (plot (fsub (fderiv sin) cos))

More than that --- our `fderiv` could be created from `deriv`
automatically:

    (: currify : (All (A B C) (A B -> C) -> (A -> B -> C)))
    ;; convert a double-argument function to a curried one
    (define (currify f)
      (lambda (x) (lambda (y) (f x y))))

    (: fderiv : (Number -> Number) -> (Number -> Number))
    ;; compute the derivative function of `f'
    (define fderiv (currify deriv))

Same principle with `fsub`: we can write a function that converts a
binary arithmetical function into a function that operates on unary
numeric function. But to make things more readable we can define new
types for unary and binary numeric functions:

    (define-type UnaryFun  = (Number -> Number))
    (define-type BinaryFun = (Number Number -> Number))

    (: binop->fbinop : BinaryFun -> (UnaryFun UnaryFun -> UnaryFun))
    ;; turns an arithmetic binary operator to a function operator
    (define (binop->fbinop op)
      (lambda (f g)
        (lambda (x) (op (f x) (g x)))))

    (: fsub : UnaryFun UnaryFun -> UnaryFun)
    ;; functional pointwise subtraction
    (define fsub (binop->fbinop -))

We can do this with anything --- developing a rich library of functions
and functionals (functions over functions) is extremely easy... Here's a
pretty extensive yet very short library of functions:

    #lang pl untyped

    (define (currify f)
      (lambda (x) (lambda (y) (f x y))))
    (define (binop->fbinop op)
      (lambda (f g)
        (lambda (x) (op (f x) (g x)))))
    (define (compose f g)
      (lambda (x) (f (g x))))

    (define dx 0.01)
    (define (deriv f x)
      (/ (- (f (+ x dx)) (f x)) dx))
    (define (integrate f x)
      (define over? (if (< x 0) < >))
      (define step  (if (< x 0) - +))
      (define add   (if (< x 0) - +))
      (define (loop y acc)
        (if (over? y x)
          (* acc dx)
          (loop (step y dx) (add acc (f y)))))
      (loop 0 0))

    (define fadd (binop->fbinop +))
    (define fsub (binop->fbinop -))
    (define fmul (binop->fbinop *))
    (define fdiv (binop->fbinop /))
    (define fderiv     (currify deriv))
    (define fintegrate (currify integrate))
    ;; ...

This is written in the "untyped dialect" of the class language, but it
should be easy now to add the types.

Examples:

    ;; want to verify that `integrate' is the opposite of `deriv':
    ;;   take a function, subtract it from its derivative's integral
    (plot (fsub sin (fintegrate (fderiv sin))))

    ;; want to magnify the errors? -- here's how you magnify:
    (plot (compose ((currify *) 5) sin))

    ;; so:
    (plot (compose ((currify *) 20)
                   (fsub sin (fintegrate (fderiv sin)))))


------------------------------------------------------------------------
## Side-note: "Point-Free" combinators [Tuesday, January 28th]

> Forming functions without using `lambda` (or an implicit `lambda`
> using a `define` syntactic sugar) is called *point-free style*. It's
> especially popular in Haskell, where it is easier to form functions
> this way because of implicit currying and a large number of higher
> level function combinators. If used too much, it can easily lead to
> obfuscated code.


------------------------------------------------------------------------
## This is not Runtime Code Generation [Tuesday, January 28th]

All of this is similar to run-time code generation, but not really. The
only thing that `fderiv` does is take a function and store it somewhere
in the returned function, then when that function receives a number, it
uses the stored function and send it to deriv with the number. We could
simply write deriv as what `fderiv` is --- which is the *real*
derivative function:

    (define (deriv f)
      (lambda (x)
        (/ (- (f (+ x dx)) (f x)) dx)))

but again, this is not faster or slower than the plain `deriv`. However,
there are some situations where we can do some of the computation on the
first-stage argument, saving work from the second stage. Here is a
cooked-to-exaggeration example --- we want a function that receives two
inputs `x`, `y` and returns `fib(x)*y`, but we must use a stupid `fib`:

    (define (fib n)
      (if (<= n 1)
        n
        (+ (fib (- n 1)) (fib (- n 2)))))

The function we want is:

    (define (bogus x y)
      (* (fib x) y))

If we currify it as usual (or just use `currify`), we get:

    (define (bogus x)
      (lambda (y)
        (* (fib x) y)))

And try this several times:

    (define bogus36 (bogus 36))
    (map bogus36 '(1 2 3 4 5))

But in the definition of `bogus`, notice that `(fib x)` does not depend
on `y` --- so we can rewrite it a little differently:

    (define (bogus x)
      (let ([fibx (fib x)])
        (lambda (y)
          (* fibx y))))

and trying the above again is much faster now:

    (define bogus36 (bogus 36))
    (map bogus36 '(1 2 3 4 5))

This is therefore not doing any kind of runtime code generation, but it
*enables* doing similar optimizations in our code. A proper RTCG
facility would recompile the curried function for a given first input,
and (hopefully) automatically achieve the optimization that we did in a
manual way.


------------------------------------------------------------------------
# Substitution Caches [Tuesday, February 4th]

> [PLAI §5] (called "deferred substitutions" there)

Evaluating using substitutions is very inefficient --- at each scope, we
copy a piece of the program AST. This includes all function calls which
implies an impractical cost (function calls should be *cheap*!).

To get over this, we want to use a cache of substitutions.

Basic idea: we begin evaluating with no cached substitutions, then
collect them as we encounter bindings.

[Implies another change for our evaluator: we don't really substitute
identifiers until we get to them; when we reach an identifier, it is no
longer an error --- we must consult the substitution cache.]


------------------------------------------------------------------------
# Implementation of Cache Functionality [Tuesday, February 4th]

First, we need a type for a substitution cache. For this we will use a
list of lists of two elements each --- a name and its value FLANG:

    ;; a type for substitution caches:
    (define-type SubstCache = (Listof (List Symbol FLANG)))

We need to have an empty substitution cache, a way to extend it, and a
way to look things up:

    (: empty-subst : SubstCache)
    (define empty-subst null)

    (: extend : Symbol FLANG SubstCache -> SubstCache)
    ;; extend a given substitution cache with a new mapping
    (define (extend id expr sc)
      (cons (list id expr) sc))

    (: lookup : Symbol SubstCache -> FLANG)
    ;; lookup a symbol in a substitution cache, return the value it is
    ;; bound to (or throw an error if it isn't bound)
    (define (lookup name sc)
      (cond [(null? sc) (error 'lookup "no binding for ~s" name)]
            [(eq? name (first (first sc))) (second (first sc))]
            [else (lookup name (rest sc))]))

Actually, the reason to use such list of lists is that Racket has a
built-in function called `assq` that will do this kind of search (`assq`
is a search in an association list using `eq?` for the key comparison).
This is a version of `lookup` that uses `assq`:

    (define (lookup name sc)
      (let ([cell (assq name sc)])
        (if cell
          (second cell)
          (error 'lookup "no binding for ~s" name))))


------------------------------------------------------------------------
# Formal Rules for Cached Substitutions [Tuesday, February 4th]

The formal evaluation rules are now different. Evaluation carries along
a *substitution cache* that begins its life as empty: so `eval` needs an
extra argument. We begin by writing the rules that deal with the cache,
and use the above function names for simplicity --- the behavior of the
three definitions can be summed up in a single rule for `lookup`:

    lookup(x,empty-subst)     = error!
    lookup(x,extend(x,E,sc))  = E
    lookup(x,extend(y,E,sc))  = lookup(x,sc)  if `x` is not `y`

And now we can write the new rules for `eval`

    eval(N,sc)                = N
    eval({+ E1 E2},sc)        = eval(E1,sc) + eval(E2,sc)
    eval({- E1 E2},sc)        = eval(E1,sc) - eval(E2,sc)
    eval({* E1 E2},sc)        = eval(E1,sc) * eval(E2,sc)
    eval({/ E1 E2},sc)        = eval(E1,sc) / eval(E2,sc)
    eval(x,sc)                = lookup(x,sc)
    eval({with {x E1} E2},sc) = eval(E2,extend(x,eval(E1,sc),sc))
    eval({fun {x} E},sc)      = {fun {x} E}
    eval({call E1 E2},sc)     = eval(B,extend(x,eval(E2,sc),sc))
                                  if eval(E1,sc) = {fun {x} B}
                              = error!  otherwise

Note that there is no mention of `subst` --- the whole point is that we
don't really do substitution, but use the cache instead. The `lookup`
rules, and the places where `extend` is used replaces `subst`, and
therefore specifies our scoping rules.

Also note that the rule for `call` is still very similar to the rule for
`with`, but it looks like we have lost something --- the interesting bit
with substituting into `fun` expressions.


------------------------------------------------------------------------
# Evaluating with Substitution Caches [Tuesday, February 4th]

Implementing the new `eval` is easy now --- it is extended in the same
way that the formal `eval` rule is extended:

    (: eval : FLANG SubstCache -> FLANG)
    ;; evaluates FLANG expressions by reducing them to expressions
    (define (eval expr sc)
      (cases expr
        [(Num n) expr]
        [(Add l r) (arith-op + (eval l sc) (eval r sc))]
        [(Sub l r) (arith-op - (eval l sc) (eval r sc))]
        [(Mul l r) (arith-op * (eval l sc) (eval r sc))]
        [(Div l r) (arith-op / (eval l sc) (eval r sc))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (extend bound-id (eval named-expr sc) sc))]
        [(Id name) (lookup name sc)]
        [(Fun bound-id bound-body) expr]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr sc))
         (cases fval
           [(Fun bound-id bound-body)
            (eval bound-body (extend bound-id (eval arg-expr sc) sc))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

Again, note that we don't need `subst` anymore, but the rest of the code
(the data type definition, parsing, and `arith-op`) is exactly the same.

Finally, we need to make sure that `eval` is initially called with an
empty cache. This is easy to change in our main `run` entry point:

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str) empty-subst)])
        (cases result
          [(Num n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

The full code (including the same tests, but not including formal rules
for now) follows. Note that one test does not pass.

    #lang pl

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; a type for substitution caches:
    (define-type SubstCache = (Listof (List Symbol FLANG)))

    (: empty-subst : SubstCache)
    (define empty-subst null)

    (: extend : Symbol FLANG SubstCache -> SubstCache)
    ;; extend a given substitution cache with a new mapping
    (define (extend name val sc)
      (cons (list name val) sc))

    (: lookup : Symbol SubstCache -> FLANG)
    ;; lookup a symbol in a substitution cache, return the value it is
    ;; bound to (or throw an error if it isn't bound)
    (define (lookup name sc)
      (let ([cell (assq name sc)])
        (if cell
          (second cell)
          (error 'lookup "no binding for ~s" name))))

    (: Num->number : FLANG -> Number)
    ;; convert a FLANG number to a Racket one
    (define (Num->number e)
      (cases e
        [(Num n) n]
        [else (error 'arith-op "expected a number, got: ~s" e)]))

    (: arith-op : (Number Number -> Number) FLANG FLANG -> FLANG)
    ;; gets a Racket numeric binary operator, and uses it within a FLANG
    ;; `Num' wrapper
    (define (arith-op op val1 val2)
      (Num (op (Num->number val1) (Num->number val2))))

    (: eval : FLANG SubstCache -> FLANG)
    ;; evaluates FLANG expressions by reducing them to expressions
    (define (eval expr sc)
      (cases expr
        [(Num n) expr]
        [(Add l r) (arith-op + (eval l sc) (eval r sc))]
        [(Sub l r) (arith-op - (eval l sc) (eval r sc))]
        [(Mul l r) (arith-op * (eval l sc) (eval r sc))]
        [(Div l r) (arith-op / (eval l sc) (eval r sc))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (extend bound-id (eval named-expr sc) sc))]
        [(Id name) (lookup name sc)]
        [(Fun bound-id bound-body) expr]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr sc))
         (cases fval
           [(Fun bound-id bound-body)
            (eval bound-body (extend bound-id (eval arg-expr sc) sc))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str) empty-subst)])
        (cases result
          [(Num n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{call {with {x 3}
                        {fun {y} {+ x y}}}
                      4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{with {x 3}
                  {with {f {fun {y} {+ x y}}}
                    {with {x 5}
                      {call f 4}}}}")
          => "???")
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)


------------------------------------------------------------------------
# Dynamic and Lexical Scopes [Tuesday, February 4th]

This seems like it should work, and it even worked on a few examples,
except for one which was hard to follow. Seems like we have a bug...

Now we get to a tricky issue that managed to be a problem for *lots* of
language implementors, including the first version of Lisp. Lets try to
run the following expression --- try to figure out what it will evaluate
to:

    (run "{with {x 3}
            {with {f {fun {y} {+ x y}}}
              {with {x 5}
                {call f 4}}}}")

We expect it to return `7` (at least I do!), but we get `9` instead...
The question is --- *should* it return `9`?

What we have arrived to is called *dynamic scope*. Scope is determined
by the dynamic run-time environment (which is represented by our
substitution cache). This is *almost always* undesirable, as I hope to
convince you.

Before we start, we define two scope options for a programming language:

* Static Scope (also called Lexical Scope): In a language with static
  scope, each identifier gets its value from the scope of its
  definition, not its use.

* Dynamic Scope: In a language with dynamic scope, each identifier gets
  its value from the scope of its use, not its definition.

Racket uses lexical scope, our new evaluator uses dynamic, the old
substitution-based evaluator was static etc.

As a side-remark, Lisp began its life as a dynamically-scoped language.
The artifacts of this were (sort-of) dismissed as an implementation bug.
When Scheme was introduced, it was the first Lisp dialect that used
strictly lexical scoping, and Racket is obviously doing the same. (Some
Lisp implementations used dynamic scope for interpreted code and lexical
scope for compiled code!) In fact, Emacs Lisp is the only *live*
dialects of Lisp that is still dynamically scoped by default. To see
this, compare a version of the above code in Racket:

    (let ((x 3))
      (let ((f (lambda (y) (+ x y))))
        (let ((x 5))
          (f 4))))

and the Emacs Lisp version (which looks almost the same):

    (let ((x 3))
      (let ((f (lambda (y) (+ x y))))
        (let ((x 5))
          (funcall f 4))))

which also happens when we use another function on the way:

    (defun blah (func val)
      (funcall func val))

    (let ((x 3))
      (let ((f (lambda (y) (+ x y))))
        (let ((x 5))
          (blah f 4))))

and note that renaming identifiers can lead to different code --- change
that `val` to `x`:

    (defun blah (func x)
      (funcall func x))

    (let ((x 3))
      (let ((f (lambda (y) (+ x y))))
        (let ((x 5))
          (blah f 4))))

and you get `8` because the argument name changed the `x` that the
internal function sees!

Consider also this Emacs Lisp function:

    (defun return-x ()
      x)

which has no meaning by itself (`x` is unbound),

    (return-x)

but can be given a dynamic meaning using a `let`:

    (let ((x 5)) (return-x))

or a function application:

    (defun foo (x)
      (return-x))

    (foo 5)

There is also a dynamically-scoped language in the course languages:

    #lang pl dynamic

    (define x 123)

    (define (getx) x)

    (define (bar1 x) (getx))
    (define (bar2 y) (getx))

    (test (getx) => 123)
    (test (let ([x 456]) (getx)) => 456)
    (test (getx) => 123)
    (test (bar1 999) => 999)
    (test (bar2 999) => 123)

    (define (foo x) (define (helper) (+ x 1)) helper)
    (test ((foo 0)) => 124)

    ;; and *much* worse:
    (define (add x y) (+ x y))
    (test (let ([+ *]) (add 6 7)) => 42)

Note how bad the last example gets: you basically cannot call any
function and know in advance what it will do.

There are some cases where dynamic scope can be useful in that it allows
you to "remotely" customize any piece of code. A good example of where
this is taken to an extreme is Emacs: originally, it was based on an
ancient Lisp dialect that was still dynamically scoped, but it retained
this feature even when practically all Lisp dialects moved on to having
lexical scope by default. The reason for this is that the danger of
dynamic scope is also a way to make a very open system where almost
anything can be customized by changing it "remotely". Here's a concrete
example for a similar kind of dynamic scope usage that makes a very
hackable and open system:

    #lang pl dynamic

    (define tax% 6.25)
    (define (with-tax n)
      (+ n (* n (/ tax% 100))))

    (with-tax 10) ; how much do we pay?
    (let ([tax% 17.0]) (with-tax 10)) ; how much would we pay in Israel?

    ;; make that into a function
    (define il-tax% 17.0)
    (define (ma-over-il-saving n)
      (- (let ([tax% il-tax%]) (with-tax n))
         (with-tax n)))

    (ma-over-il-saving 10)
    ;; can even control that: how much would we save if
    ;; the tax in israel went down one percent?
    (let ([il-tax% (- il-tax% 1)]) (ma-over-il-saving 10))

    ;; or change both: how much savings in NH instead of MA?
    (let ((tax% 0.0) (il-tax% tax%))
      (ma-over-il-saving 1000))

Obviously, this power to customize everything is also the main source of
problems with getting no guarantees for code. A common way to get the
best of both worlds is to have *controllable* dynamic scope. For
example, Common Lisp also has lexical scope everywhere by default, but
some variables can be declared as *special*, which means that they are
dynamically scoped. The main problem with that is that you can't tell
when a variable is special by just looking at the code that uses it, so
a more popular approach is the one that is used in Racket: all bindings
are always lexically scoped, but there are *parameters* which are a kind
of dynamically scoped value containers --- but they are bound to plain
(lexically scoped) identifiers. Here's the same code as above,
translated to Racket with parameters:

    #lang racket

    (define tax% (make-parameter 6.5))  ; create the dynamic container
    (define (with-tax n)
      (+ n (* n (/ (tax%) 100))))       ; note how its value is accessed

    (with-tax 10) ; how much do we pay?
    (parameterize ([tax% 17.0]) (with-tax 10)) ; not a `let'

    ;; make that into a function
    (define il-tax% (make-parameter 17.0))
    (define (ma-over-il-saving n)
      (- (parameterize ([tax% (il-tax%)]) (with-tax n))
         (with-tax n)))

    (ma-over-il-saving 10)
    (parameterize ([il-tax% (- (il-tax%) 1)]) (ma-over-il-saving 10))

The main point here is that the points where a dynamically scoped value
is used are under the programmer's control --- you cannot "customize"
what `-` is doing, for example. This gives us back the guarantees that
we like to have (= that code works), but of course these points are
pre-determined, unlike an environment where everything can be customized
including things that are unexpectedly useful.

> As a side-note, after many decades of debating this, Emacs has finally
> added lexical scope in its core language, but this is still determined
> by a flag --- a global `lexical-binding` variable.


------------------------------------------------------------------------
# Dynamic versus Lexical Scope [Tuesday, February 4th]

And back to the discussion of whether we should use dynamic or lexical
scope:

* The most important fact is that we want to view programs as executed
  by the normal substituting evaluator. Our original motivation was to
  optimize evaluation only --- not to *change* the semantics! It follows
  that we want the result of this optimization to behave in the same
  way. All we need is to evaluate:

      (run "{with {x 3}
              {with {f {fun {y} {+ x y}}}
                {with {x 5}
                  {call f 4}}}}")

  in the original evaluator to get convinced that `7` should be the
  correct result (note also that the same code, when translated into
  Racket, evaluates to `7`).

  (Yet, this is a very important optimization, which without it lots of
  programs become too slow to be feasible, so you might claim that
  you're fine with the modified semantics...)

* It does not allow using functions as objects, for example, we have
  seen that we have a functional representation for pairs:

      (define (kons x y)
        (lambda (n)
          (match n
            ['first  x]
            ['second y]
            [else (error ...)])))

      (define my-pair (kons 1 2))

  If this is evaluated in a dynamically-scoped language, we do get a
  function as a result, but the values bound to `x` and `y` are now
  gone! Using the substitution model we substituted these values in, but
  now they were only held in a cache which no has no entries for them...

  In the same way, currying would not work, our nice `deriv` function
  would not work etc etc etc.

* Makes reasoning impossible, because any piece of code behaves in a way
  that *cannot* be predicted until run-time. For example, if dynamic
  scoping was used in Racket, then you wouldn't be able to know what
  this function is doing:

      (define (foo)
        x)

  As it is, it will cause a run-time error, but if you call it like
  this:

      (let ([x 1])
        (foo))

  then it will return `1`, and if you later do this:

      (define (bar x)
        (foo))

      (let ([x 1])
        (bar 2))

  then you would get `2`!

  These problems can be demonstrated in Emacs Lisp too, but Racket goes
  one step further --- it uses the same rule for evaluating a function
  as well as its values (Lisp uses a different name-space for
  functions). Because of this, you cannot even rely on the following
  function:

      (define (add x y)
        (+ x y))

  to always add `x` and `y`! --- A similar example to the above:

      (let ([+ -])
        (add 1 2))

  would return `-1`!

* Many so-called "scripting" languages begin their lives with dynamic
  scoping. The main reason, as we've seen, is that implementing it is
  extremely simple (no, *nobody* does substitution in the real world!
  (Well, *almost* nobody...)).

  Another reason is that these problems make life impossible if you want
  to use functions as object like you do in Racket, so you notice them
  very fast --- but in a `normal` language without first-class
  functions, problems are not as obvious.

* For example, bash has `local` variables, but they have dynamic scope:

      x="the global x"
      print_x() { echo "The current value of x is \"$x\""; }
      foo() { local x="x from foo"; print_x; }
      print_x; foo; print_x

  Perl began its life with dynamic scope for variables that are declared
  `local`:

      $x="the global x";
      sub print_x { print "The current value of x is \"$x\"\n"; }
      sub foo { local($x); $x="x from foo"; print_x; }
      print_x; foo; print_x;

  When faced with this problem, "the Perl way" was, obviously, not to
  remove or fix features, but to pile them up --- so `local` *still*
  behaves in this way, and now there is a `my` declaration which
  achieves proper lexical scope (and every serious Perl programmer knows
  that you should always use `my`)...

  There are other examples of languages that changed, and languages that
  want to change (e.g, nobody likes dynamic scope in Emacs Lisp, but
  there's just too much code now).

* This is still a tricky issue, like any other issue with bindings. For
  example, googling got me quickly to [a Python blog post] which is
  confused about what "dynamic scoping" is... It claims that Python uses
  dynamic scope (Search for "Python uses dynamic as opposed to lexical
  scoping"), yet python always used lexical scope rules, as can be seen
  by translating their code to Racket (ignore side-effects in this
  computation):

      (define (orange-juice)
        (* x 2))
      (define x 3)
      (define y (orange-juice)) ; y is now 6
      (define x 1)
      (define y (orange-juice)) ; y is now 2

  or by trying this in Python:

      def orange_juice():
        return x*2
      def foo(x):
        return orange_juice()
      foo(2)

  The real problem of python (pre 2.1, and pre 2.2 without the funny

      from __future__ import nested_scope

  line) is that it didn't create closures, which we will talk about
  shortly.

  [a Python blog post]:
    https://folk.idi.ntnu.no/mlh/hetland_org/writing/instant-python.html

* Another example, which is an indicator of how easy it is to mess up
  your scope is the following Ruby bug --- running in `irb`:

      % irb
      irb(main):001:0> x = 0
      => 0
      irb(main):002:0> lambda{|x| x}.call(5)
      => 5
      irb(main):003:0> x
      => 5

  (This is a bug due to weird scoping rules for variables, which was
  fixed in newer versions of Ruby. See [this Ruby rant] for details, or
  read about [Ruby and the principle of unwelcome surprise] for
  additional gems (the latter is gone, so you'll need the [web
  archive](https://archive.org/) to read it).)

  [this Ruby rant]:
    http://innig.net/software/ruby/closures-in-ruby
  [Ruby and the principle of unwelcome surprise]:
    http://ceaude.twoticketsplease.de/articles/ruby-and-the-principle-of-unwelcome-surprise.html

* Another thing to consider is the fact that compilation is something
  that you do based only on the lexical structure of programs, since
  compilers never actually run code. This means that dynamic scope makes
  compilation close to impossible.

* There are some advantages for dynamic scope too. Two notable ones are:

  - Dynamic scope makes it easy to have a "configuration variable"
    easily change for the extent of a calling piece of code (this is
    used extensively in Emacs, for example). The thing is that usually
    we want to control which variables are "configurable" in this way,
    statically scoped languages like Racket often choose a separate
    facility for these. To rephrase the problem of dynamic scoping, it's
    that *all* variables are modifiable.

    The same can be said about functions: it is sometimes desirable to
    change a function dynamically (for example, see "Aspect Oriented
    Programming"), but if there is no control and all functions can
    change, we get a world where no code can every be reliable.

  - It makes recursion immediately available --- for example,

        {with {f {fun {x} {call f x}}}
          {call f 0}}

    is an infinite loop with a dynamically scoped language. But in a
    lexically scoped language we will need to do some more work to get
    recursion going.


------------------------------------------------------------------------
# Implementing Lexical Scope: Closures and Environments [Tuesday, February 11th]

So how do we fix this?

Lets go back to the root of the problem: the new evaluator does not
behave in the same way as the substituting evaluator. In the old
evaluator, it was easy to see how functions can behave as objects that
remember values. For example, when we do this:

    {with {x 1}
      {fun {y}
        {+ x y}}}

the result was a function value, which actually was the syntax object
for this:

    {fun {y} {+ 1 y}}

Now if we call this function from someplace else like:

    {with {f {with {x 1} {fun {y} {+ x y}}}}
      {with {x 2}
        {call f 3}}}

it is clear what the result will be: f is bound to a function that adds
1 to its input, so in the above the later binding for `x` has no effect
at all.

But with the caching evaluator, the value of

    {with {x 1}
      {fun {y}
        {+ x y}}}

is simply:

    {fun {y} {+ x y}}

and there is no place where we save the 1 --- *that's* the root of our
problem. (That's also what makes people suspect that using `lambda` in
Racket and any other functional language involves some inefficient
code-recompiling magic.) In fact, we can verify that by inspecting the
returned value, and see that it does contain a free identifier.

Clearly, we need to create an object that contains the body and the
argument list, like the function syntax object --- but we don't do any
substitution, so in addition to the body an argument name(s) we need to
remember that we still need to substitute `x` by `1` . This means that
the pieces of information we need to know are:

    - formal argument(s):    {y}
    - body:                  {+ x y}
    - pending substitutions: [1/x]

and that last bit has the missing `1`. The resulting object is called a
`closure` because it closes the function body over the substitutions
that are still pending (its environment).

So, the first change is in the value of functions which now need all
these pieces, unlike the `Fun` case for the syntax object.

A second place that needs changing is the when functions are called.
When we're done evaluating the `call` arguments (the function value and
the argument value) but before we apply the function we have two
*values* --- there is no more use for the current substitution cache at
this point: we have finished dealing with all substitutions that were
necessary over the current expression --- we now continue with
evaluating the body of the function, with the new substitutions for the
formal arguments and actual values given. But the body itself is the
same one we had before --- which is the previous body with its suspended
substitutions that we *still* did not do.

Rewrite the evaluation rules --- all are the same except for evaluating
a `fun` form and a `call` form:

    eval(N,sc)                = N
    eval({+ E1 E2},sc)        = eval(E1,sc) + eval(E2,sc)
    eval({- E1 E2},sc)        = eval(E1,sc) - eval(E2,sc)
    eval({* E1 E2},sc)        = eval(E1,sc) * eval(E2,sc)
    eval({/ E1 E2},sc)        = eval(E1,sc) / eval(E2,sc)
    eval(x,sc)                = lookup(x,sc)
    eval({with {x E1} E2},sc) = eval(E2,extend(x,eval(E1,sc),sc))
    eval({fun {x} E},sc)      = <{fun {x} E}, sc>
    eval({call E1 E2},sc1)    = eval(B,extend(x,eval(E2,sc1),sc2))
                                  if eval(E1,sc1) = <{fun {x} B}, sc2>
                              = error!  otherwise

As a side note, these substitution caches are a little more than "just a
cache" now --- they actually hold an *environment* of substitutions in
which expression should be evaluated. So we will switch to the common
*environment* name now.:

    eval(N,env)                = N
    eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
    eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
    eval({* E1 E2},env)        = eval(E1,env) * eval(E2,env)
    eval({/ E1 E2},env)        = eval(E1,env) / eval(E2,env)
    eval(x,env)                = lookup(x,env)
    eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
    eval({fun {x} E},env)      = <{fun {x} E}, env>
    eval({call E1 E2},env1)    = eval(B,extend(x,eval(E2, env1),env2))
                                  if eval(E1,env1) = <{fun {x} B}, env2>
                               = error!  otherwise

In case you find this easier to follow, the "flat algorithm" for
evaluating a `call` is:

    1. f := evaluate E1 in env1
    2. if f is not a <{fun ...},...> closure then error!
    3. x := evaluate E2 in env1
    4. new_env := extend env_of(f) by mapping arg_of(f) to x
    5. evaluate (and return) body_of(f) in new_env

Note how the scoping rules that are implied by this definition match the
scoping rules that were implied by the substitution-based rules. (It
should be possible to prove that they are the same.)

The changes to the code are almost trivial, except that we need a way to
represent `<{fun {x} B}, env>` pairs.

------------------------------------------------------------------------

The implication of this change is that we now cannot use the same type
for function syntax and function values since function values have more
than just syntax. There is a simple solution to this --- we never do any
substitutions now, so we don't need to translate values into expressions
--- we can come up with a new type for values, separate from the type of
abstract syntax trees.

When we do this, we will also fix our hack of using FLANG as the type of
values: this was merely a convenience since the AST type had cases for
all kinds of values that we needed. (In fact, you should have noticed
that Racket does this too: numbers, strings, booleans, etc are all used
by both programs and syntax representation (s-expressions) --- but note
that function values are *not* used in syntax.) We will now implement a
separate `VAL` type for runtime values.

First, we need now a type for such environments --- we can use `Listof`
for this:

    ;; a type for environments:
    (define-type ENV = (Listof (List Symbol VAL)))

but we can just as well define a new type for environment values:

    (define-type ENV
      [EmptyEnv]
      [Extend Symbol VAL ENV])

Reimplementing `lookup` is now simple:

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(Extend id val rest-env)
         (if (eq? id name) val (lookup name rest-env))]))

... we don't need `extend` because we get `Extend` from the type
definition, and we also get `(EmptyEnv)` instead of `empty-subst`.

We now use this with the new type for values --- two variants of these:

    (define-type VAL
      [NumV Number]
      [FunV Symbol FLANG ENV]) ; arg-name, body, scope

And now the new implementation of `eval` which uses the new type and
implements lexical scope:

    (: eval : FLANG ENV -> VAL)
    ;; evaluates FLANG expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Add l r) (arith-op + (eval l env) (eval r env))]
        [(Sub l r) (arith-op - (eval l env) (eval r env))]
        [(Mul l r) (arith-op * (eval l env) (eval r env))]
        [(Div l r) (arith-op / (eval l env) (eval r env))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (Extend bound-id (eval named-expr env) env))]
        [(Id name) (lookup name env)]
        [(Fun bound-id bound-body)
         (FunV bound-id bound-body env)]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV bound-id bound-body f-env)
            (eval bound-body
                  (Extend bound-id (eval arg-expr env) f-env))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

We also need to update `arith-op` to use `VAL` objects. The full code
follows --- it now passes all tests, including the example that we used
to find the problem.

    ;;; ---<<<FLANG-ENV>>>----------------------------------------------
    ;; The Flang interpreter, using environments

    #lang pl

    #|
    The grammar:
      <FLANG> ::= <num>
                | { + <FLANG> <FLANG> }
                | { - <FLANG> <FLANG> }
                | { * <FLANG> <FLANG> }
                | { / <FLANG> <FLANG> }
                | { with { <id> <FLANG> } <FLANG> }
                | <id>
                | { fun { <id> } <FLANG> }
                | { call <FLANG> <FLANG> }

    Evaluation rules:
      eval(N,env)                = N
      eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
      eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
      eval({* E1 E2},env)        = eval(E1,env) * eval(E2,env)
      eval({/ E1 E2},env)        = eval(E1,env) / eval(E2,env)
      eval(x,env)                = lookup(x,env)
      eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
      eval({fun {x} E},env)      = <{fun {x} E}, env>
      eval({call E1 E2},env1)    = eval(B,extend(x,eval(E2,env1),env2))
                                  if eval(E1,env1) = <{fun {x} B}, env2>
                                 = error!  otherwise
    |#

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Types for environments, values, and a lookup function

    (define-type ENV
      [EmptyEnv]
      [Extend Symbol VAL ENV])

    (define-type VAL
      [NumV Number]
      [FunV Symbol FLANG ENV])

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(Extend id val rest-env)
         (if (eq? id name) val (lookup name rest-env))]))

    (: NumV->number : VAL -> Number)
    ;; convert a FLANG runtime numeric value to a Racket one
    (define (NumV->number val)
      (cases val
        [(NumV n) n]
        [else (error 'arith-op "expected a number, got: ~s" val)]))

    (: arith-op : (Number Number -> Number) VAL VAL -> VAL)
    ;; gets a Racket numeric binary operator, and uses it within a NumV
    ;; wrapper
    (define (arith-op op val1 val2)
      (NumV (op (NumV->number val1) (NumV->number val2))))

    (: eval : FLANG ENV -> VAL)
    ;; evaluates FLANG expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Add l r) (arith-op + (eval l env) (eval r env))]
        [(Sub l r) (arith-op - (eval l env) (eval r env))]
        [(Mul l r) (arith-op * (eval l env) (eval r env))]
        [(Div l r) (arith-op / (eval l env) (eval r env))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (Extend bound-id (eval named-expr env) env))]
        [(Id name) (lookup name env)]
        [(Fun bound-id bound-body)
         (FunV bound-id bound-body env)]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV bound-id bound-body f-env)
            (eval bound-body
                  (Extend bound-id (eval arg-expr env) f-env))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str) (EmptyEnv))])
        (cases result
          [(NumV n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x 3}
                  {with {f {fun {y} {+ x y}}}
                    {with {x 5}
                      {call f 4}}}}")
          => 7)
    (test (run "{call {with {x 3}
                        {fun {y} {+ x y}}}
                      4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)


------------------------------------------------------------------------
# Fixing an Overlooked Bug [Tuesday, February 11th]

Incidentally, this version fixes a bug we had previously in the
substitution version of FLANG:

    (run "{with {f {fun {y} {+ x y}}}
            {with {x 7}
              {call f 1}}}")

This bug was due to our naive `subst`, which doesn't avoid capturing
renames. But note that since that version of the evaluator makes its way
from the outside in, there is no difference in semantics for *valid*
programs --- ones that don't have free identifiers.

(Reminder: This was *not* a dynamically scoped language, just a bug that
happened when `x` wasn't substituted away before `f` was replaced with
something that refers to `x`.)


------------------------------------------------------------------------
# Lexical Scope using Racket Closures [Tuesday, February 11th]

> [PLAI §11] (without the last part about recursion)

An alternative representation for an environment.

We've already seen how first-class functions can be used to implement
"objects" that contain some information. We can use the same idea to
represent an environment. The basic intuition is --- an environment is a
*mapping* (a function) between an identifier and some value. For
example, we can represent the environment that maps `'a` to `1` and `'b`
to `2` (using just numbers for simplicity) using this function:

    (: my-map : Symbol -> Number)
    (define (my-map id)
      (cond [(eq? 'a id) 1]
            [(eq? 'b id) 2]
            [else (error ...)]))

An empty mapping that is implemented in this way has the same type:

    (: empty-mapping : Symbol -> Number)
    (define (empty-mapping id)
      (error ...))

We can use this idea to implement our environments: we only need to
define three things --- `EmptyEnv`, `Extend`, and `lookup`. If we manage
to keep the contract to these functions intact, we will be able to
simply plug it into the same evaluator code with no other changes. It
will also be more convenient to define `ENV` as the appropriate function
type for use in the `VAL` type definition instead of using the actual
type:

    ;; Define a type for functional environments
    (define-type ENV = Symbol -> VAL)

Now we get to `EmptyEnv` --- this is expected to be a function that
expects no arguments and creates an empty environment, one that behaves
like the `empty-mapping` function defined above. We could define it like
this (changing the `empty-mapping` type to return a `VAL`):

    (define (EmptyEnv) empty-mapping)

but we can skip the need for an extra definition and simply return an
empty mapping function:

    (: EmptyEnv : -> ENV)
    (define (EmptyEnv)
      (lambda (id) (error ...)))

(The un-Rackety name is to avoid replacing previous code that used the
`EmptyEnv` name for the constructor that was created by the type
definition.)

The next thing we tackle is `lookup`. The previous definition that was
used is:

    (: lookup : Symbol ENV -> VAL)
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(Extend id val rest-env)
         (if (eq? id name) val (lookup name rest-env))]))

How should it be modified now? Easy --- an environment is a mapping: a
Racket function that will do the searching job itself. We don't need to
modify the contract since we're still using `ENV`, except a different
implementation for it. The new definition is:

    (: lookup : Symbol ENV -> VAL)
    (define (lookup name env)
      (env name))

Note that `lookup` does almost nothing --- it simply delegates the real
work to the `env` argument. This is a good hint for the error message
that empty mappings should throw ---

    (: EmptyEnv : -> ENV)
    (define (EmptyEnv)
      (lambda (id) (error 'lookup "no binding for ~s" id)))

Finally, `Extend` --- this was previously created by the variant case of
the ENV type definition:

    [Extend Symbol VAL ENV]

keeping the same type that is implied by this variant means that the new
`Extend` should look like this:

    (: Extend : Symbol VAL ENV -> ENV)
    (define (Extend id val rest-env)
      ...)

The question is --- how do we extend a given environment? Well, first,
we know that the result should be mapping --- a `symbol -> VAL` function
that expects an identifier to look for:

    (: Extend : Symbol VAL ENV -> ENV)
    (define (Extend id val rest-env)
      (lambda (name)
        ...))

Next, we know that in the generated mapping, if we look for `id` then
the result should be `val`:

    (: Extend : Symbol VAL ENV -> ENV)
    (define (Extend id val rest-env)
      (lambda (name)
        (if (eq? name id)
          val
          ...)))

If the `name` that we're looking for is not the same as `id`, then we
need to search through the previous environment:

    (: Extend : Symbol VAL ENV -> ENV)
    (define (Extend id val rest-env)
      (lambda (name)
        (if (eq? name id)
          val
          (lookup name rest-env))))

But we know what `lookup` does --- it simply delegates back to the
mapping function (which is our `rest` argument), so we can take a direct
route instead:

    (: Extend : Symbol VAL ENV -> ENV)
    (define (Extend id val rest-env)
      (lambda (name)
        (if (eq? name id)
          val
          (rest-env name)))) ; same as (lookup name rest-env)

To see how all this works, try out extending an empty environment a few
times and examine the result. For example, the environment that we began
with:

    (define (my-map id)
      (cond [(eq? 'a id) 1]
            [(eq? 'b id) 2]
            [else (error ...)]))

behaves in the same way (if the type of values is numbers) as

    (Extend 'a 1 (Extend 'b 2 (EmptyEnv)))

The new code is now the same, except for the environment code:

    #lang pl

    #|
    The grammar:
      <FLANG> ::= <num>
                | { + <FLANG> <FLANG> }
                | { - <FLANG> <FLANG> }
                | { * <FLANG> <FLANG> }
                | { / <FLANG> <FLANG> }
                | { with { <id> <FLANG> } <FLANG> }
                | <id>
                | { fun { <id> } <FLANG> }
                | { call <FLANG> <FLANG> }

    Evaluation rules:
      eval(N,env)                = N
      eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
      eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
      eval({* E1 E2},env)        = eval(E1,env) * eval(E2,env)
      eval({/ E1 E2},env)        = eval(E1,env) / eval(E2,env)
      eval(x,env)                = lookup(x,env)
      eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
      eval({fun {x} E},env)      = <{fun {x} E}, env>
      eval({call E1 E2},env1)    = eval(B,extend(x,eval(E2,env1),env2))
                                  if eval(E1,env1) = <{fun {x} B}, env2>
                                 = error!  otherwise
    |#

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Types for environments, values, and a lookup function

    (define-type VAL
      [NumV Number]
      [FunV Symbol FLANG ENV])

    ;; Define a type for functional environments
    (define-type ENV = Symbol -> VAL)

    (: EmptyEnv : -> ENV)
    (define (EmptyEnv)
      (lambda (id) (error 'lookup "no binding for ~s" id)))

    (: Extend : Symbol VAL ENV -> ENV)
    ;; extend a given environment cache with a new binding
    (define (Extend id val rest-env)
      (lambda (name)
        (if (eq? name id)
          val
          (rest-env name))))

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (env name))

    (: NumV->number : VAL -> Number)
    ;; convert a FLANG runtime numeric value to a Racket one
    (define (NumV->number val)
      (cases val
        [(NumV n) n]
        [else (error 'arith-op "expected a number, got: ~s" val)]))

    (: arith-op : (Number Number -> Number) VAL VAL -> VAL)
    ;; gets a Racket numeric binary operator, and uses it within a NumV
    ;; wrapper
    (define (arith-op op val1 val2)
      (NumV (op (NumV->number val1) (NumV->number val2))))

    (: eval : FLANG ENV -> VAL)
    ;; evaluates FLANG expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Add l r) (arith-op + (eval l env) (eval r env))]
        [(Sub l r) (arith-op - (eval l env) (eval r env))]
        [(Mul l r) (arith-op * (eval l env) (eval r env))]
        [(Div l r) (arith-op / (eval l env) (eval r env))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (Extend bound-id (eval named-expr env) env))]
        [(Id name) (lookup name env)]
        [(Fun bound-id bound-body)
         (FunV bound-id bound-body env)]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV bound-id bound-body f-env)
            (eval bound-body
                  (Extend bound-id (eval arg-expr env) f-env))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str) (EmptyEnv))])
        (cases result
          [(NumV n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x 3}
                  {with {f {fun {y} {+ x y}}}
                    {with {x 5}
                      {call f 4}}}}")
          => 7)
    (test (run "{call {with {x 3}
                        {fun {y} {+ x y}}}
                      4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)


------------------------------------------------------------------------
# More Closures (on both levels) [Tuesday, February 11th]

Racket closures (= functions) can be used in other places too, and as we
have seen, they can do more than encapsulate various values --- they can
also hold the behavior that is expected of these values.

To demonstrate this we will deal with closures in our language. We
currently use a variant that holds the three pieces of relevant
information:

    [FunV Symbol FLANG ENV]

We can replace this by a functional object, which will hold the three
values. First, change the `VAL` type to hold functions for `FunV`
values:

    (define-type VAL
      [NumV Number]
      [FunV (? -> ?)])

And note that the function should somehow encapsulate the same
information that was there previously, the question is *how* this
information is going to be done, and this will determine the actual
type. This information plays a role in two places in our evaluator ---
generating a closure in the `Fun` case, and using it in the `Call` case:

    [(Fun bound-id bound-body)
     (FunV bound-id bound-body env)]
    [(Call fun-expr arg-expr)
     (define fval (eval fun-expr env))
     (cases fval
       [(FunV bound-id bound-body f-env)
        (eval bound-body                   ;***
              (Extend bound-id             ;***
                      (eval arg-expr env)  ;***
                      f-env))]             ;***
       [else (error 'eval "`call' expects a function, got: ~s" fval)])]

we can simply fold the marked functionality bit of `Call` into a Racket
function that will be stored in a `FunV` object --- this piece of
functionality takes an argument value, extends the closure's environment
with its value and the function's name, and continues to evaluate the
function body. Folding all of this into a function gives us:

    (lambda (arg-val)
      (eval bound-body (Extend bound-id arg-val env)))

where the values of `bound-body`, `bound-id`, and `val` are known at the
time that the `FunV` is *constructed*. Doing this gives us the following
code for the two cases:

    [(Fun bound-id bound-body)
     (FunV (lambda (arg-val)
             (eval bound-body (Extend bound-id arg-val env))))]
    [(Call fun-expr arg-expr)
     (define fval (eval fun-expr env))
     (cases fval
       [(FunV proc) (proc (eval arg-expr env))]
       [else (error 'eval "`call' expects a function, got: ~s" fval)])]

And now the type of the function is clear:

    (define-type VAL
      [NumV Number]
      [FunV (VAL -> VAL)])

And again, the rest of the code is unmodified:

    #lang pl

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Types for environments, values, and a lookup function

    (define-type VAL
      [NumV Number]
      [FunV (VAL -> VAL)])

    ;; Define a type for functional environments
    (define-type ENV = Symbol -> VAL)

    (: EmptyEnv : -> ENV)
    (define (EmptyEnv)
      (lambda (id) (error 'lookup "no binding for ~s" id)))

    (: Extend : Symbol VAL ENV -> ENV)
    ;; extend a given environment cache with a new binding
    (define (Extend id val rest-env)
      (lambda (name)
        (if (eq? name id)
          val
          (rest-env name))))

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (env name))

    (: NumV->number : VAL -> Number)
    ;; convert a FLANG runtime numeric value to a Racket one
    (define (NumV->number val)
      (cases val
        [(NumV n) n]
        [else (error 'arith-op "expected a number, got: ~s" val)]))

    (: arith-op : (Number Number -> Number) VAL VAL -> VAL)
    ;; gets a Racket numeric binary operator, and uses it within a NumV
    ;; wrapper
    (define (arith-op op val1 val2)
      (NumV (op (NumV->number val1) (NumV->number val2))))

    (: eval : FLANG ENV -> VAL)
    ;; evaluates FLANG expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Add l r) (arith-op + (eval l env) (eval r env))]
        [(Sub l r) (arith-op - (eval l env) (eval r env))]
        [(Mul l r) (arith-op * (eval l env) (eval r env))]
        [(Div l r) (arith-op / (eval l env) (eval r env))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (Extend bound-id (eval named-expr env) env))]
        [(Id name) (lookup name env)]
        [(Fun bound-id bound-body)
         (FunV (lambda (arg-val)
                 (eval bound-body (Extend bound-id arg-val env))))]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV proc) (proc (eval arg-expr env))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str) (EmptyEnv))])
        (cases result
          [(NumV n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x 3}
                  {with {f {fun {y} {+ x y}}}
                    {with {x 5}
                      {call f 4}}}}")
          => 7)
    (test (run "{call {with {x 3}
                        {fun {y} {+ x y}}}
                      4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)


------------------------------------------------------------------------
# Types of Evaluators [Tuesday, February 11th]

What we did just now is implement lexical environments and closures in
the language we implement using lexical environments and closures in our
own language (Racket)!

This is another example of embedding a feature of the host language in
the implemented language, an issue that we have already discussed.

There are many examples of this, even when the two languages involved
are different. For example, if we have this bit in the C implementation
of Racket:

    // Disclaimer: not real Racket code
    Racket_Object *eval_and(int argc, Racket_Object *argv[]) {
      Racket_Object *tmp;
      if ( argc != 2 )
        signal_racket_error("bad number of arguments");
      else if ( racket_eval(argv[0]) != racket_false
                &&
                (tmp = racket_eval(argv[1])) != racket_false )
        return tmp;
      else
        return racket_false;
    }

then the special semantics of evaluating a Racket `and` form is being
inherited from C's special treatment of `&&`. You can see this by the
fact that if there is a bug in the C compiler, then it will propagate to
the resulting Racket implementation too. A different solution is to not
use `&&` at all:

    // Disclaimer: not real Racket code
    Racket_Object *eval_and(int argc, Racket_Object *argv[]) {
      Racket_Object *tmp;
      if ( argc != 2 )
        signal_racket_error("bad number of arguments");
      else if ( racket_eval(argv[0]) != racket_false )
        return racket_eval(argv[1]);
      else
        return racket_false;
    }

and we can say that this is even better since it evaluates the second
expression in tail position. But in this case we don't really get that
benefit, since C itself is not doing tail-call optimization as a
standard feature (though some compilers do so under some circumstances).

We have seen a few different implementations of evaluators that are
quite different in flavor. They suggest the following taxonomy.

* A ___syntactic evaluator___ is one that uses its own language to
  represent expressions and semantic runtime values of the evaluated
  language, implementing all the corresponding behavior explicitly.

* A ___meta evaluator___ is an evaluator that uses language features of
  its own language to directly implement behavior of the evaluated
  language.

While our substitution-based FLANG evaluator was close to being a
syntactic evaluator, we haven't written any purely syntactic evaluators
so far: we still relied on things like Racket arithmetics etc. The most
recent evaluator that we have studied, is even more of a *meta*
evaluator than the preceding ones: it doesn't even implement closures
and lexical scope, and instead, it uses the fact that Racket itself has
them.

With a good match between the evaluated language and the implementation
language, writing a meta evaluator can be very easy. With a bad match,
though, it can be very hard. With a syntactic evaluator, implementing
each semantic feature will be somewhat hard, but in return you don't
have to worry as much about how well the implementation and the
evaluated languages match up. In particular, if there is a particularly
strong mismatch between the implementation and the evaluated language,
it may take less effort to write a syntactic evaluator than a meta
evaluator.

As an exercise, we can build upon our latest evaluator to remove the
encapsulation of the evaluator's response in the VAL type. The resulting
evaluator is shown below. This is a true meta evaluator: it uses Racket
closures to implement FLANG closures, Racket function application for
FLANG function application, Racket numbers for FLANG numbers, and Racket
arithmetic for FLANG arithmetic. In fact, ignoring some small syntactic
differences between Racket and FLANG, this latest evaluator can be
classified as something more specific than a meta evaluator:

* A ___meta-circular evaluator___ is a meta evaluator in which the
  implementation and the evaluated languages are the same.

  This is essentially the concept of a "universal" evaluator, as in a
  "universal turing machine".

(Put differently, the trivial nature of the evaluator clues us in to the
deep connection between the two languages, whatever their syntactic
differences may be.)


------------------------------------------------------------------------
# Feature Embedding [Tuesday, February 11th]

We saw that the difference between lazy evaluation and eager evaluation
is in the evaluation rules for `with` forms, function applications, etc:

    eval({with {x E1} E2}) = eval(E2[eval(E1)/x])

is eager, and

    eval({with {x E1} E2}) = eval(E2[E1/x])

is lazy. But is the first rule *really* eager? The fact is that the only
thing that makes it eager is the fact that our understanding of the
mathematical notation is eager --- if we were to take math as lazy, then
the description of the rule becomes a description of lazy evaluation.

Another way to look at this is --- take the piece of code that
implements this evaluation:

    (: eval : FLANG -> Number)
    ;; evaluates FLANG expressions by reducing them to numbers
    (define (eval expr)
      (cases expr
        ...
        [(With bound-id named-expr bound-body)
         (eval (subst bound-body
                      bound-id
                      (Num (eval named-expr))))]
        ...))

and the same question applies: is this really implementing eager
evaluation? We know that this is indeed eager --- we can simply try it
and check that it is, but it is only eager because we are using an eager
language for the implementation! If our own language was lazy, then the
evaluator's implementation would run lazily, which means that the above
applications of the `eval` and the `subst` functions would also be lazy,
making our evaluator lazy as well.

This is a general phenomena where some of the semantic features of the
language we use (math in the formal description, Racket in our code)
gets *embedded* into the language we implement.

Here's another example --- consider the code that implements
arithmetics:

    (: eval : FLANG -> Number)
    ;; evaluates FLANG expressions by reducing them to numbers
    (define (eval expr)
      (cases expr
        [(Num n) n]
        [(Add l r) (+ (eval l) (eval r))]
        ...))

what if it was written like this:

    FLANG eval(FLANG expr) {
      if (is_Num(expr))
        return num_of_Num(expr);
      else if (is_Add(expr))
        return eval(lhs_of_Add(expr)) + eval(rhs_of_Add(expr));
      else if ...
      ...
    }

Would it still implement unlimited integers and exact fractions? That
depends on the language that was used to implement it: the above syntax
suggests C, C++, Java, or some other relative, which usually come with
limited integers and no exact fractions. But this really depends on the
language --- even our own code has unlimited integers and exact
rationals only because Racket has them. If we were using a language that
didn't have such features (there are such Scheme implementations), then
our implemented language would absorb these (lack of) features too, and
its own numbers would be limited in just the same way. (And this
includes the syntax for numbers, which we embedded intentionally, like
the syntax for identifiers).

The bottom line is that we should be aware of such issues, and be very
careful when we talk about semantics. Even the language that we use to
communicate (semi-formal logic) can mean different things.

------------------------------------------------------------------------

Aside: read "Reflections on Trusting Trust" by Ken Thompson (You can
skip to the "Stage II" part to get to the interesting stuff.)

(And when you're done, look for "XcodeGhost" to see a relevant example,
and don't miss the leaked document on the wikipedia page...)

------------------------------------------------------------------------

Here is yet another variation of our evaluator that is even closer to a
meta-circular evaluator. It uses Racket values directly to implement
values, so arithmetic operations become straightforward. Note especially
how the case for function application is similar to arithmetics: a FLANG
function application translates to a Racket function application. In
both cases (applications and arithmetics) we don't even check the
objects since they are simple Racket objects --- if our language happens
to have some meaning for arithmetics with functions, or for applying
numbers, then we will inherit the same semantics in our language. This
means that we now specify less behavior and fall back more often on what
Racket does.

We use Racket values with this type definition:

    (define-type VAL = (U Number (VAL -> VAL)))

And the evaluation function can now be:

    (: eval : FLANG ENV -> VAL)
    ;; evaluates FLANG expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) n]                ;*** return the actual number
        [(Add l r) (+ (eval l env) (eval r env))]
        [(Sub l r) (- (eval l env) (eval r env))]
        [(Mul l r) (* (eval l env) (eval r env))]
        [(Div l r) (/ (eval l env) (eval r env))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (Extend bound-id (eval named-expr env) env))]
        [(Id name) (lookup name env)]
        [(Fun bound-id bound-body)
         (lambda ([arg-val : VAL]) ;*** return the racket function
           ;; note that this requires input type specifications since
           ;; typed racket can't guess the right one
           (eval bound-body (Extend bound-id arg-val env)))]
        [(Call fun-expr arg-expr)
         ((eval fun-expr env)      ;*** trivial like the arithmetics!
          (eval arg-expr env))]))

Note how the arithmetics implementation is simple --- it's a direct
translation of the FLANG syntax to Racket operations, and since we don't
check the inputs to the Racket operations, we let Racket throw type
errors for us. Note also how function application is just like the
arithmetic operations: a FLANG application is directly translated to a
Racket application.

However, this does not work quite as simply in Typed Racket. The whole
point of typechecking is that we never run into type errors --- so we
cannot throw back on Racket errors since code that might produce them is
forbidden! A way around this is to perform explicit checks that
guarantee that Racket cannot run into type errors. We do this with the
following two helpers that are defined inside `eval`:

      (: evalN : FLANG -> Number)
      (define (evalN e)
        (let ([n (eval e env)])
          (if (number? n)
            n
            (error 'eval "got a non-number: ~s" n))))
      (: evalF : FLANG -> (VAL -> VAL))
      (define (evalF e)
        (let ([f (eval e env)])
          (if (function? f)
            f
            (error 'eval "got a non-function: ~s" f))))

Note that Typed Racket is "smart enough" to figure out that in `evalF`
the result of the recursive evaluation has to be either `Number` or
`(VAL -> VAL)`; and since the `if` throws out on numbers, we're left
with `(VAL -> VAL)` functions, not just any function.

    #lang pl

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Types for environments, values, and a lookup function

    ;; Values are plain Racket values, no new VAL wrapper;
    ;; (but note that this is a recursive definition)
    (define-type VAL = (U Number (VAL -> VAL)))

    ;; Define a type for functional environments
    (define-type ENV = (Symbol -> VAL))

    (: EmptyEnv : -> ENV)
    (define (EmptyEnv)
      (lambda (id) (error 'lookup "no binding for ~s" id)))

    (: Extend : Symbol VAL ENV -> ENV)
    ;; extend a given environment cache with a new binding
    (define (Extend id val rest-env)
      (lambda (name)
        (if (eq? name id)
          val
          (rest-env name))))

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (env name))

    (: eval : FLANG ENV -> VAL)
    ;; evaluates FLANG expressions by reducing them to values
    (define (eval expr env)
      (: evalN : FLANG -> Number)
      (define (evalN e)
        (let ([n (eval e env)])
          (if (number? n)
            n
            (error 'eval "got a non-number: ~s" n))))
      (: evalF : FLANG -> (VAL -> VAL))
      (define (evalF e)
        (let ([f (eval e env)])
          (if (function? f)
            f
            (error 'eval "got a non-function: ~s" f))))
      (cases expr
        [(Num n) n]
        [(Add l r) (+ (evalN l) (evalN r))]
        [(Sub l r) (- (evalN l) (evalN r))]
        [(Mul l r) (* (evalN l) (evalN r))]
        [(Div l r) (/ (evalN l) (evalN r))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (Extend bound-id (eval named-expr env) env))]
        [(Id name) (lookup name env)]
        [(Fun bound-id bound-body)
         (lambda ([arg-val : VAL])
           (eval bound-body (Extend bound-id arg-val env)))]
        [(Call fun-expr arg-expr)
         ((evalF fun-expr) (eval arg-expr env))]))

    (: run : String -> VAL) ; no need to convert VALs to numbers
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (eval (parse str) (EmptyEnv)))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x 3}
                  {with {f {fun {y} {+ x y}}}
                    {with {x 5}
                      {call f 4}}}}")
          => 7)
    (test (run "{call {with {x 3}
                        {fun {y} {+ x y}}}
                      4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)


------------------------------------------------------------------------
# Recursion, Recursion, Recursion [Tuesday, February 11th]

> [PLAI §9]

There is one major feature that is still missing from our language: we
have no way to perform recursion (therefore no kind of loops). So far,
we could only use recursion when we had *names*. In FLANG, the only way
we can have names is through `with` which not good enough for recursion.

To discuss the issue of recursion, we switch to a "broken" version of
(untyped) Racket --- one where a `define` has a different scoping rules:
the scope of the defined name does *not* cover the defined expression.
Specifically, in this language, this doesn't work:

    #lang pl broken
    (define (fact n)
      (if (zero? n) 1 (* n (fact (- n 1)))))
    (fact 5)

In our language, this translation would also not work (assuming we have
`if` etc):

    {with {fact {fun {n}
                  {if {= n 0} 1 {* n {call fact {- n 1}}}}}}
      {call fact 5}}

And similarly, in plain Racket this won't work if `let` is the only tool
you use to create bindings:

    (let ([fact (lambda (n)
                  (if (zero? n) 1 (* n (fact (- n 1)))))])
      (fact 5))

In the broken-scope language, the `define` form is more similar to a
mathematical definition. For example, when we write:

    (define (F x) x)
    (define (G y) (F y))
    (G F)

it is actually shorthand for

    (define F (lambda (x) x))
    (define G (lambda (y) (F y)))
    (G F)

we can then replace defined names with their definitions:

    (define F (lambda (x) x))
    (define G (lambda (y) (F y)))
    ((lambda (y) (F y)) (lambda (x) x))

and this can go on, until we get to the actual code that we wrote:

    ((lambda (y) ((lambda (x) x) y)) (lambda (x) x))

This means that the above `fact` definition is similar to writing:

    fact := (lambda (n)
              (if (zero? n) 1 (* n (fact (- n 1)))))
    (fact 5)

which is not a well-formed definition --- it is *meaningless* (this is a
formal use of the word "meaningless"). What we'd really want, is to take
the *equation* (using `=` instead of `:=`)

    fact = (lambda (n)
             (if (zero? n) 1 (* n (fact (- n 1)))))

and find a solution which will be a value for `fact` that makes this
true.

If you look at the Racket evaluation rules handout on the web page, you
will see that this problem is related to the way that we introduced the
Racket `define`: there is a hand-wavy explanation that talks about
*knowing* things.

The big question is: can we define recursive functions without Racket's
magical `define` form?

> Note: This question is a little different than the question of
> implementing recursion in our language --- in the Racket case we have
> no control over the implementation of the language. As it will
> eventually turn out, implementing recursion in our own language will
> be quite easy when we use mutation in a specific way. So the question
> that we're now facing can be phrased as either "can we get recursion
> in Racket without Racket's magical definition forms?" or "can we get
> recursion in our interpreter without mutation?".


------------------------------------------------------------------------
# Recursion without the Magic [Tuesday, February 11th]

> [PLAI §22.4] (we go much deeper)

> Note: This explanation is similar to the one you can find in "The Why
> of Y", by Richard Gabriel. (It is an "operational semantics"
> explanation, focusing on *how* things should work.)

To implement recursion without the `define` magic, we first make an
observation: this problem does *not* come up in a dynamically-scoped
language. Consider the `let`-version of the problem:

    #lang pl dynamic
    (let ([fact (lambda (n)
                  (if (zero? n) 1 (* n (fact (- n 1)))))])
      (fact 5))

This works fine --- because by the time we get to evaluate the body of
the function, `fact` is already bound to itself in the current dynamic
scope. (This is another reason why dynamic scope is perceived as a
convenient approach in new languages.)

Regardless, the problem that we have with lexical scope is still there,
but the way things work in a dynamic scope suggest a solution that we
can use now. Just like in the dynamic scope case, when `fact` is called,
it does have a value --- the only problem is that this value is
inaccessible in the lexical scope of its body.

Instead of trying to get the value in via lexical scope, we can imitate
what happens in the dynamically scoped language by passing the `fact`
value to itself so it can call itself (going back to the original code
in the broken-scope language):

    (define (fact self n)  ;***
      (if (zero? n) 1 (* n (self (- n 1)))))
    (fact fact 5)          ;***

except that now the recursive call should still send itself along:

    (define (fact self n)
      (if (zero? n) 1 (* n (self self (- n 1))))) ;***
    (fact fact 5)

The problem is that this required rewriting calls to `fact` --- both
outside and recursive calls inside. To make this an acceptable solution,
calls from both places should not change. Eventually, we should be able
to get a working `fact` definition that uses just

    (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))

The first step in resolving this problem is to curry the `fact`
definition.

    (define (fact self)                  ;***
      (lambda (n)                        ;***
        (if (zero? n)
          1
          (* n ((self self) (- n 1)))))) ;***
    ((fact fact) 5)                      ;***

Now `fact` is no longer our factorial function --- it's a function that
constructs it. So call it `make-fact`, and bind `fact` to the actual
factorial function.

    (define (make-fact self)             ;***
      (lambda (n)
        (if (zero? n) 1 (* n ((self self) (- n 1))))))
    (define fact (make-fact make-fact))  ;***
    (fact 5)                             ;***

We can try to do the same thing in the body of the factorial function:
instead of calling `(self self)`, just bind `fact` to it:

    (define (make-fact self)
      (lambda (n)
        (let ([fact (self self)])       ;***
          (if (zero? n)
            1
            (* n (fact (- n 1)))))))    ;***
    (define fact (make-fact make-fact))
    (fact 5)

This works fine, but if we consider our original goal, we need to get
that local `fact` binding outside of the `(lambda (n) ...)` --- so we're
left with a definition that uses the factorial expression as is. So,
swap the two lines:

    (define (make-fact self)
      (let ([fact (self self)])  ;***
        (lambda (n)              ;***
          (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define fact (make-fact make-fact))
    (fact 5)

But the problem is that this gets us into an infinite loop because we're
trying to evaluate `(self self)` too ea(ge)rly. In fact, if we ignore
the body of the `let` and other details, we basically do this:

    (define (make-fact self) (self self)) (make-fact make-fact)
    --reduce-sugar-->
    (define make-fact (lambda (self) (self self))) (make-fact make-fact)
    --replace-definition-->
    ((lambda (self) (self self)) (lambda (self) (self self)))
    --rename-identifiers-->
    ((lambda (x) (x x)) (lambda (x) (x x)))

And this expression has an interesting property: it reduces to itself,
so evaluating it gets stuck in an infinite loop.

So how do we solve this? Well, we know that `(self self)` *should* be
the same value that is the factorial function itself --- so it must be a
one-argument function. If it's such a function, we can use a value that
is equivalent, except that it will not get evaluated until it is needed,
when the function is called. The trick here is the observation that
`(lambda (n) (add1 n))` is really the same as `add1` (provided that
`add1` is a one-argument function), except that the `add1` part doesn't
get evaluated until the function is called. Applying this trick to our
code produces a version that does not get stuck in the same infinite
loop:

    (define (make-fact self)
      (let ([fact (lambda (n) ((self self) n))]) ;***
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define fact (make-fact make-fact))
    (fact 5)

Continuing from here --- we know that

    (let ([x v]) e)  is the same as  ((lambda (x) e) v)

(remember how we derived `fun` from a `with`), so we can turn that `let`
into the equivalent function application form:

    (define (make-fact self)
      ((lambda (fact)                    ;***
         (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))
       (lambda (n) ((self self) n))))    ;***
    (define fact (make-fact make-fact))
    (fact 5)

And note now that the (lambda (fact) ...) expression is everything that
we need for a recursive definition of `fact` --- it has the proper
factorial body with a plain recursive call. It's almost like the usual
value that we'd want to define `fact` as, except that we still have to
abstract on the recursive value itself. So lets move this code into a
separate definition for `fact-step`:

    (define fact-step                   ;***
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define (make-fact self)
      (fact-step                        ;***
       (lambda (n) ((self self) n))))
    (define fact (make-fact make-fact))
    (fact 5)

We can now proceed by moving the `(make-fact make-fact)` self
application into its own function which is what creates the real
factorial:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define (make-fact self)
      (fact-step
       (lambda (n) ((self self) n))))
    (define (make-real-fact) (make-fact make-fact)) ;***
    (define fact (make-real-fact))                  ;***
    (fact 5)

Rewrite the `make-fact` definition using an explicit `lambda`:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define make-fact                     ;***
      (lambda (self)                      ;***
        (fact-step
         (lambda (n) ((self self) n)))))
    (define (make-real-fact) (make-fact make-fact))
    (define fact (make-real-fact))
    (fact 5)

and fold the functionality of `make-fact` and `make-real-fact` into a
single `make-fact` function by just using the value of `make-fact`
explicitly instead of through a definition:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define (make-real-fact)
      (let ([make (lambda (self)                      ;***
                    (fact-step                        ;***
                     (lambda (n) ((self self) n))))]) ;***
        (make make)))
    (define fact (make-real-fact))
    (fact 5)

We can now observe that `make-real-fact` has nothing that is specific to
factorial --- we can make it take a "core function" as an argument:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define (make-real-fact core)                     ;***
      (let ([make (lambda (self)
                    (core                             ;***
                     (lambda (n) ((self self) n))))])
        (make make)))
    (define fact (make-real-fact fact-step))          ;***
    (fact 5)

and call it `make-recursive`:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define (make-recursive core)             ;***
      (let ([make (lambda (self)
                    (core
                     (lambda (n) ((self self) n))))])
        (make make)))
    (define fact (make-recursive fact-step))  ;***
    (fact 5)

We're almost done now --- there's no real need for a separate
`fact-step` definition, just use the value for the definition of `fact`:

    (define (make-recursive core)
      (let ([make (lambda (self)
                    (core
                     (lambda (n) ((self self) n))))])
        (make make)))
    (define fact
      (make-recursive
       (lambda (fact)                                          ;***
         (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))) ;***
    (fact 5)

turn the `let` into a function form:

    (define (make-recursive core)
      ((lambda (make) (make make))              ;***
       (lambda (self)                           ;***
         (core (lambda (n) ((self self) n)))))) ;***
    (define fact
      (make-recursive
       (lambda (fact)
         (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))))
    (fact 5)

do some renamings to make things simpler --- `make` and `self` turn to
`x`, and `core` to `f`:

    (define (make-recursive f)                   ;***
      ((lambda (x) (x x))                        ;***
       (lambda (x) (f (lambda (n) ((x x) n)))))) ;***
    (define fact
      (make-recursive
       (lambda (fact)
         (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))))
    (fact 5)

or we can manually expand that first (lambda (x) (x x)) application to
make the symmetry more obvious (not really surprising because it started
with a `let` whose purpose was to do a self-application):

    (define (make-recursive f)
      ((lambda (x) (f (lambda (n) ((x x) n))))   ;***
       (lambda (x) (f (lambda (n) ((x x) n)))))) ;***
    (define fact
      (make-recursive
       (lambda (fact)
         (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))))
    (fact 5)

And we finally got what we were looking for: a general way to define
*any* recursive function without any magical `define` tricks. This also
work for other recursive functions:

    #lang pl broken
    (define (make-recursive f)
      ((lambda (x) (f (lambda (n) ((x x) n))))
       (lambda (x) (f (lambda (n) ((x x) n))))))
    (define fact
      (make-recursive
       (lambda (fact)
         (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))))
    (fact 5)
    (define fib
      (make-recursive
       (lambda (fib)
         (lambda (n) (if (<= n 1) n (+ (fib (- n 1)) (fib (- n 2))))))))
    (fib 8)
    (define length
      (make-recursive
       (lambda (length)
         (lambda (l) (if (null? l) 0 (+ (length (rest l)) 1))))))
    (length '(x y z))

A convenient tool that people often use on paper is to perform a kind of
a syntactic abstraction: "assume that whenever I write (twice foo) I
really meant to write (foo foo)". This can often be done as plain
abstractions (that is, using functions), but in some cases --- for
example, if we want to abstract over definitions --- we just want such a
rewrite rule. (More on this towards the end of the course.) The
broken-scope language does provide such a tool --- `rewrite` extends the
language with a rewrite rule. Using this, and our `make-recursive`, we
can make up a recursive definition form:

    (rewrite (define/rec (f x) E)
          => (define f (make-recursive (lambda (f) (lambda (x) E)))))

In other words, we've created our own "magical definition" form. The
above code can now be written in almost the same way it is written in
plain Racket:

    #lang pl broken
    (define (make-recursive f)
      ((lambda (x) (f (lambda (n) ((x x) n))))
       (lambda (x) (f (lambda (n) ((x x) n))))))
    (rewrite (define/rec (f x) E)
          => (define f (make-recursive (lambda (f) (lambda (x) E)))))
    ;; examples
    (define/rec (fact n) (if (zero? n) 1 (* n (fact (- n 1)))))
    (fact 5)
    (define/rec (fib n) (if (<= n 1) n (+ (fib (- n 1)) (fib (- n 2)))))
    (fib 8)
    (define/rec (length l) (if (null? l) 0 (+ (length (rest l)) 1)))
    (length '(x y z))

Finally, note that make-recursive is limited to 1-argument functions
only because of the protection from eager evaluation. In any case, it
can be used in any way you want, for example,

    (make-recursive (lambda (f) (lambda (x) f)))

is a function that *returns itself* rather than calling itself. Using
the rewrite rule, this would be:

    (define/rec (f x) f)

which is the same as:

    (define (f x) f)

in plain Racket.


------------------------------------------------------------------------
# The Core of `make-recursive` [Tuesday, February 11th]

As in Racket, being able to express recursive functions is a fundamental
property of the language. It means that we can have loops in our
language, and that's the essence of making a language powerful enough to
be TM-equivalent --- able to express undecidable problems, where we
don't know whether there is an answer or not.

The core of what makes this possible is the expression that we have seen
in our derivation:

    ((lambda (x) (x x)) (lambda (x) (x x)))

which reduces to itself, and therefore has no value: trying to evaluate
it gets stuck in an infinite loop. (This expression is often called
"Omega".)

This is the key for creating a loop --- we use it to make recursion
possible. Looking at our final `make-recursive` definition and ignoring
for a moment the "protection" that we need against being stuck
prematurely in an infinite loop:

    (define (make-recursive f)
      ((lambda (x) (x x))
       (lambda (x) (f (x x)))))

we can see that this is almost the same as the Omega expression --- the
only difference is that application of `f`. Indeed, this expression (the
result of (make-recursive F) for some `F`) reduces in a similar way to
Omega:

    ((lambda (x) (x x)) (lambda (x) (F (x x))))
    ((lambda (x) (F (x x))) (lambda (x) (F (x x))))
    (F ((lambda (x) (F (x x))) (lambda (x) (F (x x)))))
    (F (F ((lambda (x) (F (x x))) (lambda (x) (F (x x))))))
    (F (F (F ((lambda (x) (F (x x))) (lambda (x) (F (x x)))))))
    ...

which means that the actual value of this expression is:

    (F (F (F ...forever...)))

This definition would be sufficient if we had a lazy language, but to
get things working in a strict one we need to bring back the protection.
This makes things a little different --- if we use `(protect f)` to be a
shorthand for the protection trick,

    (rewrite (protect f) => (lambda (x) (f x)))

then we have:

    (define (make-recursive f)
      ((lambda (x) (x x)) (lambda (x) (f (protect (x x))))))

which makes the (make-recursive F) evaluation reduce to

    (F (protect (F (protect (F (protect (...forever...)))))))

and this is still the same result (as long as `F` is a single-argument
function).

(Note that `protect` cannot be implemented as a plain function!)


------------------------------------------------------------------------
# Denotational Explanation of Recursion [Tuesday, February 11th]

> Note: This explanation is similar to the one you can find in "The
> Little Schemer" called "(Y Y) Works!", by Dan Friedman and Matthias
> Felleisen. (It focuses on *what* it should be.)

The explanation that we have now for how to derive the `make-recursive`
definition is fine --- after all, we did manage to get it working. But
this explanation was done from a kind of an operational point of view:
we knew a certain trick that can make things work and we pushed things
around until we got it working like we wanted. Instead of doing this, we
can re-approach the problem from a more declarative point of view.

So, start again from the same broken code that we had (using the
broken-scope language):

    (define fact
      (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))

This is as broken as it was when we started: the occurrence of `fact` in
the body of the function is free, which means that this code is
meaningless. To avoid the compilation error that we get when we run this
code, we can substitute *anything* for that `fact` --- it's even better
to use a replacement that will lead to a runtime error:

    (define fact
      (lambda (n) (if (zero? n) 1 (* n (777 (- n 1)))))) ;***

This function will not work in a similar way to the original one --- but
there is one case where it *does* work: when the input value is `0`
(since then we do not reach the bogus application). We note this by
calling this function `fact0`:

    (define fact0                ;***
      (lambda (n) (if (zero? n) 1 (* n (777 (- n 1))))))

Now that we have this function defined, we can use it to write `fact1`
which is the factorial function for arguments of `0` or `1`:

    (define fact0
      (lambda (n) (if (zero? n) 1 (* n (777 (- n 1))))))
    (define fact1
      (lambda (n) (if (zero? n) 1 (* n (fact0 (- n 1))))))

And remember that this is actually just shorthand for:

    (define fact1
      (lambda (n)
        (if (zero? n)
          1
          (* n ((lambda (n)
                  (if (zero? n)
                    1
                    (* n (777 (- n 1)))))
                (- n 1))))))

We can continue in this way and write `fact2` that will work for n<=2:

    (define fact2
      (lambda (n) (if (zero? n) 1 (* n (fact1 (- n 1))))))

or, in full form:

    (define fact2
      (lambda (n)
        (if (zero? n)
          1
          (* n ((lambda (n)
                  (if (zero? n)
                    1
                    (* n ((lambda (n)
                            (if (zero? n)
                              1
                              (* n (777 (- n 1)))))
                          (- n 1)))))
                (- n 1))))))

If we continue this way, we *will* get the true factorial function, but
the problem is that to handle *any* possible integer argument, it will
have to be an infinite definition! Here is what it is supposed to look
like:

    (define fact0 (lambda (n) (if (zero? n) 1 (* n (777 (- n 1))))))
    (define fact1 (lambda (n) (if (zero? n) 1 (* n (fact0 (- n 1))))))
    (define fact2 (lambda (n) (if (zero? n) 1 (* n (fact1 (- n 1))))))
    (define fact3 (lambda (n) (if (zero? n) 1 (* n (fact2 (- n 1))))))
    ...

The true factorial function is `fact-infinity`, with an infinite size.
So, we're back at the original problem...

To help make things more concise, we can observe the repeated pattern in
the above, and extract a function that abstracts this pattern. This
function is the same as the `fact-step` that we have seen previously:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define fact0 (fact-step 777))
    (define fact1 (fact-step fact0))
    (define fact2 (fact-step fact1))
    (define fact3 (fact-step fact2))
    ...

which is actually:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define fact0 (fact-step 777))
    (define fact1 (fact-step (fact-step 777)))
    (define fact2 (fact-step (fact-step (fact-step 777))))
    ...
    (define fact
      (fact-step (fact-step (fact-step (... (fact-step 777) ...)))))

Do this a little differently --- rewrite `fact0` as:

    (define fact0
      ((lambda (mk) (mk 777))
       fact-step))

Similarly, `fact1` is written as:

    (define fact1
      ((lambda (mk) (mk (mk 777)))
       fact-step))

and so on, until the real factorial, which is still infinite at this
stage:

    (define fact
      ((lambda (mk) (mk (mk (... (mk 777) ...))))
       fact-step))

Now, look at that `(lambda (mk) ...)` --- it is an infinite expression,
but for every actual application of the resulting factorial function we
only need a finite number of `mk` applications. We can guess how many,
and as soon as we hit an application of `777` we know that our guess is
too small. So instead of `777`, we can try to use the maker function to
create and use the next.

To make things more explicit, here is the expression that is our
`fact0`, without the definition form:

    ((lambda (mk) (mk 777))
     fact-step)

This function has a very low guess --- it works for 0, but with 1 it
will run into the `777` application. At this point, we want to somehow
invoke `mk` again to get the next level --- and since `777` *does* get
applied, we can just replace it with `mk`:

    ((lambda (mk) (mk mk))
     fact-step)

The resulting function works just the same for an input of `0` because
it does not attempt a recursive call --- but if we give it `1`, then
instead of running into the error of applying `777`:

    (* n (777 (- n 1)))

we get to apply `fact-step` there:

    (* n (fact-step (- n 1)))

and this is still wrong, because `fact-step` expects a function as an
input. To see what happens more clearly, write `fact-step` explicitly:

    ((lambda (mk) (mk mk))
     (lambda (fact)
       (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))

The problem is in what we're going to pass into `fact-step` --- its
`fact` argument will not be the factorial function, but the `mk`
function constructor. Renaming the `fact` argument as `mk` will make
this more obvious (but not change the meaning):

    ((lambda (mk) (mk mk))
     (lambda (mk)
       (lambda (n) (if (zero? n) 1 (* n (mk (- n 1)))))))

It should now be obvious that this application of `mk` will not work,
instead, we need to apply it on some function and *then* apply the
result on `(- n 1)`. To get what we had before, we can use `777` as a
bogus function:

    ((lambda (mk) (mk mk))
     (lambda (mk)
       (lambda (n) (if (zero? n) 1 (* n ((mk 777) (- n 1)))))))

This will allow one recursive call --- so the definition works for both
inputs of `0` and `1` --- but not more. But that `777` is used as a
maker function now, so instead, we can just use `mk` itself again:

    ((lambda (mk) (mk mk))
     (lambda (mk)
       (lambda (n) (if (zero? n) 1 (* n ((mk mk) (- n 1)))))))

And this is a *working* version of the real factorial function, so make
it into a (non-magical) definition:

    (define fact
      ((lambda (mk) (mk mk))
       (lambda (mk)
         (lambda (n) (if (zero? n) 1 (* n ((mk mk) (- n 1))))))))

But we're not done --- we "broke" into the factorial code to insert that
`(mk mk)` application --- that's why we dragged in the actual value of
`fact-step`. We now need to fix this. The expression on that last line

    (lambda (n) (if (zero? n) 1 (* n ((mk mk) (- n 1)))))

is close enough --- it is `(fact-step (mk mk))`. So we can now try to
rewrite our `fact` as:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define fact
      ((lambda (mk) (mk mk))
       (lambda (mk) (fact-step (mk mk)))))

... and would fail in a familiar way! If it's not familiar enough, just
rename all those `mk`s as `x`s:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define fact
      ((lambda (x) (x x))
       (lambda (x) (fact-step (x x)))))

We've run into the eagerness of our language again, as we did before.
The solution is the same --- the `(x x)` is the factorial function, so
protect it as we did before, and we have a working version:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define fact
      ((lambda (x) (x x))
       (lambda (x) (fact-step (lambda (n) ((x x) n))))))

The rest should not be surprising now... Abstract the recursive making
bit in a new `make-recursive` function:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))
    (define (make-recursive f)
      ((lambda (x) (x x))
       (lambda (x) (f (lambda (n) ((x x) n))))))
    (define fact (make-recursive fact-step))

and now we can do the first reduction inside `make-recursive` and write
the `fact-step` expression explicitly:

    #lang pl broken
    (define (make-recursive f)
      ((lambda (x) (f (lambda (n) ((x x) n))))
       (lambda (x) (f (lambda (n) ((x x) n))))))
    (define fact
      (make-recursive
       (lambda (fact)
         (lambda (n) (if (zero? n) 1 (* n (fact (- n 1))))))))

and this is the same code we had before.


------------------------------------------------------------------------
# The Y Combinator [Tuesday, February 11th]

Our `make-recursive` function is usually called the *fixpoint operator*
or the *Y combinator*.

It looks really simple when using the lazy version (remember: our
version is the eager one):

    (define Y
      (lambda (f)
        ((lambda (x) (f (x x)))
         (lambda (x) (f (x x))))))

> Note that if we *do* allow a recursive definition for Y itself, then
> the definition can follow the definition that we've seen:
>
>     (define (Y f) (f (Y f)))

And this all comes from the loop generated by:

    ((lambda (x) (x x)) (lambda (x) (x x)))

This expression, which is also called *Omega* (the `(lambda (x) (x x))`
part by itself is usually called *omega* and then `(omega omega)` is
*Omega*), is also the idea behind many deep mathematical facts. As an
example for what it does, follow the next rule:

    I will say the next sentence twice:
      "I will say the next sentence twice".

(Note the usage of colon for the first and quotes for the second ---
what is the equivalent of that in the lambda expression?)

By itself, this just gets you stuck in an infinite loop, as Omega does,
and the Y combinator adds `F` to that to get an infinite chain of
applications --- which is similar to:

    I will say the next sentence twice:
      "I will hop on one foot and then say the next sentence twice".

Sidenote:
  [see this SO question](https://stackoverflow.com/q/25228394/128595)
  and my answer, which came from the PLQ implementation.


------------------------------------------------------------------------
## The main property of Y [Tuesday, February 11th]

`fact-step` is a function that given any limited factorial, will
generate a factorial that is good for one more integer input. Start with
`777`, which is a factorial that is good for nothing (because it's not a
function), and you can get `fact0` as

    fact0 == (fact-step 777)

and that's a good factorial function only for an input of `0`. Use that
with `fact-step` again, and you get

    fact1 == (fact-step fact0) == (fact-step (fact-step 777))

which is the factorial function when you only look at input values of
`0` or `1`. In a similar way

    fact2 == (fact-step fact1)

is good for `0`...`2` --- and we can continue as much as we want, except
that we need to have an infinite number of applications --- in the
general case, we have:

    fact-n == (fact-step (fact-step (fact-step ... 777)))

which is good for `0`...`n`. The *real* factorial would be the result of
running `fact-step` on itself infinitely, it *is* `fact-infinity`. In
other words (here `fact` is the *real* factorial):

    fact = fact-infinity == (fact-step (fact-step ...infinitely...))

but note that since this is really infinity, then

    fact = (fact-step (fact-step ...infinitely...))
         = (fact-step fact)

so we get an equation:

    fact = (fact-step fact)

and a solution for this is going to be the real factorial. The solution
is the *fixed-point* of the `fact-step` function, in the same sense that
`0` is the fixed point of the `sin` function because

    0 = (sin 0)

And the Y combinator does just that --- it has this property:

    (make-recursive f) = (f (make-recursive f))

or, using the more common name:

    (Y f) = (f (Y f))

This property encapsulates the real magical power of Y. You can see how
it works: since `(Y f) = (f (Y f))`, we can add an `f` application to
both sides, giving us `(f (Y f)) = (f (f (Y f)))`, so we get:

    (Y f) = (f (Y f)) = (f (f (Y f))) = (f (f (f (Y f)))) = ...
          = (f (f (f ...)))

and we can conclude that

    (Y fact-step) = (fact-step (fact-step ...infinitely...))
                  = fact


------------------------------------------------------------------------
# Yet another explanation for Y [Tuesday, February 11th]

Here's another explanation of how the Y combinator works. Remember that
our `fact-step` function was actually a function that generates a
factorial function based on some input, which is supposed to be the
factorial function:

    (define fact-step
      (lambda (fact)
        (lambda (n) (if (zero? n) 1 (* n (fact (- n 1)))))))

As we've seen, you can apply this function on a version of factorial
that is good for inputs up to some n, and the result will be a factorial
that is good for those values up to n+1. The question is *what is the
fixpoint of `fact-step`*? And the answer is that if it maps factₙ
factorial to factₙ₊₁, then the input will be equal to the output on the
*infinitieth* `fact`, which is the *actual* factorial. Since Y is a
fixpoint combinator, it gives us exactly that answer:

    (define the-real-factorial (Y fact-step))


------------------------------------------------------------------------
# Typing the Y Combinator [extra] [Tuesday, February 11th]

Typing the Y combinator is a tricky issue. For example, in standard ML
you must write a new type definition to do this:

    datatype 'a t = T of 'a t -> 'a
    val y = fn f => (fn (T x) => (f (fn a => x (T x) a)))
                      (T (fn (T x) => (f (fn a => x (T x) a))))

> Can you find a pattern in the places where `T` is used?
> --- Roughly speaking, that type definition is
>
>     ;; `t' is the type name, `T' is the constructor (aka the variant)
>     (define-type (RecTypeOf t)
>       [T ((RecTypeOf t) -> t)])
>
> First note that the two `fn a => ...` parts are the same as our
> protection, so ignoring that we get:
>
>     val y = fn f => (fn (T x) => (f (x (T x))))
>                       (T (fn (T x) => (f (x (T x)))))
>
> if you now replace `T` with `Quote`, things make more sense:
>
>     val y = fn f => (fn (Quote x) => (f (x (Quote x))))
>                       (Quote (fn (Quote x) => (f (x (Quote x)))))
>
> and with our syntax, this would be:
>
>     (define (Y f)
>       ((lambda (qx)
>          (cases qx
>            [(Quote x) (f (x qx))]))
>        (Quote
>         (lambda (qx)
>           (cases qx
>             [(Quote x) (f (x qx))])))))
>
> it's not really quotation --- but the analogy should help: it uses
> `Quote` to distinguish functions as values that are applied (the `x`s)
> from functions that are passed as arguments.

In OCaml, this looks a little different:

    # type 'a t = T of ('a t -> 'a) ;;
    type 'a t = T of ('a t -> 'a)
    # let y f = (fun (T x) -> x (T x))
                (T (fun (T x) -> fun z -> f (x (T x)) z)) ;;
    val y : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun>
    # let fact = y (fun fact n -> if n < 1 then 1 else n * fact(n-1)) ;;
    val fact : int -> int = <fun>
    # fact 5 ;;
    - : int = 120

but OCaml has also a `-rectypes` command line argument, which will make
it infer the type by itself:

    # let y f = (fun x -> x x) (fun x -> fun z -> f (x x) z) ;;
    val y : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun>
    # let fact = y (fun fact n -> if n < 1 then 1 else n * fact(n-1)) ;;
    val fact : int -> int = <fun>
    # fact 5 ;;
    - : int = 120

The translation of this to `#lang pl` is a little verbose because we
don't have auto-currying, and because we need to declare input types to
functions, but it's essentially a direct translation of the above:

    (define-type (RecTypeOf t)
      [T ((RecTypeOf t) -> t)])
    (: Y : (All (A B) ((A -> B) -> (A -> B)) -> (A -> B)))
    (define (Y f)
      ((lambda ([x : (RecTypeOf (A -> B))])
         (cases x
           [(T x) (x (T x))]))
       (T (lambda ([x : (RecTypeOf (A -> B))])
            (cases x
              [(T x) (lambda ([z : A])
                       ((f (x (T x))) z))])))))
    (define fact
      (Y (lambda ([fact : (Integer -> Integer)])
           (lambda ([n : Integer])
             (if (< n 1) 1 (* n (fact (sub1 n))))))))
    (fact 5)

It is also possible to write this expression in "plain" Typed Racket,
without a user-defined type --- and we need to start with a proper type
definition. First of all, the type of Y should be straightforward: it is
a fixpoint operation, so it takes a `T -> T` function and produces its
fixpoint. The fixpoint itself is some `T` (such that applying the
function on it results in itself). So this gives us:

    (: make-recursive : (T -> T) -> T)

However, in our case `make-recursive` computes a *functional* fixpoint,
for unary `S -> T` functions, so we should narrow down the type

    (: make-recursive : ((S -> T) -> (S -> T)) -> (S -> T))

Now, in the body of `make-recursive` we need to add a type for the `x`
argument which is behaving in a weird way: it is used both as a function
and as its own argument. (Remember --- I will say the next sentence
twice: "I will say the next sentence twice".) We need a recursive type
definition helper (not a new type) for that:

    (define-type (Tau S T) = (Rec this (this -> (S -> T))))

This type is tailored for our use of `x`: it is a type for a function
that will *consume itself* (hence the `Rec`) and spit out the value that
the `f` argument consumes --- an `S -> T` function.

The resulting full version of the code:

    (: make-recursive : (All (S T) ((S -> T) -> (S -> T)) -> (S -> T)))
    (define-type (Tau S T) = (Rec this (this -> (S -> T))))
    (define (make-recursive f)
      ((lambda ([x : (Tau S T)]) (f (lambda (z) ((x x) z))))
       (lambda ([x : (Tau S T)]) (f (lambda (z) ((x x) z))))))

    (: fact : Number -> Number)
    (define fact (make-recursive
                  (lambda ([fact : (Number -> Number)])
                    (lambda ([n : Number])
                      (if (zero? n)
                        1
                        (* n (fact (- n 1))))))))

    (fact 5)


------------------------------------------------------------------------
# Lambda Calculus --- Schlac [Tuesday, February 18th]

> [PLAI §22] (we do much more)

We know that many constructs that are usually thought of as primitives
are not really needed --- we can implement them ourselves given enough
tools. The question is how far can we go?

The answer: pretty far... For example:

    (define foo((lambda(f)((lambda(x)(x x))(lambda(x)(f(x x)))))(lambda(
    f)(lambda(x)(((x(lambda(x)(lambda(x y)y))(lambda(x y)x))(x(lambda(x)
    (lambda(x y)y))(lambda(x y)x))(((x(lambda (p)(lambda(s)(s(p(lambda(x
    y)y))(lambda(f x)(f((p(lambda(x y)y))f x))))))(lambda(s) (s(lambda(f
    x)x)(lambda(f x)x))))(lambda(x y)x))(lambda(x)(lambda(x y)y))(lambda
    (x y)x)))(lambda(f x)(f x))((f((x(lambda(p)(lambda(s)(s(p(lambda(x y
    )y))(lambda(f x)(f((p(lambda(x y)y))f x))))))(lambda(y s)(s(lambda(f
    x)x)(lambda(f x)x))))(lambda(x y)x)))(lambda(n)(lambda(f x)(f(n f x)
    )))(f((((x(lambda(p)(lambda(s)(s(p (lambda(x y)y))(lambda(f x)(f((p(
    lambda(x y)y))f x))))))(lambda(s)(s(lambda(f x) x)(lambda(f x)x))))(
    lambda(x y)x))(lambda(p)(lambda(s)(s(p(lambda(x y)y))(lambda(f x)(f(
    (p(lambda(x y)y))f x))))))(lambda(s)(s(lambda(f x)x)(lambda(f x)x)))
    )(lambda(x y)x)))))))))

We begin with a very minimal language, which is based on the Lambda
Calculus. In this language we get a very minimal set of constructs and
values.

In DrRacket, this we will use the Schlac language level (stands for
"<del>Scheme</del>Racket as Lambda Calculus"). This language has a
Racket-like syntax, but don't be confused --- it is *very* different
from Racket. The only constructs that are available in this language
are: lambda expressions of at least one argument, function application
(again, at least one argument), and simple definition forms which are
similar to the ones in the "Broken define" language --- definitions are
used as shorthand, and cannot be used for recursive function definition.
They're also only allowed at the toplevel --- no local helpers, and a
definition is not an expression that can appear anywhere. The BNF is
therefore:

    <SCHLAC>      ::= <SCHLAC-TOP> ...

    <SCHLAC-TOP>  ::= <SCHLAC-EXPR>
                    | (define <id> <SCHLAC-EXPR>)

    <SCHLAC-EXPR> ::= <id>
                    | (lambda (<id> <id> ...) <SCHLAC-EXPR>)
                    | (<SCHLAC-EXPR> <SCHLAC-EXPR> <SCHLAC-EXPR> ...)

Since this language has no primitive values (other than functions),
Racket numbers and booleans are also considered identifiers, and have no
built-in value that come with the language. In addition, all functions
and function calls are curried, so

    (lambda (x y z) (z y x))

is actually shorthand for

    (lambda (x) (lambda (y) (lambda (z) ((z y) x))))

The rules for evaluation are simple, there is one very important rule
for evaluation which is called "beta reduction":

    ((lambda (x) E1) E2) --> E1[E2/x]

where substitution in this context requires being careful so you won't
capture names. This requires you to be able to do another kind of
transformation which is called "alpha conversion", which basically says
that you can rename identifiers as long as you keep the same binding
structure (eg, a valid renaming does not change the de-Bruijn form of
the expression). There is one more rule that can be used, *eta
conversion* which says that `(lambda (x) (f x))` is the same as `f` (we
used this rule above when deriving the Y combinator).

One last difference between Schlac and Racket is that Schlac is a *lazy*
language. This will be important since we do not have any built-in
special forms like `if`.

Here is a Schlac definition for the identity function:

    (define identity (lambda (x) x))

and there is not much that we can do with this now:

    > identity
    #<procedure:identity>
    > (identity identity)
    #<procedure:identity>
    > (identity identity identity)
    #<procedure:identity>

(In the last expression, note that `(id id id)` is shorthand for `((id
id) id)`, and since `(id id)` is the identity, applying that on `id`
returns it again.)

> Something to think about: are we losing anything because we have no
> no-argument functions?


------------------------------------------------------------------------
# Church Numerals [Tuesday, February 18th]

So far, it seems like it is impossible to do anything useful in this
language, since all we have are functions and applications. We know how
to write the identity function, but what about other values? For
example, can you write code that evaluates to zero?

> What's zero? I only know how to write functions!
>
> (Turing Machine / Assembly programmer: "What's a function? --- I only
> know how to write 0s and 1s!")

The first thing we therefore need is to be able to *encode* numbers as
functions. For zero, we will use a function of two arguments that simply
returns its second value:

    (define 0 (lambda (f) (lambda (x) x)))

or, more concisely

    (define 0 (lambda (f x) x))

This is the first step in an encoding that is known as *Church
Numerals*: an encoding of natural numbers as functions. The number zero
is encoded as a function that takes in a function and a second value,
and applies the function zero times on the argument (which is really
what the above definition is doing). Following this view, the number one
is going to be a function of two arguments, that applies the first on
the second one time:

    (define 1 (lambda (f x) (f x)))

and note that `1` is just like the identity function (as long as you
give it a function as its first input, but this is always the case in
Schlac). The next number on the list is two --- which applies the first
argument on the second one twice:

    (define 2 (lambda (f x) (f (f x))))

We can go on doing this, but what we really want is a way to perform
arbitrary arithmetic. The first requirement for that is an `add1`
function that increments its input (an encoded natural number) by one.
To do this, we write a function that expects an encoded number:

    (define add1 (lambda (n) ...))

and this function is expected to return an encoded number, which is
always a function of `f` and `x`:

    (define add1 (lambda (n) (lambda (f x) ...)))

Now, in the body, we need to apply `f` on `x` n+1 times --- but remember
that `n` is a function that will do `n` applications of its first
argument on its second:

    (define add1 (lambda (n) (lambda (f x) ... (n f x) ...)))

and all we have left to do now is to apply `f` one more time, yielding
this definition for `add1`:

    (define add1 (lambda (n) (lambda (f x) (f (n f x)))))

Using this, we can define a few useful numbers:

    (define 1 (add1 0))
    (define 2 (add1 1))
    (define 3 (add1 2))
    (define 4 (add1 3))
    (define 5 (add1 4))

This is all nice theoretically, but how can we make sure that it is
correct? Well, Schlac has a few additional built-in functions that
translate Church numerals into Racket numbers. To try our definitions we
use the `->nat` (read: to natural number):

    (->nat 0)
    (->nat 5)
    (->nat (add1 (add1 5)))

You can now verify that the identity function is really the same as the
number 1:

    (->nat identity)

We can even write a test case, since Schlac contains the `test` special
form, but we have to be careful in that --- first of all, we cannot test
whether functions are equal (why?) so we must use `->nat`, but

    (test (->nat (add1 (add1 5))) => 7)

will not work since `7` is undefined. To overcome this, Schlac has a
`back-door` for primitive Racket values --- just use a quote:

    (test (->nat (add1 (add1 5))) => '7)

We can now define natural number addition --- one simple idea is to get
two encoded numbers `m` and `n`, then start with `x`, apply `f` on it
`n` times by using it as a function, then apply `f` `m` more times on
the result in the same way:

    (define + (lambda (m n) (lambda (f x) (m f (n f x)))))

or equivalently:

    (define + (lambda (m n f x) (m f (n f x))))

Another idea is to use `add1` and increment `n` by `m` using `add1`:

    (define + (lambda (m n) (m add1 n)))
    (->nat (+ 4 5))

We can also define multiplication of `m` and `n` quite easily --- begin
with addition --- `(lambda (x) (+ n x))` is a function that expects an
`x` and returns `(+ x n)` --- it's an increment-by-n function. But since
all functions and applications are curried, this is actually the same as
`(lambda (x) ((+ n) x))` which is the same as `(+ n)`. Now, what we want
to do is repeat this operation `m` times over zero, which will add `n`
to zero `m` times, resulting in `m` * `n`. The definition is therefore:

    (define * (lambda (m n) (m (+ n) 0)))
    (->nat (* 4 5))
    (->nat (+ 4 (* (+ 2 5) 5)))

An alternative approach is to consider

    (lambda (x) (n f x))

for some encoded number `n` and a function `f` --- this function is like
`f`^`n` (f composed n times with itself). But remember that this is
shorthand for

    (lambda (x) ((n f) x))

and we know that `(lambda (x) (foo x))` is just like `foo` (if it is a
function), so this is equivalent to just

    (n f)

So `(n f)` is `f`^`n`, and in the same way `(m g)` is `g`^`m` --- if we
use `(n f)` for `g`, we get `(m (n f))` which is n self-compositions of
`f`, self-composed m times. In other words, `(m (n f))` is a function
that is like `m`*`n` applications of `f`, so we can define
multiplication as:

    (define * (lambda (m n) (lambda (f) (m (n f)))))

which is the same as

    (define * (lambda (m n f) (m (n f))))

The same principle can be used to define exponentiation (but now we have
to be careful with the order since exponentiation is not commutative):

    (define ^ (lambda (m n) (n (* m) 1)))
    (->nat (^ 3 4))

And there is a similar alternative here too ---
* a Church numeral `m` is the m-self-composition function,
* and `(1 m)` is just like `m`^`1` which is the same as `m`
  (`1`=`identity`)
* and `(2 m)` is just like `m`^`2` --- it takes a function `f`, self
  composes it `m` times, and self composes the result `m` times --- for
  a total of `f`^`(m*m)`
* and `(3 m)` is similarly `f`^`(m*m*m)`
* so `(n m)` is `f`^`(m^n)` (note that the first `^` is
  self-compositions, and the second one is a mathematical exponent)
* so `(n m)` is a function that returns `m`^`n` self-compositions of an
  input function,
Which means that `(n m)` is the Church numeral for `m`^`n`, so we get:

    (define ^ (lambda (m n) (n m)))

which basically says that any number encoding `n` is also the `?`^`n`
operation.

All of this is was not too complicated --- but all so far all we did is
write functions that increment their inputs in various ways. What about
`sub1`? For that, we need to do some more work --- we will need to
encode booleans.


------------------------------------------------------------------------
# More Encodings [Tuesday, February 18th]

Our choice of encoding numbers makes sense --- the idea is that the main
feature of a natural number is repeating something a number of times.
For booleans, the main property we're looking for is choosing between
two values. So we can encode true and false by functions of two
arguments that return either the first or the second argument:

    (define #t (lambda (x y) x))
    (define #f (lambda (x y) y))

Note that this encoding of `#f` is really the same as the encoding of
`0`, so we have to know what type to expect an use the proper operations
(this is similar to C, where everything is just integers). Now that we
have these two, we can define `if`:

    (define if (lambda (c t e) (c t e)))

it expects a boolean which is a function of two arguments, and passes it
the two expressions. The `#t` boolean will simply return the first, and
the `#f` boolean will return the second. Strictly speaking, we don't
really need this definition, since instead of writing `(if c t e)`, we
can simply write `(c t e)`. In any case, we need the language to be lazy
for this to work. To demonstrate this, we'll intentionally use the quote
back-door to use a non-functional value, using this will normally result
in an error:

    (+ '1 '2)

But testing our `if` definition, things work just fine:

    (if #t (+ 4 5) (+ 1 2))

and we see that DrRacket leaves the second addition expression in red,
which indicates that it was not executed. We can also make sure that
even when it is defined as a function, it is still working fine because
the language is lazy:

    (if #f ((lambda (x) (x x)) (lambda (x) (x x))) 3)

What about `and` and `or`? Simple, `or` takes two arguments, and returns
either true or false if one of the inputs is true:

    (define or (lambda (a b) (if a #t (if b #t #f))))

but `(if b #t #f)` is really the same as just `b` because it must be a
boolean (we cannot use more than one "truty" or "falsy" values):

    (define or (lambda (a b) (if a #t b)))

also, if `a` is true, we want to return `#t`, but that is exactly the
value of `a`, so:

    (define or (lambda (a b) (if a a b)))

and finally, we can get rid of the `if` (which is actually breaking the
`if` abstraction, if we encode booleans in some other way):

    (define or (lambda (a b) (a a b)))

Similarly, you can convince yourself that the definition of `and` is:

    (define and (lambda (a b) (a b a)))

Schlac has to-Racket conversion functions for booleans too:

    (->bool (or #f #f))
    (->bool (or #f #t))
    (->bool (or #t #f))
    (->bool (or #t #t))

and

    (->bool (and #f #f))
    (->bool (and #f #t))
    (->bool (and #t #f))
    (->bool (and #t #t))

A `not` function is quite simple --- one alternative is to choose from
true and false in the usual way:

    (define not (lambda (a) (a #f #t)))

and another is to return a function that switches the inputs to an input
boolean:

    (define not (lambda (a) (lambda (x y) (a y x))))

which is the same as

    (define not (lambda (a x y) (a y x)))

We can now put numbers and booleans together: we define a `zero?`
function.

    (define zero? (lambda (n) (n (lambda (x) #f) #t)))
    (test (->bool (and (zero? 0) (not (zero? 3)))) => '#t)

(Good question: is this fast?)

(Note that it is better to test that the value is explicitly `#t`, if we
just use `(test (->bool ...))` then the test will work even if the
expression in question evaluated to some bogus value.)

The idea is simple --- if `n` is the encoding of zero, it will return
it's second argument which is `#t`:

    (zero? 0) --> ((lambda (f n) n) (lambda (x) #f) #t) -> #t

if `n` is an encoding of a bigger number, then it is a self-composition,
and the function that we give it is one that always returns `#f`, no
matter how many times it is self-composed. Try `2` for example:

    (zero? 2) --> ((lambda (f n) (f (f n))) (lambda (x) #f) #t)
              --> ((lambda (x) #f) ((lambda (x) #f) #t))
              --> #f

Now, how about an encoding for compound values? A minimal approach is
what we use in Racket --- a way to generate pairs (`cons`), and encode
lists as chains of pairs with a special value at the end (`null`). There
is a natural encoding for pairs that we have previously seen --- a pair
is a function that expects a selector, and will apply that on the two
values:

    (define cons (lambda (x y) (lambda (s) (s x y))))

Or, equivalently:

    (define cons (lambda (x y s) (s x y)))

To extract the two values from a pair, we need to pass a selector that
consumes two values and returns one of them. In our framework, this is
exactly what the two boolean values do, so we get:

    (define car (lambda (x) (x #t)))
    (define cdr (lambda (x) (x #f)))

    (->nat (+ (car (cons 2 3)) (cdr (cons 2 3))))

We can even do this:

    (define 1st car)
    (define 2nd (lambda (l) (car (cdr l))))
    (define 3rd (lambda (l) (car (cdr (cdr l)))))
    (define 4th (lambda (l) (car (cdr (cdr (cdr l))))))
    (define 5th (lambda (l) (car (cdr (cdr (cdr (cdr l)))))))

or write a `list-ref` function:

    (define list-ref (lambda (l n) (car (n cdr l))))

Note that we don't need a recursive function for this: our encoding of
natural numbers makes it easy to "iterate N times". What we get with
this encoding is essentially free natural-number recursion.

We now need a special `null` value to mark list ends. This value should
have the same number of arguments as a `cons` value (one: a
selector/boolean function), and it should be possible to distinguish it
from other values. We choose

    (define null (lambda (s) #t))

Testing the list encoding:

    (define l123 (cons 1 (cons 2 (cons 3 null))))
    (->nat (2nd l123))

And as with natural numbers and booleans, Schlac has built-in facility
to convert encoded lists to Racket values, except that this requires
specifying the type of values in a list so it's a higher-order function:

    ((->listof ->nat) l123)

which ("as usual") can be written as

    (->listof ->nat l123)

We can even do this:

    (->listof (->listof ->nat) (cons l123 (cons l123 null)))

Defining `null?` is now relatively easy (and it's actually already used
by the above `->listof` conversion). The following definition

    (define null? (lambda (x) (x (lambda (x y) #f))))

works because if `x` is null, then it simply ignores its argument and
returns `#t`, and if it's a pair, then it uses the input selector, which
always returns `#f` in its turn. Using some arbitrary `A` and `B`:

    (null? (cons A B))
      --> ((lambda (x) (x (lambda (x y) #f))) (lambda (s) (s A B)))
      --> ((lambda (s) (s A B)) (lambda (x y) #f))
      --> ((lambda (x y) #f) A B)
      --> #f
    (null? null)
      --> ((lambda (x) (x (lambda (x y) #f))) (lambda (s) #t))
      --> ((lambda (s) #t) (lambda (x y) #f))
      --> #t

We can use the Y combinator to create recursive functions --- we can
even use the rewrite rules facility that Schlac contains (the same one
that we have previously seen):

    (define Y
      (lambda (f)
        ((lambda (x) (x x)) (lambda (x) (f (x x))))))
    (rewrite (define/rec f E) => (define f (Y (lambda (f) E))))

and using it:

    (define/rec length
      (lambda (l)
        (if (null? l)
          0
          (add1 (length (cdr l))))))
    (->nat (length l123))

And to complete this, um, journey --- we're still missing subtraction.
There are many ways to solve the problem of subtraction, and for a
challenge try to come up with a solution yourself. One of the clearer
solutions uses a simple idea --- begin with a pair of two zeroes
`<0,0>`, and repeat this transformation `n` times: `<a,b>` -> `<b,b+1>`.
After `n` steps, we will have `<n-1,n>` --- so we get:

    (define inccons (lambda (p) (cons (cdr p) (add1 (cdr p)))))
    (define sub1 (lambda (n) (car (n inccons (cons 0 0)))))
    (->nat (sub1 5))

And from this the road is short to general subtraction, `m`-`n` is
simply `n` applications of `sub1` on `m`:

    (define - (lambda (m n) (n sub1 m)))
    (test (->nat (- 3 2)) => '1)
    (test (->nat (- (* 4 (* 5 5)) 5)) => '95)

We now have a normal-looking language, and we're ready to do anything we
want. Here are two popular examples:

    (define/rec fact
      (lambda (x)
        (if (zero? x) 1 (* x (fact (sub1 x))))))
    (test (->nat (fact 5)) => '120)

    (define/rec fib
      (lambda (x)
        (if (or (zero? x) (zero? (sub1 x)))
          1
          (+ (fib (- x 1)) (fib (- x 2))))))
    (test (->nat (fib (* 5 2))) => '89)

To get generalized arithmetic capability, Schlac has yet another
built-in facility for translating Racket natural numbers into Church
numerals:

    (->nat (fib (nat-> '10)))

... and to get to that frightening expression in the beginning, all you
need to do is replace all definitions in the `fib` definition over and
over again until you're left with nothing but lambda expressions and
applications, then reformat the result into some cute shape. For extra
fun, you can look for immediate applications of lambda expressions and
reduce them manually.

All of this is in the following code:

    ;;; ---<<<CHURCH>>>-------------------------------------------------
    ;; Making Schlac into a practical language (not an interpreter)

    #lang pl schlac

    (define identity (lambda (x) x))

    ;; Natural numbers

    (define 0 (lambda (f x) x))

    (define add1 (lambda (n) (lambda (f x) (f (n f x)))))
    ;; same as:
    ;;   (define add1 (lambda (n) (lambda (f x) (n f (f x)))))

    (define 1 (add1 0))
    (define 2 (add1 1))
    (define 3 (add1 2))
    (define 4 (add1 3))
    (define 5 (add1 4))
    (test (->nat (add1 (add1 5))) => '7)

    (define + (lambda (m n) (m add1 n)))
    (test (->nat (+ 4 5)) => '9)

    ;; (define * (lambda (m n) (m (+ n) 0)))
    (define * (lambda (m n f) (m (n f))))
    (test (->nat (* 4 5)) => '20)
    (test (->nat (+ 4 (* (+ 2 5) 5))) => '39)

    ;; (define ^ (lambda (m n) (n (* m) 1)))
    (define ^ (lambda (m n) (n m)))
    (test (->nat (^ 3 4)) => '81)

    ;; Booleans

    (define #t (lambda (x y) x))
    (define #f (lambda (x y) y))

    (define if (lambda (c t e) (c t e))) ; not really needed

    (test (->nat (if #t 1 2)) => '1)
    (test (->nat (if #t (+ 4 5) (+ '1 '2))) => '9)

    (define and (lambda (a b) (a b a)))
    (define or  (lambda (a b) (a a b)))
    ;; (define not (lambda (a) (a #f #t)))
    (define not (lambda (a x y) (a y x)))

    (test (->bool (and #f #f)) => '#f)
    (test (->bool (and #t #f)) => '#f)
    (test (->bool (and #f #t)) => '#f)
    (test (->bool (and #t #t)) => '#t)
    (test (->bool (or  #f #f)) => '#f)
    (test (->bool (or  #t #f)) => '#t)
    (test (->bool (or  #f #t)) => '#t)
    (test (->bool (or  #t #t)) => '#t)
    (test (->bool (not #f)) => '#t)
    (test (->bool (not #t)) => '#f)

    (define zero? (lambda (n) (n (lambda (x) #f) #t)))
    (test (->bool (and (zero? 0) (not (zero? 3)))) => '#t)

    ;; Lists

    (define cons (lambda (x y s) (s x y)))

    (define car (lambda (x) (x #t)))
    (define cdr (lambda (x) (x #f)))

    (test (->nat (+ (car (cons 2 3)) (cdr (cons 2 3)))) => '5)

    (define 1st car)
    (define 2nd (lambda (l) (car (cdr l))))
    (define 3rd (lambda (l) (car (cdr (cdr l)))))
    (define 4th (lambda (l) (car (cdr (cdr (cdr l))))))
    (define 5th (lambda (l) (car (cdr (cdr (cdr (cdr l)))))))

    (define null (lambda (s) #t))
    (define null? (lambda (x) (x (lambda (x y) #f))))

    (define l123 (cons 1 (cons 2 (cons 3 null))))
    ;; Note that `->listof' is a H.O. converter
    (test ((->listof ->nat) l123) => '(1 2 3))
    (test (->listof ->nat l123) => '(1 2 3))  ; same as the above
    (test (->listof (->listof ->nat) (cons l123 (cons l123 null)))
          => '((1 2 3) (1 2 3)))

    ;; Subtraction is tricky

    (define inccons (lambda (p) (cons (cdr p) (add1 (cdr p)))))
    (define sub1 (lambda (n) (car (n inccons (cons 0 0)))))

    (test (->nat (sub1 5)) => '4)

    (define - (lambda (a b) (b sub1 a)))

    (test (->nat (- 3 2)) => '1)
    (test (->nat (- (* 4 (* 5 5)) 5)) => '95)
    (test (->nat (- 2 4)) => '0) ; this is "natural subtraction"

    ;; Recursive functions

    (define Y
      (lambda (f)
        ((lambda (x) (x x)) (lambda (x) (f (x x))))))
    (rewrite (define/rec f E) => (define f (Y (lambda (f) E))))

    (define/rec length
      (lambda (l)
        (if (null? l)
          0
          (add1 (length (cdr l))))))
    (test (->nat (length l123)) => '3)

    (define/rec fact
      (lambda (x)
        (if (zero? x) 1 (* x (fact (sub1 x))))))
    (test (->nat (fact 5)) => '120)

    (define/rec fib
      (lambda (x)
        (if (or (zero? x) (zero? (sub1 x)))
          1
          (+ (fib (sub1 x)) (fib (sub1 (sub1 x)))))))
    (test (->nat (fib (* 5 2))) => '89)

    #|
    ;; Fully-expanded Fibonacci
    (define fib
      ((lambda (f)
         ((lambda (x) (x x)) (lambda (x) (f (x x)))))
       (lambda (f)
         (lambda (x)
           ((lambda (c t e) (c t e))
            ((lambda (a b) (a a b))
             ((lambda (n)
                (n (lambda (x) (lambda (x y) y)) (lambda (x y) x)))
              x)
             ((lambda (n)
                (n (lambda (x) (lambda (x y) y)) (lambda (x y) x)))
              ((lambda (n)
                 ((lambda (x) (x (lambda (x y) x)))
                  (n (lambda (p)
                       ((lambda (x y s) (s x y))
                        ((lambda (x) (x (lambda (x y) y))) p)
                        ((lambda (n) (lambda (f x) (f (n f x))))
                         ((lambda (x) (x (lambda (x y) y))) p))))
                     ((lambda (x y s) (s x y))
                      (lambda (f x) x)
                      (lambda (f x) x)))))
               x)))
            ((lambda (n) (lambda (f x) (f (n f x)))) (lambda (f x) x))
            ((lambda (x y)
               (x (lambda (n) (lambda (f x) (f (n f x)))) y))
             (f ((lambda (n)
                   ((lambda (x) (x (lambda (x y) x)))
                    (n (lambda (p)
                         ((lambda (x y s) (s x y))
                          ((lambda (x) (x (lambda (x y) y))) p)
                          ((lambda (n) (lambda (f x) (f (n f x))))
                           ((lambda (x) (x (lambda (x y) y))) p))))
                       ((lambda (x y s) (s x y))
                        (lambda (f x) x)
                        (lambda (f x) x)))))
                 x))
             (f ((lambda (n)
                   ((lambda (x) (x (lambda (x y) x)))
                    (n (lambda (p)
                         ((lambda (x y s) (s x y))
                          ((lambda (x) (x (lambda (x y) y))) p)
                          ((lambda (n) (lambda (f x) (f (n f x))))
                           ((lambda (x) (x (lambda (x y) y))) p))))
                       ((lambda (x y s) (s x y))
                        (lambda (f x) x)
                        (lambda (f x) x)))))
                 ((lambda (n)
                    ((lambda (x) (x (lambda (x y) x)))
                     (n (lambda (p)
                          ((lambda (x y s) (s x y))
                           ((lambda (x) (x (lambda (x y) y))) p)
                           ((lambda (n) (lambda (f x) (f (n f x))))
                            ((lambda (x) (x (lambda (x y) y))) p))))
                        ((lambda (x y s) (s x y))
                         (lambda (f x) x)
                         (lambda (f x) x)))))
                  x)))))))))

    ;; The same after reducing all immediate function applications
    (define fib
      ((lambda (f)
         ((lambda (x) (x x)) (lambda (x) (f (x x)))))
       (lambda (f)
         (lambda (x)
           (((x (lambda (x) (lambda (x y) y)) (lambda (x y) x))
             (x (lambda (x) (lambda (x y) y)) (lambda (x y) x))
             (((x (lambda (p)
                    (lambda (s)
                      (s (p (lambda (x y) y))
                         (lambda (f x)
                           (f ((p (lambda (x y) y)) f x))))))
                  (lambda (s)
                    (s (lambda (f x) x) (lambda (f x) x))))
               (lambda (x y) x))
              (lambda (x) (lambda (x y) y))
              (lambda (x y) x)))
            (lambda (f x) (f x))
            ((f ((x (lambda (p)
                      (lambda (s)
                        (s (p (lambda (x y) y))
                           (lambda (f x)
                             (f ((p (lambda (x y) y)) f x))))))
                    (lambda (y s)
                      (s (lambda (f x) x) (lambda (f x) x))))
                 (lambda (x y) x)))
             (lambda (n) (lambda (f x) (f (n f x))))
             (f ((((x (lambda (p)
                        (lambda (s)
                          (s (p (lambda (x y) y))
                             (lambda (f x)
                               (f ((p (lambda (x y) y)) f x))))))
                      (lambda (s)
                        (s (lambda (f x) x) (lambda (f x) x))))
                   (lambda (x y) x))
                  (lambda (p)
                    (lambda (s)
                      (s (p (lambda (x y) y))
                         (lambda (f x)
                           (f ((p (lambda (x y) y)) f x))))))
                  (lambda (s)
                    (s (lambda (f x) x) (lambda (f x) x))))
                 (lambda (x y) x)))))))))

    ;; Cute reformatting of the above:
    (define fib((lambda(f)((lambda(x)(x x))(lambda(x)(f(x x)))))(lambda(
    f)(lambda(x)(((x(lambda(x)(lambda(x y)y))(lambda(x y)x))(x(lambda(x)
    (lambda(x y)y))(lambda(x y) x))(((x(lambda(p)(lambda(s)(s(p(lambda(x
    y)y))(lambda(f x)(f((p(lambda(x y)y))f x))))))(lambda(s) (s(lambda(f
    x)x)(lambda(f x)x))))(lambda(x y)x))(lambda(x)(lambda(x y)y))(lambda
    (x y)x)))(lambda(f x)(f x))((f((x(lambda(p)(lambda(s)(s(p(lambda(x y
    )y))(lambda(f x)(f((p(lambda(x y)y))f x))))))(lambda(y s)(s(lambda(f
    x)x)(lambda(f x)x))))(lambda(x y)x)))(lambda(n)(lambda(f x)(f(n f x)
    )))(f((((x(lambda(p)(lambda(s)(s(p (lambda(x y)y))(lambda(f x)(f((p(
    lambda(x y) y))f x))))))(lambda(s)(s(lambda(f x)x)(lambda(f x)x))))(
    ;;                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ;;                      `---------------(cons 0 0)---------------'
    lambda(x y)x))(lambda(p)(lambda(s)(s(p(lambda(x y)y))(lambda(f x)(f(
    (p(lambda(x y)y))f x))))))(lambda(s)(s(lambda(f x)x)(lambda(f x)x)))
    )(lambda(x y)x)))))))))

    ;; And for extra fun:

             (λ(f)(λ
             (x)(((x(λ(
             x)(λ(x y)y)
             )(λ(x y)x))(
             x(λ(x)(λ(x y)
                 y))(λ(x y
                   )x))(((
                    x(λ(p)(
                     λ(s)(s
                     (p (λ(
                     x y)y))
                      (λ(f x
                      )(f((p(
                       λ(x y)
                       y))f x
                       ))))))(
                       λ(s)(s(
                      λ(f x)x)
                      (λ(f x)x)
                     )))(λ(x y)
                     x))(λ(x)(λ(
                     x y)y)) (λ(
                    x y) x)))(λ(
                    f x)(f x))((f
                   ((x(λ(p )(λ (s
                   )(s(p(  λ(x y)
                  y))(λ (  f x)(f(
                  (p (λ(    x y)y)
                 )f x)))    )))(λ(
                 y s)(s     (λ (f x
                 )x)(λ(      f x)x)
                )))(λ(       x y)x))
                )(λ(n)        (λ (f
               x)(f (n        f x)))
               )(f(((         (x(λ(p)
              (λ(s)(s          (p( λ(
              x y )y           ))(λ(f
             x) (f((           p(λ(x y
             )y)) f             x)))))
            )(λ(s)(             s(λ(f x
            )x)(λ(               f x)x)
            ))) (λ               (x y)x
           ))(λ(p                )(λ(s)(
           s(p(λ(                 x y)y)
          )(λ (f                  x)(f((
          p(λ (x                  y)y)) f
         x))))))                   (λ(s)(
         s(λ (f                    x)x)(λ
        (f x)x)                     )))(λ(
        x y)x)                      ))))))

    |#


------------------------------------------------------------------------
# Alternative Church Encoding [Tuesday, February 18th]

Finally, note that this is just one way to encode things --- other
encodings are possible. One alternative encoding is in the following
code --- it uses a list of `N` falses as the encoding for `N`. This
encoding makes it easier to `add1` (just `cons` another `#f`), and to
`sub1` (simply `cdr`). The tradeoff is that some arithmetics operations
becomes more complicated, for example, the definition of `+` requires
the fixpoint combinator. (As expected, some people want to see what can
we do with a language without recursion, so they don't like jumping to Y
too fast.)

    ;;; ---<<<CHURCH-ALTERNATIVE>>>-------------------------------------
    ;; An alternative "Church" encoding: use lists to encode numbers

    #lang pl schlac

    (define identity (lambda (x) x))

    ;; Booleans (same as before)

    (define #t (lambda (x y) x))
    (define #f (lambda (x y) y))

    (define if (lambda (c t e) (c t e))) ; not really needed

    (test (->bool (if #t #f #t))
          => '#f)
    (test (->bool (if #f ((lambda (x) (x x)) (lambda (x) (x x))) #t))
          => '#t)

    (define and (lambda (a b) (a b a)))
    (define or  (lambda (a b) (a a b)))
    (define not (lambda (a x y) (a y x)))

    (test (->bool (and #f #f)) => '#f)
    (test (->bool (and #t #f)) => '#f)
    (test (->bool (and #f #t)) => '#f)
    (test (->bool (and #t #t)) => '#t)
    (test (->bool (or  #f #f)) => '#f)
    (test (->bool (or  #t #f)) => '#t)
    (test (->bool (or  #f #t)) => '#t)
    (test (->bool (or  #t #t)) => '#t)
    (test (->bool (not #f)) => '#t)
    (test (->bool (not #t)) => '#f)

    ;; Lists (same as before)

    (define cons (lambda (x y s) (s x y)))
    (define car (lambda (x) (x #t)))
    (define cdr (lambda (x) (x #f)))

    (define 1st car)
    (define 2nd (lambda (l) (car (cdr l))))
    (define 3rd (lambda (l) (car (cdr (cdr l)))))
    (define 4th (lambda (l) (car (cdr (cdr (cdr l))))))
    (define 5th (lambda (l) (car (cdr (cdr (cdr (cdr l)))))))

    (define null (lambda (s) #t))
    (define null? (lambda (x) (x (lambda (x y) #f))))

    ;; Natural numbers (alternate encoding)

    (define 0 identity)

    (define add1 (lambda (n) (cons #f n)))

    (define zero? car) ; tricky

    (define sub1 cdr) ; this becomes very simple

    ;; Note that we could have used something more straightforward:
    ;; (define 0 null)
    ;; (define add1 (lambda (n) (cons #t n))) ; cons anything
    ;; (define zero? null?)
    ;; (define sub1 (lambda (l) (if (zero? l) l (cdr l))))

    (define 1 (add1 0))
    (define 2 (add1 1))
    (define 3 (add1 2))
    (define 4 (add1 3))
    (define 5 (add1 4))

    (test (->nat* (add1 (add1 5))) => '7)
    (test (->nat* (sub1 (sub1 (add1 (add1 5))))) => '5)
    (test (->bool (and (zero? 0) (not (zero? 3)))) => '#t)
    (test (->bool (zero? (sub1 (sub1 (sub1 3))))) => '#t)

    ;; list-of-numbers tests
    (define l123 (cons 1 (cons 2 (cons 3 null))))
    (test (->listof ->nat* l123) => '(1 2 3))
    (test (->listof (->listof ->nat*) (cons l123 (cons l123 null)))
          => '((1 2 3) (1 2 3)))

    ;; Recursive functions

    (define Y
      (lambda (f)
        ((lambda (x) (x x)) (lambda (x) (f (x x))))))
    (rewrite (define/rec f E) => (define f (Y (lambda (f) E))))

    ;; note that this example is doing something silly now
    (define/rec length
      (lambda (l)
        (if (null? l)
          0
          (add1 (length (cdr l))))))
    (test (->nat* (length l123)) => '3)

    ;; addition becomes hard since it requires a recursive definition
    ;; (define/rec +
    ;;   (lambda (m n) (if (zero? n) m (+ (add1 m) (sub1 n)))))
    ;; (test (->nat* (+ 4 5)) => '9)

    ;; faster alternative:
    (define/rec +
      (lambda (m n)
        (if (zero? m) n
            (if (zero? n) m
                (add1 (add1 (+ (sub1 m) (sub1 n))))))))
    (test (->nat* (+ 4 5)) => '9)

    ;; subtraction is similar to addition
    ;; (define/rec -
    ;;   (lambda (m n) (if (zero? n) m (- (sub1 m) (sub1 n)))))
    ;; (test (->nat* (- (+ 4 5) 4)) => '5)
    ;; but this is not "natural subtraction": doesn't work when n>m,
    ;; because (sub1 0) does not return 0.

    ;; a solution is like alternative form of +:
    (define/rec -
      (lambda (m n)
        (if (zero? m) 0
            (if (zero? n) m
                (- (sub1 m) (sub1 n))))))
    (test (->nat* (- (+ 4 5) 4)) => '5)
    (test (->nat* (- 2 5)) => '0)
    ;; alternatively, could change sub1 above:
    ;; (define sub1 (lambda (n) (if (zero? n) n (cdr n))))

    ;; we can do multiplication in a similar way
    (define/rec *
      (lambda (m n)
        (if (zero? m) 0
            (+ n (* (sub1 m) n)))))
    (test (->nat* (* 4 5)) => '20)
    (test (->nat* (+ 4 (* (+ 2 5) 5))) => '39)

    ;; and the rest of the examples

    (define/rec fact
      (lambda (x)
        (if (zero? x) 1 (* x (fact (sub1 x))))))
    (test (->nat* (fact 5)) => '120)

    (define/rec fib
      (lambda (x)
        (if (or (zero? x) (zero? (sub1 x)))
          1
          (+ (fib (sub1 x)) (fib (sub1 (sub1 x)))))))
    (test (->nat* (fib (* 5 2))) => '89)

    #|
    ;; Fully-expanded Fibonacci (note: much shorter than the previous
    ;; encoding, but see how Y appears twice -- two "((lambda" pairs)
    (define fib((lambda(f)((lambda(x)(x x))(lambda(x)(f(x x)))))(lambda(
    f)(lambda(x)(((((x(lambda(x y)x))(x(lambda(x y)x)))((x(lambda(x y)y)
    )(lambda(x y)x)))(lambda(s)(s(lambda(x y)y)(lambda(x)x))))((((lambda
    (f)((lambda(x)(x x))(lambda(x)(f(x x))))) (lambda(f)(lambda(m n)((m(
    lambda(x y)x))n (((n(lambda(x y)x)) m)(lambda(s)((s (lambda(x y)y))(
    lambda(s)((s (lambda(x y)y))((f(m(lambda(x y)y)))(n(lambda(x y)y))))
    ))))))))(f(x(lambda(x y)y))))(f((x(lambda(x y)y))(lambda(x y)y))))))
    )))
    |#


------------------------------------------------------------------------
# Implementing `define-type` & `cases` in Schlac [Tuesday, February 18th]

(The following explanation originates from Chris Okasaki.)

Another interesting way to implement lists follows the pattern matching
approach, where both pairs and the null value are represented by a
function that serves as a kind of a `match` dispatcher. This function
takes in two inputs --- if it is the representation of null then it will
return the first input, and if it is a pair, then it will apply the
second input on the two parts of the pair. This is implemented as
follows (with type comments to make it clear):

    ;; null : List
    (define null
      (lambda (n p)
        n))

    ;; cons : A List -> List
    (define cons
      (lambda (x y)
        (lambda (n p)
          (p x y))))

This might seem awkward, but it follows the intended use of pairs and
null as a match-like construct. Here is an example, with the equivalent
Racket code on the side:

    ;; Sums up a list of numbers
    (define/rec (sum l)
      (l                    ; (match l
       0                    ;   ['() 0]
       (lambda (x xs)       ;   [(cons x xs)
         (+ x (sum xs)))))  ;    (+ x (sum xs))])

In fact, it's easy to implement our selectors and predicate using this:

    (define null? (lambda (l) (l #t (lambda (x xs) #f))))
    (define car   (lambda (l) (l #f (lambda (x xs) x))))
    (define cdr   (lambda (l) (l #f (lambda (x xs) xs))))
    ;; in the above `#f' is really any value, since it
    ;; should be an error alternatively:
    (define car (lambda (l)
                  (l ((lambda (x) (x x)) (lambda (x) (x x))) ; "error"
                     (lambda (x y) x))))

The same approach can be used to define any kind of new data type in a
way that looks like our own `define-type` definitions. For example,
consider a much-simplified definition of the AE type we've seen early in
the semester, and a matching `eval` definition as an example for using
`cases`:

    (define-type AE
      [Num Number]
      [Add AE AE])
    (: eval : AE -> Number)
    (define (eval expr)
      (cases expr
        [(Num n)   n]
        [(Add l r) (+ (eval l) (eval r))]))

We can follow the above approach now to write Schlac code that more than
being equivalent, is also very similar in nature. Note that the type
definition is replaced by two definitions for the two constructors:

    (define Num (lambda (n)   (lambda (num add) (num n  ))))
    (define Add (lambda (l r) (lambda (num add) (add l r))))
    (define/rec eval
      (lambda (expr) ; `expr` is always a (lambda (num add) ...), and it
                     ; expects a unary `num` argument and a binary `add`
        (expr (lambda (n)   n)
              (lambda (l r) (+ (eval l) (eval r))))))
    (test (->nat (eval (Add (Num 1) (Num 2)))) => '3)

We can even take this further: the translations from `define-type` and
`cases` are mechanical enough that we could implement them almost
exactly via rewrites (there are a subtle change in that we're list field
names rather than types):

    (rewrite (define-type -ignored-Type- [Variant arg ...] ...)
          => (define Variant
               (lambda (arg ...)
                 (lambda (Variant ...) (Variant arg ...))))
             ...)
    (rewrite (cases value [(-ignored-Variant- arg ...) result] ...)
          => (value (lambda (arg ...) result) ...))

And using that, an evluator is simple:

    (define-type AE [Num n] [Add l r] [Sub l r] [Mul l r])
    (define/rec eval
      (lambda (expr)
        (cases expr
          [(Num n)   n]
          [(Add l r) (+ (eval l) (eval r))]
          [(Sub l r) (- (eval l) (eval r))]
          [(Mul l r) (* (eval l) (eval r))])))
    (test (->nat (eval (Mul (Add (Num 1) (Num 2))
                            (Sub (Num 4) (Num 2)))))
          => '6)


------------------------------------------------------------------------
# Recursive Environments [Tuesday, February 18th]

> [PLAI §11.5]

What we really need for recursion, is a special kind of an environment,
one that can refer to itself. So instead of doing (note: `call`s removed
for readability):

    {with {fact {fun {n}
                  {if {zero? n} 1 {* n {fact {- n 1}}}}}}
      {fact 5}}

which does not work for the usual reasons, we want to use some

    {rec {fact {fun {n}
                 {if {zero? n} 1 {* n {fact {- n 1}}}}}}
      {fact 5}}

that will do the necessary magic.

One way to achieve this is using the Y combinator as we have seen --- a
kind of a "constructor" for recursive functions. We can do that in a
similar way to the `rewrite` rule that we have seen in Schlac ---
translate the above expression to:

    {with {fact {make-rec
                  {fun {fact}
                    {fun {n}
                      {if {zero? n} 1 {* n {fact {- n 1}}}}}}}}
      {fact 5}}

or even:

    {with {fact {{fun {f} {{fun {x} {f {x x}}} {fun {x} {f {x x}}}}}
                 {fun {fact}
                   {fun {n}
                     {if {zero? n} 1 {* n {fact {- n 1}}}}}}}}
      {fact 5}}

Now, we will see how it can be used in *our* code to implement a
recursive environment.

If we look at what `with` does in

    {with {fact {fun {n}
                  {if {zero? n} 1 {* n {call fact {- n 1}}}}}}
      {call fact 5}}

then we can say that to evaluate this expression, we evaluate the body
expression in an extended environment that contains `fact`, even if a
bogus one that is good for `0` only --- the new environment is created
with something like this:

    extend("fact", make-fact-closure(), env)

so we can take this whole thing as an operation over `env`

    add-fact(env) := extend("fact", make-fact-closure(), env)

This gives us the first-level fact. But `fact` itself is still undefined
in `env`, so it cannot call itself. We can try this:

    add-fact(add-fact(env))

but that still doesn't work, and it will never work no matter how far we
go:

    add-fact(add-fact(add-fact(add-fact(add-fact(...env...)))))

What we really want is infinity: a place where add-fact works and the
result is the same as what we've started with --- we want to create a
"magical" environment that makes this possible:

    let magic-env = ???
    such that:
       add-fact(magic-env) = magic-env

which basically gives us the illusion of being at the infinity point.
This magic-env thing is exactly the *fixed-point* of the `add-fact`
operation. We can use:

    magic-env = rec(add-fact)

and following the main property of the Y combinator, we know that:

    magic-env = rec(add-fact)           ; def. of magic-env
              = add-fact(rec(add-fact)) ; Y(f) = f(Y(f))
              = add-fact(magic-env)     ; def. of magic-env

What does all this mean? It means that if we have a fixed-point operator
at the level of the implementation of our environments, then we can use
it to implement a recursive binder. In our case, this means that a
fixpoint in Racket can be used to implement a recursive language. But we
have that --- Racket does have recursive functions, so we should be able
to use that to implement our recursive binder.

There are two ways that make it possible to write recursive functions in
Racket. One is to define a function, and use its name to do a recursive
call --- using the Racket formal rules, we can see that we said that we
mark that we now *know* that a variable is bound to a value. This is
essentially a side-effect --- we modify what we know, which corresponds
to modifying the global environment. The second way is a new form:
`letrec`. This form is similar to `let`, except that the scope that is
established includes the named expressions --- it is exactly what we
want `rec` to do. A third way is using recursive local definitions, but
that is equivalent to using `letrec`, more on this soon.


------------------------------------------------------------------------
# Recursion: Racket's `letrec` [Tuesday, February 18th]

So we want to add recursion to our language, practically. We already
know that Racket makes it possible to write recursive functions, which
is possible because of the way it implements its "global environment":
our evaluator can only *extend* an environment, while Racket *modifies*
its global environment. This means that whenever a function is defined
in the global environment, the resulting closure will have it as its
environment "pointer", but the global environment was not extended ---
it stays the same, and was just modified with one additional binding.

But Racket has another, a bit more organized way of using recursion:
there is a special local-binding construct that is similar to `let`, but
allows a function to refer to itself. It is called `letrec`:

    (letrec ([fact (lambda (n)
                     (if (zero? n)
                       1
                       (* n (fact (- n 1)))))])
      (fact 5))

Some people may remember that there was a third way for creating
recursive functions: using local definition in function bodies. For
example, we have seen things like:

    (define (length list)
      (define (helper list len)
        (if (null? list)
          len
          (helper (rest list) (+ len 1))))
      (helper list 0))

This looks like the same kind of environment magic that happens with a
global `define` --- but actually, Racket defines the meaning of internal
definitions using `letrec` --- so the above code is exactly the same as:

    (define (length list)
      (letrec ([helper (lambda (list len)
                         (if (null? list)
                           len
                           (helper (rest list) (+ len 1))))])
        (helper list 0)))

The scoping rules for a `letrec` is that the scope of the bound name
covers both the body *and* the named expression. Furthermore, multiple
names can be bound to multiple expressions, and the scope of each name
covers all named expression as well as the body. This makes it easy to
define mutually recursive functions, such as:

    (letrec ([even? (lambda (n) (if (zero? n) #t (odd?  (- n 1))))]
             [odd?  (lambda (n) (if (zero? n) #f (even? (- n 1))))])
      (even? 99))

But it is not a required functionality --- it could be done with a
single recursive binding that contains several functions:

    (letrec ([even+odd
              (list (lambda (n)
                      (if (zero? n) #t ((second even+odd) (- n 1))))
                    (lambda (n)
                      (if (zero? n) #f ((first  even+odd) (- n 1)))))])
      ((first even+odd) 99))

This is basically the same problem we face if we want to use the Y
combinator for mutually recursive bindings. The above solution is
inconvenient, but it can be improved using more `let`s to have easier
name access. For example:

    (letrec ([even+odd
              (list (lambda (n)
                      (let ([even? (first  even+odd)]
                            [odd?  (second even+odd)])
                        (if (zero? n) #t (odd? (- n 1)))))
                    (lambda (n)
                      (let ([even? (first  even+odd)]
                            [odd?  (second even+odd)])
                        (if (zero? n) #f (even? (- n 1))))))])
      (let ([even? (first  even+odd)]
            [odd?  (second even+odd)])
        (even? 99)))


------------------------------------------------------------------------
# Implementing Recursion using `letrec` [Tuesday, February 18th]

We will see how to add a similar construct to our language --- for
simplicity, we will add a `rec` form that handles a single binding:

    {rec {fact {fun {n}
                 {if {= 0 n}
                   1
                   {* n {fact {- n 1}}}}}}
      {fact 5}}

Using this, things can get a little tricky. What should we get if we do:

    {rec {x x} x}

? Currently, it seems like there is no point in using any expression
except for a *function* expression in a `rec` expression, so we will
handle only these cases.

(BTW, under what circumstances would non-function values be useful in a
letrec?)

------------------------------------------------------------------------

One way to achieve this is to use the same trick that we have recently
seen: instead of re-implementing language features, we can use existing
features in our own language, which hopefully has the right
functionality in a form that can be re-used to in our evaluator.

Previously, we have seen a way to implement environments using Racket
closures:

    ;; Define a type for functional environments
    (define-type ENV = Symbol -> VAL)

    (: EmptyEnv : -> ENV)
    (define (EmptyEnv)
      (lambda (id) (error 'lookup "no binding for ~s" id)))

    (: lookup : Symbol ENV -> VAL)
    (define (lookup name env)
      (env name))

    (: Extend : Symbol VAL ENV -> ENV)
    (define (Extend id val rest-env)
      (lambda (name)
        (if (eq? name id)
          val
          (rest-env name))))

We can use this implementation, and create circular environments using
Racket's `letrec`. The code for handling a `with` expressions is:

    [(With bound-id named-expr bound-body)
     (eval bound-body
           (Extend bound-id (eval named-expr env) env))]

It looks like we should be able to handle `rec` in a similar way (the
AST constructor name is `WRec` ("with-rec") so it doesn't collide with
TR's `Rec` constructor for recursive types):

    [(WRec bound-id named-expr bound-body)
     (eval bound-body
           (Extend bound-id (eval named-expr env) env))]

but this won't work because the named expression is evaluated
prematurely, in the previous environment. Instead, we will move
everything that needs to be done, including evaluation, to a separate
`extend-rec` function:

    [(WRec bound-id named-expr bound-body)
     (eval bound-body
           (extend-rec bound-id named-expr env))]

Now, the `extend-rec` function needs to provide the new, "magically
circular" environment. Following what we know about the arguments to
`extend-rec`, and the fact that it returns a new environment (= a lookup
function), we can sketch a rough definition:

    (: extend-rec : Symbol FLANG ENV -> ENV) ; FLANG, not VAL!
    ;; extend an environment with a new binding that is the result of
    ;; evaluating an expression in the same environment as the extended
    ;; result
    (define (extend-rec id expr rest-env)
      (lambda (name)
        (if (eq? name id)
          ... something that uses expr to get a value ...
          (rest-env name))))

What should the missing expression be? It can simply evaluate the object
given itself:

    (define (extend-rec id expr rest-env)
      (lambda (name)
        (if (eq? name id)
          (eval expr ...this environment...)
          (rest-env name))))

But how do we get *this environment*, before it is defined? Well, the
environment is itself a Racket *function*, so we can use Racket's
`letrec` to make the function refer to itself recursively:

    (define (extend-rec id expr rest-env)
      (letrec ([rec-env (lambda (name)
                          (if (eq? name id)
                            (eval expr rec-env)
                            (rest-env name)))])
        rec-env))

It's a little more convenient to use an internal definition, and add a
type for clarity:

    (define (extend-rec id expr rest-env)
      (: rec-env : Symbol -> VAL)
      (define (rec-env name)
        (if (eq? name id)
          (eval expr rec-env)
          (rest-env name)))
      rec-env)

This works, but there are several problems:

1. First, we no longer do a simple lookup in the new environment.
   Instead, we evaluate the expression on *every* such lookup. This
   seems like a technical point, because we do not have side-effects in
   our language (also because we said that we want to handle only
   function expressions). Still, it wastes space since each evaluation
   will allocate a new closure.

2. Second, a related problem --- what happens if we try to run this:

       {rec {x x} x}

   ? Well, we do that stuff to extend the current environment, then
   evaluate the body in the new environment, this body is a single
   variable reference:

       (eval (Id 'x) the-new-env)

   so we look up the value:

       (lookup 'x the-new-env)

   which is:

       (the-new-env 'x)

   which goes into the function which implements this environment, there
   we see that `name` is the same as `name1`, so we return:

       (eval expr rec-env)

   but the `expr` here is the original named-expression which is itself
   `(Id 'x)`, and we're in an infinite loop.

We can try to get over these problems using another binding. Racket
allows several bindings in a single `letrec` expression or multiple
internal function definitions, so we change `extend-rec` to use the
newly-created environment:

    (define (extend-rec id expr rest-env)
      (: rec-env : Symbol -> VAL)
      (define (rec-env name)
        (if (eq? name id)
          val
          (rest-env name)))
      (: val : VAL)
      (define val (eval expr rec-env))
      rec-env)

This runs into an interesting type error, which complains about possibly
getting some `Undefined` value. It does work if we switch to the untyped
language for now (using `#lang pl untyped`) --- and it seems to run fine
too. But it raises more questions, beginning with: what is the meaning
of:

    (letrec ([x ...]
             [y ...x...])
      ...)

or equivalently, an internal block of

    (define x ...)
    (define y ...x...)

? Well, DrRacket seems to do the "right thing" in this case, but what
about:

    (letrec ([y ...x...]
             [x ...])
      ...)

? As a hint, see what happens when we now try to evaluate the
problematic

    {rec {x x} x}

expression, and compare that with the result that you'd get from Racket.
This also clarifies the type error that we received.

It should be clear now why we want to restrict usage to just binding
recursive functions. There are no problems with such definitions because
when we evaluate a `fun` expression, there is no evaluation of the body,
which is the only place where there are potential references to the same
function that is defined --- a function's body is *delayed*, and
executed only when the function is applied later.

But the biggest question that is still open: we just implemented a
circular environment using Racket's own circular environment
implementation, and that does not explain how they are actually
implemented. The cycle of pointers that we've implemented depends on the
cycle of pointers that Racket uses, and that is a black box we want to
open up.

For reference, the complete code is below.

    #lang pl

    #|
    The grammar:
      <FLANG> ::= <num>
                | { + <FLANG> <FLANG> }
                | { - <FLANG> <FLANG> }
                | { * <FLANG> <FLANG> }
                | { / <FLANG> <FLANG> }
                | { with { <id> <FLANG> } <FLANG> }
                | { rec { <id> <FLANG> } <FLANG> }
                | <id>
                | { fun { <id> } <FLANG> }
                | { call <FLANG> <FLANG> }

    Evaluation rules:
      eval(N,env)                = N
      eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
      eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
      eval({* E1 E2},env)        = eval(E1,env) * eval(E2,env)
      eval({/ E1 E2},env)        = eval(E1,env) / eval(E2,env)
      eval(x,env)                = lookup(x,env)
      eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
      eval({rec {x E1} E2},env)  = ???
      eval({fun {x} E},env)      = <{fun {x} E}, env>
      eval({call E1 E2},env1)    = eval(B,extend(x,eval(E2,env1),env2))
                                  if eval(E1,env1) = <{fun {x} B}, env2>
                                 = error!  otherwise
    |#

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [WRec Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [(cons 'rec more)
         (match sexpr
           [(list 'rec (list (symbol: name) named) body)
            (WRec name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `rec' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Types for environments, values, and a lookup function

    (define-type VAL
      [NumV Number]
      [FunV Symbol FLANG ENV])

    ;; Define a type for functional environments
    (define-type ENV = Symbol -> VAL)

    (: EmptyEnv : -> ENV)
    (define (EmptyEnv)
      (lambda (id) (error 'lookup "no binding for ~s" id)))

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (env name))

    (: Extend : Symbol VAL ENV -> ENV)
    ;; extend a given environment cache with a new binding
    (define (Extend id val rest-env)
      (lambda (name)
        (if (eq? name id)
          val
          (rest-env name))))

    (: extend-rec : Symbol FLANG ENV -> ENV)
    ;; extend an environment with a new binding that is the result of
    ;; evaluating an expression in the same environment as the extended
    ;; result
    (define (extend-rec id expr rest-env)
      (: rec-env : Symbol -> VAL)
      (define (rec-env name)
        (if (eq? name id)
          val
          (rest-env name)))
      (: val : VAL)
      (define val (eval expr rec-env))
      rec-env)

    (: NumV->number : VAL -> Number)
    ;; convert a FLANG runtime numeric value to a Racket one
    (define (NumV->number val)
      (cases val
        [(NumV n) n]
        [else (error 'arith-op "expected a number, got: ~s" val)]))

    (: arith-op : (Number Number -> Number) VAL VAL -> VAL)
    ;; gets a Racket numeric binary operator, and uses it within a NumV
    ;; wrapper
    (define (arith-op op val1 val2)
      (NumV (op (NumV->number val1) (NumV->number val2))))

    (: eval : FLANG ENV -> VAL)
    ;; evaluates FLANG expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Add l r) (arith-op + (eval l env) (eval r env))]
        [(Sub l r) (arith-op - (eval l env) (eval r env))]
        [(Mul l r) (arith-op * (eval l env) (eval r env))]
        [(Div l r) (arith-op / (eval l env) (eval r env))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (Extend bound-id (eval named-expr env) env))]
        [(WRec bound-id named-expr bound-body)
         (eval bound-body
               (extend-rec bound-id named-expr env))]
        [(Id name) (lookup name env)]
        [(Fun bound-id bound-body)
         (FunV bound-id bound-body env)]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV bound-id bound-body f-env)
            (eval bound-body
                  (Extend bound-id (eval arg-expr env) f-env))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str) (EmptyEnv))])
        (cases result
          [(NumV n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x 3}
                  {with {f {fun {y} {+ x y}}}
                    {with {x 5}
                      {call f 4}}}}")
          => 7)
    (test (run "{call {with {x 3}
                        {fun {y} {+ x y}}}
                      4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)


------------------------------------------------------------------------
# Implementing `rec` Using Cyclic Structures [Tuesday, February 25th]

> [PLAI §10]

Looking at the arrows in the environment diagrams, what we're really
looking for is a closure that has an environment pointer which is the
same environment in which it was defined. This will make it possible for
`fact` to be bound to a closure that can refer to *itself* since its
environment is the same one in which it is defined. However, so far we
have no tools that makes it possible to do this.

What we need is to create a "cycle of pointers", and so far we do not
have a way of achieving that: when we create a closure, we begin with an
environment which is saved in the slot's environment slot, but we want
that closure to be the value of a binding in that same environment.


------------------------------------------------------------------------
# Boxes and Mutation [Tuesday, February 25th]

To actually implement a circular structure, we will now use
*side-effects*, using a new kind of Racket value which supports
mutation: a box. A box value is built with the `box` constructor:

    (define my-thing (box 7))

the value is retrieved with the `unbox' function,

    (* 6 (unbox my-thing))

and finally, the value can be changed with the `set-box!` function.

    (set-box! my-thing 17)
    (* 6 (unbox my-thing))

An important thing to note is that `set-box!` is much like `display`
etc, it returns a value that is not printed in the Racket REPL, because
there is no point in using the result of a `set-box!`, it is called for
the side-effect it generates. (Languages like C blur this distinction
between returning a value and a side-effect with its assignment
statement.)

As a side note, we now have side effects of two kinds: mutation of
state, and I/O (at least the O part). (Actually, there is also infinite
looping that can be viewed as another form of a side effect.) This means
that we're now in a completely different world, and lots of new things
can make sense now. A few things that you should know about:

* We never used more than one expression in a function body because
  there was no point in it, but now there is. To evaluate a sequence of
  Racket expressions, you wrap them in a `begin` expression.

* In most places you don't actually need to use `begin` --- these are
  places that are said to have an *implicit* `begin`: the body of a
  function (or any lambda expression), the body of a `let` (and
  `let`-relatives), the consequence positions in `cond`, `match`, and
  `cases` clauses and more. One of the common places where a `begin` is
  used is in an `if` expression (and some people prefer using `cond`
  instead when there is more than a single expression).

* `cond` without an `else` in the end can make sense, if all you're
  using it it for is side-effects.

* `if` could get a single expression which is executed when the
  condition is true (and an unspecified value is used otherwise), but
  our language (as well as the default Racket language) always forbids
  this --- there are convenient special forms for a one-sided `if`s:
  `when` & `unless`, and they can have any number of expressions (they
  have an implicit `begin`). They have an advantage of saying "this code
  does some side-effects here" more explicit.

* There is a function called `for-each` which is just like `map`, except
  that it doesn't collect the list of results, it is used only for
  performing side effects.

* Aliasing and the concept of "object equality": `equal?` vs `eq?`. For
  example:

      (: foo : (Boxof ...) (Boxof ...) -> ...)
      (define (foo a b)
        (set-box! a 1)) ;*** this might change b, can check `eq?`

When any one of these things is used (in Racket or other languages), you
can tell that side-effects are involved, because there is no point in
any of them otherwise. In addition, any name that ends with a `!`
("bang") is used to mark a function that changes state (usually a
function that only changes state).

So how do we create a cycle? Simple, boxes can have any value, and they
can be put in other values like lists, so we can do this:

    #lang pl untyped
    (define foo (list 1 (box 3)))
    (set-box! (second foo) foo)

and we get a circular value. (Note how it is printed.) And with types:

    #lang pl
    (: foo : (List Number (Boxof Any)))
    (define foo (list 1 (box 3)))
    (set-box! (second foo) foo)


------------------------------------------------------------------------
# Types for Boxes [Tuesday, February 25th]

Obviously, `Any` is not too great --- it is the most generic type, so it
provides the least information. For example, notice that

    (unbox (second foo))

returns the right list, which is equal to `foo` itself --- but if we try
to grab some part of the resulting list:

    (second (unbox (second foo)))

we get a type error, because the result of the `unbox` is `Any`, so
Typed Racket knows nothing about it, and won't allow you to treat it as
a list. It is not too surprising that the type constructor that can help
in this case is `Rec` which we have already seen --- it allows a type
that can refer to itself:

    #lang pl
    (: foo : (Rec this (List Number (Boxof (U #f this)))))
    (define foo (list 1 (box #f)))
    (set-box! (second foo) foo)

Note that either `foo` or the value in the box are both printed with a
`Rec` type --- the value in the box can't just have a `(U #f this)`
type, since `this` doesn't mean anything in there, so the whole type
needs to still be present.

There is another issue to be aware of with `Boxof` types. For most type
constructors (like `Listof`), if `T1` is a subtype of `T2`, then we also
know that`(Listof T1)` is a subtype of `(Listof T2)`. This makes the
following code typecheck:

    #lang pl
    (: foo : (Listof Number) -> Number)
    (define (foo l)
      (first l))
    (: bar : Integer -> Number)
    (define (bar x)
      (foo (list x)))

Since the `(Listof Integer)` is a subtype of the `(Listof Number)` input
for `foo`, the application typechecks. But this is *not* the same for
the output type, for example --- if we change the `bar` type to:

    (: bar : Integer -> Integer)

we get a type error since `Number` is not a subtype of `Integer`. So
subtypes are required to "go higher" on the input side and "lower" on
the other. So, in a sense, the fact that boxes are mutable means that
their contents can be considered to be on the other side of the arrow,
which is why for such `T1` subtype of `T2`, it is `(Boxof T2)` that is a
subtype of `(Boxof T1)`, instead of the usual. For example, this doesn't
work:

    #lang pl
    (: foo : (Boxof Number) -> Number)
    (define (foo b)
      (unbox b))
    (: bar : Integer -> Number)
    (define (bar x)
      (: b : (Boxof Integer))
      (define b (box x))
      (foo b))                   ;***

And you can see why this is the case --- the marked line is fine given a
`Number` contents, so if the type checker allows passing in a box
holding an integer, then that expression would mutate the contents and
make it an invalid value.

However, boxes are not only mutable, they hold a value that can be read
too, which means that they're on *both* sides of the arrow, and this
means that `(Boxof T1)` is a subtype of `(Boxof T2)` if `T2` is a
subtype of `T1` *and* `T1` is a subtype of `T2` --- in other words, this
happens only when `T1` and `T2` are the same type. (See below for an
extended demonstration of all of this.)

Note also that this demonstration requires that extra `b` definition, if
it's skipped:

    (define (bar x)
      (foo (box x)))

then this will typecheck again --- Typed Racket will just consider the
context that requires a box holding a `Number`, and it is still fine to
initialize such a box with an `Integer` value.

> As a side comment, this didn't always work. Earlier in its existence,
> Typed Racket would always choose a specific type for values, which
> would lead to confusing errors with boxes. For example, the above
> would need to be written as
>
>     (define (bar x)
>       (foo (box (ann x : Number))))
>
> to prevent Typed Racket from inferring a specific type. This is no
> longer the case, but there can still be some surprises. A similar
> annotation was needed in the case of a list holding a self-referential
> box, to avoid the initial `#f` from getting a specific-but-wrong type.

> Another way to see the problem these days is to enter the following
> expressions and see what types Typed Racket guesses for them:
>
>     > (define a 0)
>     > (define b (box 0))
>     > a
>     - : Integer [more precisely: Zero]         ;***
>     0
>     > b
>     - : (Boxof Integer)                        ;***
>     '#&0
>
> Note that for `a`, the assigned type is *very* specific, because Typed
> Racket assumes that it will not change. But with a boxed value, using
> a type of `(Boxof Zero)` would lead to a useless box, since it'll only
> allow using `set-box!` with `0`, and therefore can never change. This
> shows that this is exactly that: a guess given the lack or explicit
> user-specified type, so there's no canonical guess that can be
> inferred here.


------------------------------------------------------------------------
# `Boxof`'s Lack of Subtyping [Tuesday, February 25th]

The lack of any subtype relations between `(Boxof T)` and `(Boxof S)`
regardless of `S` and `T` can roughly be explained as follows.

First, a box is a container that you can pull a value out of --- which
makes it similar to lists. In the case of lists, we have:

    if:           S   subtype-of          T
    then: (Listof S)  subtype-of  (Listof T)

This is true for all such containers that you can pull a value out of:
if you expect to pull a `T` but you're given a container of a subtype
`S`, then things are still fine (you'll get an `S` which is also a `T`).
Such "containers" include functions that produce a value --- for
example:

    if:         S  subtype-of       T
    then:  Q -> S  subtype-of  Q -> T

However, functions also have the other side, where things are different
--- instead of a side of some *produced* value, it's the side of the
*consumed* value. We get the opposite rule there:

    if:    T       subtype-of  S
    then:  S -> Q  subtype-of  T -> Q

To see why this is right, use `Number` and `Integer` for `S` and `T`:

    if:    Integer      subtype-of  Number
    then:  Number -> Q  subtype-of  Integer -> Q

so --- if you expect a function that takes an integer, a valid *subtype*
value that I can give you is a function that takes a number. In other
words, every function that takes a number is also a function that takes
an integer, but not the other way.

To summarize all of this, when you make the output type of a function
"smaller" (more constrained), the resulting type is smaller (a subset),
but on the input side things are flipped --- a bigger input type means a
more constrained function.

> The technical names for these properties are: a "covariant" type is
> one that preserves the subtype relationship, and a "contravairant"
> type is one that reverses it. (Which is similar to how these terms are
> used in math.)

(Side note: this is related to the fact that in logic, `P => Q` is
roughly equivalent to `not(P) or Q` --- the left side, `P`, is inside
negation. It also explains why in `((S -> T) -> Q)` the `S` obeys the
first rule, as if it was on the right side --- because it's negated
twice.)

Now, a `(Boxof T)` is a producer of `T` when you pull a value out of the
box, but it's also a consumer of `T` when you put such a value in it.
This means that --- using the above analogy --- the `T` is on both sides
of the arrow. This means that

    if:    S subtype-of T  *and*  T subtype-of S
    then:  (Boxof S) subtype-of (Boxof T)

which is actually:

    if:           S   is-the-same-type-as         T
    then:  (Boxof S)  is-the-same-type-as  (Boxof T)

A different way to look at this conclusion is to consider the function
type of `(A -> A)`: when is it a subtype of some other `(B -> B)`? Only
when `A` is a subtype of `B` and `B` is a subtype of `A`, which means
that this happens only when `A` and `B` are the same type.

> The term for this is "nonvariant" (or "invariant"): `(A -> A)` is
> unrelated to `(B -> B)` regardless of how `A` and `B` are related. The
> only exception is, of course, when they are the same type. The
> Wikipedia entry about these puts the terms together nicely in the face
> of mutation:
>
> > Read-only data types (sources) can be covariant; write-only data
> > types (sinks) can be contravariant. Mutable data types which act as
> > both sources and sinks should be invariant.

The following piece of code makes the analogy to function types more
formally. Boxes behave as if their contents is on both sides of a
function arrow --- on the right because they're readable, and on the
left because they're writable, which the conclusion that a `(Boxof A)`
type is a subtype of itself and no other `(Boxof B)`.

    #lang pl

    ;; a type for a "read-only" box
    (define-type (Boxof/R A) = (-> A))
    ;; Boxof/R constructor
    (: box/r : (All (A) A -> (Boxof/R A)))
    (define (box/r x) (lambda () x))
    ;; we can see that (Boxof/R T1) is a subtype of (Boxof/R T2)
    ;; if T1 is a subtype of T2 (this is not surprising, since
    ;; these boxes are similar to any other container, like lists):
    (: foo1 : Integer -> (Boxof/R Integer))
    (define (foo1 b) (box/r b))
    (: bar1 : (Boxof/R Number) -> Number)
    (define (bar1 b) (b))
    (test (bar1 (foo1 123)) => 123)

    ;; a type for a "write-only" box
    (define-type (Boxof/W A) = (A -> Void))
    ;; Boxof/W constructor
    (: box/w : (All (A) A -> (Boxof/W A)))
    (define (box/w x) (lambda (new) (set! x new)))
    ;; in contrast to the above, (Boxof/W T1) is a subtype of
    ;; (Boxof/W T2) if T2 is a subtype of T1, *not* the other way
    ;; (and note how this is related to A being on the *left* side
    ;; of the arrow in the `Boxof/W' type):
    (: foo2 : Number -> (Boxof/W Number))
    (define (foo2 b) (box/w b))
    (: bar2 : (Boxof/W Integer) Integer -> Void)
    (define (bar2 b new) (b new))
    (test (bar2 (foo2 123) 456))

    ;; combining the above two into a type for a "read/write" box
    (define-type (Boxof/RW A) = (A -> A))
    ;; Boxof/RW constructor
    (: box/rw : (All (A) A -> (Boxof/RW A)))
    (define (box/rw x) (lambda (new) (let ([old x]) (set! x new) old)))
    ;; this combines the above two: `A' appears on both sides of the
    ;; arrow, so (Boxof/RW T1) is a subtype of (Boxof/RW T2) if T1
    ;; is a subtype of T2 (because there's an A on the right) *and*
    ;; if T2 is a subtype of T1 (because there's another A on the
    ;; left) -- and that can happen only when T1 and T2 are the same
    ;; type. So this is a type error:
    ;;   (: foo3 : Integer -> (Boxof/RW Integer))
    ;;   (define (foo3 b) (box/rw b))
    ;;   (: bar3 : (Boxof/RW Number) Number -> Number)
    ;;   (define (bar3 b new) (b new))
    ;;   (test (bar3 (foo3 123) 456) => 123)
    ;;   ** Expected (Number -> Number), but got (Integer -> Integer)
    ;; And this a type error too:
    ;;   (: foo3 : Number -> (Boxof/RW Number))
    ;;   (define (foo3 b) (box/rw b))
    ;;   (: bar3 : (Boxof/RW Integer) Integer -> Integer)
    ;;   (define (bar3 b new) (b new))
    ;;   (test (bar3 (foo3 123) 456) => 123)
    ;;   ** Expected (Integer -> Integer), but got (Number -> Number)
    ;; The two types must be the same for this to work:
    (: foo3 : Integer -> (Boxof/RW Integer))
    (define (foo3 b) (box/rw b))
    (: bar3 : (Boxof/RW Integer) Integer -> Integer)
    (define (bar3 b new) (b new))
    (test (bar3 (foo3 123) 456) => 123)


------------------------------------------------------------------------
# Implementing a Circular Environment [Tuesday, February 25th]

We now use this to implement `rec` in the following way:

1. Change environments so that instead of values they hold boxes of
   values: `(Boxof VAL)` instead of `VAL`, and whenever `lookup` is
   used, the resulting boxed value is unboxed,

2. In the `WRec` case, create the new environment with some temporary
   binding for the identifier --- any value will do since it should not
   be used (when named expressions are always `fun` expressions),

3. Evaluate the expression in the new environment,

4. Change the binding of the identifier (the box) to the result of this
   evaluation.

The resulting definition is:

    (: extend-rec : Symbol FLANG ENV -> ENV)
    ;; extend an environment with a new binding that is the result of
    ;; evaluating an expression in the same environment as the extended
    ;; result
    (define (extend-rec id expr rest-env)
      (let ([new-cell (box (NumV 42))])
        (let ([new-env (Extend id new-cell rest-env)])
          (let ([value (eval expr new-env)])
            (set-box! new-cell value)
            new-env))))

Racket has another `let` relative for such cases of multiple-nested
`let`s --- `let*`. This form is a derived form --- it is defined as a
shorthand for using nested `let`s. The above is therefore exactly the
same as this code:

    (: extend-rec : Symbol FLANG ENV -> ENV)
    ;; extend an environment with a new binding that is the result of
    ;; evaluating an expression in the same environment as the extended
    ;; result
    (define (extend-rec id expr rest-env)
      (let* ([new-cell (box (NumV 42))]
             [new-env  (Extend id new-cell rest-env)]
             [value    (eval expr new-env)])
        (set-box! new-cell value)
        new-env))

This `let*` form can be read almost as a C/Java-ish kind of code:

    fun extend_rec(id, expr, rest_env) {
      new_cell  = new NumV(42);
      new_env   = new ExtendedEnv(id, new_cell, rest_env);
      value     = eval(expr, new_env);
      *new_cell = value;
      return new_env;
    }

The code can be simpler if we fold the evaluation into the `set-box!`
(since `value` is used just there), and if use `lookup` to do the
mutation --- since this way there is no need to hold onto the box. This
is a bit more expensive, but since the binding is guaranteed to be the
first one in the environment, the addition is just one quick step. The
only binding that we need is the one for the new environment, which we
can do as an internal definition, leaving us with:

    (: extend-rec : Symbol FLANG ENV -> ENV)
    (define (extend-rec id expr rest-env)
      (define new-env (Extend id (box (NumV 42)) rest-env))
      (set-box! (lookup id new-env) (eval expr new-env))
      new-env)

A complete rehacked version of FLANG with a `rec` binding follows. We
can't test `rec` easily since we have no conditionals, but you can at
least verify that

    (run "{rec {f {fun {x} {call f x}}} {call f 0}}")

is an infinite loop.

    ;;; ---<<<FLANG-BOX>>>----------------------------------------------
    #lang pl

    (define-type FLANG
      [Num  Number]
      [Add  FLANG FLANG]
      [Sub  FLANG FLANG]
      [Mul  FLANG FLANG]
      [Div  FLANG FLANG]
      [Id   Symbol]
      [With Symbol FLANG FLANG]
      [WRec Symbol FLANG FLANG]
      [Fun  Symbol FLANG]
      [Call FLANG FLANG])

    (: parse-sexpr : Sexpr -> FLANG)
    ;; parses s-expressions into FLANGs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons (or 'with 'rec) more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [(list 'rec (list (symbol: name) named) body)
            (WRec name (parse-sexpr named) (parse-sexpr body))]
           [(cons x more)
            (error 'parse-sexpr "bad `~s' syntax in ~s" x sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(list '+ lhs rhs) (Add (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '* lhs rhs) (Mul (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '/ lhs rhs) (Div (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call (parse-sexpr fun) (parse-sexpr arg))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> FLANG)
    ;; parses a string containing a FLANG expression to a FLANG AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Types for environments, values, and a lookup function

    (define-type ENV
      [EmptyEnv]
      [Extend Symbol (Boxof VAL) ENV])

    (define-type VAL
      [NumV Number]
      [FunV Symbol FLANG ENV])

    (: lookup : Symbol ENV -> (Boxof VAL))
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(Extend id boxed-val rest-env)
         (if (eq? id name) boxed-val (lookup name rest-env))]))

    (: extend-rec : Symbol FLANG ENV -> ENV)
    ;; extend an environment with a new binding that is the result of
    ;; evaluating an expression in the same environment as the extended
    ;; result
    (define (extend-rec id expr rest-env)
      (define new-env (Extend id (box (NumV 42)) rest-env))
      (set-box! (lookup id new-env) (eval expr new-env))
      new-env)

    (: NumV->number : VAL -> Number)
    ;; convert a FLANG runtime numeric value to a Racket one
    (define (NumV->number val)
      (cases val
        [(NumV n) n]
        [else (error 'arith-op "expected a number, got: ~s" val)]))

    (: arith-op : (Number Number -> Number) VAL VAL -> VAL)
    ;; gets a Racket numeric binary operator, and uses it within a NumV
    ;; wrapper
    (define (arith-op op val1 val2)
      (NumV (op (NumV->number val1) (NumV->number val2))))

    (: eval : FLANG ENV -> VAL)
    ;; evaluates FLANG expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Add l r) (arith-op + (eval l env) (eval r env))]
        [(Sub l r) (arith-op - (eval l env) (eval r env))]
        [(Mul l r) (arith-op * (eval l env) (eval r env))]
        [(Div l r) (arith-op / (eval l env) (eval r env))]
        [(With bound-id named-expr bound-body)
         (eval bound-body
               (Extend bound-id (box (eval named-expr env)) env))]
        [(WRec bound-id named-expr bound-body)
         (eval bound-body
               (extend-rec bound-id named-expr env))]
        [(Id name) (unbox (lookup name env))]
        [(Fun bound-id bound-body)
         (FunV bound-id bound-body env)]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV bound-id bound-body f-env)
            (eval bound-body
                  (Extend bound-id (box (eval arg-expr env)) f-env))]
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]))

    (: run : String -> Number)
    ;; evaluate a FLANG program contained in a string
    (define (run str)
      (let ([result (eval (parse str) (EmptyEnv))])
        (cases result
          [(NumV n) n]
          [else (error 'run "evaluation returned a non-number: ~s"
                       result)])))

    ;; tests
    (test (run "{call {fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x 3}
                  {with {f {fun {y} {+ x y}}}
                    {with {x 5}
                      {call f 4}}}}")
          => 7)
    (test (run "{call {with {x 3}
                        {fun {y} {+ x y}}}
                      4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)


------------------------------------------------------------------------
# Variable Mutation [Tuesday, February 25th]

> [PLAI §12] and [PLAI §13] (different: adds boxes to the language)
>
> [PLAI §14] (that's what we do)

The code that we now have implements recursion by *changing* bindings,
and to make that possible we made environments hold boxes for all
bindings, therefore bindings are *all* mutable now. We can use this to
add more functionality to our evaluator, by allowing changing any
variable --- we can add a `set!` form:

    {set! <id> <FLANG>}

to the evaluator that will modify the value of a variable. To implement
this functionality, all we need to do is to use `lookup` to retrieve
some box, then evaluate the expression and put the result in that box.
The actual implementation is left as a homework exercise.

One thing that should be considered here is --- all of the expressions
in our language evaluate to some value, the question is what should be
the value of a `set!` expression? There are three obvious choices:

1. return some bogus value,

2. return the value that was assigned,

3. return the value that was previously in the box.

Each one of these has its own advantage --- for example, C uses the
second option to `chain` assignments (eg, `x = y = 0`) and to allow side
effects where an expression is expected (eg, `while (x = x-1) ...`).

The third one is useful in cases where you might use the old value that
is overwritten --- for example, if C had this behavior, you could `pop`
a value from a linked list using something like:

    first(stack = rest(stack));

because the argument to `first` will be the old value of `stack`, before
it changed to be its `rest`. You could also swap two variables in a
single expression: `x = y = x`.

(Note that the expression `x = x + 1` has the meaning of C's `++x` when
option (2) is used, and `x++` when option (3) is used.)

Racket chooses the first option, and we will do the same in our
language. The advantage here is that you get no discounts, therefore you
must be explicit about what values you want to return in situations
where there is no obvious choice. This leads to more robust programs
since you do not get other programmers that will rely on a feature of
your code that you did not plan on.

In any case, the modification that introduces mutation is small, but it
has a tremendous effect on our language: it was true for Racket, and it
is true for FLANG. We have seen how mutation affects the language subset
that we use, and in the extension of our FLANG the effect is even
stronger: since *any* variable can change (no need for explicit
`box`es). In other words, a binding is not always the same --- in can
change as a result of a `set!` expression. Of course, we could extend
our language with boxes (using Racket boxes to implement FLANG boxes),
but that will be a little more verbose.

> Note that Racket does have a `set!` form, and in addition, fields in
> structs can be made modifiable. However, we do not use any of these.
> At least not for now.


------------------------------------------------------------------------
# State and Environments [Tuesday, February 25th]

A quick example of how mutation can be used:

    (define counter
      (let ([counter (box 0)])
        (lambda ()
          (set-box! counter (+ 1 (unbox counter)))
          (unbox counter))))

and compare that to:

    (define (make-counter)
      (let ([counter (box 0)])
        (lambda ()
          (set-box! counter (+ 1 (unbox counter)))
          (unbox counter))))

It is a good idea if you follow the exact evaluation of

    (define foo (make-counter))
    (define bar (make-counter))

and see how both bindings have separate environment so each one gets its
own private state. The equivalent code in the homework interpreter
extended with `set!` doesn't need boxes:

    {with {make-counter
            {fun {}
              {with {counter 0}
                {fun {}
                  {set! counter {+ counter 1}}
                  counter}}}}
      {with {foo {call make-counter}}
        {with {bar {call make-counter}}
          ...}}}

To see multiple values from a single expression you can extend the
language with a `list` binding. As a temporary hack, we can use dummy
function inputs to cover for our lack of nullary functions, and use
`with` (with dummy bound ids) to sequence multiple expressions:

    {with {make-counter
            {fun {init}
              {with {counter init}
                {fun {_}
                  {with {_ {set! counter {+ counter 1}}}
                    counter}}}}}
      {with {foo {call make-counter 0}}
        {with {bar {call make-counter 1}}
          {+ {+ {call foo 0} {+ {* 10 {call foo 0}}
                                {* 100 {call foo 0}}}}
             {* 10000 {+ {call bar 0} {+ {* 10 {call bar 0}}
                                         {* 100 {call bar 0}}}}}}}}}

Note that we cannot describe this behavior with substitution rules! We
now use the environments to make it possible to change bindings --- so
finally an environment is actually an environment rather than a
substitution cache.

When you look at the above, note that we still use lexical scope --- in
fact, the local binding is actually a private state that nobody can
access. For example, if we write this:

    (define counter
      (let ([counter (box 0)])
        (lambda ()
          (set-box! counter (+ 1 (unbox counter)))
          (if (zero? (modulo (unbox counter) 4)) 'tock 'tick))))

then the resulting function that us bound to `counter` keeps a local
integer state which no other code can access --- you cannot modify it,
reset it, or even know if it is really an integer that is used in there.


------------------------------------------------------------------------
# Implementing Objects with State [Tuesday, February 25th]

We have already seen how several pieces of information can be
encapsulate in a Racket closure that keeps them all; now we can do a
little more --- we can actually have mutable state, which leads to a
natural way to implement objects. For example:

    (define (make-point x y)
      (let ([xb (box x)]
            [yb (box y)])
        (lambda (msg)
          (match msg
            ['getx (unbox xb)]
            ['gety (unbox yb)]
            ['incx (set-box! xb (add1 (unbox xb)))]))))

implements a constructor for `point` objects which keep two values and
can move one of them. Note that the messages act as a form of methods,
and that the values themselves are hidden and are accessible only
through the interface that these messages make. For example, if these
points correspond to some graphic object on the screen, we can easily
incorporate a necessary screen update:

    (define (make-point x y)
      (let ([xb (box x)]
            [yb (box y)])
        (lambda (msg)
          (match msg
            ['getx (unbox xb)]
            ['gety (unbox yb)]
            ['incx (set-box! xb (add1 (unbox xb)))
                   (update-screen)]))))

and be sure that this is always done when the value changes --- since
there is no way to change the value except through this interface.

A more complete example would define functions that actually send these
messages --- here is a better implementation of a point object and the
corresponding accessors and mutators:

    (define (make-point x y)
      (let ([xb (box x)]
            [yb (box y)])
        (lambda (msg)
          (match msg
            ['getx (unbox xb)]
            ['gety (unbox yb)]
            [(list 'setx newx)
             (set-box! xb newx)
             (update-screen)]
            [(list 'sety newy)
             (set-box! yb newy)
             (update-screen)]))))
    (define (point-x p) (p 'getx))
    (define (point-y p) (p 'gety))
    (define (set-point-x! p x) (p (list 'setx x)))
    (define (set-point-y! p y) (p (list 'sety y)))

And a quick imitation of inheritance can be achieved using delegation to
an instance of the super-class:

    (define (make-colored-point x y color)
      (let ([p (make-point x y)])
        (lambda (msg)
          (match msg
            ['getcolor color]
            [else (p msg)]))))

You can see how all of these could come from some preprocessing of a
more normal-looking class definition form, like:

    (defclass point (x y)
      (public (getx) x)
      (public (gety) y)
      (public (setx new) (set! x newx))
      (public (setx new) (set! x newx)))

    (defclass colored-point point (c)
      (public (getcolor) c))


------------------------------------------------------------------------
# The Toy Language [Tuesday, February 25th]

> Not in PLAI

A quick note: from now on, we will work with a variation of our language
--- it will change the syntax to look a little more like Racket, and we
will use Racket values for values in our language and Racket functions
for built-ins in our language.

Main highlights:

* There can be multiple bindings in function arguments and local `bind`
  forms --- the names are required to be distinct.

* There are now a few keywords like `bind` that are parsed in a special
  way. Other forms are taken as function application, which means that
  there are no special parse rules (and AST entries) for arithmetic
  functions. They're now bindings in the global environment, and treated
  in the same way as all bindings. For example, `*` is an expression
  that evaluates to the *primitive* multiplication function, and
  `{bind {{+ *}} {+ 2 3}}` evaluates to `6`.

* Since function applications are now the same for primitive functions
  and user-bound functions, there is no need for a `call` keyword. Note
  that the function call part of the parser must be last, since it
  should apply only if the input is not some other known form.

* Note the use of `make-untyped-list-function`: it's a library function
  (included in the course language) that can convert a few known Racket
  functions to a function that consumes a list of *any* Racket values,
  and returns the result of applying the given Racket function on these
  values. For example:

      (define add (make-untyped-list-function +))
      (add (list 1 2 3 4))

  evaluates to `10`.

* Another important aspect of this is its type --- the type of `add` in
  the previous example is `(List -> Any)`, so the resulting function can
  consume *any* input values. If it gets a bad value, it will throw an
  appropriate error. This is a hack: it basically means that the
  resulting `add` function has a very generic type (requiring just a
  list), so errors can be thrown at run-time. However, in this case, a
  better solution is not going to make these run-time errors go away
  because the language that we're implementing is not statically typed.

* The benefit of this is that we can avoid the hassle of more verbose
  code by letting these functions dynamically check the input values, so
  we can use a single `RktV` variant in `VAL` which wraps any Racket
  value. (Otherwise we'd need different wrappers for different types,
  and implement these dynamic checks.)

The following is the complete implementation.

    ;;; ---<<<TOY>>>----------------------------------------------------
    #lang pl

    ;;; ----------------------------------------------------------------
    ;;; Syntax

    #| The BNF:
       <TOY> ::= <num>
               | <id>
               | { bind {{ <id> <TOY> } ... } <TOY> }
               | { fun { <id> ... } <TOY> }
               | { if <TOY> <TOY> <TOY> }
               | { <TOY> <TOY> ... }
    |#

    ;; A matching abstract syntax tree datatype:
    (define-type TOY
      [Num  Number]
      [Id   Symbol]
      [Bind (Listof Symbol) (Listof TOY) TOY]
      [Fun  (Listof Symbol) TOY]
      [Call TOY (Listof TOY)]
      [If   TOY TOY TOY])

    (: unique-list? : (Listof Any) -> Boolean)
    ;; Tests whether a list is unique, guards Bind and Fun values.
    (define (unique-list? xs)
      (or (null? xs)
          (and (not (member (first xs) (rest xs)))
               (unique-list? (rest xs)))))

    (: parse-sexpr : Sexpr -> TOY)
    ;; parses s-expressions into TOYs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'bind more)
         (match sexpr
           [(list 'bind (list (list (symbol: names) (sexpr: nameds))
                              ...)
              body)
            (if (unique-list? names)
              (Bind names (map parse-sexpr nameds) (parse-sexpr body))
              (error 'parse-sexpr "duplicate `bind' names: ~s" names))]
           [else (error 'parse-sexpr "bad `bind' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: names) ...) body)
            (if (unique-list? names)
              (Fun names (parse-sexpr body))
              (error 'parse-sexpr "duplicate `fun' names: ~s" names))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(cons 'if more)
         (match sexpr
           [(list 'if cond then else)
            (If (parse-sexpr cond)
                (parse-sexpr then)
                (parse-sexpr else))]
           [else (error 'parse-sexpr "bad `if' syntax in ~s" sexpr)])]
        [(list fun args ...) ; other lists are applications
         (Call (parse-sexpr fun)
               (map parse-sexpr args))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> TOY)
    ;; Parses a string containing an TOY expression to a TOY AST.
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;;; ----------------------------------------------------------------
    ;;; Values and environments

    (define-type ENV
      [EmptyEnv]
      [FrameEnv FRAME ENV])

    ;; a frame is an association list of names and values.
    (define-type FRAME = (Listof (List Symbol VAL)))

    (define-type VAL
      [RktV  Any]
      [FunV  (Listof Symbol) TOY ENV]
      [PrimV ((Listof VAL) -> VAL)])

    (: extend : (Listof Symbol) (Listof VAL) ENV -> ENV)
    ;; extends an environment with a new frame.
    (define (extend names values env)
      (if (= (length names) (length values))
        (FrameEnv (map (lambda ([name : Symbol] [val : VAL])
                         (list name val))
                       names values)
                  env)
        (error 'extend "arity mismatch for names: ~s" names)))

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, frame by frame,
    ;; return its value or throw an error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(FrameEnv frame rest)
         (let ([cell (assq name frame)])
           (if cell
             (second cell)
             (lookup name rest)))]))

    (: unwrap-rktv : VAL -> Any)
    ;; helper for `racket-func->prim-val': unwrap a RktV wrapper in
    ;; preparation to be sent to the primitive function
    (define (unwrap-rktv x)
      (cases x
        [(RktV v) v]
        [else (error 'racket-func "bad input: ~s" x)]))

    (: racket-func->prim-val : Function -> VAL)
    ;; converts a racket function to a primitive evaluator function
    ;; which is a PrimV holding a ((Listof VAL) -> VAL) function.
    ;; (the resulting function will use the list function as is,
    ;; and it is the list function's responsibility to throw an error
    ;; if it's given a bad number of arguments or bad input types.)
    (define (racket-func->prim-val racket-func)
      (define list-func (make-untyped-list-function racket-func))
      (PrimV (lambda (args)
               (RktV (list-func (map unwrap-rktv args))))))

    ;; The global environment has a few primitives:
    (: global-environment : ENV)
    (define global-environment
      (FrameEnv (list (list '+ (racket-func->prim-val +))
                      (list '- (racket-func->prim-val -))
                      (list '* (racket-func->prim-val *))
                      (list '/ (racket-func->prim-val /))
                      (list '< (racket-func->prim-val <))
                      (list '> (racket-func->prim-val >))
                      (list '= (racket-func->prim-val =))
                      ;; values
                      (list 'true  (RktV #t))
                      (list 'false (RktV #f)))
                (EmptyEnv)))

    ;;; ----------------------------------------------------------------
    ;;; Evaluation

    (: eval : TOY ENV -> VAL)
    ;; evaluates TOY expressions
    (define (eval expr env)
      ;; convenient helper
      (: eval* : TOY -> VAL)
      (define (eval* expr) (eval expr env))
      (cases expr
        [(Num n)   (RktV n)]
        [(Id name) (lookup name env)]
        [(Bind names exprs bound-body)
         (eval bound-body (extend names (map eval* exprs) env))]
        [(Fun names bound-body)
         (FunV names bound-body env)]
        [(Call fun-expr arg-exprs)
         (define fval (eval* fun-expr))
         (define arg-vals (map eval* arg-exprs))
         (cases fval
           [(PrimV proc) (proc arg-vals)]
           [(FunV names body fun-env)
            (eval body (extend names arg-vals fun-env))]
           [else (error 'eval "function call with a non-function: ~s"
                        fval)])]
        [(If cond-expr then-expr else-expr)
         (eval* (if (cases (eval* cond-expr)
                      [(RktV v) v] ; Racket value => use as boolean
                      [else #t])   ; other values are always true
                  then-expr
                  else-expr))]))

    (: run : String -> Any)
    ;; evaluate a TOY program contained in a string
    (define (run str)
      (let ([result (eval (parse str) global-environment)])
        (cases result
          [(RktV v) v]
          [else (error 'run "evaluation returned a bad value: ~s"
                       result)])))

    ;;; ----------------------------------------------------------------
    ;;; Tests

    (test (run "{{fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{bind {{add3 {fun {x} {+ x 3}}}} {add3 1}}")
          => 4)
    (test (run "{bind {{add3 {fun {x} {+ x 3}}}
                       {add1 {fun {x} {+ x 1}}}}
                  {bind {{x 3}} {add1 {add3 x}}}}")
          => 7)
    (test (run "{bind {{identity {fun {x} x}}
                       {foo {fun {x} {+ x 1}}}}
                  {{identity foo} 123}}")
          => 124)
    (test (run "{bind {{x 3}}
                  {bind {{f {fun {y} {+ x y}}}}
                    {bind {{x 5}}
                      {f 4}}}}")
          => 7)
    (test (run "{{{fun {x} {x 1}}
                  {fun {x} {fun {y} {+ x y}}}}
                 123}")
          => 124)

    ;; More tests for complete coverage
    (test (run "{bind x 5 x}")      =error> "bad `bind' syntax")
    (test (run "{fun x x}")         =error> "bad `fun' syntax")
    (test (run "{if x}")            =error> "bad `if' syntax")
    (test (run "{}")                =error> "bad syntax")
    (test (run "{bind {{x 5} {x 5}} x}") =error> "duplicate*bind*names")
    (test (run "{fun {x x} x}")     =error> "duplicate*fun*names")
    (test (run "{+ x 1}")           =error> "no binding for")
    (test (run "{+ 1 {fun {x} x}}") =error> "bad input")
    (test (run "{+ 1 {fun {x} x}}") =error> "bad input")
    (test (run "{1 2}")             =error> "with a non-function")
    (test (run "{{fun {x} x}}")     =error> "arity mismatch")
    (test (run "{if {< 4 5} 6 7}")  => 6)
    (test (run "{if {< 5 4} 6 7}")  => 7)
    (test (run "{if + 6 7}")        => 6)
    (test (run "{fun {x} x}")       =error> "returned a bad value")

    ;;; ----------------------------------------------------------------


------------------------------------------------------------------------
# Compilation [Tuesday, March 11th]

Instead of interpreting an expression, which is performing a full
evaluation, we can think about *compiling* it: translating it to a
different language which we can later run more easily, more efficiently,
on more platforms, etc. Another feature that is usually associated with
compilation is that a lot more work was done at the compilation stage,
making the actual running of the code faster.

For example, translating an AST into one that has de-Bruijn indexes
instead of identifier names is a form of compilation --- not only is it
translating one language into another, it does the work involved in name
lookup before the program starts running.

This is something that we can experiment with now. An easy way to
achieve this is to start with our evaluation function:

    (: eval : TOY ENV -> VAL)
    ;; evaluates TOY expressions
    (define (eval expr env)
      ;; convenient helper
      (: eval* : TOY -> VAL)
      (define (eval* expr) (eval expr env))
      (cases expr
        [(Num n)   (RktV n)]
        [(Id name) (lookup name env)]
        [(Bind names exprs bound-body)
         (eval bound-body (extend names (map eval* exprs) env))]
        [(Fun names bound-body)
         (FunV names bound-body env)]
        [(Call fun-expr arg-exprs)
         (define fval (eval* fun-expr))
         (define arg-vals (map eval* arg-exprs))
         (cases fval
           [(PrimV proc) (proc arg-vals)]
           [(FunV names body fun-env)
            (eval body (extend names arg-vals fun-env))]
           [else (error 'eval "function call with a non-function: ~s"
                        fval)])]
        [(If cond-expr then-expr else-expr)
         (eval* (if (cases (eval* cond-expr)
                      [(RktV v) v] ; Racket value => use as boolean
                      [else #t])   ; other values are always true
                  then-expr
                  else-expr))]))

and change it so it compiles a given expression to a Racket function.
(This is, of course, just to demonstrate a conceptual point, it is only
the tip of what compilers actually do...) This means that we need to
turn it into a function that receives a TOY expression and compiles it.
In other words, `eval` no longer consumes and environment argument which
makes sense because the environment is a place to hold run-time values,
so it is a data structure that is not part of the compiler (it is
usually represented as the call stack).

So we split the two arguments into a compile-time and run-time, which
can be done by simply currying the `eval` function --- here this is
done, and all calls to `eval` are also curried:

    (: eval : TOY -> ENV -> VAL) ;*** note the curried type
    ;; evaluates TOY expressions
    (define (eval expr)
      (lambda (env)
        ;; convenient helper
        (: eval* : TOY -> VAL)
        (define (eval* expr) ((eval expr) env))
        (cases expr
          [(Num n)   (RktV n)]
          [(Id name) (lookup name env)]
          [(Bind names exprs bound-body)
           ((eval bound-body) (extend names (map eval* exprs) env))]
          [(Fun names bound-body)
           (FunV names bound-body env)]
          [(Call fun-expr arg-exprs)
           (define fval (eval* fun-expr))
           (define arg-vals (map eval* arg-exprs))
           (cases fval
             [(PrimV proc) (proc arg-vals)]
             [(FunV names body fun-env)
              ((eval body) (extend names arg-vals fun-env))]
             [else (error 'eval "function call with a non-function: ~s"
                          fval)])]
          [(If cond-expr then-expr else-expr)
           (eval* (if (cases (eval* cond-expr)
                        [(RktV v) v] ; Racket value => use as boolean
                        [else #t])   ; other values are always true
                    then-expr
                    else-expr))])))

We also need to change the `eval` call in the main `run` function:

    (: run : String -> Any)
    ;; evaluate a TOY program contained in a string
    (define (run str)
      (let ([result ((eval (parse str)) global-environment)])
        (cases result
          [(RktV v) v]
          [else (error 'run "evaluation returned a bad value: ~s"
                       result)])))

Not much has changed so far.

Note that in the general case of a compiler we need to run a program
several times, so we'd want to avoid parsing it over and over again. We
can do that by keeping a single parsed AST of the input. Now we went one
step further by making it possible to do more work ahead and keep the
result of the first "stage" of eval around (except that "more work" is
really not saying much at the moment):

    (: run : String -> Any)
    ;; evaluate a TOY program contained in a string
    (define (run str)
      (let* ([compiled (eval (parse str))]
             [result   (compiled global-environment)])
        (cases result
          [(RktV v) v]
          [else (error 'run "evaluation returned a bad value: ~s"
                       result)])))

At this point, even though our "compiler" is not much more than a
slightly different representation of the same functionality, we rename
`eval` to `compile` which is a more appropriate description of what we
intend it to do (so we change the purpose statement too):

    (: compile : TOY -> ENV -> VAL)
    ;; compiles TOY expressions to Racket functions.
    (define (compile expr)
      (lambda (env)
        (: compile* : TOY -> VAL)
        (define (compile* expr) ((compile expr) env))
        (cases expr
          [(Num n)   (RktV n)]
          [(Id name) (lookup name env)]
          [(Bind names exprs bound-body)
           ((compile bound-body)
            (extend names (map compile* exprs) env))]
          [(Fun names bound-body)
           (FunV names bound-body env)]
          [(Call fun-expr arg-exprs)
           (define fval (compile* fun-expr))
           (define arg-vals (map compile* arg-exprs))
           (cases fval
             [(PrimV proc) (proc arg-vals)]
             [(FunV names body fun-env)
              ((compile body) (extend names arg-vals fun-env))]
             [else (error 'call ; this is *not* a compilation error
                          "function call with a non-function: ~s"
                          fval)])]
          [(If cond-expr then-expr else-expr)
           (compile* (if (cases (compile* cond-expr)
                           [(RktV v) v] ; Racket value => use as boolean
                           [else #t])   ; other values are always true
                       then-expr
                       else-expr))])))

    (: run : String -> Any)
    ;; evaluate a TOY program contained in a string
    (define (run str)
      (let* ([compiled (compile (parse str))]
             [result   (compiled global-environment)])
        (cases result
          [(RktV v) v]
          [else (error 'run "evaluation returned a bad value: ~s"
                       result)])))

------------------------------------------------------------------------

Not much changed, still. We curried the `eval` function and renamed it
to `compile`. But when we actually call compile almost nothing happens
--- all it does is create a Racket closure which will do the rest of the
work. (And this closure closes over the given expression.)

Running this "compiled" code is going to be very much like the previous
usage of `eval`, except a little *slower*, because now every recursive
call involves calling `compile` to generate a closure, which is then
immediately used --- so we just added some allocations at the recursive
call points! (Actually, the extra cost is minimal because the Racket
compiler will optimize away such immediate closure applications.)

Another way to see how this is not really a compiler yet is to consider
*when* `compile` gets called. A proper compiler is something that does
all of its work *before* running the code, which means that once it
spits out the compiled code it shouldn't be used again (except for
compiling other code, of course). Our current code is not really a
compiler since it breaks this feature. (For example, if GCC behaved this
way, then it would "compile" files by producing code that invokes GCC to
compile the next step, which, when run, invokes GCC again, etc.)

However, the conceptual change is substantial --- we now have a function
that does its work in two stages --- the first part gets an expression
and *can* do some compile-time work, and the second part does the
run-time work, and includes anything inside the (lambda (env) ...). The
thing is that so far, the code does nothing at the compilation stage
(remember: only creates a closure). But because we have two stages, we
can now shift work from the second stage (the run-time) to the first
(the compile-time).

For example, consider the following simple example:

    #lang pl

    (: foo : Number Number -> Number)
    (define (foo x y)
      (* x y))

    (: bar : Number -> Number)
    (define (bar c)
      (: loop : Number Number -> Number)
      (define (loop n acc)
        (if (< 0 n)
            (loop (- n 1) (+ (foo c n) acc))
            acc))
      (loop 4000000000 0))

    (time (bar 0))

We can do the same thing here --- separate `foo` it into two stages
using currying, and modify `bar` appropriately:

    #lang pl

    (: foo : Number -> Number -> Number)
    (define (foo x)
      (lambda (y)
        (* x y)))

    (: bar : Number -> Number)
    (define (bar c)
      (: loop : Number Number -> Number)
      (define (loop n acc)
        (if (< 0 n)
            (loop (- n 1) (+ ((foo c) n) acc))
            acc))
      (loop 4000000000 0))

    (time (bar 0))

Now instead of a simple multiplication, lets expand it a little, for
example, do a case split on common cases where `x` is `0`, `1`, or `2`:

    (: foo : Number -> Number -> Number)
    (define (foo x)
      (lambda (y)
        (cond [(= x 0) 0]
              [(= x 1) y]
              [(= x 2) (+ y y)] ; assume that this is faster
              [else (* x y)])))

This is not much faster, since Racket already optimizes multiplication
in a similar way.

Now comes the real magic: deciding what branch of the cond to take
depends *only* on x, so we can `push` the lambda inside:

    (: foo : Number -> Number -> Number)
    (define (foo x)
      (cond [(= x 0) (lambda (y) 0)]
            [(= x 1) (lambda (y) y)]
            [(= x 2) (lambda (y) (+ y y))]
            [else (lambda (y) (* x y))]))

We just made an improvement --- the comparisons for the common cases are
now done as soon as (foo x) is called, they're not delayed to when the
resulting function is used. Now go back to the way this is used in `bar`
and make it call `foo` once for the given `c`:

    #lang pl

    (: foo : Number -> Number -> Number)
    (define (foo x)
      (cond [(= x 0) (lambda (y) 0)]
            [(= x 1) (lambda (y) y)]
            [(= x 2) (lambda (y) (+ y y))]
            [else (lambda (y) (* x y))]))

    (: bar : Number -> Number)
    (define (bar c)
      (define foo-c (foo c))
      (: loop : Number Number -> Number)
      (define (loop n acc)
        (if (< 0 n)
            (loop (- n 1) (+ (foo-c n) acc))
            acc))
      (loop 4000000000 0))

    (time (bar 0))

Now foo-c is generated once, and if `c` happens to be one of the three
common cases (as in the last expression), we can avoid doing any
multiplication. (And if we hit the default case, then we're doing the
same thing we did before.)

[However, the result runs at roughly the same speed! This heavily
depends on what kind of optimizations the compiler can do, in this case,
optimizing multiplications (which are essentially a single machine-code
instruction) vs optimizing multiple-stage function calls.]

Here is another useful example that demonstrates this:

    (define (foo list)
      (map (lambda (n) (if ...something... E1 E2))
           list))

    -->

    (define (foo list)
      (map (if ...something...
             (lambda (n) E1)
             (lambda (n) E2))
           list))

(Question: when can you do that?)

This is not unique to Racket, it can happen in any language. Racket (or
any language with first class function values) only makes it easy to
create a local function that is specialized for the flag.

------------------------------------------------------------------------

Getting our thing closer to a compiler is done in a similar way --- we
push the `(lambda (env) ...)` inside the various cases. (Note that
`compile*` depends on the `env` argument, so it also needs to move
inside --- this is done for all cases that use it, and will eventually
go away.) We actually need to use `(lambda ([env : ENV]) ...)` though,
to avoid upsetting the type checker:

    (: compile : TOY -> ENV -> VAL)
    ;; compiles TOY expressions to Racket functions.
    (define (compile expr)
      (cases expr
        [(Num n)   (lambda ([env : ENV]) (RktV n))]
        [(Id name) (lambda ([env : ENV]) (lookup name env))]
        [(Bind names exprs bound-body)
         (lambda ([env : ENV])
           (: compile* : TOY -> VAL)
           (define (compile* expr) ((compile expr) env))
           ((compile bound-body)
            (extend names (map compile* exprs) env)))]
        [(Fun names bound-body)
         (lambda ([env : ENV]) (FunV names bound-body env))]
        [(Call fun-expr arg-exprs)
         (lambda ([env : ENV])
           (: compile* : TOY -> VAL)
           (define (compile* expr) ((compile expr) env))
           (define fval (compile* fun-expr))
           (define arg-vals (map compile* arg-exprs))
           (cases fval
             [(PrimV proc) (proc arg-vals)]
             [(FunV names body fun-env)
              ((compile body) (extend names arg-vals fun-env))]
             [else (error 'call ; this is *not* a compilation error
                          "function call with a non-function: ~s"
                          fval)]))]
        [(If cond-expr then-expr else-expr)
         (lambda ([env : ENV])
           (: compile* : TOY -> VAL)
           (define (compile* expr) ((compile expr) env))
           (compile* (if (cases (compile* cond-expr)
                           [(RktV v) v] ; Racket value => use as boolean
                           [else #t])   ; other values are always true
                       then-expr
                       else-expr)))]))

and with this we shifted a bit of actual work to compile time --- the
code that checks what structure we have, and extracts its different
slots. But this is still not good enough --- it's only the first
top-level `cases` that is moved to compile-time --- recursive calls to
`compile` are still there in the resulting closures. This can be seen by
the fact that we have those calls to `compile` in the Racket closures
that are the results of our compiler, which, as discussed above, mean
that it's not an actual compiler yet.

For example, consider the `Bind` case:

    [(Bind names exprs bound-body)
     (lambda ([env : ENV])
       (: compile* : TOY -> VAL)
       (define (compile* expr) ((compile expr) env))
       ((compile bound-body)
        (extend names (map compile* exprs) env)))]

At compile-time we identify and deconstruct the Bind structure, then
create a the runtime closure that will access these parts when the code
runs. But this closure will itself call `compile` on `bound-body` and
each of the expressions. Both of these calls can be done at compile
time, since they only need the expressions --- they don't depend on the
environment. Note that `compile*` turns to `run` here, since all it does
is run a compiled expression on the current environment.

    [(Bind names exprs bound-body)
     (define compiled-body (compile bound-body))
     (define compiled-exprs (map compile exprs))
     (lambda ([env : ENV])
       (: run : (ENV -> VAL) -> VAL)
       (define (run compiled-expr) (compiled-expr env))
       (compiled-body (extend names
                              (map run compiled-exprs)
                              env)))]

We can move it back up, out of the resulting functions, by making it a
function that consumes an environment and returns a "caller" function:

    (define (compile expr)
      ;; convenient helper
      (: caller : ENV -> (ENV -> VAL) -> VAL)
      (define (caller env)
        (lambda (compiled) (compiled env)))
      (cases expr
        ...
        [(Bind names exprs bound-body)
         (define compiled-body (compile bound-body))
         (define compiled-exprs (map compile exprs))
         (lambda ([env : ENV])
           (compiled-body (extend names
                                  (map (caller env) compiled-exprs)
                                  env)))]
        ...))

Once this is done, we have a bunch of work that can happen at compile
time: we pre-scan the main "bind spine" of the code.

We can deal in a similar way with other occurrences of `compile` calls
in compiled code. The two branches that need to be fixed are:

1. In the `If` branch, there is not much to do. After we make it
   pre-compile the `cond-expr`, we also need to make it pre-compile both
   the `then-expr` and the `else-expr`. This might seem like doing more
   work (since before changing it only one would get compiled), but
   since this is compile-time work, then it's not as important. Also,
   `if` expressions are evaluated many times (being part of a loop, for
   example), so overall we still win.

2. The `Call` branch is a little trickier: the problem here is that the
   expressions that are compiled are coming from the closure that is
   being applied. The solution for this is obvious: we need to change
   the closure type so that it closes over *compiled* expressions
   instead of over plain ones. This makes sense because closures are
   run-time values, so they need to close over the compiled expressions
   since this is what we use as "code" at run-time.

Again, the goal is to have no `compile` calls that happen at runtime:
they should all happen before that. This would allow, for example, to
obliterate the compiler once it has done its work, similar to how you
don't need GCC to run a C application. Yet another way to look at this
is that we shouldn't look at the AST at runtime --- again, the analogy
to GCC is that the AST is a data structure that the compiler uses, and
it does not exist at runtime. Any runtime reference to the TOY AST is,
therefore, as bad as any runtime reference to `compile`.

When we're done with this process we'll have something that is an actual
compiler: translating TOY programs into Racket closures. To see how this
is an actual compiler consider the fact that Racket uses a JIT to
translate bytecode into machine code when it's running functions. This
means that the compiled version of our TOY programs are, in fact,
translated all the way down to machine code.


------------------------------------------------------------------------
## Is this really a compiler? [Tuesday, March 11th]

Yes, it is, though it's hard to see it when we're compiling TOY code
directly to Racket closures.

Another way to see this in a more obvious way is to change the compiler
code so instead of producing a Racket closure it spits out the Racket
code that makes up these closures when evaluated.

The basic idea is to switch from a function that has code that "does
stuff", to a function that *emits* that code indtead. For example,
consider a function that computes the average of two numbers

    (define (average x y)
      (/ (+ x y) 2))

to one that instead returns the actual code

    (define (make-average-expression x y)
      (string-append "(/ (+ " x " " y ") 2)"))

It is, however, inconvenient to use strings to represent code:
S-expressions are a much better fit for representing code:

    (define (make-average-expression x y)
      (list '/ (list '+ x y) 2))

This is still tedious though, since the clutter of `list`s and quotes
makes it hard to see the actual code that is produced. It would be nice
if we could quote the whole thing instead:

    (define (make-average-expression x y)
      '(/ (+ x y) 2))

but that's wrong since we don't want to include the `x` and `y` symbols
in the result, but rather their values. Racket (and all other lisp
dialects) have a tool for that: `quasiquote`. In code, you just use a
backquote `` ` `` instead of a `'`, and then you can unquote parts of
the quasi-quoted code using `,` (which is called `unquote`). (Later in
the course we'll talk about these "`` ` ``"s and "`,`"s more.)

So the above becomes:

    (define (make-average-expression x y)
      `(/ (+ ,x ,y) 2))

Note that this would work fine if `x` and `y` are numbers, but they're
now essentially arguments that hold *expression* values (as
S-expressions). For example, see what you get with:

    (make-average-expression 3 `(/ 8 2))

Back to the compiler, we change the closure-generating compiler code

    [(Bind names exprs bound-body)
     (define compiled-body (compile bound-body))
     (define compiled-exprs (map compile exprs))
     (lambda ([env : ENV])
       (compiled-body (extend ...)))]

into

    [(Bind names exprs bound-body)
     (define compiled-body (compile bound-body))
     (define compiled-exprs (map compile exprs))
     `(lambda ([env : ENV])
        (,compiled-body (extend ...)))]

Doing this systematically would result in something that is more clearly
a compiler: the result of `compile` would be an S-expression that you
can then paste in the Racket file to run it.

An example of this idea taken seriously is the graal + truffle
combination for implementing fast JIT compiled languages:
* https://medium.com/graalvm/writing-truly-memory-safe-jit-compilers-f79ad44558dd


------------------------------------------------------------------------
# Lazy Evaluation: Using a Lazy Racket [Tuesday, March 18th]

> [PLAI §7] (done with Haskell)

For this part, we will use a new language, Lazy Racket.

    #lang pl lazy

As the name suggests, this is a version of the normal (untyped) Racket
language that is lazy.

First of all, let's verify that this is indeed a lazy language:

    > (define (foo x) 3)
    > (foo (+ 1 "2"))
    3

That went without a problem --- the argument expression was indeed not
evaluated. In this language, you can treat all expressions as future
`promises` to evaluate. There are certain points where such promises are
actually `forced`, all of these stem from some need to print a resulting
value, in our case, it's the REPL that prints such values:

    > (+ 1 "2")
    +: expects type <number> as 2nd argument,
    given: "2"; other arguments were: 1

The expression by itself only generates a promise, but when we want to
print it, this promise is forced to evaluate --- this forces the
addition, which forces its arguments (plain values rather than
computation promises), and at this stage we get an error. (If we never
want to see any results, then the language will never do anything at
all.) So a promise is forced either when a value printout is needed, or
if it is needed to recursively compute a value to print:

    > (* 1 (+ 2 "3"))
    +: expects type <number> as 2nd argument,
    given: "3"; other arguments were: 2

Note that the error was raised by the internal expression: the outer
expression uses `*`, and `+` requires actual values not promises.

Another example, which is now obvious, is that we can now define an `if`
function:

    > (define (my-if x y z) (if x y z))
    > (my-if (< 1 2) 3 (+ 4 "5"))
    3

Actually, in this language `if`, `and`, and `or` are all function values
instead of special forms:

    > (list if and or)
    (#<procedure:if> #<procedure:and> #<procedure:or>)
    > ((third (list if and or)) #t (+ 1 "two"))
    #t

(By now, you should know that these have no value in Racket --- using
them like this in plain will lead to syntax errors.) There are some
primitives that do not force their arguments. Constructors fall in this
category, for example `cons` and `list`:

    > (define (fib n) (if (<= n 1) n (+ (fib (- n 1)) (fib (- n 2)))))
    > (define a (list (+ 1 2) (+ 3 "4") (fib 30) (* 5 6)))

Nothing --- the definition simply worked, but that's expected, since
nothing is printed. If we try to inspect this value, we can get some of
its parts, provided we do not force the bogus one:

    > (first a)
    3
    > (fourth a)
    30
    > (third a)
    196418
    > (second a)
    +: contract violation, expected: number?, given: "4" ...

The same holds for cons:

    > (second (cons 1 (cons 2 (first null))))
    2

Now if this is the case, then how about this:

    > (define ones (cons 1 ones))

Everything is fine, as expected --- but what is the value of `ones` now?
Clearly, it is a list that has 1 as its first element:

    > (first ones)
    1

But what do we have in the tail of this list? We have `ones` which we
already know is a list that has 1 in its first place --- so following
Racket's usual rules, it means that the second element of `ones` is,
again, 1. If we continue this, we can see that `ones` is, in fact, an
*infinite* list of 1s:

    > (second ones)
    1
    > (fifth ones)
    1

In this sense, the way `define` behaves is that it defines a true
equation: if ones is defined as (cons 1 ones), then the real value does
satisfy

    (equal? ones (cons 1 ones))

which means that the value is the fixpoint of the defined expression.

We can use `append` in a similar way:

    > (define foo (append (list 1 2 3) foo))
    > (fourth foo)
    1

This looks like it has some common theme with the discussion of
implementing recursive environments --- it actually demonstrates that in
this language, `letrec` can be used for *simple* values too. First of
all, a side note --- here an expression that indicated a bug in our
substituting evaluator:

    > (let ([x (list y)])
        (let ([y 1])
          x))
    reference to undefined identifier: y

When our evaluator returned `1` for this, we noticed that this was a
bug: it does not obey the lexical scoping rules. As seen above, Lazy
Racket is correctly using lexical scope. Now we can go back to the use
of `letrec` --- what do we get by this definition:

    > (define twos (let ([xs (cons 2 xs)]) xs))

we get an error about `xs` being undefined.

`xs` is unbound because of the usual scope that `let` uses. How can we
make this work? --- We simply use `letrec`:

    > (define twos (letrec ([xs (cons 2 xs)]) xs))
    > (first twos)
    2

As expected, if we try to print an infinite list will cause an infinite
loop, which DrRacket catches and prints in that weird way:

    > twos
    #0=(2 . #0#)

How would we inspect an infinite list? We write a function that returns
part of it:

    > (define (take n l)
        (if (or (<= n 0) (null? l))
          null
          (cons (first l) (take (sub1 n) (rest l)))))
    > (take 10 twos)
    (2 2 2 2 2 2 2 2 2 2)
    > (define foo (append (list 1 2 3) foo))
    > (take 10 foo)
    (1 2 3 1 2 3 1 2 3 1)

Dealing with infinite lists can lead to lots of interesting things, for
example:

    > (define fibs (cons 1 (cons 1 (map + fibs (rest fibs)))))
    > (take 10 fibs)
    (1 1 2 3 5 8 13 21 34 55)

To see how it works, see what you know about `fibs[n]` which will be our
notation for the nth element of `fibs` (starting from `1`):

    fibs[1] = 1  because of the first `cons'
    fibs[2] = 1  because of the second `cons'

and for all `n>2`:

    fibs[n] = (map + fibs (rest fibs))[n-2]
            = fibs[n-2] + (rest fibs)[n-2]
            = fibs[n-2] + fibs[n-2+1]
            = fibs[n-2] + fibs[n-1]

so it follows the exact definition of Fibonacci numbers.

Note that the list examples demonstrate that laziness applies to nested
values (actually, nested computations) too: a value that is not needed
is not computed, even if it is *contained* in a value that is needed.
For example, in:

    (define x (/ 1 0))
    (if (list (+ 1 x)) 1 2)

the `if` needs to know only whether its first argument (note: it *is* an
*argument*, since this `if` is a function) is `#f` or not. Once it is
determined that it is a pair (a `cons` cell), there is no need to
actually look at the values inside the pair, and therefore `(+ 1 x)`
(and more specifically, `x`) is never evaluated and we see no error.


------------------------------------------------------------------------
# Lazy Evaluation: Some Issues [Tuesday, March 18th]

There are a few issues that we need to be aware of when we're dealing
with a lazy language. First of all, remember that our previous attempt
at lazy evaluation has made

    {with {x y}
      {with {y 1}
        x}}

evaluate to 1, which does not follow the rules of lexical scope. This is
*not* a problem with lazy evaluation, but rather a problem with our
naive implementation. We will shortly see a way to resolve this problem.
In the meanwhile, remember that when we try the same in Lazy Racket we
do get the expected error:

    > (let ([x y])
        (let ([y 1])
          x))
    reference to undefined identifier: y

A second issue is a subtle point that you might have noticed when we
played with Lazy Racket: for some of the list values we have see a "`.`"
printed. This is part of the usual way Racket displays an *improper
list* --- any list that does not terminate with a null value. For
example, in plain Racket:

    > (cons 1 2)
    (1 . 2)
    > (cons 1 (cons 2 (cons 3 4)))
    (1 2 3 . 4)

In the dialect that we're using in this course, this is not possible.
The secret is that the `cons` that we use first checks that its second
argument is a proper list, and it will raise an error if not. So how
come Lazy Racket's `cons` is not doing the same thing? The problem is
that to know that something is a proper list, we will have to force it,
which will make it not behave like a constructor.

> As a side note, we can achieve some of this protection if we don't
> insist on immediately checking the second argument completely, and
> instead we do the check when needed --- lazily:
>
>     (define (safe-cons x l)
>       (cons x (if (pair? l) l (error "poof"))))

Finally, there are two consequences of using a lazy language that make
it more difficult to debug (or at lease take some time to get used to).
First of all, control tends to flow in surprising ways. For example,
enter the following into DrRacket, and run it in the normal language
level for the course:

                     (define (foo3 x)
                       (/ x "1"))
    (define (foo2 x)
      (foo3 x))
                     (define (foo1 x)
                       (list (foo2 x)))
    (define (foo0 x)
      (first (foo1 x)))

             (+ 1 (foo0 3))

In the normal language level, we get an error, and red arrows that show
us how where in the computation the error was raised. The arrows are all
expected, except that `foo2` is not in the path --- why is that?
Remember the discussion about tail-calls and how they are important in
Racket since they are the only tool to generate loops? This is what
we're seeing here: `foo2` calls `foo3` in a tail position, so there is
no need to keep the `foo2` context anymore --- it is simply replaced by
`foo3`. (Incidentally, there is also no arrow that goes through `foo1`:
Racket does some smart inlining, and it figures out that `foo0`+`foo1`
are simply returning the same value, so it skips `foo1`.)

Now switch to Lazy Racket and re-run --- you'll see no arrows at all.
What's the problem? The call of `foo0` creates a promise that is forced
in the top-level expression, that simply returns the `first` of the
`list` that `foo1` created --- and all of that can be done without
forcing the `foo2` call. Going this way, the computation is finally
running into an error *after* the calls to `foo0`, `foo1`, and `foo2`
are done --- so we get the seemingly out-of-context error.

To follow what's happening here, we need to follow how promise are
forced: when we have code like

    > (define (foo x) (/ x 0))
    > (foo 9)

then the `foo` call is a *strict point*, since we need an actual value
to display on the REPL. Since it's in a strict position, we do the call,
but when we're in the function there is no need to compute the division
result --- so it is returned as a lazy promise value back to the
toplevel. It is only then that we continue the process of getting an
actual value, which leads to trying to compute the division and get the
error.

Finally, there are also potential problems when you're not careful about
memory use. A common technique when using a lazy language is to generate
an infinite list and pull out its Nth element. For example, to compute
the Nth Fibonacci number, we've seen how we can do this:

    (define fibs (cons 1 (cons 1 (map + fibs (rest fibs)))))
    (define (fib n) (list-ref fibs n))

and we can also do this (reminder: `letrec` is the same as an internal
definition):

    (define (fib n)
      (letrec ([fibs (cons 1 (cons 1 (map + fibs (rest fibs))))])
        (list-ref fibs n))) ; tail-call => no need to keep `fibs`

but the problem here is that when `list-ref` is making its way down the
list, it might still hold a reference to `fibs`, which means that as the
list is forced, all intermediate values are held in memory. In the first
of these two, this is guaranteed to happen since we have a binding that
points at the head of the `fibs` list. With the second form things can
be confusing: it might be that our language implementation is smart
enough to see that `fibs` is not really needed anymore and release the
offending reference. If it isn't, then we'd have to do something like

    (define (fib n)
      (list-ref
       (letrec ([fibs (cons 1 (cons 1 (map + fibs (rest fibs))))])
         fibs)
       n))

to eliminate it. But even if the implementation does know that there is
no need for that reference, there are other tricky situations that are
hard to avoid.

Side note: Racket didn't use to do this optimization, but now it does,
and the lazy language helped in clarifying more cases where references
should be released. To see that, consider these two variants:

    (define (nat1 n)
      (define nats (cons 1 (map add1 nats)))
      (if (number? (list-ref nats n))
        "a number"
        "not a number"))

    ;; we want to provide some information: show the first element
    (define (nat2 n)
      (define nats (cons 1 (map add1 nats)))
      (if (number? (list-ref nats n))
        "a number"
        (error 'nat "the list starting with ~s is broken"
               (first nats))))

If we try to use them with a big input:

    (nat1 300000) ; or with nat2

then `nat1` would work fine, whereas `nat2` will likely run into
DrRacket's memory limit and the computation will be terminated. The
problem is that `nat2` uses the `nats` value *after* the `list-ref`
call, which will make a reference to the head of the list, preventing it
from being garbage-collected while `list-ref` is `cdr`-ing down the list
and making more cons cells materialize.

It's still possible to show the extra information though --- just save
it:

    ;; we want to provide some information: show the first element
    (define (nat3 n)
      (define nats (cons 1 (map add1 nats)))
      (define fst (first nats))
      (if (number? (list-ref nats n))
        "a number"
        (error 'nat "the list starting with ~s is broken" fst)))

It looks like it's spending a redundant runtime cycle in the extra
computation, but it's a lazy language so this is not a problem.


------------------------------------------------------------------------
# Lazy Evaluation: Shell Examples [Tuesday, March 18th]

There is a very simple and elegant principle in shell programming --- we
get a single data type, a character stream, and many small functions,
each doing a single simple job. With these small building blocks, we can
construct more sequences that achieve more complex tasks, for example
--- a sorted frequency table of lines in a file:

    sort foo | uniq -c | sort -nr

This is very much like a programming language --- we get small blocks,
and build stuff out of them. Of course there are swiss army knives like
awk that try to do a whole bunch of stuff, (the same attitude that
brought Perl to the world...) and even these respect the "stream" data
type. For example, a simple `{ print $1 }` statement will work over all
lines, one by one, making it a program over an infinite input stream,
which is what happens in reality in something like:

    cat /dev/console | awk ...

But there is something else in shell programming that makes so
effective: it is implementing a sort of a lazy evaluation. For example,
compare this:

    cat foo | awk '{ print $1+$2; }' | uniq

to:

    cat foo | awk '{ print $1+$2; }' | uniq | head -5

Each element in the pipe is doing its own small job, and it is always
doing just enough to feed its output. Each basic block is designed to
work even on infinite inputs! (Even sort works on unlimited inputs...)
(Soon we will see a stronger connection with lazy evaluation.)

> Side note: (Alan Perlis) "It is better to have 100 functions operate
> on one data structure than 10 functions on 10 data structures"... But
> the uniformity comes at a price: the biggest problem shells have is in
> their lack of a recursive structure, contaminating the world with way
> too many hacked up solutions. More than that, it is extremely
> inefficient and usually leads to data being re-parsed over and over
> and over --- each small Unix command needs to always output stuff that
> is human readable, but the next command in the pipe will need to
> re-parse that, eg, rereading decimal numbers. If you look at pipelines
> as composing functions, then a pipe of numeric commands translates to
> something like:
>
>     itoa(baz(atoi(itoa(bar(atoi(itoa(foo(atoi(inp)))))))))
>
> and it is impossible to get rid of the redundant `atoi(itoa(...))`s.


------------------------------------------------------------------------
# Lazy Evaluation: Programming Examples [Tuesday, March 18th]

We already know that when we use lazy evaluation, we are guaranteed to
have more robust programs. For example, a function like:

    (define (my-if x y z)
      (if x y z))

is completely useless in Racket because all functions are eager, but in
a lazy language, it would behave exactly like the real if. Note that we
still need *some* primitive conditional, but this primitive can be a
function (and it is, in Lazy Racket).

But we get more than that. If we have a lazy language, then
*computations* are pushed around as if they were values (computations,
because these are expressions that are yet to be evaluated). In fact,
there is no distinction between computations and values, it just happens
that some values contain "computational promises", things that will do
something in the future.

To see how this happens, we write a simple program to compute the
(infinite) list of prime numbers using the sieve of Eratosthenes. To do
this, we begin by defining the list of all natural numbers:

    (define nats (cons 1 (map add1 nats)))

And now define a `sift` function: it receives an integer `n` and an
infinite list of integers `l`, and returns a list without the numbers
that can be divided by `n`. This is simple to write using `filter`:

    (define (sift n l)
      (filter (lambda (x) (not (divides? n x))) l))

and it requires a definition for `divides?` --- we use Racket's `modulo`
for this:

    (define (divides? n m)
      (zero? (modulo m n)))

Now, a `sieve` is a function that consumes a list that begins with a
prime number, and returns the prime numbers from this list. To do this,
it returns a list that has the same first number, and for its tail it
sifts out numbers that are divisible by the first from the original
list's tail, and calls itself recursively on the result:

    (define (sieve l)
      (cons (first l) (sieve (sift (first l) (rest l)))))

Finally, the list of prime numbers is the result of applying `sieve` on
the list of numbers from `2`. The whole program is now:

    #lang pl lazy

    (define nats (cons 1 (map add1 nats)))

    (define (divides? n m)
      (zero? (modulo m n)))

    (define (sift n l)
      (filter (lambda (x) (not (divides? n x))) l))

    (define (sieve l)
      (cons (first l) (sieve (sift (first l) (rest l)))))

    (define primes (sieve (rest nats)))

To see how this runs, we trace `modulo` to see which tests are being
used. The effect of this is that each time `divides?` is actually
required to return a value, we will see a line with its inputs, and its
output. This output looks quite tricky --- things are computed only on a
"need to know" basis, meaning that debugging lazy programs can be
difficult, since things happen when they are needed which takes time to
get used to. However, note that the program actually performs the same
tests that you'd do using any eager-language implementation of the sieve
of Eratosthenes, and the advantage is that we don't need to decide in
advance how many values we want to compute --- all values will be
computed when you want to see the corresponding result. Implementing
*this* behavior in an eager language is more difficult than a simple
program, yet we don't need such complex code when we use lazy
evaluation.

Note that if we trace `divides?` we see results that are some promise
struct --- these are unevaluated expressions, and they point at the fact
that when `divides?` is used, it doesn't really force its arguments ---
this happens later when these results are forced.

The analogy with shell programming using pipes should be clear now ---
for example, we have seen this:

    cat foo | awk '{ print $1+$2; }' | uniq | head -5

The last `head -5` means that no computation is done on parts of the
original file that are not needed. It is similar to a `(take 5 l)`
expression in Lazy Racket.


------------------------------------------------------------------------
## Side Note: Similarity to Generators and Channels [Tuesday, March 18th]

Using infinite lists is similar to using channels --- a tool for
synchronizing threads and (see a [Rob Pike's talk]), and generators (as
they exist in Python). Here are examples of both, note how similar they
both are, and how similar they are to the above definition of `primes`.
(But note that there is an important difference, can you see it? It has
to be with whether a stream is reusable or not.)

[Rob Pike's talk]: https://www.youtube.com/watch?v=hB05UFqOtFA&t=1680s

First, the threads & channels version:

    #lang racket

    (define-syntax-rule (bg expr ...) (thread (lambda () expr ...)))

    (define nats
      (let ([out (make-channel)])
        (define (loop i) (channel-put out i) (loop (add1 i)))
        (bg (loop 1))
        out))

    (define (divides? n m)
      (zero? (modulo m n)))

    (define (filter pred c)
      (define out (make-channel))
      (define (loop)
        (let ([x (channel-get c)])
          (when (pred x) (channel-put out x))
          (loop)))
      (bg (loop))
      out)

    (define (sift n c)
      (filter (lambda (x) (not (divides? n x))) c))

    (define (sieve c)
      (define out (make-channel))
      (define (loop c)
        (define first (channel-get c))
        (channel-put out first)
        (loop (sift first c)))
      (bg (loop c))
      out)

    (define primes
      (begin (channel-get nats) (sieve nats)))

    (define (take n c)
      (if (zero? n) null (cons (channel-get c) (take (sub1 n) c))))

    (take 10 primes)

And here is the generator version:

    #lang racket

    (require racket/generator)

    (define nats
      (generator ()
        (define (loop i)
          (yield i)
          (loop (add1 i)))
        (loop 1)))

    (define (divides? n m)
      (zero? (modulo m n)))

    (define (filter pred g)
      (generator ()
        (define (loop)
          (let ([x (g)])
            (when (pred x) (yield x))
            (loop)))
        (loop)))

    (define (sift n g)
      (filter (lambda (x) (not (divides? n x))) g))

    (define (sieve g)
      (define (loop g)
        (define first (g))
        (yield first)
        (loop (sift first g)))
      (generator () (loop g)))

    (define primes
      (begin (nats) (sieve nats)))

    (define (take n g)
      (if (zero? n) null (cons (g) (take (sub1 n) g))))

    (take 10 primes)


------------------------------------------------------------------------
# Call by Need vs Call by Name [Tuesday, March 18th]

Finally, note that on requiring different parts of the `primes`, the
same calls are not repeated. This indicates that our language implements
"call by need" rather than "call by name": once an expression is forced,
its value is remembered, so subsequent usages of this value do not
require further computations.

Using "call by name" means that we actually use expressions which can
lead to confusing code. An old programming language that used this is
Algol. A confusing example that demonstrates this evaluation strategy
is:

    #lang algol60
    begin
      integer procedure SIGMA(x, i, n);
        value n;
        integer x, i, n;
      begin
        integer sum;
        sum := 0;
        for i := 1 step 1 until n do
          sum := sum + x;
        SIGMA := sum;
      end;
      integer q;
      printnln(SIGMA(q*2-1, q, 7));
    end

`x` and `i` are arguments that are passed by name, which means that they
can use the same memory location. This is called *aliasing*, a problem
that happens when pointers are involved (eg, pointers in C and
`reference` arguments in C++). The code, BTW, is called "Jensen's
device".


------------------------------------------------------------------------
# Example of Feature Embedding [Tuesday, March 18th]

Another interesting behavior that we can now observe, is that the TOY
evaluation rule for `with`:

    eval({with {x E1} E2}) = eval(E2[eval(E1)/x])

is specifying an eager evaluator *only if* the language that this rule
is written in is itself eager. Indeed, if we run the TOY interpreter in
Lazy Racket (or other interpreters we have implemented), we can verify
that running:

    (run "{bind {{x {/ 1 0}}} 1}")

is perfectly fine --- the call to Racket's division is done in the
evaluation of the TOY division expression, but since Lazy Racket is
lazy, then if this value is never used then we never get to do this
division! On the other hand, if we evaluate

    (run "{bind {{x {/ 1 0}}} {+ x 1}}")

we do get an error when DrRacket tries to display the result, which
forces strictness. Note how the arrows in DrRacket that show where the
computation is are quite confusing: the computation seem to go directly
to the point of the arithmetic operations (`arith-op`) since the rest of
the evaluation that the evaluator performed was already done, and
succeeded. The actual failure happens when we try to force the resulting
promise which contains only the strict points in our code.


------------------------------------------------------------------------
# Implementing Laziness (in plain Racket) [Tuesday, March 18th]

> [PLAI §8]

Generally, we know how lazy evaluation works when we use the
substitution model. We even know that if we have:

    {bind {{x y}}
      {bind {{y 2}}
        {+ x y}}}

then the result should be an error because we cannot substitute the
binding of `x` into the body expression because it will capture the `y`
--- changing the binding structure. As an indication, the original
expression contains a free reference to `y`, which is exactly why we
shouldn't substitute it. But what about:

    {bind {{x {+ 4 5}}}
      {bind {{y {+ x x}}}
        {bind {{z y}}
          {bind {{x 4}}
            z}}}}

Evaluating this eagerly returns 18, we therefore expect any other
evaluation (eager or lazy, using substitutions or environments) to
return 18 too, because any of these options should not change the
meaning of numbers, of addition, *or* of the scoping rules. (And we know
that no matter what evaluation strategy we choose, if we get to a value
(no infinite loop or exception) then it'll always be the same value.)
For example, try using lazy evaluation with substitutions:

    {bind {{x {+ 4 5}}}
      {bind {{y {+ x x}}}
        {bind {{z y}}
          {bind {{x 4}}
            z}}}}
    -->
    {bind {{y {+ {+ 4 5} {+ 4 5}}}}
      {bind {{z y}}
        {bind {{x 4}}
          z}}}
    -->
    {bind {{z {+ {+ 4 5} {+ 4 5}}}}
      {bind {{x 4}}
        z}}
    -->
    {bind {{x 4}}
      {+ {+ 4 5} {+ 4 5}}}
    -->
    {+ {+ 4 5} {+ 4 5}}
    -->
    {+ 9 9}
    -->
    18

And what about lazy evaluation using environments:

    {bind {{x {+ 4 5}}}
      {bind {{y {+ x x}}}
        {bind {{z y}}
          {bind {{x 4}}
            z}}}}         []
    -->
    {bind {{y {+ x x}}}
      {bind {{z y}}
        {bind {{x 4}}
          z}}}            [x:={+ 4 5}]
    -->
    {bind {{z y}}
      {bind {{x 4}}
        z}}               [x:={+ 4 5}, y:={+ x x}]
    -->
    {bind {{x 4}}
      z}                  [x:={+ 4 5}, y:={+ x x}, z:=y]
    -->
    z                     [x:=4, y:={+ x x}, z:=y]
    -->
    y                     [x:=4, y:={+ x x}, z:=y]
    -->
    {+ x x}               [x:=4, y:={+ x x}, z:=y]
    -->
    {+ 4 4}               [x:=4, y:={+ x x}, z:=y]
    -->
    8                     [x:=4, y:={+ x x}, z:=y]

We have a problem! This problem should be familiar now, it is very
similar to the problem that led us down the mistaken path of dynamic
scoping when we tried to have first-class functions. In both cases,
substitution always worked, and it looks like in both cases the problem
is that we don't remember the environment of an expression: in the case
of functions, it is the environment at the time of creating the closure
that we want to capture and use when we go back later to evaluate the
body of the function. Here we have a similar situation, except that we
don't need a function to defer computation: *most* expressions get
evaluated at some time in the future, so every time we defer such a
computation we need to remember the lexical environment of the
expression.

This is the major point that will make things work again: every
expression creates something like a closure --- an object that closes
over an expression and an environment at the (lexical) place where that
expression was used, and when we actually want to evaluate it later, we
need to do it in the right lexical context. So it is like a closure
except it doesn't need to be applied, and there are no arguments. In
fact it is also a form of a closure --- instead of closing over a
function body and an environment, it closes over any expression and an
environment. (As we shall see, lazy evaluation is tightly related to
using nullary functions: *thunks*.)


------------------------------------------------------------------------
# Sloth: A Lazy Evaluator [Tuesday, March 18th]

So we implement this by creating such closure values for all expressions
that are not evaluated right now. We begin with the Toy language, and
rename it to "Sloth". We then add one more case to the data type of
values which implements the new kind of expression closures, which
contains the expression and its environment:

    (define-type VAL
      [RktV  Any]
      [FunV  (Listof Symbol) SLOTH ENV]
      [ExprV                 SLOTH ENV] ;*** new: expression and scope
      [PrimV ((Listof VAL) -> VAL)])

(Intuition#1: `ExprV` is a delayed evaluation and therefore it has the
two values that are ultimately passed to `eval`. Intuition#2: laziness
can be implemented with thunks, so we hold the same information as a
`FunV` does, only there's no need for the argument names.)

Where should we use the new `ExprV`? --- At any place where we want to
be lazy and defer evaluating an expression for later. The two places in
the interpreter where we want to delay evaluation are the named
expressions in a bind form and the argument expressions in a function
application. Both of these cases use the helper `eval*` function to do
their evaluations, for example:

    [(Bind names exprs bound-body)
     (eval bound-body (extend names (map eval* exprs) env))]

To delay these evaluations, we need to change `eval*` so it returns an
expression closure instead of actually doing the evaluation --- change:

    (: eval* : SLOTH -> VAL)
    (define (eval* expr) (eval expr env))

to:

    (: eval* : SLOTH -> VAL)
    (define (eval* expr) (ExprV expr env))

Note how simple this change is --- instead of an `eval` function call,
we create a value that contains the parts that would have been used in
the `eval` function call. This value serves as a promise to do this
evaluation (the `eval` call) later, if needed. (This is exactly why a
Lazy Racket would make this a lazy evaluator: in it, *all* function
calls are promises.)

------------------------------------------------------------------------

Side note: this can be used in any case when you're using an eager
language, and you want to delay some function call --- all you need to
do is replace (using a C-ish syntax)

    int foo(int x, str y) {
      ...do some work...
    }

with

    // rename `foo':
    int real_foo(int x, str y) {
      ...same work...
    }

    // `foo' is a delayed constructor, instead of a plain function
    struct delayed_foo {
      int x;
      str y;
    }
    delayed_foo foo(int x, str y) {
      return new delayed_foo(x, y);
    }

now all calls to `foo` return a `delayed_foo` instance instead of an
integer. Whenever we want to force the delayed promise, we can use this
function:

    int force_foo(delayed_foo promise) {
      return real_foo(promise.x, promise.y);
    }

You might even want to make sure that each such promise is evaluated
exactly once --- this is simple to achieve by adding a cache field to
the struct:

    int real_foo(int x, str y) {
      ...same work...
    }

    struct delayed_foo {
      int  x;
      str  y;
      bool is_computed;
      int  result;
    }
    delayed_foo foo(int x, str y) {
      return new delayed_foo(x, y, false, 0);
    }

    int force_foo(delayed_foo promise) {
      if (!promise.is_computed) {
        promise.result = real_foo(promise.x, promise.y);
        promise.is_computed = true;
      }
      return promise.result;
    }

As we will see shortly, this corresponds to switching from a
call-by-name lazy language to a call-by-need one.

------------------------------------------------------------------------

Back to our Sloth interpreter --- given the `eval*` change, we expect
that `eval`-uating:

    {bind {{x 1}} x}

will return:

    (ExprV (Num 1) ...the-global-environment...)

and the same goes for `eval`-uating

    {{fun {x} x} 1}

Similarly, evaluating

    {bind {{x {+ 1 2}}} x}

should return

    (ExprV (Call (Id +) (Num 1) (Num 2)) ...the-global-environment...)

But what about evaluating an expression like this one:

    {bind {{x 2}}
      {+ x x}}

?

Using what we have so far, we will get to evaluate the body, which is a
(Call ...) expression, but when we evaluate the arguments for this
function call, we will get `ExprV` values --- so we will not be able to
perform the addition. Instead, we will get an error from the function
that `racket-func->prim-val` creates, due to the value being an `ExprV`
instead of a `RktV`.

What we really want is to actually add two *values*, not promises. So
maybe distinguish the two applications --- treat `PrimV` differently
from `FunV` closures?

    (: eval* : SLOTH -> VAL)
    (define (eval* expr) (ExprV expr env))
    (: real-eval* : SLOTH -> VAL)
    (define (real-eval* expr) (eval expr env))
    (cases expr
      ...
      [(Call fun-expr arg-exprs)
       (define fval (eval fun-expr env))
       ;; move: (define arg-vals (map eval* arg-exprs))
       (cases fval
         [(PrimV proc) (proc (map real-eval* arg-exprs))] ; change
         [(FunV names body fun-env)
          (eval body (extend names (map eval* arg-exprs) fun-env))]
         ...)]
      ...)

This still doesn't work --- the problem is that the function now gets a
bunch of values, where some of these can still be `ExprV`s because the
evaluation itself can return such values... Another way to see this
problem is to consider the code for evaluating an `If` conditional
expression:

    [(If cond-expr then-expr else-expr)
     (eval* (if (cases (real-eval* cond-expr)
                  [(RktV v) v] ; Racket value => use as boolean
                  [else #t])   ; other values are always true
              then-expr
              else-expr))]

...we need to take care of a possible `ExprV` here. What should we do?
The obvious solution is to use `eval` if we get an `ExprV` value:

    [(If cond-expr then-expr else-expr)
     (eval* (if (cases (real-eval* cond-expr)
                  [(RktV v) v] ; Racket value => use as boolean
                  [(ExprV expr env) (eval expr env)] ; force a promise
                  [else #t])   ; other values are always true
              then-expr
              else-expr))]

Note how this translates back the data structure that represents a
delayed `eval` promise back into a real `eval` call...

Going back to our code for `Call`, there is a problem with it --- the

    (define (real-eval* expr) (eval expr env))

will indeed evaluate the expression instead of lazily deferring this to
the future, but this evaluation might itself return such lazy values. So
we need to inspect the resulting value again, forcing the promise if
needed:

    (define (real-eval* expr)
      (let ([val (eval expr env)])
        (cases val
          [(ExprV expr env) (eval expr env)]
          [else val])))

But we *still* have a problem --- programs can get an arbitrarily long
nested chains of `ExprV`s that get forced to other `ExprV`s.

    {bind {{x true}}
      {bind {{y x}}
        {bind {{z y}}
          {if z
            {foo}
            {bar}}}}}

What we really need is to write a loop that keeps forcing promises over
and over until it gets a proper non-`ExprV` value.

    (: strict : VAL -> VAL)
    ;; forces a (possibly nested) ExprV promise,
    ;; returns a VAL that is not an ExprV
    (define (strict val)
      (cases val
        [(ExprV expr env) (strict (eval expr env))] ; loop back
        [else val]))

Note that it's close to `real-eval*`, but there's no need to mix it with
`eval`. The recursive call is important: we can never be sure that
`eval` didn't return an `ExprV` promise, so we have to keep looping
until we get a "real" value.

Now we can change the evaluation of function calls to something more
manageable:

    [(Call fun-expr arg-exprs)
     (define fval (strict (eval* fun-expr)))        ;*** strict!
     (define arg-vals (map eval* arg-exprs))
     (cases fval
       [(PrimV proc) (proc (map strict arg-vals))]  ;*** strict!
       [(FunV names body fun-env)
        (eval body (extend names arg-vals fun-env))]
       [else (error 'eval "function call with a non-function: ~s"
                    fval)])]

The code is fairly similar to what we had previously --- the only
difference is that we wrap a `strict` call where a proper value is
needed --- the function value itself, and arguments to primitive
functions.

The `If` case is similar (note that it doesn't matter if `strict` is
used with the result of `eval` or `eval*` (which returns an `ExprV`)):

    [(If cond-expr then-expr else-expr)
     (eval* (if (cases (strict (eval* cond-expr))
                  [(RktV v) v] ; Racket value => use as boolean
                  [else #t])   ; other values are always true
              then-expr
              else-expr))]

Note that, like before, we always return `#t` for non-`RktV` values ---
this is because we know that the value there is never an `ExprV`. All we
need now to get a working evaluator, is one more strictness point: the
outermost point that starts our evaluation --- `run` --- needs to use
`strict` to get a proper result value.

    (: run : String -> Any)
    ;; evaluate a SLOTH program contained in a string
    (define (run str)
      (let ([result (strict (eval (parse str) global-environment))])
        (cases result
          [(RktV v) v]
          [else (error 'run "evaluation returned a bad value: ~s"
                       result)])))

With this, all of the tests that we took from the Toy evaluator run
successfully. To make sure that the interpreter is lazy, we can add a
test that will fail if the language is strict:

    ;; Test laziness
    (test (run "{{fun {x} 1} {/ 9 0}}") => 1)
    (test (run "{{fun {x} 1} {{fun {x} {x x}} {fun {x} {x x}}}}") => 1)
    (test (run "{bind {{x {{fun {x} {x x}} {fun {x} {x x}}}}} 1}") => 1)

[In fact, we can continue and replace all `eval` calls with `ExprV`,
leaving only the one call in `strict`. This doesn't make any difference,
because the resulting promises will eventually be forced by `strict`
anyway.]


------------------------------------------------------------------------
## Getting more from Sloth [Tuesday, March 25th]

As we've seen, using `strict` in places where we need an actual value
rather than a delayed promise is enough to get a working lazy evaluator.
Our current implementation assumes that all primitive functions need
strict values, therefore the argument values are all passed through the
`strict` function --- but this is not always the case. Specifically, if
we have constructor functions, then we don't need (and usually don't
want) to force the promises. This is basically what allows us to use
infinite lists in Lazy Racket: the fact that `list` and `cons` do not
require forcing their arguments.

To allow some primitive functions to consume strict values and some to
leave them as is, we're going to change `racket-func->prim-val` and add
a flag that indicates whether the primitive function is strict or not.
Obviously, we also need to move the `strict` call around arguments to a
primitive function application into the `racket-func->prim-val`
generated function --- which simplifies the `Call` case in `eval` (we go
from (proc (map strict arg-vals)) back to (proc arg-vals)). The new code
for `racket-func->prim-val` and its helper is:

    (: unwrap-rktv : VAL -> Any)
    ;; helper for `racket-func->prim-val': strict and unwrap a RktV
    ;; wrapper in preparation to be sent to the primitive function
    (define (unwrap-rktv x)
      (let ([s (strict x)])
        (cases s
          [(RktV v) v]
          [else (error 'racket-func "bad input: ~s" s)])))

    (: racket-func->prim-val : Function Boolean -> VAL)
    ;; converts a racket function to a primitive evaluator function ...
    (define (racket-func->prim-val racket-func strict?)
      (define list-func (make-untyped-list-function racket-func))
      (PrimV (lambda (args)
               (let ([args (if strict?
                             (map unwrap-rktv args)
                             args)])  ;*** use values as is!
                 (RktV (list-func args))))))

We now need to annotate the primitives in the global environment, as
well as add a few constructors:

    ;; The global environment has a few primitives:
    (: global-environment : ENV)
    (define global-environment
      (FrameEnv (list (list '+ (racket-func->prim-val + #t))
                      (list '- (racket-func->prim-val - #t))
                      (list '* (racket-func->prim-val * #t))
                      (list '/ (racket-func->prim-val / #t))
                      (list '< (racket-func->prim-val < #t))
                      (list '> (racket-func->prim-val > #t))
                      (list '= (racket-func->prim-val = #t))
                      ;; note flags:
                      (list 'cons  (racket-func->prim-val cons  #f))
                      (list 'list  (racket-func->prim-val list  #f))
                      (list 'first (racket-func->prim-val car   #t)) ;**
                      (list 'rest  (racket-func->prim-val cdr   #t)) ;**
                      (list 'null? (racket-func->prim-val null? #t))
                      ;; values
                      (list 'true  (RktV #t))
                      (list 'false (RktV #f))
                      (list 'null  (RktV null)))
                (EmptyEnv)))

Note that this last change raises a subtle type issue: we're actually
abusing the Racket `list` and `cons` constructors to hold Sloth values.
One way in which this becomes a problem is the current assumption that a
primitive function always returns a Racket value (it is always wrapped
in a `RktV`) --- but this is no longer the case for `first` and `rest`:
when we use

    {cons 1 null}

in Sloth, the resulting value will be

    (RktV (cons (ExprV (Num 1) ...) (ExprV (Id null) ...)))

This leads to two problems: first, if we use Racket's `first` and
`rest`, they will complain (throw a runtime error) since the input value
is not a *proper* list (it's a pair that has a non-list value in its
tail). To resolve that, we use the more primitive `car` and `cdr`
functions to implement Sloth's `first` and `rest`.

The second problem happens when we try and grab the first value of this

    {first {cons 1 null}}

we will eventually get back the `ExprV` and wrap it in a `RktV`:

    (RktV (ExprV (Num 1) ...))

and finally `run` will strip off the `RktV` and return the `ExprV`. A
solution to this is to make our `first` and `rest` functions return a
value *without* wrapping it in a `RktV` --- we can identify this
situation by the fact that the returned value is already a VAL instead
of some other Racket value. We can identify such values with the `VAL?`
predicate that gets defined by our `define-type`, implemented by a new
`wrap-in-val` helper:

    (: unwrap-rktv : VAL -> Any)
    ;; helper for `racket-func->prim-val': strict and unwrap a RktV
    ;; wrapper in preparation to be sent to the primitive function
    (define (unwrap-rktv x)
      (let ([s (strict x)])
        (cases s
          [(RktV v) v]
          [else (error 'racket-func "bad input: ~s" s)])))

    (: wrap-in-val : Any -> VAL)
    ;; helper that ensures a VAL output using RktV wrapper when needed,
    ;; but leaving as is otherwise
    (define (wrap-in-val x)
      (if (VAL? x) x (RktV x)))

    (: racket-func->prim-val : Function Boolean -> VAL)
    ;; converts a racket function to a primitive evaluator function ...
    (define (racket-func->prim-val racket-func strict?)
      (define list-func (make-untyped-list-function racket-func))
      (PrimV (lambda (args)
               (let ([args (if strict? (map unwrap-rktv args) args)])
                 (wrap-in-val (list-func args))))))

Note that we don't need to worry about the result being an `ExprV` ---
that will eventually be taken care of by `strict`.


------------------------------------------------------------------------
## The Sloth Implementation [Tuesday, March 25th]

The complete Sloth code follows. It can be used to do the same fun
things we did with Lazy Racket.

    ;;; ---<<<SLOTH>>>--------------------------------------------------
    #lang pl

    ;;; ----------------------------------------------------------------
    ;;; Syntax

    #| The BNF:
       <SLOTH> ::= <num>
                 | <id>
                 | { bind {{ <id> <SLOTH> } ... } <SLOTH> }
                 | { fun { <id> ... } <SLOTH> }
                 | { if <SLOTH> <SLOTH> <SLOTH> }
                 | { <SLOTH> <SLOTH> ... }
    |#

    ;; A matching abstract syntax tree datatype:
    (define-type SLOTH
      [Num  Number]
      [Id   Symbol]
      [Bind (Listof Symbol) (Listof SLOTH) SLOTH]
      [Fun  (Listof Symbol) SLOTH]
      [Call SLOTH (Listof SLOTH)]
      [If   SLOTH SLOTH SLOTH])

    (: unique-list? : (Listof Any) -> Boolean)
    ;; Tests whether a list is unique, guards Bind and Fun values.
    (define (unique-list? xs)
      (or (null? xs)
          (and (not (member (first xs) (rest xs)))
               (unique-list? (rest xs)))))

    (: parse-sexpr : Sexpr -> SLOTH)
    ;; parses s-expressions into SLOTHs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'bind more)
         (match sexpr
           [(list 'bind (list (list (symbol: names) (sexpr: nameds))
                              ...)
              body)
            (if (unique-list? names)
              (Bind names (map parse-sexpr nameds) (parse-sexpr body))
              (error 'parse-sexpr "duplicate `bind' names: ~s" names))]
           [else (error 'parse-sexpr "bad `bind' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: names) ...) body)
            (if (unique-list? names)
              (Fun names (parse-sexpr body))
              (error 'parse-sexpr "duplicate `fun' names: ~s" names))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(cons 'if more)
         (match sexpr
           [(list 'if cond then else)
            (If (parse-sexpr cond)
                (parse-sexpr then)
                (parse-sexpr else))]
           [else (error 'parse-sexpr "bad `if' syntax in ~s" sexpr)])]
        [(list fun args ...) ; other lists are applications
         (Call (parse-sexpr fun)
               (map parse-sexpr args))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> SLOTH)
    ;; Parses a string containing an SLOTH expression to a SLOTH AST.
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;;; ----------------------------------------------------------------
    ;;; Values and environments

    (define-type ENV
      [EmptyEnv]
      [FrameEnv FRAME ENV])

    ;; a frame is an association list of names and values.
    (define-type FRAME = (Listof (List Symbol VAL)))

    (define-type VAL
      [RktV  Any]
      [FunV  (Listof Symbol) SLOTH ENV]
      [ExprV SLOTH ENV]
      [PrimV ((Listof VAL) -> VAL)])

    (: extend : (Listof Symbol) (Listof VAL) ENV -> ENV)
    ;; extends an environment with a new frame.
    (define (extend names values env)
      (if (= (length names) (length values))
        (FrameEnv (map (lambda ([name : Symbol] [val : VAL])
                         (list name val))
                       names values)
                  env)
        (error 'extend "arity mismatch for names: ~s" names)))

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, frame by frame,
    ;; return its value or throw an error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(FrameEnv frame rest)
         (let ([cell (assq name frame)])
           (if cell
             (second cell)
             (lookup name rest)))]))

    (: unwrap-rktv : VAL -> Any)
    ;; helper for `racket-func->prim-val': strict and unwrap a RktV
    ;; wrapper in preparation to be sent to the primitive function
    (define (unwrap-rktv x)
      (let ([s (strict x)])
        (cases s
          [(RktV v) v]
          [else (error 'racket-func "bad input: ~s" s)])))

    (: wrap-in-val : Any -> VAL)
    ;; helper that ensures a VAL output using RktV wrapper when needed,
    ;; but leaving as is otherwise
    (define (wrap-in-val x)
      (if (VAL? x) x (RktV x)))

    (: racket-func->prim-val : Function Boolean -> VAL)
    ;; converts a racket function to a primitive evaluator function
    ;; which is a PrimV holding a ((Listof VAL) -> VAL) function.
    ;; (the resulting function will use the list function as is,
    ;; and it is the list function's responsibility to throw an error
    ;; if it's given a bad number of arguments or bad input types.)
    (define (racket-func->prim-val racket-func strict?)
      (define list-func (make-untyped-list-function racket-func))
      (PrimV (lambda (args)
               (let ([args (if strict? (map unwrap-rktv args) args)])
                 (wrap-in-val (list-func args))))))

    ;; The global environment has a few primitives:
    (: global-environment : ENV)
    (define global-environment
      (FrameEnv (list (list '+ (racket-func->prim-val + #t))
                      (list '- (racket-func->prim-val - #t))
                      (list '* (racket-func->prim-val * #t))
                      (list '/ (racket-func->prim-val / #t))
                      (list '< (racket-func->prim-val < #t))
                      (list '> (racket-func->prim-val > #t))
                      (list '= (racket-func->prim-val = #t))
                      ;; note flags:
                      (list 'cons  (racket-func->prim-val cons  #f))
                      (list 'list  (racket-func->prim-val list  #f))
                      (list 'first (racket-func->prim-val car   #t))
                      (list 'rest  (racket-func->prim-val cdr   #t))
                      (list 'null? (racket-func->prim-val null? #t))
                      ;; values
                      (list 'true  (RktV #t))
                      (list 'false (RktV #f))
                      (list 'null  (RktV null)))
                (EmptyEnv)))

    ;;; ----------------------------------------------------------------
    ;;; Evaluation

    (: strict : VAL -> VAL)
    ;; forces a (possibly nested) ExprV promise, returns a VAL that is
    ;; not an ExprV
    (define (strict val)
      (cases val
        [(ExprV expr env) (strict (eval expr env))]
        [else val]))

    (: eval : SLOTH ENV -> VAL)
    ;; evaluates SLOTH expressions
    (define (eval expr env)
      ;; convenient helper
      (: eval* : SLOTH -> VAL)
      (define (eval* expr) (ExprV expr env))
      (cases expr
        [(Num n)   (RktV n)]
        [(Id name) (lookup name env)]
        [(Bind names exprs bound-body)
         (eval bound-body (extend names (map eval* exprs) env))]
        [(Fun names bound-body)
         (FunV names bound-body env)]
        [(Call fun-expr arg-exprs)
         (define fval (strict (eval* fun-expr)))
         (define arg-vals (map eval* arg-exprs))
         (cases fval
           [(PrimV proc) (proc arg-vals)]
           [(FunV names body fun-env)
            (eval body (extend names arg-vals fun-env))]
           [else (error 'eval "function call with a non-function: ~s"
                        fval)])]
        [(If cond-expr then-expr else-expr)
         (eval* (if (cases (strict (eval* cond-expr))
                      [(RktV v) v] ; Racket value => use as boolean
                      [else #t])   ; other values are always true
                  then-expr
                  else-expr))]))

    (: run : String -> Any)
    ;; evaluate a SLOTH program contained in a string
    (define (run str)
      (let ([result (strict (eval (parse str) global-environment))])
        (cases result
          [(RktV v) v]
          [else (error 'run "evaluation returned a bad value: ~s"
                       result)])))

    ;;; ----------------------------------------------------------------
    ;;; Tests

    (test (run "{{fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{bind {{add3 {fun {x} {+ x 3}}}} {add3 1}}")
          => 4)
    (test (run "{bind {{add3 {fun {x} {+ x 3}}}
                       {add1 {fun {x} {+ x 1}}}}
                  {bind {{x 3}} {add1 {add3 x}}}}")
          => 7)
    (test (run "{bind {{identity {fun {x} x}}
                       {foo {fun {x} {+ x 1}}}}
                  {{identity foo} 123}}")
          => 124)
    (test (run "{bind {{x 3}}
                  {bind {{f {fun {y} {+ x y}}}}
                    {bind {{x 5}}
                      {f 4}}}}")
          => 7)
    (test (run "{{{fun {x} {x 1}}
                  {fun {x} {fun {y} {+ x y}}}}
                 123}")
          => 124)

    ;; More tests for complete coverage
    (test (run "{bind x 5 x}")      =error> "bad `bind' syntax")
    (test (run "{fun x x}")         =error> "bad `fun' syntax")
    (test (run "{if x}")            =error> "bad `if' syntax")
    (test (run "{}")                =error> "bad syntax")
    (test (run "{bind {{x 5} {x 5}} x}") =error> "duplicate*bind*names")
    (test (run "{fun {x x} x}")     =error> "duplicate*fun*names")
    (test (run "{+ x 1}")           =error> "no binding for")
    (test (run "{+ 1 {fun {x} x}}") =error> "bad input")
    (test (run "{+ 1 {fun {x} x}}") =error> "bad input")
    (test (run "{1 2}")             =error> "with a non-function")
    (test (run "{{fun {x} x}}")     =error> "arity mismatch")
    (test (run "{if {< 4 5} 6 7}")  => 6)
    (test (run "{if {< 5 4} 6 7}")  => 7)
    (test (run "{if + 6 7}")        => 6)
    (test (run "{fun {x} x}")       =error> "returned a bad value")

    ;; Test laziness
    (test (run "{{fun {x} 1} {/ 9 0}}") => 1)
    (test (run "{{fun {x} 1} {{fun {x} {x x}} {fun {x} {x x}}}}") => 1)
    (test (run "{bind {{x {{fun {x} {x x}} {fun {x} {x x}}}}} 1}") => 1)

    ;; Test lazy constructors
    (test (run "{bind {{l {list 1 {/ 9 0} 3}}}
                  {+ {first l} {first {rest {rest l}}}}}")
          => 4)

    ;;; ----------------------------------------------------------------


------------------------------------------------------------------------
## Shouldn't there be more `ExprV` promises? [Tuesday, March 25th]

You might notice that there are some apparently missing promises. For
example, consider our evaluation of `Bind` forms:

        [(Bind names exprs bound-body)
         (eval bound-body (extend names (map eval* exprs) env))]

The named expressions are turned into expression promises via `eval*`,
but shouldn't we change the first `eval` (the one that evaluates the
body) into a promise too? This is a confusing point, and the bottom line
is that there is no need to create a promise there. The main idea is
that the `eval` function is actually called from contexts that actually
*need* to be evaluated. One example is when we force a promise via
`strict`, and another one is when `run` calls `eval`. Note that in both
of these cases, we actuallly need a (forced) value, so creating a
promise in there doesn't make any difference.

To see this differently, consider how `bind` might be used *within* the
language. The first case is when `bind` is the topmost expression, or
part of a `bind` "spine":

    {bind {{x ...}}
      {bind {{y ...}}
        ...}}

In these cases we evaluate the `bind` expression when we need to return
a result for the whole run, so adding an `ExprV` is not going to make a
difference. The second case is when `bind` is used in an expression line
a function argument:

    {foo {bind {{x ...}} ...}}

Here there is also no point in adding an `ExprV` to the `Bind` case,
since the evaluation of the whole argument (the `Bind` value) will be
wrapped in an `ExprV`, so it is already delayed. (And when it get
forced, we will need to do the `bind` evaluation anyway, so again, it
adds no value.)

A generalization of this is that when we actually call `eval` (either
directly or via `strict`), there is never any point in making the result
that it returns a promise.

(And if you'll follow this carefully and look at all of the `eval`
calls, you will see that this means that *neither* of the `eval*`s in
the `If` case are needed!)


------------------------------------------------------------------------
# Implementing Call by Need [Tuesday, March 25th]

As we have seen, there are a number of advantages for lazy evaluation,
but its main disadvantage is the fact that it is extremely inefficient,
to the point of rendering lots of programs impractical, for example, in:

    {bind {{x {+ 4 5}}}
      {bind {{y {+ x x}}}
        y}}

we end up adding 4 and 5 twice. In other words, we don't suffer from
textual redundancy (each expression is written once), but we don't avoid
dynamic redundancy. We can get it back by simply caching evaluation
results, using a box that will be used to remember the results. The box
will initially hold `#f`, and it will change to hold the VAL that
results from evaluation:

    (define-type VAL
      [RktV  Any]
      [FunV  (Listof Symbol) SLOTH ENV]
      [ExprV SLOTH ENV (Boxof (U #f VAL))] ;*** new: mutable cache field
      [PrimV ((Listof VAL) -> VAL)])

We need a utility function to create an evaluation promise, because when
an `ExprV` is created, its initial cache box needs to be initialized.

    (: eval-promise : SLOTH ENV -> VAL)
    ;; used instead of `eval' to create an evaluation promise
    (define (eval-promise expr env)
      (ExprV expr env (box #f)))

(And note that Typed Racket needs to figure out that the `#f` in this
definition has a type of `(U #f VAL)` and not just `#f`.)

This `eval-promise` is used instead of `ExprV` in eval. Finally,
whenever we force such an `ExprV` promise, we need to check if it was
already evaluated, otherwise force it and cache the result. This is
simple to do since there is a single field that is used both as a flag
and a cached value:

    (: strict : VAL -> VAL)
    ;; forces a (possibly nested) ExprV promise, returns a VAL that is
    ;; not an ExprV
    (define (strict val)
      (cases val
        [(ExprV expr env cache)
         (or (unbox cache)
             (let ([val* (strict (eval expr env))])
               (set-box! cache val*)
               val*))]
        [else val]))

But note that this makes using side-effects in our interpreter even more
confusing. (It was true with call-by-name too.)

The resulting code follows.

    ;;; ---<<<SLOTH-CACHED>>>-------------------------------------------
    ;; A call-by-need version of the SLOTH interpreter

    #lang pl

    ;;; ----------------------------------------------------------------
    ;;; Syntax

    #| The BNF:
       <SLOTH> ::= <num>
                 | <id>
                 | { bind {{ <id> <SLOTH> } ... } <SLOTH> }
                 | { fun { <id> ... } <SLOTH> }
                 | { if <SLOTH> <SLOTH> <SLOTH> }
                 | { <SLOTH> <SLOTH> ... }
    |#

    ;; A matching abstract syntax tree datatype:
    (define-type SLOTH
      [Num  Number]
      [Id   Symbol]
      [Bind (Listof Symbol) (Listof SLOTH) SLOTH]
      [Fun  (Listof Symbol) SLOTH]
      [Call SLOTH (Listof SLOTH)]
      [If   SLOTH SLOTH SLOTH])

    (: unique-list? : (Listof Any) -> Boolean)
    ;; Tests whether a list is unique, guards Bind and Fun values.
    (define (unique-list? xs)
      (or (null? xs)
          (and (not (member (first xs) (rest xs)))
               (unique-list? (rest xs)))))

    (: parse-sexpr : Sexpr -> SLOTH)
    ;; parses s-expressions into SLOTHs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(cons 'bind more)
         (match sexpr
           [(list 'bind (list (list (symbol: names) (sexpr: nameds))
                              ...)
              body)
            (if (unique-list? names)
              (Bind names (map parse-sexpr nameds) (parse-sexpr body))
              (error 'parse-sexpr "duplicate `bind' names: ~s" names))]
           [else (error 'parse-sexpr "bad `bind' syntax in ~s" sexpr)])]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: names) ...) body)
            (if (unique-list? names)
              (Fun names (parse-sexpr body))
              (error 'parse-sexpr "duplicate `fun' names: ~s" names))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(cons 'if more)
         (match sexpr
           [(list 'if cond then else)
            (If (parse-sexpr cond)
                (parse-sexpr then)
                (parse-sexpr else))]
           [else (error 'parse-sexpr "bad `if' syntax in ~s" sexpr)])]
        [(list fun args ...) ; other lists are applications
         (Call (parse-sexpr fun)
               (map parse-sexpr args))]
        [else (error 'parse-sexpr "bad syntax in ~s" sexpr)]))

    (: parse : String -> SLOTH)
    ;; Parses a string containing an SLOTH expression to a SLOTH AST.
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;;; ----------------------------------------------------------------
    ;;; Values and environments

    (define-type ENV
      [EmptyEnv]
      [FrameEnv FRAME ENV])

    ;; a frame is an association list of names and values.
    (define-type FRAME = (Listof (List Symbol VAL)))

    (define-type VAL
      [RktV  Any]
      [FunV  (Listof Symbol) SLOTH ENV]
      [ExprV SLOTH ENV (Boxof (U #f VAL))]
      [PrimV ((Listof VAL) -> VAL)])

    (: extend : (Listof Symbol) (Listof VAL) ENV -> ENV)
    ;; extends an environment with a new frame.
    (define (extend names values env)
      (if (= (length names) (length values))
        (FrameEnv (map (lambda ([name : Symbol] [val : VAL])
                         (list name val))
                       names values)
                  env)
        (error 'extend "arity mismatch for names: ~s" names)))

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, frame by frame,
    ;; return its value or throw an error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(FrameEnv frame rest)
         (let ([cell (assq name frame)])
           (if cell
             (second cell)
             (lookup name rest)))]))

    (: unwrap-rktv : VAL -> Any)
    ;; helper for `racket-func->prim-val': strict and unwrap a RktV
    ;; wrapper in preparation to be sent to the primitive function
    (define (unwrap-rktv x)
      (let ([s (strict x)])
        (cases s
          [(RktV v) v]
          [else (error 'racket-func "bad input: ~s" s)])))

    (: wrap-in-val : Any -> VAL)
    ;; helper that ensures a VAL output using RktV wrapper when needed,
    ;; but leaving as is otherwise
    (define (wrap-in-val x)
      (if (VAL? x) x (RktV x)))

    (: racket-func->prim-val : Function Boolean -> VAL)
    ;; converts a racket function to a primitive evaluator function
    ;; which is a PrimV holding a ((Listof VAL) -> VAL) function.
    ;; (the resulting function will use the list function as is,
    ;; and it is the list function's responsibility to throw an error
    ;; if it's given a bad number of arguments or bad input types.)
    (define (racket-func->prim-val racket-func strict?)
      (define list-func (make-untyped-list-function racket-func))
      (PrimV (lambda (args)
               (let ([args (if strict? (map unwrap-rktv args) args)])
                 (wrap-in-val (list-func args))))))

    ;; The global environment has a few primitives:
    (: global-environment : ENV)
    (define global-environment
      (FrameEnv (list (list '+ (racket-func->prim-val + #t))
                      (list '- (racket-func->prim-val - #t))
                      (list '* (racket-func->prim-val * #t))
                      (list '/ (racket-func->prim-val / #t))
                      (list '< (racket-func->prim-val < #t))
                      (list '> (racket-func->prim-val > #t))
                      (list '= (racket-func->prim-val = #t))
                      ;; note flags:
                      (list 'cons  (racket-func->prim-val cons  #f))
                      (list 'list  (racket-func->prim-val list  #f))
                      (list 'first (racket-func->prim-val car   #t))
                      (list 'rest  (racket-func->prim-val cdr   #t))
                      (list 'null? (racket-func->prim-val null? #t))
                      ;; values
                      (list 'true  (RktV #t))
                      (list 'false (RktV #f))
                      (list 'null  (RktV null)))
                (EmptyEnv)))

    ;;; ----------------------------------------------------------------
    ;;; Evaluation

    (: eval-promise : SLOTH ENV -> VAL)
    ;; used instead of `eval' to create an evaluation promise
    (define (eval-promise expr env)
      (ExprV expr env (box #f)))

    (: strict : VAL -> VAL)
    ;; forces a (possibly nested) ExprV promise, returns a VAL that is
    ;; not an ExprV
    (define (strict val)
      (cases val
        [(ExprV expr env cache)
         (or (unbox cache)
             (let ([val* (strict (eval expr env))])
               (set-box! cache val*)
               val*))]
        [else val]))

    (: eval : SLOTH ENV -> VAL)
    ;; evaluates SLOTH expressions
    (define (eval expr env)
      ;; convenient helper
      (: eval* : SLOTH -> VAL)
      (define (eval* expr) (eval-promise expr env))
      (cases expr
        [(Num n)   (RktV n)]
        [(Id name) (lookup name env)]
        [(Bind names exprs bound-body)
         (eval bound-body (extend names (map eval* exprs) env))]
        [(Fun names bound-body)
         (FunV names bound-body env)]
        [(Call fun-expr arg-exprs)
         (define fval (strict (eval* fun-expr)))
         (define arg-vals (map eval* arg-exprs))
         (cases fval
           [(PrimV proc) (proc arg-vals)]
           [(FunV names body fun-env)
            (eval body (extend names arg-vals fun-env))]
           [else (error 'eval "function call with a non-function: ~s"
                        fval)])]
        [(If cond-expr then-expr else-expr)
         (eval* (if (cases (strict (eval* cond-expr))
                      [(RktV v) v] ; Racket value => use as boolean
                      [else #t])   ; other values are always true
                  then-expr
                  else-expr))]))

    (: run : String -> Any)
    ;; evaluate a SLOTH program contained in a string
    (define (run str)
      (let ([result (strict (eval (parse str) global-environment))])
        (cases result
          [(RktV v) v]
          [else (error 'run "evaluation returned a bad value: ~s"
                       result)])))

    ;;; ----------------------------------------------------------------
    ;;; Tests

    (test (run "{{fun {x} {+ x 1}} 4}")
          => 5)
    (test (run "{bind {{add3 {fun {x} {+ x 3}}}} {add3 1}}")
          => 4)
    (test (run "{bind {{add3 {fun {x} {+ x 3}}}
                       {add1 {fun {x} {+ x 1}}}}
                  {bind {{x 3}} {add1 {add3 x}}}}")
          => 7)
    (test (run "{bind {{identity {fun {x} x}}
                       {foo {fun {x} {+ x 1}}}}
                  {{identity foo} 123}}")
          => 124)
    (test (run "{bind {{x 3}}
                  {bind {{f {fun {y} {+ x y}}}}
                    {bind {{x 5}}
                      {f 4}}}}")
          => 7)
    (test (run "{{{fun {x} {x 1}}
                  {fun {x} {fun {y} {+ x y}}}}
                 123}")
          => 124)

    ;; More tests for complete coverage
    (test (run "{bind x 5 x}")      =error> "bad `bind' syntax")
    (test (run "{fun x x}")         =error> "bad `fun' syntax")
    (test (run "{if x}")            =error> "bad `if' syntax")
    (test (run "{}")                =error> "bad syntax")
    (test (run "{bind {{x 5} {x 5}} x}") =error> "duplicate*bind*names")
    (test (run "{fun {x x} x}")     =error> "duplicate*fun*names")
    (test (run "{+ x 1}")           =error> "no binding for")
    (test (run "{+ 1 {fun {x} x}}") =error> "bad input")
    (test (run "{+ 1 {fun {x} x}}") =error> "bad input")
    (test (run "{1 2}")             =error> "with a non-function")
    (test (run "{{fun {x} x}}")     =error> "arity mismatch")
    (test (run "{if {< 4 5} 6 7}")  => 6)
    (test (run "{if {< 5 4} 6 7}")  => 7)
    (test (run "{if + 6 7}")        => 6)
    (test (run "{fun {x} x}")       =error> "returned a bad value")

    ;; Test laziness
    (test (run "{{fun {x} 1} {/ 9 0}}") => 1)
    (test (run "{{fun {x} 1} {{fun {x} {x x}} {fun {x} {x x}}}}") => 1)
    (test (run "{bind {{x {{fun {x} {x x}} {fun {x} {x x}}}}} 1}") => 1)

    ;; Test lazy constructors
    (test (run "{bind {{l {list 1 {/ 9 0} 3}}}
                  {+ {first l} {first {rest {rest l}}}}}")
          => 4)

    ;;; ----------------------------------------------------------------


------------------------------------------------------------------------
# Side Effects in a Lazy Language [Tuesday, March 25th]

We've seen that a lazy language without the call-by-need optimization is
too slow to be practical, but the optimization makes using side-effects
extremely confusing. Specifically, when we deal with side-effects (I/O,
mutation, errors, etc) the order of evaluation matters, but in our
interpreter expressions are getting evaluated as needed. (Remember
tracing the prime-numbers code in Lazy Racket --- numbers are tested as
needed, not in order.) If we can't do these things, the question is
whether there is any point in using a purely functional lazy language at
all --- since computer programs often interact with an imperative world.

There is a solution for this: the lazy language does not have any (sane)
facilities for *doing* things (like `printf` that prints something in
plain Racket), but it can use a data structure that *describes* such
operations. For example, in Lazy Racket we cannot print stuff sanely
using `printf`, but we can construct a string using `format` (which is
just like `printf`, except that it returns the formatted string instead
of printing it). So (assuming Racket syntax for simplicity), instead of:

    (define (foo n)
      (printf "~s + 1 = ~s\n" n (+ n 1)))

we will write:

    (define (foo n)
      (format "~s + 1 = ~s\n" n (+ n 1)))

and get back a string. We can now change the way that our interpreter
deals with the output value that it receives after evaluating a lazy
expression: if it receives a string, then it can take that string as
denoting a request for printout, and simply print it. Such an evaluator
will do the printout when the lazy evaluation is done, and everything
works fine because we don't try to use any side-effects in the lazy
language --- we just describe the desired side-effects, and constructing
such a description does not require *performing* side-effects.

But this only solves printing a single string, and nothing else. If we
want to print two strings, then the only thing we can do is concatenate
the two strings --- but that is not only inefficient, it cannot describe
infinite output (since we will not be able to construct the infinite
string in memory). So we need a better way to chain several printout
representations. One way to do so is to use a list of strings, but to
make things a little easier to manage, we will create a type for I/O
descriptions --- and populate it with one variant holding a string (for
plain printout) and one for holding a chain of two descriptions (which
can be used to construct an arbitrarily long sequence of descriptions):

    (define-type IO
      [Print  String]
      [Begin2 IO IO])

Now we can use this to chain any number of printout representations by
turning them into a single `Begin2` request, which is very similar to
simply using a loop to print the list. For example, the eager printout
code:

    (: print-list : (Listof A) -> Void)
    (define (print-list l)
      (if (null? l)
        (printf "\n")
        (begin (printf "~s " (first l))
               (print-list (rest l)))))

turns to the following code:

    (: print-list : (Listof A) -> IO)
    (define (print-list l)
      (if (null? l)
        (Print "\n")
        (Begin2 (Print (format "~s " (first l)))
                (print-list (rest l)))))

This will basically scan an input list like the eager version, but
instead of printing the list, it will convert it into a single output
request that forms a recipe for this printout. Note that within the lazy
world, the result of `print-list` is just a value, there are no side
effects involved. Turning this value into the actual printout is
something that needs to be done on the eager side, which must be part of
the implementation. In the case of Lazy Racket, we have no access to the
implementation, but we can do so in our Sloth implementation: again,
`run` will inspect the result and either print a given string (if it
gets a `Print` value), or print two things recursively (if it gets a
`Begin2` value). (To implement this, we will add an `IOV` variant to the
`VAL` type definition, and have it contain an `IO` description of the
above type.)

Because the sequence is constructed in the lazy world, it will not
require allocating the whole sequence in memory --- it can be forced
bits by bits (using `strict`) as the imperative back-end (the `run` part
of the implementation) follows the instructions in the resulting IO
description. More concretely, it will also work on an infinite list: the
translation of an infinite-loop printout function will be one that
returns an infinite IO description tree of `Begin2` values. This loop
will also force only what it needs to print and will go on recursively
printing the whole sequence (possibly not terminating). For example
(again, using Racket syntax), the infinite printout loop

    (: print-loop : -> Void)
    (define (print-loop)
      (printf "foo\n")
      (print-loop))

is translated into a function that returns an infinite tree of print
operations:

    (: print-loop : -> IO)
    (define (print-loop)
      (Begin2 (Print "foo\n")
              (print-loop)))

When this tree is converted to actions, it will result in an infinite
loop that produces the same output --- it is essentially the same
infinite loop, only now it's derived by an infinite description rather
than an infinite process.

Finally, how should we deal with inputs? We can add another variant to
our type definition that represents a `read-line` operation, assuming
that like `read-line` it does not require any arguments:

    (define-type IO
      [Print    String]
      [ReadLine ]
      [Begin2   IO IO])

Now the eager implementation can invoke `read-line` when it encounters a
`ReadLine` value --- but what should it do with the resulting string?
Even worse, naively binding a value to `ReadLine`

    (let ([name (ReadLine)])
      (Print (format "Your name is ~a" name)))

doesn't get us the string that is read --- instead, the value is a
*description* of a read operation, which is very different from the
actual string value that we want in the binding.

The solution is to take the "code that acts on the string value" and
make *it* be the argument to `ReadLine`. In the above example, that
would be the `let` expression without the `(ReadLine)` part --- and as
you remember from the time we introduced `fun` into `WAE`, taking away a
named expression from a binding expression leads to a function. With
this in mind, it makes sense to make `ReadLine` take a function value
that represents what to do in the future, once the reading is actually
done.

    (ReadLine (lambda (name)
                (Print (format "Your name is ~a" name))))

This receiver value is a kind of a *continuation* of the computation,
provided as a callback value --- it will get the string that was read on
the terminal, and will return a new description of side-effects that
represents the rest of the process:

    (define-type IO
      [Print    String]
      [ReadLine (String -> IO)]
      [Begin2   IO IO])

Now, when the eager side sees a `ReadLine` value, it will read a line,
and invoke the callback function with the string that it has read. By
doing this, the control goes back to the lazy world to process the value
and get back another IO value to continue the processing. This results
in a process where the lazy code generates some IO descriptions, then
the imperative side will execute it and control goes back to the lazy
code, then back to the imperative side, etc.

As a more verbose example of all of the above, this silly loop:

    (: silly-loop : -> Void)
    (define (silly-loop)
      (printf "What is your name? ")
      (let ([name (read-line)])
        (if (equal? name "quit")
          (printf "bye\n")
          (begin (printf "Your name is ~s\n" name)
                 (silly-loop)))))

is now translated to:

    (: silly-loop : -> IO)
    (define (silly-loop)
      (Begin2 (Print "What is your name? ")
              (ReadLine
               (lambda (name)
                 (if (equal? name "quit")
                   (Print "bye\n")
                   (Begin2 (Print (format "Your name is ~s\n" name))
                           (silly-loop)))))))

Using this strategy to implement side-effects is possible, and you will
do that in the homework --- some technical details are going to be
different but the principle is the same as discussed above. The last
problem is that the above code is difficult to work with --- in the
homework you will see how to use syntactic abstractions to make things
much simpler.


------------------------------------------------------------------------
# Designing Domain Specific Languages (DSLs) [Tuesday, March 25th]

> [PLAI §35]

Programming languages differ in numerous ways:

1. Each uses different notations for writing down programs. As we've
   observed, however, syntax is only partially interesting. (This is,
   however, less true of languages that are trying to mirror the
   notation of a particular domain.)

2. Control constructs: for instance, early languages didn't even support
   recursion, while most modern languages still don't have
   continuations.

3. The kinds of data they support. Indeed, sophisticated languages like
   Racket blur the distinction between control and data by making
   fragments of control into data values (such as first-class functions
   and continuations).

4. The means of organizing programs: do they have functions, modules,
   classes, namespaces, ...?

5. Automation such as memory management, run-time safety checks, and so
   on.

Each of these items suggests natural questions to ask when you design
your own languages in particular domains.

Obviously, there are a lot of domain specific languages these days ---
and that's not new. For example, four of the oldest languages were
conceived as domain specific languages:

 * **Fortran** --- *Formula Translator*
 * **Algol**   --- *Algorithmic Language*
 * **Cobol**   --- *Common Business Oriented Language*
 * **Lisp**    --- *List Processing*

Only in the late 60s / early 70s languages began to get free from their
special purpose domain and become *general purpose* languages (GPLs).
These days, we usually use some GPL for our programs and often come up
with small *domain specific* languages (DSLs) for specific jobs. The
problem is designing such a specific language. There are lots of
decisions to make, and as should be clear now, many ways of shooting
your self in the foot. You need to know:

* What is your domain?

* What are the common notations in this domain (need to be convenient
  both for the machine and for humans)?

* What do you expect to get from your DSL? (eg, performance gains when
  you know that you're dealing with a certain limited kind of
  functionality like arithmetics.)

* Do you have any semantic reason for a new language? (For example,
  using special scoping rules, or a mixture of lazy and eager
  evaluation, maybe a completely different way of evaluation (eg,
  makefiles).)

* Is your language expected to envelope other functionality (eg, shell
  scripts, TCL), perhaps throwing some functionality on a different
  language (makefiles and shell scripts), or is it going to be embedded
  in a bigger application (eg, PHP), or embedded in a way that exposes
  parts of an application to user automation (Emacs Lisp, Word Basic,
  Visual Basic for Office Application or Some Other Long List of
  Buzzwords).

* If you have one language embedded in another enveloping language ---
  how do you handle syntax? How can they communicate (eg, share
  variables)?

And very important:

* Is there a benefit for implementing a DSL over using a GPL --- how
  much will your DSL grow (usually more than you think)? Will it get to
  a point where it will need the power of a full GPL? Do you want to
  risk doing this just to end up admitting that you need a "Real
  Language" and dump your solution for "Visual Basic for Applications"?
  (It might be useful to think ahead about things that you know you
  don't need, rather than things you need.)

To clarify why this can be applicable in more situations than you think,
consider what programming languages are used for. One example that
should not be ignored is using a programming language to implement a
programming language --- for example, what we did so far (or any other
interpreter or compiler). In the same way that some piece of code in a
PL represent functions about the "real world", there are other programs
that represent things in a language --- possibly even the same one. To
make a side-effect-full example, the meaning of `one-brick` might
abstract over laying a brick when making a wall --- it abstracts all the
little details into a function:

    (define (one-brick wall brick-pile)
      (move-eye (location brick-pile))
      (let ([pos (find-available-brick-position brick-pile)])
        (move-hand pos)
        (grab-object))
      (move-eye wall)
      (let ([pos (find-next-brick-position wall)])
        (move-hand pos)
        (drop-object)))

and we can now write

    (one-brick my-wall my-brick-pile)

instead of all of the above. We might use that in a loop:

    (define (build-wall wall pile)
      (define (loop n)
        (when (< n 500)
          (one-brick wall pile)
          (loop (add1 n))))
      (loop 0))

This is a common piece of looping code that we've seen in many forms,
and a common complaint of newcomers to functional languages is the lack
of some kind of a loop. But once you know the template, writing such
loops is easy --- and in fact, you can write code that would take
something like:

    (define (build-wall wall pile)
      (loop-for i from 1 to 500
        (one-brick wall pile)))

and produce the previous code. Note the main point here: we switch from
code that deals with bricks to code that deals with code.

Now, a viable option for implementing a new DSL is to do so by
transforming it into an existing language. Such a process is usually
tedious and error prone --- tedious because you need to deal with the
boring parts of a language (making a parser etc), and error prone
because it's easy to generate bad code (especially when you're dealing
with strings) and you get bad errors in terms of the translated code
instead of the actual code, resorting to debugging the intermediate
generated programs. Lisp languages traditionally have taken this idea
one level further than other languages: instead of writing a new
transformer for your language, you use the host language, but you extend
and customize it by adding you own forms.


------------------------------------------------------------------------
# Syntax Transformations: Macros [Tuesday, April 1st]

> [PLAI §36]

Macros are one of the biggest advantages of all Lisps, and specifically
even more so an advantage of Scheme implementations, and yet more
specifically, it is a major Racket feature: this section is therefore
specific to Racket (which has this unique feature), although most of
this is the same in most Schemes.

As we have previously seen, it is possible to implement one language
construct using another. What we did could be described as bits of a
compiler, since they translate one language to another.

We will see how this can be done *in* Racket: implementing some new
linguistic forms in terms of ones that are already known. In essence, we
will be translating Racket constructs to other Racket constructs --- and
all that is done *in Racket*, no need to go back to the language Racket
was implemented in (C).

This is possible with a simple "trick": the Racket implementation uses
some syntax objects. These objects are implemented somehow inside
Racket's own source code. But these objects are also directly available
for our use --- part of the implementation is exposed to our level. This
is quite similar to the way we have implemented pairs in our language
--- a TOY or a SLOTH pair is implemented using a Racket pair, so the
*same* data object is available at both levels.

This is the big idea in Lisp, which Scheme (and therefore Racket)
inherited from (to some extent): programs are made of numbers, strings,
symbols and lists of these --- and these are all used both at the
meta-level as well as the user level. This means that instead of having
no meta-language at all (locking away a lot of useful stuff), and
instead of having some crippled half-baked meta language (CPP being both
the most obvious (as well as the most popular) example for such a meta
language), instead of all this we get exactly the same language at both
levels.

How is this used? Well, the principle is simple. For example, say we
want to write a macro that will evaluate two forms in sequence, but if
the first one returns a result that is not false then it returns it
instead of evaluating the second one too. This is exactly how `or`
behaves, so pretend we don't have it --- call our version `orelse`:

    (orelse <expr1> <expr2>)

in effect, we add a new *special* form to our language, with its own
evaluation rule, only we know how to express this evaluation rule by
translating it to things that are already part of our language.

We could do this as a simple function --- only if we're willing to
explicitly delay the arguments with a `lambda`, and use the thunks in
the function:

    (define (orelse thunk1 thunk2)
      (if (thunk1)
        (thunk1) ; ignore the double evaluation for now
        (thunk2)))

or:

    (define (orelse thunk1 thunk2)
      ((if (thunk1)
         thunk1
         thunk2)))

and using it like this:

    (orelse (lambda () 1) (lambda () (error "boom")))

But this is clearly not the right way to do this: whoever uses this code
must be aware of the need to send us thunks, and it's verbose and
inconvenient.

> Note that this could be a feasible solution if there was a uniform way
> to have an easy syntactic way to say "a chunk of code" instead of
> immediately execute it --- this is exactly what `(lambda () ...)`
> does. So we could, for example, make `{...}` be a shorthand for that,
> which is what Perl-6 is doing. However, we will soon see examples
> where we want more than just delay the evaluation of some code.

We want to translate

    (orelse <expr1> <expr2>)

    --to-->

    (if <expr1>
      <expr1>
      <expr2>)

If we look at the code as an s-expression, then we can write the
following function:

    (define (translate-orelse l)
      (if (and (list? l)
               (= 3 (length l))
               (eq? 'orelse (first l)))
        (list 'if (second l) (second l) (third l))
        (error 'translate-orelse "bad input: ~s" l)))

We can now try it with a simple list:

    (translate-orelse '(orelse foo1 foo2))

and note that the result is correct.

How is this used? Well, all we need is to hook *our* function into our
implementation's evaluator. In Lisp, we get a `defmacro` form for this,
and many Schemes inherited it or something similar. In Racket, we need
to

    (require compatibility/defmacro)

but it requires the transformation to be a little different in a way
that makes life easier: the above contains a lot of boilerplate code.
Usually, we will require the input to be a list of some known length,
the first element to be a symbol that specifies our form, and then do
something with the other arguments. So we'd want to always follow a
template that looks like:

    (define (translate-??? exprs)
      (if (and (list? exprs)
               (= N (length exprs))
               (eq? '??? (car exprs)))
        (let ([E1 (cadr exprs)]
              [E2 (caddr exprs)]
              ...)
          ...make result expression...)
        (error ...)))

But this looks very similar to making sure that a function call is a
specific function call (and for a good reason --- macro usages look just
like function calls). So make the translation function get a number of
arguments one each for each part of the input, an s-expression. For
example, the above translation and test become:

    (define (translate-orelse <expr1> <expr2>)
      (list 'if <expr1> <expr1> <expr2>))

    (translate-orelse 'foo1 'foo2)

The number of arguments is used to check the input (turning an arity
error for the macro to an arity error for the translator function call),
and we don't need to "caddr our way" to arguments.

This gives us the simple definition --- but what about the promised
hook? --- All we need is to use `define-macro` instead of `define`, and
change the name to the name that will trigger this translation
(providing the last missing test of the input):

    (define-macro (orelse <expr1> <expr2>)
      (list 'if <expr1> <expr1> <expr2>))

and test it:

    (orelse 1 (error "boom"))

Note that this is basically a (usually purely functional) lazy language
of transformations which is built on top of Racket. It is possible for
macros to generate pieces of code that contain references to these same
macros, and they will be used to expand those instances again.

Now we start with the problems, one by one.


------------------------------------------------------------------------
# Macro Problems [Tuesday, April 1st]

There is an inherent problem when macros are being used, in any form and
any language (even in CPP): you must remember that you are playing with
*expressions*, not with values --- which is why this is problematic:

    (define (foo x) (printf "foo ~s!\n" x) x)

    (or (foo 1) (foo 2))

    (orelse (foo 1) (foo 2))

And the reason for this should be clear. The standard solution for this
is to save the value as a binding --- so back to the drawing board, we
want this transformation instead:

    (orelse <expr1> <expr2>)
    -->
    (let ((val <expr1>))
      (if val
        val
        <expr2>))

(Note that we would have the same problem in the version that used
simple functions and thunks.)

And to write the new code:

    (define-macro (orelse <expr1> <expr2>)
      (list 'let (list (list 'val <expr1>))
        (list 'if 'val
          'val
          <expr2>)))

    (orelse (foo 1) (foo 2))

and this works like we want it to.


------------------------------------------------------------------------
## Complexity of S-expression transformations [Tuesday, April 1st]

As can be seen, writing a simple macro doesn't look too good --- what if
we want to write a more complicated macro? A solution to this is to look
at the above macro and realize that it *almost* looks like the code we
want --- we basically want to return a list of a certain fixed shape, we
just want some parts to be filled in by the given arguments. Something
like:

    (define-macro (orelse <expr1> <expr2>)
      '(let ((val <expr1>))
         (if val
           val
           <expr2>)))

if we had a way to make the `<...>`s not be a fixed part of the result,
but we actually want the values that the transformation function
received. (Remember that the `<` and `>` are just a part of the name, no
magic, just something to make these names stand out.) This is related to
notational problems that logicians and philosophers had problems with
for centuries. One solution that Lisp uses for this is: instead of a
quote, use backquote (called `quasiquote` in Racket) which is almost
like quote, except that you can `unquote` parts of the value inside.
This is done with a "`,`" comma. Using this, the above code can be
written like this:

    (define-macro (orelse <expr1> <expr2>)
      `(let ((val ,<expr1>))
         (if val
           val
           ,<expr2>)))


------------------------------------------------------------------------
## Scoping problems [Tuesday, April 1st]

You should be able to guess what's this problem about. The basic problem
of these macros is that they cannot be used reliably --- the name that
is produced by the macro can shadow a name that is in a completely
different place, therefore destroying lexical scope. For example, in:

    (let ((val 4))
      (orelse #f val))

the `val` in the macro shadows the use of this name in the above. One
way to solve this is to write macros that look like this:

    (define-macro (orelse <expr1> <expr2>)
      `(let ((%%!my*internal*var-do-not-use!%% ,<expr1>))
         (if %%!my*internal*var-do-not-use!%%
           %%!my*internal*var-do-not-use!%%
           ,<expr2>)))

or:

    (define-macro (orelse <expr1> <expr2>)
      `(let ((i-am-using-orelse-so-i-should-not-use-this-name ,<expr1>))
         (if i-am-using-orelse-so-i-should-not-use-this-name
           i-am-using-orelse-so-i-should-not-use-this-name
           ,<expr2>)))

or (this is actually similar to using UUIDs):

    (define-macro (orelse <expr1> <expr2>)
      `(let ((eli@barzilay.org/foo/bar/2002-02-02-10:22:22 ,<expr1>))
         (if eli@barzilay.org/foo/bar/2002-02-02-10:22:22
           eli@barzilay.org/foo/bar/2002-02-02-10:22:22
           ,<expr2>)))

Which is really not too good because such obscure variables tend to
clobber each other too, in all kinds of unexpected ways.

Another way is to have a function that gives you a different variable
name every time you call it:

    (define-macro (orelse <expr1> <expr2>)
      (let ((temp (gensym)))
        `(let ((,temp ,<expr1>))
           (if ,temp
             ,temp
             ,<expr2>))))

but this is not safe either since there might still be clashes of these
names (eg, if they're using a counter that is specific to the current
process, and you start a new process and load code that was generated
earlier). The Lisp solution for this (which Racket's `gensym` function
implements as well) is to use *uninterned* symbols --- symbols that have
their own identity, much like strings, and even if two have the same
name, they are not `eq?`.

Note also that there is the mirror side of this problem --- what happens
if we try this:

    (let ([if 123]) (orelse #f #f))

? This leads to capture in the other direction --- the code above
shadows the `if` binding that the macro produces.

Some Schemes will allow something like

    (define-macro (foo x)
      `(,mul-list ,x))

but this is a hack since the macro outputs something that is not a pure
s-expression (and it cannot work for a syntactic keyword like `if`).
Specifically, it is not possible to write the resulting expression (to a
compiled file, for example).

We will ignore this for a moment.

------------------------------------------------------------------------

Another problem --- manageability of these transformations.

Quasiquotes gets us a long way, but it is still insufficient.

For example, lets write a Racket `bind` that uses `lambda` for binding.
The transformation we now want is:

    (bind ((var expr) ...)
      body)
    -->
    ((lambda (var ...) body)
     expr ...)

The code for this looks like this:

    (define-macro (bind var-expr-list body)
      (cons (list 'lambda (map car var-expr-list) body)
            (map cadr var-expr-list)))

This already has a lot more pitfalls. There are `list`s and `cons`es
that you should be careful of, there are `map`s and there are `cadr`s
that would be catastrophic if you use `car`s instead. The quasiquote
syntax is a little more capable --- you can write this:

    (define-macro (bind var-expr-list body)
      `((lambda ,(map car var-expr-list) ,body)
        ,@(map cadr var-expr-list)))

where "`,@`" is similar to "`,`" but the unquoted expression should
evaluate to a list that is *spliced* into its surrounding list (that is,
its own parens are removed and it's made into elements in the containing
list). But this is still not as readable as the transformation you
actually want, and worse, it is not checking that the input syntax is
valid, which can lead to very confusing errors.

This is yet another problem --- if there is an error in the resulting
syntax, the error will be reported in terms of this result rather than
the syntax of the code. There is no easy way to tell where these errors
are coming from. For example, say that we make a common mistake: forget
the "`@`" character in the above macro:

    (define-macro (bind var-expr-list body)
      `((lambda ,(map car var-expr-list) ,body)
        ,(map cadr var-expr-list)))

Now, someone else (the client of this macro), tries to use it:

    > (bind ((x 1) (y 2)) (+ x y))
    procedure application: expected procedure,
    given: 1; arguments were: 2

Yes? Now what? Debugging this is difficult, since in most cases it is
not even clear that you were using a macro, and in any case the macro
comes from code that you have no knowledge of and no control over. [The
problem in this specific case is that the macro expands the code to:

    ((lambda (x y) (+ x y))
     (1 2))

so Racket will to use `1` as a function and throw a runtime error.]

Adding error checking to the macro results in this code:

    (define-macro (bind var-expr-list body)
      (if (andmap (lambda (var-expr)
                    (and (list? var-expr)
                         (= 2 (length var-expr))
                         (symbol? (car var-expr))))
                  var-expr-list)
        `((lambda ,(map car var-expr-list) ,body)
          ,@(map cadr var-expr-list))
        (error 'bind "bad syntax whaaaa!")))

Such checks are very important, yet writing this is extremely tedious.


------------------------------------------------------------------------
# Scheme (and Racket) Macros [Tuesday, April 1st]

Scheme, Racket included (and much extended), has a solution that is
better than `defmacro`: it has `define-syntax` and `syntax-rules`. First
of all, `define-syntax` is used to create the "magical connection"
between user code and some macro transformation code that does some
rewriting. This definition:

    (define-syntax foo
      ...something...)

makes `foo` be a special syntax that, when used in the head of an
expression, will lead to transforming the expression itself, where the
result of this transformation is what gets used instead of the original
expression. The "`...something...`" in this code fragment should be a
transformation function --- one that consumes the expression that is to
be transformed, and returns the new expression to run.

Next, `syntax-rules` is used to create such a transformation in an easy
way. The idea is that what we thought to be an informal specification of
such rewrites, for example:

* `let` can be defined as the following transformation:

      (let ((x v) ...) body ...)
      --> ((lambda (x ...) body ...) v ...)

* `let*` is defined with *two* transformation rules:

  1. (let* () body ...)
     --> (let () body ...)
  2. (let* ((x1 v1) (x2 v2) ...) body ...)
     --> (let ((x1 v1)) (let* ((x2 v2) ...) body ...))

can actually be formalized by automatically creating a syntax
transformation function from these rule specifications. (Note that this
example has round parentheses so we don't fall into the illusion that
square brackets are different: the resulting transformation would be the
same.) The main point is to view the left hand side as a *pattern* that
can match some forms of syntax, and the right hand side as producing an
output that can use some matched patterns.

`syntax-rules` is used with such rewrite specifications, and it produces
the corresponding transformation function. For example, this:

    (syntax-rules () ;*** ignore this "()" for now
      [(x y) (y x)])

evaluates to a function that is somewhat similar to:

    (lambda (expr)
      (if (and (list? expr) (= 2 (length expr)))
        (list (second expr) (first expr))
        (error "bad syntax")))

but `match` is a little closer, since it uses similar input patterns:

    (lambda (expr)
      (match expr
        [(list x y) (list y x)]
        [else (error "bad syntax")]))

Such transformations are used in a `define-syntax` expression to tie the
transformer back into the compiler by hooking it on a specific keyword.
You can now appreciate how all this work when you see how easy it is to
define macros that are very tedious with `define-macro`. For example,
the above `bind`:

    (define-syntax bind
      (syntax-rules ()
        [(bind ((x v) ...) body ...)
         ((lambda (x ...) body ...) v ...)]))

and `let*` with its two rules:

    (define-syntax let*
      (syntax-rules ()
        [(let* () body ...)
         (let () body ...)]
        [(let* ((x v) (xs vs) ...) body ...)
         (let ((x v)) (let* ((xs vs) ...) body ...))]))

These transformations are so convenient to follow, that Scheme
specifications (and reference manuals) describe forms by specifying
their definition. For example, the Scheme report, specifies `let*` as a
"derived form", and explains its semantics via this transformation.

The input patterns in these rules are similar to `match` patterns, and
the output patterns assemble an s-expression using the matched parts in
the input. For example:

    (x y) --> (y x)

does the thing you expect it to do --- matches a parenthesized form with
two sub-forms, and produces a form with the two sub-forms swapped. The
rules for "`...`" on the left side are similar to `match`, as we have
seen many times, and on the right side it is used to *splice* a matched
sequence into the resulting expression and it is required to use the
`...` for sequence-matched pattern variables. For example, here is a
list of some patterns, and a description of how they match an input when
used on the left side of a transformation rule and how they produce an
output expression when they appear on the right side:

* `(x ...)`

  > **LHS:** matches a parenthesized sequence of zero or more
  > expressions, and the `x` pattern variable is bound to this whole
  > sequence; `match` analogy: `(match ? [(list x ...) ?])`
  >
  > **RHS:** when `x` is bound to a sequence, this will produce a
  > parenthesized expression containing this sequence; `match` analogy:
  > `(match ? [(list x ...) x])`

* `(x1 x2 ...)`

  > **LHS:** matches a parenthesized sequence of one or more
  > expressions, the first is bound to `x1` and the rest of the sequence
  > is bound to `x2`;
  >
  > `match` analogy: `(match ? [(list x1 x2 ...) ?])`
  >
  > **RHS:** produces a parenthesized expression that contains the
  > expression bound to `x1` first, then all of the expressions in the
  > sequence that `x2` is bound to;
  >
  > `match` analogy: `(match ? [(list x1 x2 ...) (cons x1 x2)])`

* `((x y) ...)`

  > **LHS:** matches a parenthesized sequence of 2-form parenthesized
  > sequences, binding `x` to all the first forms of these, and `y` to
  > all the seconds of these (so they will both have the same number of
  > items);
  >
  > `match` analogy: `(match ? [(list (list x y) ...) ?])`
  >
  > **RHS:** produces a list of forms where each one is made of
  > consecutive forms in the `x` sequence and consecutive forms in the
  > `y` sequence (both sequences should have the same number of
  > elements);
  >
  > `match` analogy:
  >
  >     (match ? [(list (list x y) ...)
  >               (map (lambda (x y) (list x y)) x y)])

Some examples of transformations that would be very tedious to write
code manually for:

* `((x y) ...) --> ((y x) ...)`

  Matches a sequence of 2-item sequences, produces a similar sequence
  with all of the nested 2-item sequences flipped.

* `((x y) ...) --> ((x ...) (y ...))`

  Matches a similar sequence, and produces a sequence of two sequences,
  one of all the first items, and one of the second ones.

* `((x y ...) ...) --> ((y ... x) ...)`

  Similar to the first example, but the nested sequences can have 1 or
  more items in them, and the nested sequences in the result have the
  first element moved to the end. Note how the `...` are nested: the
  rule is that for each pattern variable you count how many `...`s apply
  to it, and that tells you what it holds --- and you have to use the
  same `...` nestedness for it in the output template.

------------------------------------------------------------------------

This is solving the problems of easy code --- no need for `list`, `cons`
etc, not even for quasiquotes and tedious syntax massaging. But there
were other problems. First, there was a problem of bad scope, one that
was previously solved with a `gensym`:

    (define-macro (orelse <expr1> <expr2>)
      (let ((temp (gensym)))
        `(let ((,temp ,<expr1>))
           (if ,temp ,temp ,<expr2>))))

Translating this to `define-syntax` and `syntax-rules` we get something
simpler:

    (define-syntax orelse
      (syntax-rules ()
        [(orelse <expr1> <expr2>)
         (let ((temp <expr1>))
           (if temp temp <expr2>))]))

Even simpler, Racket has a macro called `define-syntax-rule` that
expands to a `define-syntax` combined with a `syntax-rules` --- using
it, we can write:

    (define-syntax-rule (orelse <expr1> <expr2>)
      (let ((temp <expr1>))
        (if temp temp <expr2>)))

This looks like like a function --- but you must remember that it is a
transformation rule specification which is a *very* different beast, as
we'll see.

The main thing here is that Racket takes care of making bindings follow
the lexical scope rules:

    (let ([temp 4])
      (orelse #f temp))

works fine. In fact, it fully respects the scoping rules: there is no
confusion between bindings that the macro introduces and bindings that
are introduced where the macro is used. (Think about different colors
for bindings introduced by the macro and other bindings.) It's also fine
with many cases that are much harder to cope with otherwise (eg, cases
where there is no `gensym` magic solution):

    (let ([if +])
      (orelse 1 1))

    (let ([if +])
      (if (orelse 1 1) 10 100)) ; two different `if's here

or combining both:

    (let ([if #f] [temp 4])
      (orelse if temp))

(You can try DrRacket's macro debugger to see how the various bindings
get colored differently.)

`define-macro` advocates will claim that it is difficult to make a macro
that intentionally plants a *known* identifier. Think about a `loop`
macro that has an `abort` that can be used inside its body. Or an
`if-it` form that is like `if`, but makes it possible to use the
condition's value in the "then" branch as an `it` binding. It is
possible with all Scheme macro systems to "break hygiene" in such ways,
and we will later see how to do this in Racket. However, Racket also
provides a better way to deal with such problems (think about `it` being
always "bound to a syntax error", but locally rebound in an `if-it`
form).

Scheme macros are said to be *hygienic* --- a term used to specify that
they respect lexical scope. There are several implementations of
hygienic macro systems across Scheme implementations, Racket uses the
one known as "syntax-case system", named after the `syntax-case`
construct that we discuss below.

> All of this can get much more important in the presence of a module
> system, since you can write a module that provides transformations
> rules, not just values and functions. This means that the concept of
> "a library" in Racket is something that doesn't exist in other
> languages: it's a library that has values, functions, as well as
> macros (or, "compiler plugins").

------------------------------------------------------------------------

The way that Scheme implementations achieve hygiene in a macro system is
by making it deal with more than just raw S-expressions. Roughly
speaking, it deals with syntax objects that are sort of a wrapper
structure around S-expression, carrying additional information. The
important part of this information when it gets to dealing with hygiene
is the "lexical scope" --- which can roughly be described as having
identifiers be represented as symbols plus a "color" which represents
the scope. This way such systems can properly avoid confusing
identifiers with the same name that come from different scopes.

There was also the problem of making debugging difficult, because a
macro can introduce errors that are "coming out of nowhere". In the
implementation that we work with, this is solved by adding yet more
information to these syntax objects --- in addition to the underlying
S-expression and the lexical scope, they also contain source location
information. This allows Racket (and DrRacket) to locate the source of a
specific syntax error, so locating the offending code is easy.
DrRacket's macro debugger heavily relies on this information to provide
a very useful tool --- since writing macros can easily become a hard
job.

Finally, there was the problem of writing bad macros. For example, it is
easy to forget that you're dealing with a macro definition and write:

    (define-syntax-rule (twice x) (+ x x))

just because you want to inline the addition --- but in this case you
end up duplicating the input expression which can have a disastrous
effect. For example:

    (twice (twice (twice (twice (twice (twice (twice (twice 1))))))))

expands to a *lot* of code to compile.

Another example is:

    (define-syntax-rule (with-increment var expr)
      (let ([var (add1 var)]) expr))
    ...
    (with-increment (* foo 2)
      ...code...)

the problem here is that (* foo 2) will be used as an identifier to be
bound by the `let` expression --- which can lead to a confusing syntax
error.

Racket provides many tools to help macro programmers --- in addition to
a user-interface tool like the macro debugger there are also
programmer-level tools where you can reject an input if it doesn't
contain an identifier at a certain place etc. Still, writing macros is
much harder than writing functions --- some of these problems are
inherent to the problem that macros solve; for example, you may *want* a
`twice` macro that replicates an expression. By specifying a
transformation to the core language, a macro writer has full control
over which expressions get evaluated and how, which identifiers are
binding instances, and how is the scope of the given expression is
shaped.


------------------------------------------------------------------------
# Meta Macros [Tuesday, April 1st]

One of the nice results of `syntax-rules` dealing with the subtle points
of identifiers and scope is that things works fine even when we "go up a
level". For example, the short `define-syntax-rule` form that we've seen
is *itself* a defined as a simple macro:

    (define-syntax define-syntax-rule
      (syntax-rules ()
        [(define-syntax-rule (name P ...) B)
         (define-syntax name
           (syntax-rules ()
             [(name P ...) B]))]))

In fact, this is very similar to something that we have already seen:
the `rewrite` form that we have used in Schlac is implemented in just
this way. The only difference is that `rewrite` requires an actual `=>`
token to separate the input pattern from the output template. If we just
use it in a syntax rule:

    (define-syntax rewrite
      (syntax-rules ()
        [(rewrite (name P ...) => B)
         (define-syntax name
           (syntax-rules ()
             [(name P ...) B]))]))

it won't work. Racket treats the above `=>` just like any identifier,
which in this case acts as a pattern variable which matches anything.
The solution to this is to list the `=>` as a keyword which is expected
to appear in the macro use as-is --- and that's what the mysterious `()`
of `syntax-rules` is used for: any identifier listed there is taken to
be such a keyword. This makes the following version

    (define-syntax rewrite
      (syntax-rules (=>)
        [(rewrite (name P ...) => B)
         (define-syntax name
           (syntax-rules ()
             [(name P ...) B]))]))

do what we want and throw a syntax error unless `rewrite` is used with
an actual `=>` in the proper place.


------------------------------------------------------------------------
# Lazy Constructions in an Eager Language [Tuesday, April 1st]

> [PLAI §37] (has some examples)

This is not really lazy evaluation, but it gets close, and provides the
core useful property of easy-to-use infinite lists.

    (define-syntax-rule (cons-stream x y)
      (cons x (lambda () y)))
    (define stream? pair?)
    (define null-stream null)
    (define null-stream? null?)
    ;; note that there are not proper lists in racket,
    ;; so we use car and cdr here
    (define stream-first car)
    (define (stream-rest s) ((cdr s)))

Using it:

    (define ones (cons-stream 1 ones))
    (define (stream-map f s)
      (if (null-stream? s)
        null-stream
        (cons-stream (f (stream-first s))
                     (stream-map f (stream-rest s)))))
    (define (stream-map2 f s1 s2)
      (if (null-stream? s1)
        null-stream
        (cons-stream (f (stream-first s1) (stream-first s2))
                     (stream-map2 f (stream-rest s1)
                                    (stream-rest s2)))))
    (define ints (cons-stream 0 (stream-map2 + ones ints)))

Actually, all Scheme implementations come with a generalized tool for
(local) laziness: a `delay` form that delays computation of its body
expression, and a `force` function that forces such promises. Here is a
naive implementation of this:

    (define-type promise
      [make-promise (-> Any)])

    (define-syntax-rule (delay expr)
      (make-promise (lambda () expr)))

    (define (force p)
      (cases p [(make-promise thunk) (thunk)]))

Proper definitions of `delay`/`force` cache the result --- and practical
ones can get pretty complex, for example, in order to allow tail calls
via promises.


------------------------------------------------------------------------
# Recursive Macros [Tuesday, April 1st]

Syntax transformations can be recursive. For example, we have seen how
`let*` can be implemented by a transformation that uses two rules, one
of which expands to another use of `let*`:

    (define-syntax let*
      (syntax-rules ()
        [(let* () body ...)
         (let () body ...)]
        [(let* ((x v) (xs vs) ...) body ...)
         (let ((x v)) (let* ((xs vs) ...) body ...))]))

When Racket expands a `let*` expression, the result may contain a new
`let*` which needs extending as well. An important implication of this
is that recursive macros are fine, as long as the recursive case is
using a *smaller* expression. This is just like any form of recursion
(or loop), where you need to be looping over a `well-founded` set of
values --- where each iteration uses a new value that is closer to some
base case.

For example, consider the following macro:

    (define-syntax-rule (while condition body ...)
      (when condition
        body ...
        (while condition body ...)))

It seems like this is a good implementation of a `while` loop --- after
all, if you were to implement it as a function using thunks, you'd write
very similar code:

    (define (while condition-thunk body-thunk)
      (when (condition-thunk)
        (body-thunk)
        (while condition-thunk body-thunk)))

But if you look at the nested `while` form in the transformation rule,
you'll see that it is exactly the same as the input form. This means
that this macro can never be completely expanded --- it specifies
infinite code! In practice, this makes the (Racket) compiler loop
forever, consuming more and more memory. This is unlike, for example,
the recursive `let*` rule which uses one less binding-value pair than
specified as its input.

The reason that the function version of `while` is fine is that it
iterates using the *same* code, and the condition thunk will depend on
some state that converges to a base case (usually the body thunk will
perform some side-effects that makes the loop converge). But in the
macro case there is *no* evaluation happening, if the transformed syntax
contains the same input pattern, we end up having a macro that expands
infinitely.

The correct solution for a `while` macro is therefore to use plain
recursion using a local recursive function:

    (define-syntax-rule (while condition body ...)
      (letrec ([loop (lambda ()
                       (when condition
                         body ...
                         (loop)))])
        (loop)))

A popular way to deal with macros like this that revolve around a
specific control flow is to separate them into a function that uses
thunks, and a macro that does nothing except wrap input expressions as
thunks. In this case, we get this solution:

    (define (while/proc condition-thunk body-thunk)
      (when (condition-thunk)
        (body-thunk)
        (while/proc condition-thunk body-thunk)))

    (define-syntax-rule (while condition body ...)
      (while/proc (lambda () condition)
                  (lambda () body ...)))


------------------------------------------------------------------------
## Another example: a simple loop [Tuesday, April 1st]

Here is an implementation of a macro that does a simple arithmetic loop:

    (define-syntax for
      (syntax-rules (= to do)
        [(for x = m to n do body ...)
         (letrec ([loop (lambda (x)
                          (when (<= x n)
                            body ...
                            (loop (+ x 1))))])
           (loop m))]))

(Note that this is not complete code: it suffers from the usual problem
of multiple evaluations of the `n` expression. We'll deal with it soon.)

This macro combines both control flow and lexical scope. Control flow is
specified by the loop (done, as usual in Racket, as a tail-recursive
function) --- for example, it determines how code is iterated, and it
also determines what the `for` form will evaluate to (it evaluates to
whatever `when` evaluates to, the void value in this case). Scope is
also specified here, by translating the code to a function --- this code
makes `x` have a scope that covers the body so this is valid:

    (for i = 1 to 3 do (printf "i = ~s\n" i))

but it also makes the boundary expression `n` be in this scope, making
this:

    (for i = 1 to (if (even? i) 10 20) do (printf "i = ~s\n" i))

valid. In addition, while evaluating the condition on each iteration
might be desirable, in most cases it's not --- consider this example:

    (for i = 1 to (read) do (printf "i = ~s\n" i))

This is easily solved by using a `let` to make the expression evaluate
just once:

    (define-syntax for
      (syntax-rules (= to do)
        [(for x = m to n do body ...)
         (let ([m* m]  ; execution order
               [n* n])
           (letrec ([loop (lambda (x)
                            (when (<= x n*)
                              body ...
                              (loop (+ x 1))))])
             (loop m*)))]))

which makes the previous use result in a "`reference to undefined
identifier: i`" error.

Furthermore, the fact that we have a hygienic macro system means that it
is perfectly fine to use nested `for` expressions:

    (for a = 1 to 9 do
      (for b = 1 to 9 do (printf "~s,~s " a b))
      (newline))

The transformation is, therefore, completely specifying the semantics of
this new form.

Extending this syntax is easy using multiple transformation rules ---
for example, say that we want to extend it to have a `step` optional
keyword. The standard idiom is to have the step-less pattern translated
into one that uses `step 1`:

    (for x = m to n do body ...)
    --> (for x = m to n step 1 do body ...)

Usually, you should remember that `syntax-rules` tries the patterns one
by one until a match is found, but in this case there is no problems
because the keywords make the choice unambiguous:

    (define-syntax for
      (syntax-rules (= to do step)
        [(for x = m to n do body ...)
         (for x = m to n step 1 do body ...)]
        [(for x = m to n step d do body ...)
         (let ([m* m]
               [n* n]
               [d* d])
           (letrec ([loop (lambda (x)
                            (when (<= x n*)
                              body ...
                              (loop (+ x d*))))])
             (loop m*)))]))

    (for i = 1 to 10 do (printf "i = ~s\n" i))
    (for i = 1 to 10 step 2 do (printf "i = ~s\n" i))

We can even extend it to do a different kind of iteration, for example,
iterate over list:

    (define-syntax for
      (syntax-rules (= to do step in)
        [(for x = m to n do body ...)
         (for x = m to n step 1 do body ...)]
        [(for x = m to n step d do body ...)
         (let ([m* m]
               [n* n]
               [d* d])
           (letrec ([loop (lambda (x)
                            (when (<= x n*)
                              body ...
                              (loop (+ x d*))))])
             (loop m*)))]
        ;; list
        [(for x in l do body ...)
         (for-each (lambda (x) body ...) l)]))

    (for i in (list 1 2 3 4) do (printf "i = ~s\n" i))

    (for i in (list 1 2 3 4) do
      (for i = 0 to i do (printf "i = ~s  " i))
      (newline))


------------------------------------------------------------------------
## Yet Another: List Comprehension [Tuesday, April 1st]

At this point it's clear that macros are a powerful language feature
that makes it relatively easy to implement new features, making it a
language that is easy to use as a tool for quick experimentation with
new language features. As an example of a practical feature rather than
a toy, let's see how we can implement [Python's list comprehenions].
These are expressions that conveniently combine `map`, `filter`, and
nested uses of both.

First, a simple implementation that uses only the `map` feature:

    (define-syntax list-of
      (syntax-rules (for in)
        [(list-of EXPR for ID in LIST)
         (map (lambda (ID) EXPR)
              LIST)]))

    (list-of (* x x) for x in (range 10))

It is a good exercise to see how everything that we've seen above plays
a role here. For example, how we get the `ID` to be bound in `EXPR`.

Next, add a condition expression with an `if` keyword, and implemented
using a `filter`:

    (define-syntax list-of
      (syntax-rules (for in if)
        [(list-of EXPR for ID in LIST if COND)
         (map (lambda (ID) EXPR)
              (filter (lambda (ID) COND) LIST))]
        [(list-of EXPR for ID in LIST)
         (list-of EXPR for ID in LIST if #t)]))

    (list-of (* x x) for x in (range 10) if (odd? x))

Again, go over it and see how the binding structure makes the identifier
available in both expressions. Note that since we're just playing around
we're not paying too much attention to performance etc. (For example, if
we cared, we could have implemented the `if`-less case by not using
`filter` at all, or we could implement a `filter` that accepts `#t` as a
predicate and in that case just returns the list, or even implementing
it as a macro that identifies a `(lambda (_) #t)` pattern and expands to
just the list (a bad idea in general).)

The last step: Python's comprehension accepts multiple `for`-`in`s for
nested loops, possibly with `if` filters at each level:

    (define-syntax list-of
      (syntax-rules (for in if)
        [(list-of EXPR for ID in LIST if COND)
         (map (lambda (ID) EXPR)
              (filter (lambda (ID) COND) LIST))]
        [(list-of EXPR for ID in LIST)
         (list-of EXPR for ID in LIST if #t)]
        [(list-of EXPR for ID in LIST for MORE ...)
         (list-of EXPR for ID in LIST if #t for MORE ...)]
        [(list-of EXPR for ID in LIST if COND for MORE ...)
         (apply append (map (lambda (ID) (list-of EXPR for MORE ...))
                            (filter (lambda (ID) COND) LIST)))]))

A collection of examples that I found in the Python docs and elsewhere,
demonstrating all of these:

    ;; [x**2 for x in range(10)]
    (list-of (* x x) for x in (range 10))
    ;; [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
    (list-of (list x y) for x in '(1 2 3) for y in '(3 1 4)
                        if (not (= x y)))

    (define (round-n x n) ; python-like round to n digits
      (define 10^n (expt 10 n))
      (/ (round (* x 10^n)) 10^n))
    ;; [str(round(pi, i)) for i in range(1, 6)]
    (list-of (number->string (round-n pi i)) for i in (range 1 6))

    (define matrix
      '((1 2 3 4)
        (5 6 7 8)
        (9 10 11 12)))
    ;; [[row[i] for row in matrix] for i in range(4)]
    (list-of (list-of (list-ref row i) for row in matrix)
             for i in (range 4))

    (define text '(("bar" "foo" "fooba")
                   ("Rome" "Madrid" "Houston")
                   ("aa" "bb" "cc" "dd")))
    ;; [y for x in text if len(x)>3 for y in x]
    (list-of y for x in text if (> (length x) 3) for y in x)
    ;; [y for x in text for y in x if len(y)>4]
    (list-of y for x in text for y in x if (> (string-length y) 4))
    ;; [y.upper() for x in text if len(x) == 3
    ;;            for y in x if y.startswith('f')]
    (list-of (string-upcase y) for x in text if (= (length x) 3)
                               for y in x if (regexp-match? #rx"^f" y))

[Python's list comprehenions]:
  https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions


------------------------------------------------------------------------
# Problems of `syntax-rules` Macros [Tuesday, April 1st]

As we've seen, using `syntax-rules` solves many of the problems of
macros, but it comes with a high price tag: the macros are "just"
rewrite rules. As rewrite rules they're pretty sophisticated, but it
still loses a huge advantage of what we had with `define-macro` --- the
macro code is no longer Racket code but a simple language of rewrite
rules.

There are two big problems with this which we will look into now.
(DrRacket's macro stepper tool can be very useful in clarifying these
examples.) The first problem is that in some cases we want to perform
computations at the macro level --- for example, consider a `repeat`
macro that needs to expand like this:

    (repeat 1 E)  -->  (begin E)
    (repeat 2 E)  -->  (begin E E)
    (repeat 3 E)  -->  (begin E E E)
    ...

With a `syntax-rules` macro we can match over specific integers, but we
just cannot do this with *any* integer. Note that this specific case can
be done better via a function --- better by not replicating the
expression:

    (define (repeat/proc n thunk)
      (when (> n 0) (thunk) (repeat/proc (sub1 n) thunk)))
    (define-syntax-rule (repeat N E)
      (repeat/proc N (lambda () E)))

or even better, assuming the above `for` is already implemented:

    (define-syntax-rule (repeat N E)
      (for i = 1 to N do E))

But still, we want to have the ability to do such computation. A
similar, and perhaps better example, is better error reporting. For
example, the above `for` implementation blindly expands its input, so:

    > (for 1 = 1 to 3 do (printf "i = ~s\n" i))
    lambda: not an identifier in: 1

we get a bad error message in terms of `lambda`, which is breaking
abstraction (it comes from the expansion of `for`, which is an
implementation detail), and worse --- it is an error about something
that the user didn't write.

Yet another aspect of this problem is that sometimes we need to get
creative solutions where it would be very simple to write the
corresponding Racket code. For example, consider the problem of writing
a `rev-app` macro --- (rev-app F E ...) should evaluate to a function
similar to (F E ...), except that we want the evaluation to go from
right to left instead of the usual left-to-right that Racket does. This
code is obviously very broken:

    (define-syntax-rule (rev-app F E ...)
      (let (reverse ([x E] ...))
        (F x ...)))

because it *generates* a malformed `let` form --- there is no way for
the macro expander to somehow know that the `reverse` should happen at
the transformation level. In this case, we can actually solve this using
a helper macro to do the reversing:

    (define-syntax-rule (rev-app F E ...)
      (rev-app-helper F (E ...) ()))
    (define-syntax rev-app-helper
      (syntax-rules ()
        ;; this rule does the reversing, collecting the reversed
        ;; sequence in the last part
        [(rev-app-helper F (E0 E ...) (E* ...))
         (rev-app-helper F (E ...) (E0 E* ...))]
        ;; and this rule fires up when we're done with the reversal
        [(rev-app-helper F () (E ...))
         (let ([x E] ...)
           (F x ...))]))

There are still problems with this --- it complains about `x ...`
because there is a single `x` there rather than a sequence of them; and
even if it did somehow work, we also need the `x`s in that last line in
the original order rather than the reversed one. So the solution is
complicated by collecting new `x`s while reversing --- and since we need
them in both orders, we're going to collect both orders:

    (define-syntax-rule (rev-app F E ...)
      (rev-app-helper F (E ...) () () ()))
    (define-syntax rev-app-helper
      (syntax-rules ()
        ;; this rule does the reversing, collecting the reversed
        ;; sequence in the last part -- also make up new identifiers
        ;; and collect them in *both* directions (`X' is the straight
        ;; sequence of identifiers, `X*' is the reversed one, and `E*'
        ;; is the reversed expression sequence); note that each
        ;; iteration introduces a new identifier called `t'
        [(rev-app-helper F (E0 E ...) (X ...  ) (  X* ...) (   E* ...))
         (rev-app-helper F (   E ...) (X ... t) (t X* ...) (E0 E* ...))]
        ;; and this rule fires up when we're done with the reversal and
        ;; the generation
        [(rev-app-helper F () (x ...) (x* ...) (E* ...))
         (let ([x* E*] ...)
           (F x ...))]))

    ;; see that it works
    (define (show x) (printf ">>> ~s\n" x) x)
    (rev-app list (show 1) (show 2) (show 3))

So, this worked, but in this case the simplicity of the `syntax-rules`
rewrite language worked against us, and made a very inconvenient
solution. This could have been much easier if we could just write a
"meta-level" reverse, and a use of `map` to generate the names.

... And all of that was just the first problem. The second one is even
harder: `syntax-rules` is *designed* to avoid all name captures, but
what if we *want* to break hygiene? There are some cases where you want
a macro that "injects" a user-visible identifier into its result. The
most common (and therefore the classic) example of this is an anaphoric
`if` macro, that binds `it` to the result of the test (which can be any
value, not just a boolean):

    ;; find the element of `l' that is immediately following `x'
    ;; (assumes that if `x' is found, it is not the last one)
    (define (after x l)
      (let ([m (member x l)])
        (if m
          (second m)
          (error 'after "~s not found in ~s" x l))))

which we want to turn to:

    ;; find the element of `l' that is immediately following `x'
    ;; (assumes that if `x' is found, it is not the last one)
    (define (after x l)
      (if (member x l)
        (second it)
        (error 'after "~s not found in ~s" x l)))

The obvious definition of `if-it' doesn't work:

    (define-syntax-rule (if-it E1 E2 E3)
      (let ([it E1]) (if it E2 E3)))

The reason it doesn't work should be obvious now --- it is *designed* to
avoid the `it` that the macro introduced from interfering with the `it`
that the user code uses.

Next, we'll see Racket's "low level" macro system, which can later be
used to solve these problems.


------------------------------------------------------------------------
# Racket's "Low-Level" Macros [Tuesday, April 1st]

As we've previously seen, `syntax-rules` *creates* transformation
functions --- but there are other more direct ways to write these
functions. These involve writing a function directly rather than
creating one with `syntax-rules` --- and because this is a more
low-level approach than using `syntax-rules` to generate a transformer,
it is called a "low level macro system". All Scheme implementations have
such low-level systems, and these systems vary from one to the other.
They all involve some particular type that is used as "syntax"
--- this type is always related to S-expressions, but it cannot be the
simple `define-macro` tool that we've seen earlier if we want to avoid
the problems of capturing identifiers.

Historical note: For a very long time the Scheme standard had avoided a
concrete specification of this low-level system, leaving `syntax-rules`
as the only way to write portable Scheme code. This had lead some people
to explore more thoroughly the things that can be done with just
`syntax-rules` rewrites, even beyond the examples we've seen. As it
turns out, there's a lot that can be done with it --- in fact, it is
possible to write rewrite rules that [implement a lambda
calculus](http://www.phyast.pitt.edu/~micheles/syntax-rules.pdf), making
it possible to write things that look kind of like "real code". This is,
however, awkward to say the least, and redundant with a macro system
that can use the full language for arbitrary computations. It has also
became less popular recently, since R6RS dictates something that is
known as a "syntax-case macro system" (not really a good name, since
`syntax-case` is just a tool in this system).

Racket uses an extended version of this `syntax-case` system, which is
what we will discuss now. In the Racket macro system, "syntax" is a new
type, not just S-expressions as is the case with `define-macro`. The way
to think about this type is as a wrapper around S-expressions, where the
S-expression is the "raw" symbolic form of the syntax, and a bunch of
"stuff" is added. Two important bits of this "stuff" are the source
location information for the syntax, and its lexical scope. The source
location is what you'd expect: the file name for the syntax (if it was
read from a file), its position in the file, and its line and column
numbers; this information is mostly useful for reporting errors. The
lexical scope information is used in a somewhat peculiar way: there is
no direct way to access it, since usually you don't need to do so ---
instead, for the rare cases where you do need to manipulate it, you
*copy* the lexical scope of an existing syntax to create another. This
allows the macro interface to be usable without specification of a
specific representation for the scope.

The main piece of functionality in this system is `syntax-case` (which
has lead to its common name) --- a form that is used to deconstruct the
input via pattern-matching similar to `syntax-rules`. In fact, the
syntax of `syntax-case` looks very similar to the syntax of
`syntax-rules` --- there are zero or more parenthesized keywords, and
then clauses that begin with the same kind of patterns to match the
syntax against. The first obvious difference is that the syntax to be
matched is given explicitly:

    (syntax-case <value-to-match> (<keywords>)
      [<pattern> <result>]
      ...)

A macro is written as a plain function, usually used as the value in a
`define-syntax` form (but it could also be used in plain helper
functions). For example, here's how the `orelse` macro is written using
this:

    (define-syntax orelse
      (lambda (stx)
        (syntax-case stx ()
          [(orelse x y) ???])))

Racket's `define-syntax` can also use the same syntactic sugar for
functions as `define`:

    (define-syntax (orelse stx)
      (syntax-case stx ()
        [(orelse x y) ???]))

The second significant change from `syntax-rules` is that the
right-hand-side expressions in the branches are not patterns. Instead,
they're plain Racket expressions. In this example (as in most uses of
`syntax-case`) the result of the `syntax-case` form is going to be the
result of the macro, and therefore it should return a piece of syntax.
So far, the only piece of syntax that we see in this code is just the
input `stx` --- and returning that will get the macro expander in an
infinite loop (because we're essentially making a transformer for
`orelse` expressions that expands the syntax to itself).

To return a *new* piece of syntax, we need a way to write new syntax
values. The common way to do this is using a new special form: `syntax`.
This form is similar to `quote` --- except that instead of an
S-expression, it results in a syntax. For example, in this code:

    (define-syntax (orelse stx)
      (printf "Expanding ~s\n" stx)
      (syntax-case stx ()
        [(orelse x y) (syntax (printf "Running an orelse\n"))]))

the first printout happens during macro expansion, and the second is
part of the generated code. Like `quote`, `syntax` has a convenient
abbreviation --- "`#'`":

    (define-syntax (orelse stx)
      (printf "Expanding ~s\n" stx)
      (syntax-case stx ()
        [(orelse x y) #'(printf "Running an orelse\n")]))

Now the question is how we can get the actual macro working. The thing
is that `syntax` is not completely quoting its contents as a syntax ---
there could be some meta identifiers that are bound as "pattern
variables" in the `syntax-case` pattern that was matched for the current
clause --- in this case, we have `x` and `y` as such pattern variables.
(Actually, `orelse` is a pattern variable too, but this doesn't matter
for our purpose.) Using these inside a `syntax` will have them replaced
by the syntax that they matched against. The complete `orelse`
definition is therefore very easy:

    (define-syntax (orelse stx)
      (syntax-case stx ()
        [(orelse <expr1> <expr2>)
         #'(let ((temp <expr1>))
             (if temp temp <expr2>))]))

The same treatment of `...` holds here too --- in the matching pattern
they specify 0 or more occurrences of the preceding pattern, and in the
output template they mean that the matching sequence is "spliced" in.
Note that `syntax-rules` is now easy to define as a macro that expands
to a function that uses `syntax-case` to do the actual rewrite work:

    (define-syntax (syntax-rules stx)
      (syntax-case stx ()
        [(syntax-rules (keyword ...)
           [pattern template]
           ...)
         #'(lambda (stx)
             (syntax-case stx (keyword ...)
               [pattern #'template]
               ...))]))


------------------------------------------------------------------------
## Solving the `syntax-rules` problems [extra] [Tuesday, April 1st]

So far it looks like we didn't do anything new, but the important change
is already in: the fact that the results of a macro is a plain Racket
expression mean that we can now add more API functionality for dealing
with syntax values. There is no longer a problem with running
"meta-level" code vs generated runtime code: anything that is inside a
`syntax` (anything that is quoted with a "`#'`") is generated code, and
the rest is code that is executed when the macro expands. We will now
introduce some of the Racket macro API by demonstrating the solutions to
the `syntax-rules` problem that were mentioned earlier.

First of all, we've talked about the problem of reporting good errors.
For example, make this:

    (for 1 = 1 to 3 do ...)

throw a proper error instead of leaving it for `lambda` to complain
about. To make it easier to play with, we'll use a simpler macro:

    (define-syntax fun
      (syntax-rules (->)
        [(_ id -> E) (lambda (id) E)])) ; _ matches the head `fun'

and using an explicit function:

    (define-syntax (fun stx)
      (syntax-case stx (->)
        [(_ id -> E) #'(lambda (id) E)]))

One of the basic API functions is `syntax-e` --- it takes in a syntax
value and returns the S-expression that it wraps. In this case, we can
pull out the identifier from this, and check that it is a valid
identifier using `symbol?` on what it wraps:

    (define-syntax (fun stx)
      (syntax-case stx (->)
        [(_ id -> E)
         (if (symbol? (syntax-e (cadr (syntax-e stx))))
           #'(lambda (id) E)
           (error 'fun "bad syntax: expecting an identifier, got ~s"
                       (cadr (syntax-e stx))))]))

The error is awkward though --- it doesn't look like the usual kind of
syntax errors that Racket throws: it's shown in an ugly way, and its
source is not properly highlighted. A better way to do this is to use
`raise-syntax-error' --- it takes an error message, the offending
syntax, and the offending part of this syntax:

    (define-syntax (fun stx)
      (syntax-case stx (->)
        [(_ id -> E)
         (if (symbol? (syntax-e (cadr (syntax-e stx))))
           #'(lambda (id) E)
           (raise-syntax-error
            'fun "bad syntax: expecting an identifier"
            stx (cadr (syntax-e stx))))]))

Another inconvenient issue is with pulling out the identifier. Consider
that `#'(lambda (id) E)` is a new piece of syntax that has the supposed
identifier in it --- we pull it from that instead of from `stx`, but it
would be even easier with `#'(id)`, and even easier than that with just
`#'id` which will be just the identifier:

    (define-syntax (fun stx)
      (syntax-case stx (->)
        [(_ id -> E)
         (if (symbol? (syntax-e #'id))
           #'(lambda (id) E)
           (raise-syntax-error
            'fun "bad syntax: expecting an identifier"
            stx #'id))]))

Also, checking that something is an identifier is common enough that
there is another predicate for this (the combination of `syntax-e` and
`symbol?`) --- `identifier?`:

    (define-syntax (fun stx)
      (syntax-case stx (->)
        [(_ id -> E)
         (if (identifier? #'id)
           #'(lambda (id) E)
           (raise-syntax-error
            'fun "bad syntax: expecting an identifier"
            stx #'id))]))

As a side note, checking the input pattern for validity is very common,
and in some cases might be needed to discriminate patterns (eg, one
result when `id` is an identifier, another when it's not). For this,
`syntax-cases` clauses have "guard expressions" --- so we can write the
above more simply as:

    (define-syntax (fun stx)
      (syntax-case stx (->)
        [(_ id -> E)
         (identifier? #'id)
         #'(lambda (id) E)]))

This, however, produces a less informative "bad syntax" error, since
there is no way to tell what the error message should be. (There is a
relatively new Racket tool called `syntax-parse` where such requirements
can be specified and a proper error message is generated on bad inputs.)

We can now resolve the `repeat` problem --- create a `(repeat N E)`
macro:

    (define-syntax (repeat stx)
      (define (n-copies n expr)
        (if (> n 0) (cons expr (n-copies (sub1 n) expr)) null))
      (syntax-case stx ()
        [(_ N E)
         (integer? (syntax-e #'N))
         #'(begin (n-copies (syntax-e #'N) #'E))]))

(Note that we can define an internal helper function, just like we do
with plain functions.) But this doesn't quite work (and if you try it,
you'll see an interesting error message) --- the problem is that we're
*generating* code with a call to `n-copies` in it, instead of actually
calling it. The problem is that we need to take the list that `n-copies`
generates, and somehow "plant" it in the resulting syntax. So far the
only things that were planted in it are pattern variables --- and we can
actually use another `syntax-case` to do just that: match the result of
`n-copies` against a pattern variable, and then use that variable in the
final syntax:

    (define-syntax (repeat stx)
      (define (n-copies n expr)
        (if (> n 0) (cons expr (n-copies (sub1 n) expr)) null))
      (syntax-case stx ()
        [(_ N E)
         (number? (syntax-e #'N))
         (syntax-case (n-copies (syntax-e #'N) #'E) ()
           [(expr ...) #'(begin expr ...)])]))

This works --- but one thing to note here is that `n-copies` returns a
list, not a syntax. The thing is that `syntax-case` will automatically
"coerce" S-expressions into a syntax in some way, easy to do in this
case since we only care about the elements of the list, and those are
all syntaxes.

However, this use of `syntax-case` as a pattern variable binder is
rather indirect, enough that it's hard to read the code. Since this is a
common use case, there is a shorthand for that too: `with-syntax`. It
looks as a kind of a `let`-like form, but instead of binding plain
identifiers, it binds pattern identifiers --- and in fact, the things to
be bound are themselves patterns:

    (define-syntax (repeat stx)
      (define (n-copies n expr)
        (if (> n 0) (cons expr (n-copies (sub1 n) expr)) null))
      (syntax-case stx ()
        [(_ N E)
         (number? (syntax-e #'N))
         (with-syntax ([(expr ...) (n-copies (syntax-e #'N) #'E)])
           #'(begin expr ...))]))

Note that there is no need to implement `with-syntax` as a primitive
form --- it is not too hard to implement it as a macro that expands to
the actual use of `syntax-case`. (In fact, you can probably guess now
that the Racket core language is much smaller than it seems, with large
parts that are implemented as layers of macros.)

There is one more related group of shorthands that is relevant here:
`quasisyntax`, `unsyntax`, and `unsyntax-splicing`. These are analogous
to the quoting forms by the same names, and they have similar
shorthands: "`` #` ``", "`#,`" and "`#,@`". They could be used to
implement this macro:

    (define-syntax (repeat stx)
      (define (n-copies n expr)
        (if (> n 0) (cons expr (n-copies (sub1 n) expr)) null))
      (syntax-case stx ()
        [(_ N E)
         (number? (syntax-e #'N))
         #`(begin #,@(n-copies (syntax-e #'N) #'E))]))

[As you might suspect now, these new forms are also implemented as
macros, which expand to the corresponding uses of `with-syntax`, which
in turn expand into `syntax-case` forms.]

We now have almost enough machinery to implement the `rev-app` macro,
and compare it to the original (complex) version that used
`syntax-rules`. The only thing that is missing is a way to generate a
number of new identifiers --- which we achieved earlier by a number of
macro expansion (each expansion of a macro that has a new identifier `x`
will have this identifier different from other expansions, which is why
it worked). Racket has a function for this: `generate-temporaries`.
Since it is common to generate temporaries for input syntaxes, the
function expects an input syntax that has a list as its S-expression
form (or a plain list).

    (define-syntax (rev-app stx)
      (syntax-case stx ()
        [(_ F E ...)
         (let ([temps (generate-temporaries #'(E ...))])
           (with-syntax ([(E* ...) (reverse (syntax-e #'(E ...)))]
                         [(x  ...) temps]
                         [(x* ...) (reverse temps)])
             #'(let ([x* E*] ...)
                 (F x ...))))]))

    ;; see that it works
    (define (show x) (printf ">>> ~s\n" x) x)
    (rev-app list (show 1) (show 2) (show 3))

Note that this is not shorter than the `syntax-rules` version, but it is
easier to read since `reverse` and `generate-temporaries` have an
obvious direct intention, eliminating the need to wonder through rewrite
rules and inferring how they do their work. In addition, this macro
expands in one step (use the macro stepper to compare it with the
previous version), which makes it much more efficient.


------------------------------------------------------------------------
## Breaking Hygiene, How Bad is it? [extra] [Tuesday, April 1st]

We finally get to address the second deficiency of `syntax-rules` ---
its inability to intentionally capture an identifier so it is visible in
user code. Let's start with the simple version, the one that didn't
work:

    (define-syntax-rule (if-it E1 E2 E3)
      (let ([it E1]) (if it E2 E3)))

and translate it to `syntax-case`:

    (define-syntax (if-it stx)
      (syntax-case stx ()
        [(if-it E1 E2 E3)
         #'(let ([it E1]) (if it E2 E3))]))

The only problem here is that the `it` identifier is introduced by the
macro, or more specifically, by the `syntax` form that makes up the
return syntax. What we need now is a programmatic way to create an
identifier with a lexical context that is different than the default. As
mentioned above, Racket's syntax system (and all other `syntax-case`
systems) doesn't provide a direct way to manipulate the lexical context.
Instead, it provides a way to create a piece of syntax by copying the
lexical scope of another one --- and this is done with the
`datum->syntax` function. The function consumes a syntax value to get
the lexical scope from, and a "datum" which is an S-expression that can
contain syntax values. The result will have these syntax values as given
on the input, but raw S-expressions will be converted to syntaxes, using
the given lexical context. In the above case, we need to convert an `it`
symbol into the same-named identifier, and we can do that using the
lexical scope of the input syntax. As we've seen before, we use
`with-syntax` to inject the new identifier into the result:

    (define-syntax (if-it stx)
      (syntax-case stx ()
        [(if-it E1 E2 E3)
         (with-syntax ([it (datum->syntax stx 'it)])
           #'(let ([it E1]) (if it E2 E3)))]))

We can even control the scope of the user binding --- for example, it
doesn't make much sense to have `it` in the `else` branch. We can do
this by first binding a plain (hygienic) identifier to the result, and
only bind `it` to that when needed:

    (define-syntax (if-it stx)
      (syntax-case stx ()
        [(if-it E1 E2 E3)
         (with-syntax ([it (datum->syntax stx 'it)])
           #'(let ([tmp E1]) (if tmp (let ([it tmp]) E2) E3)))]))

[A relevant note: Racket provides something that is known as "The Macro
Writer's Bill of Rights" --- in this case, it guarantees that the extra
`let` does not imply a runtime or a memory overhead.]

This works --- and it's a popular way for creating such user-visible
bindings. However, breaking hygiene this way can lead to some confusing
problems. Such problems are usually visible when we try to compose
macros --- for example, say that we want to create a `cond-it` macro,
the anaphoric analogue of `cond`, which binds `it` in each branch. It
seems that an obvious way of doing this is by layering it on top of
`if-it` --- it should even be simple enough to be defined with
`syntax-rules`:

    (define-syntax cond-it
      (syntax-rules (else)
        [(_ [test1 expr1 ...] [tests exprs ...] ...)
         (if-it test1
           (begin expr1 ...)
           (cond-it [tests exprs ...] ...))]
        ;; two end cases -- one with an `else' and one without
        [(_ [else expr ...]) (begin expr ...)]
        [(_) (void)]))

Surprisingly, this does not work! Can you see what went wrong?

The problem lies in how the `it` identifier is generated --- it used the
lexical context of the whole `if-it` expression, which seemed like
exactly what we wanted. But in this case, the `if-it` expression is
coming from the `cond-it` macro, not from the user input. Or to be more
accurate: it's the `cond-it` macro which is the user of `if-it`, so `it`
is visible to `cond-it`, but not to its own users...

Note that these anaphoric macros are a popular example, but these
problems do pop up elsewhere too. For example, imagine a loop macro that
wants to bind `break` unhygienically, a class macro that binds `this`,
and many others.

How can we solve this? There are several ways for this:

* Don't break hygiene. For example, instead of `if-it` and `cond-it`
  forms that have an implicit `it`, use forms with an explicit
  identifiers. For example: `(if* it <test> <then> <else>)`. This might
  be a little more verbose at times, but it makes everything behave very
  well, since the identifiers always have the right scope.

* Try to patch things up with a little more unhygienic in your macros.
  In this case, try to make `cond-it` introduce `if-it` unhygienically,
  so when it introduces `it` in its own turn, it will be the right one.
  This is bad, since we started trying to get hygienic macros, and there
  are no easy discounts. (For example, what if there's a different
  `if-it` that is used at the place where `cond-it` is used?) In fact,
  the unhygienic `define-macro` that we've seen is an extreme example of
  this: there is no lexical scope anywhere; so `it` is the same
  identifier no matter where it's introduced. But as we've seen, this
  means that hygiene is always broken when possible.

* Try to make `cond-it` come up with its own unhygienic `it`, then bind
  this `it` to the `it` that `if-it` creates. This can work but on one
  hand it's difficult and fragile to write such code, and on the other
  hand it defeats the simplicity of macros.

* Finally, Racket provides an elegant solution in the form of *syntax
  parameters*. The idea is to avoid the unhygienic binding: have a
  single global binding for `it`, and change the meaning of this binding
  on uses of `if-it`. (If you're interested, see "Keeping it Clean with
  Syntax Parameters" for details.)


------------------------------------------------------------------------
# Macros in Racket's Module System [extra] [Tuesday, April 1st]

> Not in PLAI

One of the main things that Racket pioneered is integrating its syntax
system with its module system. In plain Racket (`#lang racket`, not the
course languages), every file is a module that can `provide` some
functionality, for when you put this code in a file:

    #lang racket
    (provide plus)
    (define (plus x y) (+ x y))

You get a library that gives you a `plus` function. This is just the
usual thing that you'd expect from a library facility in a language ---
but Racket allows you to do the same with syntax definitions. For
example, if we add the following to this file:

    (provide with)
    (define-syntax-rule (with [x V] E)
      (let ([x V]) E))

we --- the users of this library --- also get to have a `with` binding,
which is a "FLANG-compatibility" macro that expands into a `let`. Now,
on a brief look, this doesn't seem all too impressive, but consider the
fact that `with` is actually a translation function that lives at the
syntax level, as a kind of a compiler plugin, and you'll see that this
is not as trivial as it seems. Racket arranges to do this with a concept
of *instantiating* code at the compiler level, so libraries are used in
two ways: either the usual thing as a runtime instantiation, or at
compile time.


------------------------------------------------------------------------
# Defining Languages in Racket [extra] [Tuesday, April 1st]

But then Racket takes this concept even further. So far, we treated the
thing that follows a `#lang` as a kind of a language specification ---
but the more complete story is that this specification is actually just
a *module*. The only difference between such modules like `racket` or
`pl` and "library modules" as our above file is that language modules
provide a bunch of functionality that is specific to a language
implementation. However, you don't need to know about these things up
front: instead, there's a few tools that allow you to provide everything
that some other module provides --- if we add this to the above:

    (provide (all-from-out racket))

then we get a library that provides the same two bindings as above
(`plus` and `with`) --- *in addition* to everything from the `racket`
library (which it got from its own `#lang racket` line).

To use this file as a language, the last bit that we need to know about
is the actual concrete level syntax. Racket provides an `s-exp` language
which is a kind of a meta language for reading source code in the usual
S-expression syntax. Assuming that the above is in a file called
`mylang.rkt`, we can use it (from a different file in the same
directory) as follows:

    #lang s-exp "mylang.rkt"

which makes the language of this file be (a) read using the S-expression
syntax, and (b) get its bindings from our module, so

    #lang s-exp "mylang.rkt"
    (with [x 10] (* x 4))

will show a result of `40`.

So far this seems like just some awkward way to get to the same
functionality as a simple library --- but now we can use more tools to
make things more interesting. First, we can provide everything from
`racket` *except* for `let` --- change the last `provide` to:

    (provide (except-out (all-from-out racket) let))

Next, we can provide our `with` but make it have the name `let` instead
--- by replacing that `(provide with)` with:

    (provide (rename-out [with let]))

The result is a language that is the same as Racket, except that it has
an additional `plus` "built-in" function, and its `let` syntax is
different, as specified by our macro:

    #lang s-exp "mylang.rkt"
    (let [x 10] (plus x 4))

To top things off, there are a few "special" implicit macros that Racket
uses. One of them, `#%app`, is a macro that is used implicitly whenever
there's an expression that looks like a function application. In our
terms, that's the `Call` AST node that gets used whenever a braced-form
isn't one of the known forms. If we override this macro in a similar way
that we did for `let`, we're essentially changing the semantics of
application syntax. For example, here's a definition that makes it
possible to use a `@` keyword to get a list of results of applying a
function on several arguments:

    (define-syntax my-app
      (syntax-rules (@)
        [(_ F @ E ...)
         (list (F E) ...)]
        [(_ x ...) (x ...)]))

This makes the `(my-app add1 @ 1 2)` application evaluate to `'(2 3)`,
but if `@` is not used (as the second subexpression), we get the usual
function application. (Note that this is because the last clause expands
to `(x ...)` which implicitly has the usual Racket function
application.) We can now make our language replace Racket's implicit
`#%app` macro with this, in the same way as we did before: first, drop
Racket's version from what we `provide`:

    (provide (except-out (all-from-out racket) let #%app))

and then `provide` our definition instead

    (provide (rename-out [my-app #%app]))

Users of our language get this as the regular function application:

    #lang s-exp "mylang.rkt"
    (let [x (plus 6 10)] (sqrt @ (plus x -7) x (plus x 9)))

Since `#%app` is a macro, it can evaluate to anything, even to things
that are not function applications at all. For example, here's an
extended definition that adds an arrow syntax that expands to a `lambda`
expression not to an actual application:

    (define-syntax my-app
      (syntax-rules (@ =>)
        [(_ F @ E ...)
         (list (F E) ...)]
        [(_ x => E ...)
         (lambda x E ...)]
        [(_ x ...) (x ...)]))

And an example of using it

    #lang s-exp "mylang.rkt"
    (define add1 ((x) => (+ x 1)))
    ;; or, combining all application forms in one example:
    (((x) => (plus x 7)) @ 10 20 30)

Another such special macro is `#%module-begin`: this is a macro that is
wrapped around the whole module body. Changing it makes it possible to
change the semantics of a sequence of toplevel expressions in our
language. The following is our complete language, with an example of
redefining `#%module-begin` to create a "verbose" language that prints
out expressions and what they evaluate to (note the `verbose` helper
macro that is completely internal):

    ;;; ---<<<MYLANG>>>-------------------------------------------------
    ;; A language that is built as an extension on top of Racket

    #lang racket

    (provide (except-out (all-from-out racket)
                         let #%app #%module-begin))

    (provide plus)
    (define (plus x y) (+ x y))

    (provide (rename-out [with let]))
    (define-syntax-rule (with [x V] E)
                        (let ([x V]) E))

    (provide (rename-out [my-app #%app]))
    (define-syntax my-app
      (syntax-rules (=> @)
        [(_ x => E ...)
         (lambda x E ...)]
        [(_ F @ E ...)
         (list (F E) ...)]
        [(_ x ...) (x ...)]))

    (provide (rename-out [mod-beg #%module-begin]))
    (define-syntax-rule (mod-beg E ...)
                        (#%module-begin (verbose E) ...))
    (define-syntax verbose
      (syntax-rules ()
        [(_ (define name value)) ; assume no (define (foo ...) ...)
         (begin (define name value)
                (printf "~s := ~s\n" 'name name))]
        [(_ E)
         (printf "~s --> ~s\n" 'E E)]))

And for reference, try that language with the above example:

    #lang s-exp "mylang.rkt"
    (define seven (+ 3 4))
    (define add1 ((x) => (+ x 1)))
    (((x) => (plus x seven)) @ 10 20 30)


------------------------------------------------------------------------
# Macro Conclusions [Tuesday, April 1st]

> [PLAI §37.5]

Macros are extremely powerful, but this also means that their usage
should be restricted only to situations where they are really needed.
You can view any function as extending the current collection of tools
that you provide --- where these tools are much more difficult for your
users to swallow than plain functions: evaluation can happen in any way,
with any scope, unlike the uniform rules of function application. An
analogy is that every function (or value) that you provide is equivalent
to adding nouns to a vocabulary, but macros can add completely new rules
for reading, since using them might result in a completely different
evaluation. Because of this, adding macros carelessly can make code
harder to read and debug --- and using them should be done in a way that
is as clear as possible for users.

When should a macro be used?

* Providing cosmetics: eliminating some annoying repetitiveness and/or
  inconvenient verbosity. This is usually macros that are intended to
  beautify code, for example, we could use a macro to make this bit of
  the Sloth source:

      (list '+ (box (racket-func->prim-val + #t)))
      (list '- (box (racket-func->prim-val - #t)))
      (list '* (box (racket-func->prim-val + #t)))

  look much better, by using a macro instead of the above. We can try to
  use a function, but we still need two inputs for each call --- the
  name and the function:

      (rfpv '+ + #t)
      (rfpv '- - #t)
      (rfpv '* + #t)

  and a macro can eliminate this (small, but potentially dangerous)
  redundancy. For example:

      (define-syntax-rule (rfpv fun flag)
        (list 'fun (box (racket-func->prim-val fun flag))))

  and then:

      (rfpv + #t)
      (rfpv - #t)
      (rfpv * #t)

  eliminates the typo that was in the previous examples (did you catch
  that?).

* Altering the order of evaluation: as seen with the `orelse` macro, we
  can control evaluation order in our macro. This is achieved by
  translating the macro into Racket code with a known evaluation order.
  We even choose not to evaluate some parts, or evaluate some parts
  multiple times (eg, the `for` macro).

  Note that by itself, we could get this if only we had a more
  light-weight notation for thunks, since then we could simply use
  functions. For example, a `while` function could easily be used with
  thunks:

      (define (while cond body)
        (when (cond)
          (body)
          (while cond body)))

  if the syntax for a thunk would be as easy as, for example, using
  curly braces:

      (let ([i 0])
        (while { (< i 10) }
          { (printf "i = ~s\n" i) (set! i (+ i 1)) }))

* Introducing binding constructs: macros that have a different binding
  structure from Racket built-ins. These kind of macros are ones that
  makes a powerful language --- for example, we've seen how we can
  survive without basic built-ins like `let`. For example, the `for`
  macro has its own binding structure.

  Note that with a sufficiently concise syntax for functions such as the
  arrow functions in JavaScript, we can get away with plain functions
  here too. For example, instead of a `with` macro, we could do it with
  a function:

      function with(val,fun) { return fun(val); }
      with( 123, x => x*x );

  (The obvious inconvenience is that the order can be weird.)

* Defining data languages: macros can be used for expressions that are
  not Racket expressions themselves. For example, the parens that wrap
  binding pairs in a `let` form are not function applications. Some
  times it is possible to use quotes for that, but then we get run-time
  values rather than being able to translate them into Racket code.
  Another usage of this category is to hide representation details that
  might involve implicit lambda's (for example, `delay`) --- if we
  define a macro, then there is a single point where we control whether
  an expression is used within some `lambda` --- but it it was a
  function, we'd have to change every usage of it to add an explicit
  lambda.

It is also important to note that macros should not be used too
frequently. As said above, every macro adds a completely different way
of reading your code --- a way that doesn't use the usual "nouns" and
"verbs", but there are other reasons not to use a macro.

One common usage case is as an optimization --- trying to avoid an extra
function call. For example, this:

    int min(int x, int y) {
      if ( x < y ) then return x; else return y;
    }

might seem wasteful if you don't want a full function call on every
usage of `min`. So you might be tempted to use this instead:

    #define min(x,y) x<y ? x : y

you even know the pitfalls of C macros so you make it more robust:

    #define min(x,y) (((x)<(y)) ? (x) : (y))

But small functions like the above are things that any decent compiler
should know how to optimize, and even if your compiler doesn't, it's
still not worth doing this optimization because programmer time is the
most expensive factor in any computer system. In addition, a compiler is
committed to doing these optimizations only when possible (eg, it is not
possible to in-line a recursive function) and to do proper in-lining
--- unlike the `min` CPP macro above which is erroneous in case `x` or
`y` are expressions that have side-effects.


------------------------------------------------------------------------
## Side-note: macros in mainstream languages [Tuesday, April 1st]

Macros are an extremely powerful tool in Racket (and other languages in
the Lisp family) --- how come nobody else uses them?

Well, people have tried to use them in many contexts. The problem is
that you cannot get away with a simple solution that does nothing more
than textual manipulation of your programs. For example, the standard C
preprocessor is a macro language, but it is fundamentally limited to
very simple situations. This is still a hot topic these days, with
modern languages trying out different solutions (or giving up and
claiming that macros are evil).

Here is an example that was written by Mark Jason Dominus ("Higher Order
Perl"), in a [Perl mailing list post] among further discussion on macros
in Lisp vs other languages, including Perl's source transformers that
are supposed to fill a similar role.

[Perl mailing list post]:
  http://lists.warhead.org.uk/pipermail/iwe/2005-July/000130.html

The example starts with writing the following simple macro:

    #define square(x) x*x

This doesn't quite work because

    2/square(10)

expands to

    2/10*10

which is 2, but you wanted 0.02. So you need this instead:

    #define square(x) (x*x)

but this breaks because

    square(1+1)

expands to

    (1+1*1+1)

which is 3, but you wanted 4. So you need this instead:

    #define square(x) ((x)*(x))

But what about

    x = 2;
    square(x++);

which expands to

    ((x++)*(x++))

or an expensive expression? So you need this instead:

    int __MYTMP;
    #define square(x) (__MYTMP = (x), __MYTMP*__MYTMP)

but now it only works for ints; you can't do square(3.5) any more. To
really fix this you have to use a new C feature, something like:

    #define square(x) ({ typeof(x) xval = x; xval*xval; })

or more like:

    #define square(x) \
      ({ typeof(x) xval = (x); \
         xval*xval; })

And that's just to get trivial macros, like "square()", to work.

------------------------------------------------------------------------

You should be able to appreciate now the tremendous power of macros.
This is why there are so many "primitive features" of programming
languages that can be considered as merely library functionality given a
good macro system. For example, people are used to think about OOP as
some inherent property of a language --- but in Racket there are at
least two very different object systems that it comes with, and several
others in user-distributed code. All of these are implemented as a
library which provides the functionality as well as the necessary syntax
in the form of macros. So the basic principle is to have a small core
language with powerful constructs, and make it easy to express complex
ideas using these constructs.

This is an important point to consider before starting a new DSL
(reminder: domain specific language) --- if you need something that
looks like a simple DSL but might grow to a full language, you can
extend an existing language with macros to have the features you want,
and you will always be able to grow to use the full language if
necessary. This is particularly easy with Racket, but possible in other
languages too.

------------------------------------------------------------------------

Side note: the principle of a powerful but simple code language and easy
extensions is not limited to using macros --- other factors are
involved, like first-class functions. In fact, "first class"-ness can
help in many situations, for example: single inheritance + classes as
first-class values can be used instead of multiple inheritance.


------------------------------------------------------------------------
# Types [Tuesday, April 8th]

> [PLAI §24]

In our Toy language implementation, there are certain situations that
are not covered. For example,

    {< {+ 1 2} 3}

is not a problem, but

    {+ {< 1 2} 3}

will eventually use Racket's addition function on a boolean value, which
will crash our evaluator. Assuming that we go back to the simple
language we once had, where there were no booleans, we can still run
into errors --- except now these are the errors that our code raises:

    {+ {fun {} 1} 2}

or

    {1 2 3}

or

    {{fun {x y} {+ x y}} 5}

In any case, it would be good to avoid such errors right from the start
--- it seems like we should be able to identify such bad code and not
even try to run it. One thing that we can do is do a little more work at
parse time, and declare the `{1 2 3}` program fragment as invalid. We
can even try to forbid

    {bind {{x 1}} {x 2 3}}

in the same way, but what should we do with this? ---

    {fun {x} {x 2 3}}

The validity of this depends on how it is used. The same goes for some
invalid expressions --- the above bogus expression can be fine if it's
in a context that shadows `<`:

    {bind {{< *}}
      {+ {< 1 2} 3}}

Finally, consider this:

    {+ 3 {if <mystery> 5 {fun {x} x}}}

where mystery contains something like `random` or `read`. In general,
knowing whether a piece of code will run with no errors is a problem
that is equivalent to the halting problem --- and because of this, there
is no way to create an "exact" type system: they are all either too
restrictive (rejecting programs that would run with no errors) or too
permissive (accepting programs that might crash). This is a very
practical issue --- type safety means a lot less bugs in the system. A
good type system is still an actively researched problem.


------------------------------------------------------------------------
# What is a Type? [Tuesday, April 8th]

> [PLAI §25]

A type is any property of a program (or an expression) that can be
determined without running the program. (This is different than what is
considered a `type` in Racket which is a property that is known only at
run-time, which means that before run-time we know nothing so in essence
we have a single type (in the static sense).) Specifically, we want to
use types in a way that predicts some aspects of the program's behavior,
for example, whether a program will crash.

Usually, types are being used as the kind of value that an expression
can evaluate to, not the precise value itself. For example, we might
have two kinds of values --- functions and numbers, and we know that
addition always operates on numbers, therefore

    {+ 1 {fun {x} x}}

is a type error. Note that to determine this we don't care about the
actual function, just the fact that it is a function.

Important: types can discriminate certain programs as invalid, but they
cannot discriminate correct programs from incorrect ones. For example,
there is no way for any type system to know that this:

    {fun {x} {+ x 1}}

is an incorrect decrease-by-one function.

In general, type systems try to get to the optimal point where as much
information as possible is known, yet the language is not too
restricted, no significant computing resources are wasted, and
programmers don't spend much time annotating their code.

Why would you want to use a type system?

* Catch errors even in code that you don't execute, for example, when
  your tests are too weak (but they do *not* substitute proper test
  suites).

* They help reduce the time spent on debugging (when they detect
  legitimate errors, rather than force you to change your code).

* As we have seen, they help in documenting code (but they do *not*
  substitute proper documentation).

* Compilers can use type information to make programs run faster.

* They encourage a more organized code development process. For example,
  our use of `define-type` and `cases` (inspired by ML) help guide your
  code. (But note that the actual code can be as disorganized as usual,
  typechecking or not...)


------------------------------------------------------------------------
# Our Types --- The Picky Language [Tuesday, April 8th]

The first thing we need to do is to agree on what types are. Earlier, we
talked about two types: numbers and functions (ignore booleans or
anything else for now), we will use these two types for now.

> In general, this means that we are using the *Types are Sets* meaning
> for types, and specifically, we will be implmenting a type system
> known as a *Hindley-Milner* system. This is *not* what Typed Racket is
> using. In fact, one of the main differences is that in our type system
> each binding has exactly one type, whereas in Typed Racket an
> identifier can have different types in different places in the code.
> An example of this is something that we've talked about earlier:
>
>     (: foo : (U String Number) -> Number)
>     (define (foo x)          ; \ these `x`s have a
>       (if (number? x)        ; / (U Number String) type
>         (+ x 1)              ; > this one is a Number
>         (string-length x)))  ; > and this one is a String

A type system is presented as a collection of rules called "type
judgments", which describe how to determine the type of an expression.
Beside the types and the judgments, a type system specification needs a
(decidable) algorithm that can assign types to expressions.

Such a specification should have one rule for every kind of syntactic
construct, so when we get a program we can determine the precise type of
any expression. Also, these judgments are usually recursive since a type
judgment will almost always rely on the types of sub-expressions (if
any).

For our restricted system, we have two rules that we can easily specify:

    n : Number  (any numeral `n' is a number)
    {fun {x} E} : Function

(These rules are actually "axioms", since the state facts that are true
by themselves, with no need for any further work.)

And what about an identifier? Well, it is clear that we need to keep
some form of an environment that will keep an account of types assigned
to identifiers (note: all of this is not at run-time). This environment
is used in all type judgments, and usually written as a capital Greek
Gamma character (in some places `G` is used to stick to ASCII texts).

The conventional way to write the above two axioms is:

    Γ ⊢ n : Number
    Γ ⊢ {fun {x} E} : Function

The first one is read as "Gamma proves that `n` has the type `Number`".
Note that this is a syntactic environment, much like DE-ENVs that you
have seen in homework.

So, we can write a rule for identifiers that simply has the type
assigned by the environment:

    Γ ⊢ x : Γ(x)    ; "Γ(x)" is "lookup(x, Γ)" for a new "type env"

We now need a rule for addition and a rule for application (note: we're
using a very limited subset of our old language, where arithmetic
operators are not function applications). Addition is easy: if we can
prove that both `a` and `b` are numbers in some environment Γ, then we
know that `{+ a b}` is a number in the same environment. We write this
as follows:

    Γ ⊢ A : Number   Γ ⊢ B : Number
    ———————————————————————————————
         Γ ⊢ {+ A B} : Number

Now, what about application? We need to refer to some arbitrary type
now, and the common letter for that is a Greek lowercase tau:

    Γ ⊢ F : Function   Γ ⊢ V : τᵥ
    —————————————————————————————
        Γ ⊢ {call F V} : ???

that is --- if we can prove that `f` is a function, and that `v` is a
value of some type `τₐ`, then ... ??? Well, we need to know more about
`f`: we need to know what type it consumes and what type it returns. So
a simple `function` is not enough --- we need some sort of a function
type that specifies both input and output types. We will use the
notation that was seen throughout the semester and dump `function`. Now
we can write:

    Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
    ——————————————————————————————
         Γ ⊢ {call F V} : τ₂

which makes sense --- if you take a function of type `τ₁->τ₂` and you
feed it what it expects, you get the obvious output type. But going back
to the language --- where do we get these new arrow types from? We will
modify the language and require that every function specifies its input
and output type (and assume we have only one argument functions). For
example, we will write something like this for a function that is the
curried version of addition:

    {fun {x : Number} : (Number -> Number)
      {fun {y : Number} : Number
        {+ x y}}}

So: the revised syntax for the limited language that contains only
additions, applications and single-argument functions, and for fun ---
go back to using the `call` keyword is. The syntax we get is:

    <PICKY> ::= <num>
              | <id>
              | { + <PICKY> <PICKY> }
              | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
              | { call <PICKY> <PICKY> }

    <TYPE>  ::= Number
              | ( <TYPE> -> <TYPE> )

and the typing rules are:

    Γ ⊢ n : Number

    Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

    Γ ⊢ x : Γ(x)

    Γ ⊢ A : Number   Γ ⊢ B : Number
    ———————————————————————————————
         Γ ⊢ {+ A B} : Number

    Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
    ——————————————————————————————
         Γ ⊢ {call F V} : τ₂

But we're still missing a big part --- the current axiomatic rule for a
`fun` expression is too weak. If we use it, we conclude that these
expressions:

    {fun {x : Number} : (Number -> Number)
      3}
    {fun {x : Number} : Number
      {call x 2}}

are valid, as well concluding that this program:

    {call {call {fun {x : Number} : (Number -> Number)
                  3}
                5}
          7}

is valid, and should return a number. What's missing? We need to check
that the body part of the function is correct, so the rule for typing a
`fun` is no longer a simple axiom but rather a type judgment. Here is
how we check the body instead of blindly believing program annotations:

              Γ[x:=τ₁] ⊢ E : τ₂             ; Γ[x:=τ₁] is
    ——————————————————————————————————————  ;     extend(Γ, x, τ₁)
    Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)  ; for the new type envs

That is --- we want to make sure that if `x` has type `τ₁`, then the
body expression `E` has type `τ₂`, and if we can prove this, then we can
trust these annotations.

There is an important relationship between this rule and the `call` rule
for application:

* In this rule we assume that the input will have the right type and
  guarantee (via a proof) that the output will have the right type.

* In the application rule, we guarantee (by a proof) an input of the
  right type and assume a result of the right type.

(Side note: Racket comes with a contract system that can identify type
errors dynamically, and assign blame to either the caller or the callee
--- and these correspond to these two sides.)

Note that, as we said, `number` is really just a property of a certain
kind of values, we don't know exactly what numbers are actually used. In
the same way, the arrow function types don't tell us exactly what
function it is, for example, `(Number -> Number)` can indicate a
function that adds three to its argument, subtracts seven, or multiplies
it by 7619. But it certainly contains much more than the previous naive
`function` type. (Consider also Typed Racket here: it goes much further
in expressing facts about code.)

For reference, here is the complete BNF and typing rules:

    <PICKY> ::= <num>
              | <id>
              | { + <PICKY> <PICKY> }
              | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
              | { call <PICKY> <PICKY> }

    <TYPE>  ::= Number
              | ( <TYPE> -> <TYPE> )

    Γ ⊢ n : Number

    Γ ⊢ x : Γ(x)

    Γ ⊢ A : Number   Γ ⊢ B : Number
    ———————————————————————————————
         Γ ⊢ {+ A B} : Number

              Γ[x:=τ₁] ⊢ E : τ₂
    ——————————————————————————————————————
    Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

    Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
    ——————————————————————————————
         Γ ⊢ {call F V} : τ₂

------------------------------------------------------------------------

Examples of using types (abbreviate `Number` as `Num`) --- first, a
simple example:

                    {} ⊢ 5 : Num   {} ⊢ 7 : Num
                    ———————————————————————————
    {} ⊢ 2 : Num         {} ⊢ {+ 5 7} : Num
    ———————————————————————————————————————————
              {} ⊢ {+ 2 {+ 5 7}} : Num

and a little more involved one:

        [x:=Num] ⊢ x : Num   [x:=Num] ⊢ 3 : Num
        ———————————————————————————————————————
               [x:=Num] ⊢ {+ x 3} : Num
    ———————————————————————————————————————————————
    {} ⊢ {fun {x : Num} : Num {+ x 3}} : Num -> Num   {} ⊢ 5 : Num
    ——————————————————————————————————————————————————————————————
          {} ⊢ {call {fun {x : Num} : Num {+ x 3}} 5} : Num

Finally, try a buggy program like

    {+ 3 {fun {x : Number} : Number x}}

and see where it is impossible to continue.

The main thing here is that to know that this is a type error, we have
to prove that there is no judgment for a certain type (in this case, no
way to prove that a `fun` expression has a `Num` type), which we
(humans) can only do by inspecting all of the rules. Because of this, we
need to also add an algorithm to our type system, one that we can follow
and determine when it gives up.


------------------------------------------------------------------------
# Typing control [Tuesday, April 8th]

> [PLAI §26]

We will now extend our typed Picky language to have a conditional
expression, and predicates. First, we extend the BNF with a predicate
expression, and we also need a type for the results:

    <PICKY> ::= <num>
              | <id>
              | { + <PICKY> <PICKY> }
              | { < <PICKY> <PICKY> }
              | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
              | { call <PICKY> <PICKY> }
              | { if <PICKY> <PICKY> <PICKY> }

    <TYPE>  ::= Number
              | Boolean
              | ( <TYPE> -> <TYPE> )

Initially, we use the same rules, and add the obvious type for the
predicate:

    Γ ⊢ A : Number   Γ ⊢ B : Number
    ———————————————————————————————
         Γ ⊢ {< A B} : Boolean

And what should the rule for `if` look like? Well, to make sure that the
condition is a boolean, it should be something of this form:

    Γ ⊢ C : Boolean   Γ ⊢ T : ???   Γ ⊢ E : ???
    ———————————————————————————————————————————
               Γ ⊢ {if C T E} : ???

What would be the types of `t` and `e`? A natural choice would be to let
the programmer use any two types:

    Γ ⊢ C : Boolean   Γ ⊢ T : τ₁   Γ ⊢ E : τ₂
    —————————————————————————————————————————
              Γ ⊢ {if C T E} : ???

But what would the return type be? This is still a problem. (BTW, some
kind of a union would be nice, but it has some strong implications that
we will not discuss.) In addition, we will have a problem detecting
possible errors like:

    {+ 2 {if <mystery> 3 {fun {x} x}}}

Since we know nothing about the condition, we can just as well be
conservative and force both arms to have the same type. The rule is
therefore:

    Γ ⊢ C : Boolean   Γ ⊢ T : τ   Γ ⊢ E : τ
    ———————————————————————————————————————
              Γ ⊢ {if C T E} : τ

--- using the same letter indicates that we expect the types to be
identical, unlike the previous attempt. Consequentially, this type
system is fundamentally weaker than Typed Racket which we use in this
class.

Here is the complete language specification with this extension:

    <PICKY> ::= <num>
              | <id>
              | { + <PICKY> <PICKY> }
              | { < <PICKY> <PICKY> }
              | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
              | { call <PICKY> <PICKY> }
              | { if <PICKY> <PICKY> <PICKY> }

    <TYPE>  ::= Number
              | Boolean
              | ( <TYPE> -> <TYPE> )

    Γ ⊢ n : Number

    Γ ⊢ x : Γ(x)

    Γ ⊢ A : Number   Γ ⊢ B : Number
    ———————————————————————————————
         Γ ⊢ {+ A B} : Number

    Γ ⊢ A : Number   Γ ⊢ B : Number
    ———————————————————————————————
         Γ ⊢ {< A B} : Boolean

              Γ[x:=τ₁] ⊢ E : τ₂
    ——————————————————————————————————————
    Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

    Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
    ——————————————————————————————
         Γ ⊢ {call F V} : τ₂

    Γ ⊢ C : Boolean   Γ ⊢ T : τ   Γ ⊢ E : τ
    ———————————————————————————————————————
              Γ ⊢ {if C T E} : τ


------------------------------------------------------------------------
## Extending Picky [Tuesday, April 8th]

In general, we can extend this language in one of two ways. For example,
lets say that we want to add the `with` form. One way to add it is what
we did above --- simply add it to the language, and write the rule for
it. In this case, we get:

    Γ ⊢ V : τ₁   Γ[x:=τ₁] ⊢ E : τ₂
    ——————————————————————————————
     Γ ⊢ {with {x : τ₁ V} E} : τ₂

Note how this rule encapsulates information about the scope of `with`.
Also note that we need to specify the types for the bound values.

Another way to achieve this extension is if we add `with` as a derived
rule. We know that when we see a

    {with {x V} E}

expression, we can just translate it into

    {call {fun {x} E} V}

So we could achieve this extension by using a rewrite rule to translate
all `with` expressions into `call`s of anonymous functions (eg, using
the `with-stx` facility that we have seen recently). This could be done
formally: begin with the `with` form, translate to the `call` form, and
finally show the necessary goals to prove its type. The only thing to be
aware of is the need to translate the types too, and there is one type
that is missing from the typed-with version above --- the output type of
the function. This is an indication that we don't really need to specify
function output types --- we can just deduce them from the code,
provided that we know the input type to the function.

Indeed, if we do this on a general template for a `with` expression,
then we end up with the same goals that need to be proved as in the
above rule:

          Γ[x:=τ₁] ⊢ E : τ₂
    ——————————————————————————————————————
    Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)       Γ ⊢ V : τ₁
    ———————————————————————————————————————————————————————
             Γ ⊢ {call {fun {x : τ₁} : τ₂ E} V} : τ₂
             ———————————————————————————————————————
                Γ ⊢ {with {x : τ₁ V} E} : τ₂

------------------------------------------------------------------------

Conclusion --- we've seen type judgment rules, and using them in proof
trees. Note that in these trees there is a clear difference between
rules that have no preconditions --- there are axioms that are always
true (eg, a numeral is always of type `num`).

The general way of proving a type seems similar to evaluation of an
expression, but there is a huge difference --- *nothing* is really
getting evaluated. As an example, we always go into the body of a
function expression, which is done to get the function's type, and this
is later used anywhere this function is used --- when you evaluate this:

    {with {f {fun {x : Number} : Number x}}
      {+ {call f 1} {call f 2}}}

you first create a closure which means that you don't touch the body of
the function, and later you use it twice. In contrast, when you prove
the type of this expression, you immediately go into the body of the
function which you have to do to prove that it has the expected
`Number->Number` type, and then you just use this type twice.

Finally, we have seen the importance of using the same type letters to
enforce types, and in the case of typing an `if` statement this had a
major role: specifying that the two arms can be any two types, or the
same type.


------------------------------------------------------------------------
# Implementing Picky [Tuesday, April 8th]

The following is a simple implementation of the Picky language. It is
based on the environments-based Flang implementation. Note the two main
functions here --- `typecheck` and `typecheck*`.

    ;;; ---<<<PICKY1>>>-------------------------------------------------
    ;; The Picky interpreter, verbose version

    #lang pl

    #|
    The grammar:
      <PICKY> ::= <num>
                | <id>
                | { + <PICKY> <PICKY> }
                | { - <PICKY> <PICKY> }
                | { = <PICKY> <PICKY> }
                | { < <PICKY> <PICKY> }
                | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
                | { call <PICKY> <PICKY> }
                | { with { <id> : <TYPE> <PICKY> } <PICKY> }
                | { if <PICKY> <PICKY> <PICKY> }
      <TYPE>  ::= Num | Number
                | Bool | Boolean
                | { <TYPE> -> <TYPE> }

    Evaluation rules:
      eval(N,env)                = N
      eval(x,env)                = lookup(x,env)
      eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
      eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
      eval({= E1 E2},env)        = eval(E1,env) = eval(E2,env)
      eval({< E1 E2},env)        = eval(E1,env) < eval(E2,env)
      eval({fun {x} E},env)      = <{fun {x} E}, env>
      eval({call E1 E2},env1)    = eval(B,extend(x,eval(E2,env1),env2))
                                  if eval(E1,env1) = <{fun {x} B}, env2>
                                 = error!  otherwise <-- never happens
      eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
      eval({if E1 E2 E3},env)    = eval(E2,env)  if eval(E1,env) is true
                                 = eval(E3,env)  otherwise

    Type checking rules:

      Γ ⊢ n : Number

      Γ ⊢ x : Γ(x)

      Γ ⊢ A : Number   Γ ⊢ B : Number
      ———————————————————————————————
           Γ ⊢ {+ A B} : Number

      Γ ⊢ A : Number   Γ ⊢ B : Number
      ———————————————————————————————
           Γ ⊢ {< A B} : Boolean

                Γ[x:=τ₁] ⊢ E : τ₂
      ——————————————————————————————————————
      Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

      Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
      ——————————————————————————————
           Γ ⊢ {call F V} : τ₂

      Γ ⊢ V : τ₁   Γ[x:=τ₁] ⊢ E : τ₂
      ——————————————————————————————
       Γ ⊢ {with {x : τ₁ V} E} : τ₂

      Γ ⊢ C : Boolean   Γ ⊢ T : τ   Γ ⊢ E : τ
      ———————————————————————————————————————
                Γ ⊢ {if C T E} : τ

    |#

    (define-type PICKY
      [Num   Number]
      [Id    Symbol]
      [Add   PICKY PICKY]
      [Sub   PICKY PICKY]
      [Equal PICKY PICKY]
      [Less  PICKY PICKY]
      [Fun   Symbol TYPE PICKY TYPE] ; name, in-type, body, out-type
      [Call  PICKY PICKY]
      [With  Symbol TYPE PICKY PICKY]
      [If    PICKY PICKY PICKY])

    (define-type TYPE
      [NumT]
      [BoolT]
      [FunT TYPE TYPE])

    (: parse-sexpr : Sexpr -> PICKY)
    ;; parses s-expressions into PICKYs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(list '+ lhs rhs) (Add   (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub   (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '= lhs rhs) (Equal (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '< lhs rhs) (Less  (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call  (parse-sexpr fun) (parse-sexpr arg))]
        [(list 'if c t e)
         (If (parse-sexpr c) (parse-sexpr t) (parse-sexpr e))]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name) ': itype) ': otype body)
            (Fun name
                 (parse-type-sexpr itype)
                 (parse-sexpr body)
                 (parse-type-sexpr otype))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) ': type named) body)
            (With name
                  (parse-type-sexpr type)
                  (parse-sexpr named)
                  (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [else (error 'parse-sexpr "bad expression syntax: ~s" sexpr)]))

    (: parse-type-sexpr : Sexpr -> TYPE)
    ;; parses s-expressions into TYPEs
    (define (parse-type-sexpr sexpr)
      (match sexpr
        ['Number  (NumT)]
        ['Boolean (BoolT)]
        ;; allow shorter names too
        ['Num  (NumT)]
        ['Bool (BoolT)]
        [(list itype '-> otype)
         (FunT (parse-type-sexpr itype) (parse-type-sexpr otype))]
        [else (error 'parse-type-sexpr "bad type syntax in ~s" sexpr)]))

    (: parse : String -> PICKY)
    ;; parses a string containing a PICKY expression to a PICKY AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Typechecker and related types and helpers

    ;; this is similar to ENV, but it holds type information for the
    ;; identifiers during typechecking; it is essentially "Γ"
    (define-type TYPEENV
      [EmptyTypeEnv]
      [ExtendTypeEnv Symbol TYPE TYPEENV])

    (: type-lookup : Symbol TYPEENV -> TYPE)
    ;; similar to `lookup' for type environments; note that the
    ;; error is phrased as a typecheck error, since this indicates
    ;; a failure at the type checking stage
    (define (type-lookup name typeenv)
      (cases typeenv
        [(EmptyTypeEnv) (error 'typecheck "no binding for ~s" name)]
        [(ExtendTypeEnv id type rest-env)
         (if (eq? id name) type (type-lookup name rest-env))]))

    (: typecheck : PICKY TYPE TYPEENV -> Void)
    ;; Checks that the given expression has the specified type.
    ;; Used only for side-effects (to throw a type error), so return
    ;; a void value.
    (define (typecheck expr type type-env)
      (unless (equal? type (typecheck* expr type-env))
        (error 'typecheck "type error for ~s: expecting a ~s"
               expr type)))

    (: typecheck* : PICKY TYPEENV -> TYPE)
    ;; Returns the type of the given expression (which also means that
    ;; it checks it). This is a helper for the real typechecker that
    ;; also checks a specific return type.
    (define (typecheck* expr type-env)
      (: two-nums : PICKY PICKY -> Void)
      (define (two-nums e1 e2)
        (typecheck e1 (NumT) type-env)
        (typecheck e2 (NumT) type-env))
      (cases expr
        [(Num n) (NumT)]
        [(Id name) (type-lookup name type-env)]
        [(Add   l r) (two-nums l r) (NumT)]
        [(Sub   l r) (two-nums l r) (NumT)]
        [(Equal l r) (two-nums l r) (BoolT)]
        [(Less  l r) (two-nums l r) (BoolT)]
        [(Fun bound-id in-type bound-body out-type)
         (typecheck bound-body out-type
                    (ExtendTypeEnv bound-id in-type type-env))
         (FunT in-type out-type)]
        [(Call fun arg)
         (cases (typecheck* fun type-env)
           [(FunT in-type out-type)
            (typecheck arg in-type type-env)
            out-type]
           [else (error 'typecheck "type error for ~s: expecting a fun"
                        expr)])]
        [(With bound-id itype named-expr bound-body)
         (typecheck named-expr itype type-env)
         (typecheck* bound-body
                     (ExtendTypeEnv bound-id itype type-env))]
        [(If cond-expr then-expr else-expr)
         (typecheck cond-expr (BoolT) type-env)
         (let ([type (typecheck* then-expr type-env)])
           (typecheck else-expr type type-env) ; enforce same type
           type)]))

    ;; Evaluator and related types and helpers

    (define-type ENV
      [EmptyEnv]
      [Extend Symbol VAL ENV])

    (define-type VAL
      [NumV  Number]
      [BoolV Boolean]
      [FunV  Symbol PICKY ENV])

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(Extend id val rest-env)
         (if (eq? id name) val (lookup name rest-env))]))

    (: strip-numv : Symbol VAL -> Number)
    ;; converts a VAL to a Racket number if possible, throws an error if
    ;; not using the given name for the error message
    (define (strip-numv name val)
      (cases val
        [(NumV n) n]
        ;; this error will never be reached, see below for more
        [else (error name "expected a number, got: ~s" val)]))

    (: arith-op : (Number Number -> Number) VAL VAL -> VAL)
    ;; gets a Racket numeric binary operator, and uses it within a NumV
    ;; wrapper
    (define (arith-op op val1 val2)
      (NumV (op (strip-numv 'arith-op val1)
                (strip-numv 'arith-op val2))))

    (: bool-op : (Number Number -> Boolean) VAL VAL -> VAL)
    ;; gets a Racket numeric binary predicate, and uses it
    ;; within a BoolV wrapper
    (define (bool-op op val1 val2)
      (BoolV (op (strip-numv 'bool-op val1)
                 (strip-numv 'bool-op val2))))

    (: eval : PICKY ENV -> VAL)
    ;; evaluates PICKY expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Id name) (lookup name env)]
        [(Add   l r) (arith-op + (eval l env) (eval r env))]
        [(Sub   l r) (arith-op - (eval l env) (eval r env))]
        [(Equal l r) (bool-op  = (eval l env) (eval r env))]
        [(Less  l r) (bool-op  < (eval l env) (eval r env))]
        [(Fun bound-id in-type bound-body out-type)
         ;; note that types are not used at runtime,
         ;; so they're not stored in the closure
         (FunV bound-id bound-body env)]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV bound-id bound-body f-env)
            (eval bound-body
                  (Extend bound-id (eval arg-expr env) f-env))]
           ;; `cases' requires complete coverage of all variants, but
           ;; this `else' is never used since we typecheck programs
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]
        [(With bound-id type named-expr bound-body)
         (eval bound-body (Extend bound-id (eval named-expr env) env))]
        [(If cond-expr then-expr else-expr)
         (let ([bval (eval cond-expr env)])
           (if (cases bval
                 [(BoolV b) b]
                 ;; same as above: this case is never reached
                 [else (error 'eval "`if' expects a boolean, got: ~s"
                                    bval)])
             (eval then-expr env)
             (eval else-expr env)))]))

    (: run : String -> Number)
    ;; evaluate a PICKY program contained in a string
    (define (run str)
      (let ([prog (parse str)])
        (typecheck prog (NumT) (EmptyTypeEnv))
        (let ([result (eval prog (EmptyEnv))])
          (cases result
            [(NumV n) n]
            ;; this error is never reached, since we make sure
            ;; that the program always evaluates to a number above
            [else (error 'run "evaluation returned a non-number: ~s"
                              result)]))))

    ;; tests -- including translations of the FLANG tests
    (test (run "5") => 5)
    (test (run "{< 1 2}") =error> "type error")
    (test (run "{fun {x : Num} : Num {+ x 1}}") =error> "type error")
    (test (run "{call {fun {x : Num} : Num {+ x 1}} 4}") => 5)
    (test (run "{with {x : Num 3} {+ x 1}}") => 4)
    (test (run "{with {identity : {Num -> Num} {fun {x : Num} : Num x}}
                  {call identity 1}}")
          => 1)
    (test (run "{with {add3 : {Num -> Num}
                       {fun {x : Num} : Num {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 : {Num -> Num}
                       {fun {x : Num} : Num {+ x 3}}}
                  {with {add1 : {Num -> Num}
                         {fun {x : Num} : Num {+ x 1}}}
                    {with {x : Num 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity : {{Num -> Num} -> {Num -> Num}}
                       {fun {x : {Num -> Num}} : {Num -> Num} x}}
                  {with {foo : {Num -> Num}
                         {fun {x : Num} : Num {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x : Num 3}
                  {with {f : {Num -> Num}
                         {fun {y : Num} : Num {+ x y}}}
                    {with {x : Num 5}
                      {call f 4}}}}")
          => 7)
    (test (run "{call {with {x : Num 3}
                        {fun {y : Num} : Num {+ x y}}}
                      4}")
          => 7)
    (test (run "{with {f : {Num -> Num}
                       {with {x : Num 3} {fun {y : Num} : Num {+ x y}}}}
                  {with {x : Num 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x : {Num -> {Num -> Num}}}
                                 : {Num -> Num}
                              {call x 1}}
                            {fun {x : Num} : {Num -> Num}
                              {fun {y : Num} : Num {+ x y}}}}
                      123}")
          => 124)
    (test (run "{call {fun {x : Num} : Num {if {< x 2} {+ x 5} {+ x 6}}}
                      1}")
          => 6)
    (test (run "{call {fun {x : Num} : Num {if {< x 2} {+ x 5} {+ x 6}}}
                      2}")
          => 8)

One thing that is very obvious when you look at the examples is that
this language is way too verbose to be practical --- types are repeated
over and over again. If you look carefully at the typechecking fragments
for the two relevant expressions --- `fun` and `with` --- you can see
that we can actually get rid of almost all of the type annotations.


------------------------------------------------------------------------
## Improving Picky [Tuesday, April 8th]

The following version does that, there are no types mentioned except for
the input type for a function. Note that we can do that at this point
because our language is so simple that many pieces of code have a
specific type. (For example, if we add polymorphism things get more
complicated.)

    ;;; ---<<<PICKY2>>>-------------------------------------------------
    ;; The Picky interpreter, almost no explicit types

    #lang pl

    #|
    The grammar:
      <PICKY> ::= <num>
                | <id>
                | { + <PICKY> <PICKY> }
                | { - <PICKY> <PICKY> }
                | { = <PICKY> <PICKY> }
                | { < <PICKY> <PICKY> }
                | { fun { <id> : <TYPE> } <PICKY> }
                | { call <PICKY> <PICKY> }
                | { with { <id> <PICKY> } <PICKY> }
                | { if <PICKY> <PICKY> <PICKY> }
      <TYPE>  ::= Num | Number
                | Bool | Boolean
                | { <TYPE> -> <TYPE> }

    Evaluation rules:
      eval(N,env)                = N
      eval(x,env)                = lookup(x,env)
      eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
      eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
      eval({= E1 E2},env)        = eval(E1,env) = eval(E2,env)
      eval({< E1 E2},env)        = eval(E1,env) < eval(E2,env)
      eval({fun {x} E},env)      = <{fun {x} E}, env>
      eval({call E1 E2},env1)    = eval(B,extend(x,eval(E2,env1),env2))
                                  if eval(E1,env1) = <{fun {x} B}, env2>
                                 = error!  otherwise <-- never happens
      eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
      eval({if E1 E2 E3},env)    = eval(E2,env)  if eval(E1,env) is true
                                 = eval(E3,env)  otherwise

    Type checking rules (note how implicit types are made):

      Γ ⊢ n : Number

      Γ ⊢ x : Γ(x)

      Γ ⊢ A : Number   Γ ⊢ B : Number
      ———————————————————————————————
           Γ ⊢ {+ A B} : Number

      Γ ⊢ A : Number   Γ ⊢ B : Number
      ———————————————————————————————
           Γ ⊢ {< A B} : Boolean

              Γ[x:=τ₁] ⊢ E : τ₂
      —————————————————————————————————
      Γ ⊢ {fun {x : τ₁} E} : (τ₁ -> τ₂)

      Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
      ——————————————————————————————
           Γ ⊢ {call F V} : τ₂

      Γ ⊢ V : τ₁   Γ[x:=τ₁] ⊢ E : τ₂
      ——————————————————————————————
         Γ ⊢ {with {x V} E} : τ₂

      Γ ⊢ C : Boolean   Γ ⊢ T : τ   Γ ⊢ E : τ
      ———————————————————————————————————————
                Γ ⊢ {if C T E} : τ

    |#

    (define-type PICKY
      [Num   Number]
      [Id    Symbol]
      [Add   PICKY PICKY]
      [Sub   PICKY PICKY]
      [Equal PICKY PICKY]
      [Less  PICKY PICKY]
      [Fun   Symbol TYPE PICKY]  ; no output type
      [Call  PICKY PICKY]
      [With  Symbol PICKY PICKY] ; no types here
      [If    PICKY PICKY PICKY])

    (define-type TYPE
      [NumT]
      [BoolT]
      [FunT TYPE TYPE])

    (: parse-sexpr : Sexpr -> PICKY)
    ;; parses s-expressions into PICKYs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(list '+ lhs rhs) (Add   (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub   (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '= lhs rhs) (Equal (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '< lhs rhs) (Less  (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call  (parse-sexpr fun) (parse-sexpr arg))]
        [(list 'if c t e)
         (If (parse-sexpr c) (parse-sexpr t) (parse-sexpr e))]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name) ': itype) body)
            (Fun name (parse-type-sexpr itype) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [else (error 'parse-sexpr "bad expression syntax: ~s" sexpr)]))

    (: parse-type-sexpr : Sexpr -> TYPE)
    ;; parses s-expressions into TYPEs
    (define (parse-type-sexpr sexpr)
      (match sexpr
        ['Number  (NumT)]
        ['Boolean (BoolT)]
        ;; allow shorter names too
        ['Num  (NumT)]
        ['Bool (BoolT)]
        [(list itype '-> otype)
         (FunT (parse-type-sexpr itype) (parse-type-sexpr otype))]
        [else (error 'parse-type-sexpr "bad type syntax in ~s" sexpr)]))

    (: parse : String -> PICKY)
    ;; parses a string containing a PICKY expression to a PICKY AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Typechecker and related types and helpers

    ;; this is similar to ENV, but it holds type information for the
    ;; identifiers during typechecking; it is essentially "Γ"
    (define-type TYPEENV
      [EmptyTypeEnv]
      [ExtendTypeEnv Symbol TYPE TYPEENV])

    (: type-lookup : Symbol TYPEENV -> TYPE)
    ;; similar to `lookup' for type environments; note that the
    ;; error is phrased as a typecheck error, since this indicates
    ;; a failure at the type checking stage
    (define (type-lookup name typeenv)
      (cases typeenv
        [(EmptyTypeEnv) (error 'typecheck "no binding for ~s" name)]
        [(ExtendTypeEnv id type rest-env)
         (if (eq? id name) type (type-lookup name rest-env))]))

    (: typecheck : PICKY TYPE TYPEENV -> Void)
    ;; Checks that the given expression has the specified type.
    ;; Used only for side-effects (to throw a type error), so return
    ;; a void value.
    (define (typecheck expr type type-env)
      (unless (equal? type (typecheck* expr type-env))
        (error 'typecheck "type error for ~s: expecting a ~s"
               expr type)))

    (: typecheck* : PICKY TYPEENV -> TYPE)
    ;; Returns the type of the given expression (which also means that
    ;; it checks it). This is a helper for the real typechecker that
    ;; also checks a specific return type.
    (define (typecheck* expr type-env)
      (: two-nums : PICKY PICKY -> Void)
      (define (two-nums e1 e2)
        (typecheck e1 (NumT) type-env)
        (typecheck e2 (NumT) type-env))
      (cases expr
        [(Num n) (NumT)]
        [(Id name) (type-lookup name type-env)]
        [(Add   l r) (two-nums l r) (NumT)]
        [(Sub   l r) (two-nums l r) (NumT)]
        [(Equal l r) (two-nums l r) (BoolT)]
        [(Less  l r) (two-nums l r) (BoolT)]
        [(Fun bound-id in-type bound-body)
         (FunT in-type
               (typecheck* bound-body
                           (ExtendTypeEnv bound-id in-type type-env)))]
        [(Call fun arg)
         (cases (typecheck* fun type-env)
           [(FunT in-type out-type)
            (typecheck arg in-type type-env)
            out-type]
           [else (error 'typecheck "type error for ~s: expecting a fun"
                        expr)])]
        [(With bound-id named-expr bound-body)
         (typecheck* bound-body
                     (ExtendTypeEnv bound-id
                                    (typecheck* named-expr type-env)
                                    type-env))]
        [(If cond-expr then-expr else-expr)
         (typecheck cond-expr (BoolT) type-env)
         (let ([type (typecheck* then-expr type-env)])
           (typecheck else-expr type type-env) ; enforce same type
           type)]))

    ;; Evaluator and related types and helpers

    (define-type ENV
      [EmptyEnv]
      [Extend Symbol VAL ENV])

    (define-type VAL
      [NumV  Number]
      [BoolV Boolean]
      [FunV  Symbol PICKY ENV])

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(Extend id val rest-env)
         (if (eq? id name) val (lookup name rest-env))]))

    (: strip-numv : Symbol VAL -> Number)
    ;; converts a VAL to a Racket number if possible, throws an error if
    ;; not using the given name for the error message
    (define (strip-numv name val)
      (cases val
        [(NumV n) n]
        ;; this error will never be reached, see below for more
        [else (error name "expected a number, got: ~s" val)]))

    (: arith-op : (Number Number -> Number) VAL VAL -> VAL)
    ;; gets a Racket numeric binary operator, and uses it within a NumV
    ;; wrapper
    (define (arith-op op val1 val2)
      (NumV (op (strip-numv 'arith-op val1)
                (strip-numv 'arith-op val2))))

    (: bool-op : (Number Number -> Boolean) VAL VAL -> VAL)
    ;; gets a Racket numeric binary predicate, and uses it
    ;; within a BoolV wrapper
    (define (bool-op op val1 val2)
      (BoolV (op (strip-numv 'bool-op val1)
                 (strip-numv 'bool-op val2))))

    (: eval : PICKY ENV -> VAL)
    ;; evaluates PICKY expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Id name) (lookup name env)]
        [(Add   l r) (arith-op + (eval l env) (eval r env))]
        [(Sub   l r) (arith-op - (eval l env) (eval r env))]
        [(Equal l r) (bool-op  = (eval l env) (eval r env))]
        [(Less  l r) (bool-op  < (eval l env) (eval r env))]
        [(Fun bound-id in-type bound-body)
         ;; note that types are not used at runtime,
         ;; so they're not stored in the closure
         (FunV bound-id bound-body env)]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV bound-id bound-body f-env)
            (eval bound-body
                  (Extend bound-id (eval arg-expr env) f-env))]
           ;; `cases' requires complete coverage of all variants, but
           ;; this `else' is never used since we typecheck programs
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]
        [(With bound-id named-expr bound-body)
         (eval bound-body (Extend bound-id (eval named-expr env) env))]
        [(If cond-expr then-expr else-expr)
         (let ([bval (eval cond-expr env)])
           (if (cases bval
                 [(BoolV b) b]
                 ;; same as above: this case is never reached
                 [else (error 'eval "`if' expects a boolean, got: ~s"
                                    bval)])
             (eval then-expr env)
             (eval else-expr env)))]))

    (: run : String -> Number)
    ;; evaluate a PICKY program contained in a string
    (define (run str)
      (let ([prog (parse str)])
        (typecheck prog (NumT) (EmptyTypeEnv))
        (let ([result (eval prog (EmptyEnv))])
          (cases result
            [(NumV n) n]
            ;; this error is never reached, since we make sure
            ;; that the program always evaluates to a number above
            [else (error 'run "evaluation returned a non-number: ~s"
                              result)]))))

    ;; tests -- including translations of the FLANG tests
    (test (run "5") => 5)
    (test (run "{fun {x : Num} {+ x 1}}") =error> "type error")
    (test (run "{call {fun {x : Num} {+ x 1}} 4}") => 5)
    (test (run "{with {x 3} {+ x 1}}") => 4)
    (test (run "{with {identity {fun {x : Num} x}} {call identity 1}}")
          => 1)
    (test (run "{with {add3 {fun {x : Num} {+ x 3}}}
                  {call add3 1}}")
          => 4)
    (test (run "{with {add3 {fun {x : Num} {+ x 3}}}
                  {with {add1 {fun {x : Num} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x : {Num -> Num}} x}}
                  {with {foo {fun {x : Num} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x 3}
                  {with {f {fun {y : Num} {+ x y}}}
                    {with {x 5} {call f 4}}}}")
          => 7)
    (test (run "{call {with {x 3} {fun {y : Num} {+ x y}}} 4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y : Num} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x : {Num -> {Num -> Num}}} {call x 1}}
                            {fun {x : Num} {fun {y : Num} {+ x y}}}}
                      123}")
          => 124)
    (test (run "{call {fun {x : Num} {if {< x 2} {+ x 5} {+ x 6}}} 1}")
          => 6)
    (test (run "{call {fun {x : Num} {if {< x 2} {+ x 5} {+ x 6}}} 2}")
          => 8)

Finally, an obvious question is whether we can get rid of *all* of the
type declarations. The main point here is that we need to somehow be
able to typecheck expressions and assign "temporary types" to them that
will later on change --- for example, when we typecheck this:

    {with {identity {fun {x} x}}
      {call identity 1}}

we need to somehow decide that the named expression has a general
function type, with no commitment on the actual input and output types
--- and then change them after we typecheck the body. (We could try to
resolve that somehow by typechecking the body first, but that will not
work, since the body must be checked with *some* type assigned to the
identifier, or it will fail.)


------------------------------------------------------------------------
## Improving Picky (contd.) [Tuesday, April 8th]

### Even better... #####################################################

This can be done using *type variables* --- things that contain boxes
that can be used to change types as typecheck progresses. The following
version does that. (Also, it gets rid of the `typecheck*` thing, since
it can be achieved by using a type-variable and a call to `typecheck`.)
Note the interesting tests at the end.

    ;;; ---<<<PICKY3>>>-------------------------------------------------
    ;; The Picky interpreter, no explicit types

    #lang pl

    #|
    The grammar:
      <PICKY> ::= <num>
                | <id>
                | { + <PICKY> <PICKY> }
                | { - <PICKY> <PICKY> }
                | { = <PICKY> <PICKY> }
                | { < <PICKY> <PICKY> }
                | { fun { <id> } <PICKY> }
                | { call <PICKY> <PICKY> }
                | { with { <id> <PICKY> } <PICKY> }
                | { if <PICKY> <PICKY> <PICKY> }

    The types are no longer part of the input syntax.

    Evaluation rules:
      eval(N,env)                = N
      eval(x,env)                = lookup(x,env)
      eval({+ E1 E2},env)        = eval(E1,env) + eval(E2,env)
      eval({- E1 E2},env)        = eval(E1,env) - eval(E2,env)
      eval({= E1 E2},env)        = eval(E1,env) = eval(E2,env)
      eval({< E1 E2},env)        = eval(E1,env) < eval(E2,env)
      eval({fun {x} E},env)      = <{fun {x} E}, env>
      eval({call E1 E2},env1)    = eval(B,extend(x,eval(E2,env1),env2))
                                  if eval(E1,env1) = <{fun {x} B}, env2>
                                 = error!  otherwise <-- never happens
      eval({with {x E1} E2},env) = eval(E2,extend(x,eval(E1,env),env))
      eval({if E1 E2 E3},env)    = eval(E2,env)  if eval(E1,env) is true
                                 = eval(E3,env)  otherwise

    Type checking rules (note the extra complexity in the `fun' rule):

      Γ ⊢ n : Number

      Γ ⊢ x : Γ(x)

      Γ ⊢ A : Number   Γ ⊢ B : Number
      ———————————————————————————————
           Γ ⊢ {+ A B} : Number

      Γ ⊢ A : Number   Γ ⊢ B : Number
      ———————————————————————————————
           Γ ⊢ {< A B} : Boolean

           Γ[x:=τ₁] ⊢ E : τ₂
      ————————————————————————————
      Γ ⊢ {fun {x} E} : (τ₁ -> τ₂)

      Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
      ——————————————————————————————
           Γ ⊢ {call F V} : τ₂

      Γ ⊢ C : Boolean   Γ ⊢ T : τ   Γ ⊢ E : τ
      ———————————————————————————————————————
                Γ ⊢ {if C T E} : τ

      Γ ⊢ V : τ₁   Γ[x:=τ₁] ⊢ E : τ₂
      ——————————————————————————————
         Γ ⊢ {with {x V} E} : τ₂

    |#

    (define-type PICKY
      [Num   Number]
      [Id    Symbol]
      [Add   PICKY PICKY]
      [Sub   PICKY PICKY]
      [Equal PICKY PICKY]
      [Less  PICKY PICKY]
      [Fun   Symbol PICKY] ; no types even here
      [Call  PICKY PICKY]
      [With  Symbol PICKY PICKY]
      [If    PICKY PICKY PICKY])

    (: parse-sexpr : Sexpr -> PICKY)
    ;; parses s-expressions into PICKYs
    (define (parse-sexpr sexpr)
      (match sexpr
        [(number: n)    (Num n)]
        [(symbol: name) (Id name)]
        [(list '+ lhs rhs) (Add   (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '- lhs rhs) (Sub   (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '= lhs rhs) (Equal (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list '< lhs rhs) (Less  (parse-sexpr lhs) (parse-sexpr rhs))]
        [(list 'call fun arg)
                           (Call  (parse-sexpr fun) (parse-sexpr arg))]
        [(list 'if c t e)
         (If (parse-sexpr c) (parse-sexpr t) (parse-sexpr e))]
        [(cons 'fun more)
         (match sexpr
           [(list 'fun (list (symbol: name)) body)
            (Fun name (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `fun' syntax in ~s" sexpr)])]
        [(cons 'with more)
         (match sexpr
           [(list 'with (list (symbol: name) named) body)
            (With name (parse-sexpr named) (parse-sexpr body))]
           [else (error 'parse-sexpr "bad `with' syntax in ~s" sexpr)])]
        [else (error 'parse-sexpr "bad expression syntax: ~s" sexpr)]))

    (: parse : String -> PICKY)
    ;; parses a string containing a PICKY expression to a PICKY AST
    (define (parse str)
      (parse-sexpr (string->sexpr str)))

    ;; Typechecker and related types and helpers

    ;; this is not a part of the AST now, and it also has a new variant
    ;; for type variables (see `same-type' for how it's used)
    (define-type TYPE
      [NumT]
      [BoolT]
      [FunT TYPE TYPE]
      [?T (Boxof (U TYPE #f))])

    ;; this is similar to ENV, but it holds type information for the
    ;; identifiers during typechecking; it is essentially "Γ"
    (define-type TYPEENV
      [EmptyTypeEnv]
      [ExtendTypeEnv Symbol TYPE TYPEENV])

    (: type-lookup : Symbol TYPEENV -> TYPE)
    ;; similar to `lookup' for type environments; note that the
    ;; error is phrased as a typecheck error, since this indicates
    ;; a failure at the type checking stage
    (define (type-lookup name typeenv)
      (cases typeenv
        [(EmptyTypeEnv) (error 'typecheck "no binding for ~s" name)]
        [(ExtendTypeEnv id type rest-env)
         (if (eq? id name) type (type-lookup name rest-env))]))

    (: typecheck : PICKY TYPE TYPEENV -> Void)
    ;; Checks that the given expression has the specified type. Used
    ;; only for side-effects, so return a void value. There are two
    ;; side-effects that it can do: throw an error if the input
    ;; expression doesn't typecheck, and type variables can be mutated
    ;; once their values are known -- this is done by the `types='
    ;; utility function that follows.
    (define (typecheck expr type type-env)
      ;; convenient helpers
      (: type= : TYPE -> Void)
      (define (type= type2) (types= type type2 expr))
      (: two-nums : PICKY PICKY -> Void)
      (define (two-nums e1 e2)
        (typecheck e1 (NumT) type-env)
        (typecheck e2 (NumT) type-env))
      (cases expr
        [(Num n)     (type= (NumT))]
        [(Id name)   (type= (type-lookup name type-env))]
        [(Add   l r) (type= (NumT))  (two-nums l r)] ; note that the
        [(Sub   l r) (type= (NumT))  (two-nums l r)] ; order in these
        [(Equal l r) (type= (BoolT)) (two-nums l r)] ; things can be
        [(Less  l r) (type= (BoolT)) (two-nums l r)] ; swapped...
        [(Fun bound-id bound-body)
         (let (;; the identity of these type variables is important!
               [itype (?T (box #f))]
               [otype (?T (box #f))])
           (type= (FunT itype otype))
           (typecheck bound-body otype
                      (ExtendTypeEnv bound-id itype type-env)))]
        [(Call fun arg)
         (let ([type2 (?T (box #f))]) ; same here
           (typecheck arg type2 type-env)
           (typecheck fun (FunT type2 type) type-env))]
        [(With bound-id named-expr bound-body)
         (let ([type2 (?T (box #f))]) ; and here
           (typecheck named-expr type2 type-env)
           (typecheck bound-body type
                      (ExtendTypeEnv bound-id type2 type-env)))]
        [(If cond-expr then-expr else-expr)
         (typecheck cond-expr (BoolT) type-env)
         (typecheck then-expr type type-env)
         (typecheck else-expr type type-env)]))

    (: types= : TYPE TYPE PICKY -> Void)
    ;; Compares the two input types, and throw an error if they don't
    ;; match. This function is the core of `typecheck', and it is used
    ;; only for its side-effect. Another side effect in addition to
    ;; throwing an error is when type variables are present -- they will
    ;; be mutated in an attempt to make the typecheck succeed. Note that
    ;; the two type arguments are not symmetric: the first type is the
    ;; expected one, and the second is the one that the code implies
    ;; -- but this matters only for the error messages. Also, the
    ;; expression input is used only for these errors. As the code
    ;; clearly shows, the main work is done by `same-type' below.
    (define (types= type1 type2 expr)
      (unless (same-type type1 type2)
        (error 'typecheck "type error for ~s: expecting ~a, got ~a"
               expr (type->string type1) (type->string type2))))

    (: type->string : TYPE -> String)
    ;; Convert a TYPE to a human readable string,
    ;; used for error messages
    (define (type->string type)
      (format "~s" type)
      ;; The code below would be useful, but unfortunately it doesn't
      ;; work in some cases. To see the problem, try to run the example
      ;; below that applies identity on itself. It's left here so you
      ;; can try it out when you're not running into this problem.
      #|
      (cases type
        [(NumT)  "Num"]
        [(BoolT) "Bool"]
        [(FunT i o)
         (string-append (type->string i) " -> " (type->string o))]
        [(?T box)
         (let ([t (unbox box)])
           (if t (type->string t) "?"))])
      |#)

    ;; Convenience type to make it possible to have a single `cases'
    ;; dispatch on two types instead of nesting `cases' in each branch
    (define-type 2TYPES [PairT TYPE TYPE])

    (: same-type : TYPE TYPE -> Boolean)
    ;; Compares the two input types, return true or false whether
    ;; they're the same. The process might involve mutating ?T type
    ;; variables.
    (define (same-type type1 type2)
      ;; the `PairT' type is only used to conveniently match on both
      ;; types in a single `cases', it's not used in any other way
      (cases (PairT type1 type2)
        ;; flatten the first type, or set it to the second if it's unset
        [(PairT (?T box) type2)
         (let ([t1 (unbox box)])
           (if t1
             (same-type t1 type2)
             (begin (set-box! box type2) #t)))]
        ;; do the same for the second (reuse the above case)
        [(PairT type1 (?T box)) (same-type type2 type1)]
        ;; the rest are obvious
        [(PairT (NumT) (NumT)) #t]
        [(PairT (BoolT) (BoolT)) #t]
        [(PairT (FunT i1 o1) (FunT i2 o2))
         (and (same-type i1 i2) (same-type o1 o2))]
        [else #f]))

    ;; Evaluator and related types and helpers

    (define-type ENV
      [EmptyEnv]
      [Extend Symbol VAL ENV])

    (define-type VAL
      [NumV  Number]
      [BoolV Boolean]
      [FunV  Symbol PICKY ENV])

    (: lookup : Symbol ENV -> VAL)
    ;; lookup a symbol in an environment, return its value or throw an
    ;; error if it isn't bound
    (define (lookup name env)
      (cases env
        [(EmptyEnv) (error 'lookup "no binding for ~s" name)]
        [(Extend id val rest-env)
         (if (eq? id name) val (lookup name rest-env))]))

    (: strip-numv : Symbol VAL -> Number)
    ;; converts a VAL to a Racket number if possible, throws an error if
    ;; not using the given name for the error message
    (define (strip-numv name val)
      (cases val
        [(NumV n) n]
        ;; this error will never be reached, see below for more
        [else (error name "expected a number, got: ~s" val)]))

    (: arith-op : (Number Number -> Number) VAL VAL -> VAL)
    ;; gets a Racket numeric binary operator, and uses it within a NumV
    ;; wrapper
    (define (arith-op op val1 val2)
      (NumV (op (strip-numv 'arith-op val1)
                (strip-numv 'arith-op val2))))

    (: bool-op : (Number Number -> Boolean) VAL VAL -> VAL)
    ;; gets a Racket numeric binary predicate, and uses it
    ;; within a BoolV wrapper
    (define (bool-op op val1 val2)
      (BoolV (op (strip-numv 'bool-op val1)
                 (strip-numv 'bool-op val2))))

    (: eval : PICKY ENV -> VAL)
    ;; evaluates PICKY expressions by reducing them to values
    (define (eval expr env)
      (cases expr
        [(Num n) (NumV n)]
        [(Id name) (lookup name env)]
        [(Add   l r) (arith-op + (eval l env) (eval r env))]
        [(Sub   l r) (arith-op - (eval l env) (eval r env))]
        [(Equal l r) (bool-op  = (eval l env) (eval r env))]
        [(Less  l r) (bool-op  < (eval l env) (eval r env))]
        [(Fun bound-id bound-body) (FunV bound-id bound-body env)]
        [(Call fun-expr arg-expr)
         (define fval (eval fun-expr env))
         (cases fval
           [(FunV bound-id bound-body f-env)
            (eval bound-body
                  (Extend bound-id (eval arg-expr env) f-env))]
           ;; `cases' requires complete coverage of all variants, but
           ;; this `else' is never used since we typecheck programs
           [else (error 'eval "`call' expects a function, got: ~s"
                        fval)])]
        [(With bound-id named-expr bound-body)
         (eval bound-body (Extend bound-id (eval named-expr env) env))]
        [(If cond-expr then-expr else-expr)
         (let ([bval (eval cond-expr env)])
           (if (cases bval
                 [(BoolV b) b]
                 ;; same as above: this case is never reached
                 [else (error 'eval "`if' expects a boolean, got: ~s"
                                    bval)])
             (eval then-expr env)
             (eval else-expr env)))]))

    (: run : String -> Number)
    ;; evaluate a PICKY program contained in a string
    (define (run str)
      (let ([prog (parse str)])
        (typecheck prog (NumT) (EmptyTypeEnv))
        (let ([result (eval prog (EmptyEnv))])
          (cases result
            [(NumV n) n]
            ;; this error is never reached, since we make sure
            ;; that the program always evaluates to a number above
            [else (error 'run "evaluation returned a non-number: ~s"
                              result)]))))

    ;; tests -- including translations of the FLANG tests
    (test (run "5") => 5)
    (test (run "{fun {x} {+ x 1}}") =error> "type error")
    (test (run "{call {fun {x} {+ x 1}} 4}") => 5)
    (test (run "{with {x 3} {+ x 1}}") => 4)
    (test (run "{with {identity {fun {x} x}} {call identity 1}}") => 1)
    (test (run "{with {add3 {fun {x} {+ x 3}}} {call add3 1}}") => 4)
    (test (run "{with {add3 {fun {x} {+ x 3}}}
                  {with {add1 {fun {x} {+ x 1}}}
                    {with {x 3}
                      {call add1 {call add3 x}}}}}")
          => 7)
    (test (run "{with {identity {fun {x} x}}
                  {with {foo {fun {x} {+ x 1}}}
                    {call {call identity foo} 123}}}")
          => 124)
    (test (run "{with {x 3}
                  {with {f {fun {y} {+ x y}}}
                    {with {x 5} {call f 4}}}}")
          => 7)
    (test (run "{call {with {x 3} {fun {y} {+ x y}}} 4}")
          => 7)
    (test (run "{with {f {with {x 3} {fun {y} {+ x y}}}}
                  {with {x 100}
                    {call f 4}}}")
          => 7)
    (test (run "{call {call {fun {x} {call x 1}}
                            {fun {x} {fun {y} {+ x y}}}}
                      123}")
          => 124)
    (test (run "{call {fun {x} {if {< x 2} {+ x 5} {+ x 6}}} 1}") => 6)
    (test (run "{call {fun {x} {if {< x 2} {+ x 5} {+ x 6}}} 2}") => 8)

    ;; Note that we still have a language with the same type system,
    ;; even though it looks like it could be more flexible -- for
    ;; example, the following two examples work:
    (test (run "{with {identity {fun {x} x}}
                  {call identity 1}}")
          => 1)
    (test (run "{with {identity {fun {x} x}}
                  {if {call identity {< 1 2}} 1 2}}")
          => 1)
    ;; but this doesn't, since identity can not be used with different
    ;; types:
    (test (run "{with {identity {fun {x} x}}
                  {if {call identity {< 1 2}}
                    {call identity 1}
                    2}}")
          =error> "type error")
    ;; this doesn't work either -- with an interesting error message:
    (test (run "{with {identity {fun {x} x}}
                  {call {call identity identity} 1}}")
          =error> "type error")
    ;; ... but these two work fine:
    (test (run "{with {identity1 {fun {x} x}}
                  {with {identity2 {fun {x} x}}
                    {+ {call identity1 1}
                       {if {call identity2 {< 1 2}} 1 2}}}}")
          => 2)
    (test (run "{with {identity1 {fun {x} x}}
                  {with {identity2 {fun {x} x}}
                    {call {call identity1 identity2} 1}}}")
          => 1)

Here are two other interesting things to try out --- in particular, the
type that is shown in the error message is interesting:

    (run "{fun {x} x}")
    (run "{call {fun {x} {call x x}} {fun {x} {call x x}}}")

More specifically, it is interesting to try the following to see
explicitly what our typechecker infers for `{fun {x} {call x x}}`:

    > (define b (?T (box #f)))
    > (typecheck (parse "{fun {x} {call x x}}") b (EmptyTypeEnv))
    > (cases b [(?T b) (unbox b)] [else #f])
    - : (U False TYPE)
    (FunT #0=(?T (box (FunT (?T '#&#0#) #1=(?T '#&#f)))) #1#)

To see it clearly, we can replace each `(?T #&...)` with the `...` that
it contains:

    (FunT #0=(FunT #0# #1=#f) #1#)

and to clarify further, convert the `FunT` to an infix `->`, the `#f` to
a `<?>`, and use `α` for the unknown "type variable" that is represented
by the `#1` (which is used twice):

    (#0=(#0# -> α) -> α)

This shows us that the type is recursive.

> **Sidenote#1**: You can now go back to the code and look at
> `type->string`, which is an attempt to implement a nice string
> representation for types. Can you see now why it cannot work (at least
> not without more complex code)?
>
> **Sidenote#2**: Compare the above with OCaml, which can infer such
> types when started with a `-rectypes` flag:
>
>     # let foo = fun x -> x x ;;
>     val foo : ('a -> 'b as 'a) -> 'b = <fun>
>
> The type here is identical to our type: `'a` and `'b` should be read
> as `α` and `β` resp., and `as` is used in the same way that Racket
> shows a cyclic structure using `#0#`. As for the question of why OCaml
> doesn't *always* behave as if the `-rectypes` flag is given, the
> answer is that its type checker might fall into the same trap that
> ours does --- it gets stuck with:
>
>     # let foo = (fun x -> x x) (fun x -> x x) ;;

The `α` that we see here is "kind of" in a direction of something that
resembles a polymorphic type, but we really don't have polymorphism in
our language: each box can be filled just one time with one type, and
from then on that type is used in all further uses of the same box type.
For example, note the type error we get with:

    {with {f {fun {x} x}}
      {call f {< {call f 1} {call f 2}}}}


------------------------------------------------------------------------
# Typing Recursion [Tuesday, April 8th]

We already know that without recursion life can be very boring... So we
obviously want to be able to have recursive functions --- but the
question is how will they interact with our type system. One thing that
we have seen is that by just having functions we get recursion. This was
achieved by the Y combinator function. It seems like the same should
apply to our simple typed language. The core of the Y combinator was
using an expression similar to Omega that generates the infinite loop
that is needed. In our language:

    {call {fun {x} {call x x}} {fun {x} {call x x}}}

This expression was impossible to evaluate completely since it never
terminates, but it served as a basis for the Y combinator so we need to
be able to perform this kind of infinite loop. Now, consider the type of
the first `x` --- it's used in a `call` expression as a function, so its
type must be a function type, say τ₁->τ₂. In addition, its argument is
`x` itself so its type is also τ₁ --- this means that we have:

    τ₁ -> τ₂ = τ₁

and from this we get:

    => τ₁ = τ₁ -> τ₂
          = (τ₁ -> τ₂) -> τ₂
          = ((τ₁ -> τ₂) -> τ₂) -> τ₂
          = ...

And this is a type that does not exist in our type system, since we can
only have finite types. Therefore, we have a proof by contradiction that
this expression cannot be typed in our system.

This is closely related to the fact that the typed language we have
described so far is "strongly normalizing": no matter what program you
write, it will always terminate! To see this, very informally, consider
this language without functions --- this is clearly a language where all
programs terminate, since the only way to create a loop is through
function applications. Now add functions and function application --- in
the typing rules for the resulting language, each `fun` creates a
function type (creates an arrow), and each function application consumes
a function type (deletes one arrow) --- since types are finite, the
number of arrows is finite, which means that the number of possible
applications is finite, so all programs must run in finite time.

> Note that when we discussed how to type the Y combinator we needed to
> use a `Rec` constructor --- something that the current type system
> has. Using that, we could have easily solve the `τ₁ = τ₁ -> τ₂`
> equation with `(Rec τ₁ (τ₁ -> τ₂))`.

In the our language, therefore, the halting problem doesn't even exist,
since all programs (that are properly typed) are guaranteed to halt.
This property is useful in many real-life situations (consider firewall
rules, configuration files, devices with embedded code). But the
language that we get is very limited as a result --- we really want the
power to shoot our feet...


------------------------------------------------------------------------
## Extending Picky with recursion [Tuesday, April 8th]

As we have seen, our language is strongly normalizing, which means that
to get general recursion, we must introduce a new construct (unlike
previously, when we didn't really need one). We can do this as we
previously did --- by adding a new construct to the language, or we can
somehow extend the (sub) language of type descriptions to allow a new
kind of type that can be used to solve the `τ₁ = τ₁ -> τ₂` equation. An
example of this solution would be similar to the `Rec` type constructor
in Typed Racket: a new type constructor that allows a type to refer to
itself --- and using `(Rec τ₁ (τ₁ -> τ₂))` as the solution. However,
this complicates things: type descriptions are no longer unique, since
we have `Num`, `(Rec this Num)`, and `(Rec this (Rec that Num))` that
are all equal.

For simplicity we will now take the first route and add `rec` --- an
explicit recursive binder form to the language (as with `with`, we're
going back to `rec` rather than `bindrec` to keep things simple).

First, the new BNF:

    <PICKY> ::= <num>
              | <id>
              | { + <PICKY> <PICKY> }
              | { < <PICKY> <PICKY> }
              | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
              | { call <PICKY> <PICKY> }
              | { with { <id> : <TYPE> <PICKY> } <PICKY> }
              | { rec { <id> : <TYPE> <PICKY> } <PICKY> }
              | { if <PICKY> <PICKY> <PICKY> }

    <TYPE>  ::= Number
              | Boolean
              | ( <TYPE> -> <TYPE> )

We now need to add a typing judgment for `rec` expressions. What should
it look like?

                ???
    ———————————————————————————
    Γ ⊢ {rec {x : τ₁ V} E} : τ₂

`rec` is similar to all the other local binding forms, like `with`, it
can be seen as a combination of a function and an application. So we
need to check the two things that those rules checked --- first, check
that the body expression has the right type assuming that the type
annotation given to `x` is valid:

      Γ[x:=τ₁] ⊢ E : τ₂   ???
    ———————————————————————————
    Γ ⊢ {rec {x : τ₁ V} E} : τ₂

Now, we also want to add the other side --- making sure that the τ₁ type
annotation is valid:

    Γ[x:=τ₁] ⊢ E : τ₂   Γ ⊢ V : τ₁
    ——————————————————————————————
     Γ ⊢ {rec {x : τ₁ V} E} : τ₂

But that will not be possible in general --- `V` is an expression that
usually includes `x` itself --- that's the whole point. The conclusion
is that we should use a similar trick to the one that we used to specify
evaluation of recursive binders --- the same environment is used for
both the named expression and for the body expression:

    Γ[x:=τ₁] ⊢ E : τ₂   Γ[x:=τ₁] ⊢ V : τ₁
    —————————————————————————————————————
         Γ ⊢ {rec {x : τ₁ V} E} : τ₂

You can also see now that if this rule adds an arrow type to the Γ type
environment (i.e., `τ₁` is an arrow), then it is doing so in a way that
makes it possible to use it over and over, making it possible to run
infinite loops in this language.

Our complete language specification is below.

    <PICKY> ::= <num>
              | <id>
              | { + <PICKY> <PICKY> }
              | { < <PICKY> <PICKY> }
              | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
              | { call <PICKY> <PICKY> }
              | { with { <id> : <TYPE> <PICKY> } <PICKY> }
              | { rec  { <id> : <TYPE> <PICKY> } <PICKY> }
              | { if <PICKY> <PICKY> <PICKY> }

    <TYPE>  ::= Number
              | Boolean
              | ( <TYPE> -> <TYPE> )

    Γ ⊢ n : Number

    Γ ⊢ x : Γ(x)

    Γ ⊢ A : Number   Γ ⊢ B : Number
    ———————————————————————————————
         Γ ⊢ {+ A B} : Number

    Γ ⊢ A : Number   Γ ⊢ B : Number
    ———————————————————————————————
         Γ ⊢ {< A B} : Boolean

              Γ[x:=τ₁] ⊢ E : τ₂
    ——————————————————————————————————————
    Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

    Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
    ——————————————————————————————
         Γ ⊢ {call F V} : τ₂

    Γ ⊢ C : Boolean   Γ ⊢ T : τ   Γ ⊢ E : τ
    ———————————————————————————————————————
              Γ ⊢ {if C T E} : τ

    Γ ⊢ V : τ₁   Γ[x:=τ₁] ⊢ E : τ₂
    ——————————————————————————————
     Γ ⊢ {with {x : τ₁ V} E} : τ₂

    Γ[x:=τ₁] ⊢ V : τ₁   Γ[x:=τ₁] ⊢ E : τ₂
    —————————————————————————————————————
         Γ ⊢ {rec {x : τ₁ V} E} : τ₂


------------------------------------------------------------------------
# Typing Data [Tuesday, April 8th]

> [PLAI §27]

An important concept that we have avoided so far is user-defined types.
This issue exists in practically all languages, including the ones we
did so far, since a language without the ability to create new
user-defined types is a language with a major problem. (As a side note,
we did talk about mimicking an object system using plain closures, but
it turns out that this is insufficient as a replacement for true
user-defined types --- you can kind of see that in the Schlac language,
where the lack of all types mean that there is no type error.)

In the context of a statically typed language, this issue is even more
important. Specifically, we talked about typing recursive code, but we
should also consider typing recursive data. For example, we will start
with a `length` function in an extension of the language that has
`empty?`, `rest`, and `NumCons` and `NumEmpty` constructors:

    {rec {length : ???
          {fun {l : ???} : Number
            {if {empty? l}
              0
              {+ 1 {call length {rest l}}}}}}
      {call length {NumCons 1 {NumCons 2 {NumCons 3 {NumEmpty}}}}}}

But adding all of these new functions as built-ins is getting messy: we
want our language to have a form for defining new kinds of data. In this
example --- we want to be able to define the `NumList` type for lists of
numbers. We therefore extend the language with a new `with-type` form
for creating new user-defined types, using variants in a similar way to
our own course language:

    {with-type {NumList [NumEmpty]
                        [NumCons Number ???]}
      {rec {length : ???
            {fun {l : ???} : Number
              ...}}
        ...}}

We assume here that the `NumList` definition provides us with a number
of new built-ins --- `NumEmpty` and `NumCons` constructors, and assume
also a `cases` form that can be used to both test a value and access its
components (with the constructors serving as patterns). This makes the
code a little different than what we started with:

    {with-type {NumList [NumEmpty]
                        [NumCons Number ???]}
      {rec {length : ???
            {fun {l : ???} : Number
              {cases l
                [{NumEmpty}    0]
                [{NumCons x r} {+ 1 {call length r}}]}}}
        {call length {NumCons 1 {NumCons 2 {NumCons 3 {NumEmpty}}}}}}}

The question is what should the `???` be filled with? Clearly, recursive
data types are very common and we need to support them. The scope of
`with-type` should therefore be similar to `rec`, except that it works
at the type level: the new type is available for its own definition.
This is the complete code now:

    {with-type {NumList [NumEmpty]
                        [NumCons Number NumList]}
      {rec {length : (NumList -> Number)
            {fun {l : NumList} : Number
              {cases l
                [{NumEmpty}    0]
                [{NumCons x r} {+ 1 {call length r}}]}}}
        {call length {NumCons 1 {NumCons 2 {NumCons 3 {NumEmpty}}}}}}}

(Note that in the course language we can do just that, and in addition,
the `Rec` type constructor can be used to make up recursive types.)

An important property that we would like this type to have is for it to
be *well founded*: that we'd never get stuck in some kind of type-level
infinite loop. To see that this holds in this example, note that some of
the variants are self-referential (only `NumCons` here), but there is at
least one that is not (`NumEmpty`) --- if there wasn't any simple
variant, then we would have no way to construct instances of this type
to begin with!

[As a side note, if the language has lazy semantics, we could use such
types --- for example:

    {with-type {NumList [NumCons Number NumList]}
      {rec {ones : NumList {NumCons 1 ones}}
        ...}}

Reasoning about such programs requires more than just induction though.]


------------------------------------------------------------------------
## Judgments for recursive types [Tuesday, April 8th]

If we want to have a language that is basically similar to the course
language, then --- as seen above --- we'd use a similar `cases`
expression. How should we type-check such expressions? In this case, we
want to verify this:

    Γ ⊢ {cases l [{NumEmpty} 0]
                 [{NumCons x r} {+ 1 {call length r}}]} : Number

Similarly to the judgment for `if` expressions, we require that the two
result expressions are numbers. Indeed, you can think about `cases` as a
more primitive tool that has the functionality of `if` --- in other
words, given such user-defined types we could implement booleans as a
new type and and implement `if` using `cases`. For example, wrap
programs with:

    {with-type {Bool [True] [False]} ...}

and translate `{if E1 E2 E3}` to `{cases E1 [{True} E2] [{False} E3]}`.

Continuing with typing `cases`, we now have:

    Γ ⊢ 0 : Number           Γ ⊢ {+ 1 {call length r}} : Number
    ————————————————————————————————————————————————————————————
    Γ ⊢ {cases l [{NumEmpty} 0]
                 [{NumCons x r} {+ 1 {call length r}}]} : Number

But this will not work --- we have no type for `r` here, so we can't
prove the second subgoal. We need to consider the `NumList` type
definition as something that, in addition to the new built-ins, provides
us with type judgments for these built-ins. In the case of the `NumCons`
variant, we know that using `{NumCons x r}` is a pattern that matches
`NumList` values that are a result of this variant constructor but it
also binds `x` and `r` to the values of the two fields, and since all
uses of the constructor are verified, the fields have the declared
types. This means that we need to extend Γ in this rule so we're able to
prove the two subgoals. Note that we do the same for the `NumEmpty`
case, except that there are no new bindings there.

    Γ ⊢ 0 : Number
    Γ[x:=Number; r:=NumList] ⊢ {+ 1 {call length r}} : Number
    ————————————————————————————————————————————————————————————
    Γ ⊢ {cases l [{NumEmpty} 0]
                 [{NumCons x r} {+ 1 {call length r}}]} : Number

Finally, we need to verify that the value itself --- `l` --- has the
right type: that it is a `NumList`.

    Γ ⊢ l : NumList
    Γ ⊢ 0 : Number
    Γ[x:=Number; r:=NumList] ⊢ {+ 1 {call length r}} : Number
    ————————————————————————————————————————————————————————————
    Γ ⊢ {cases l [{NumEmpty} 0]
                 [{NumCons x r} {+ 1 {call length r}}]} : Number

But why `NumList` and not some other defined type? This judgment needs
to do a little more work: it should inspect all of the variants that are
used in the branches, find the type that defines them, then use that
type as the subgoal. Furthermore, to make the type checker more useful,
it can check that we have complete coverage of the variants, and that no
variant is used twice:

    Γ ⊢ l : NumList
        (also need to show that NumEmpty and NumCons are all of
        the variants of NumList, with no repetition or extras.)
    Γ ⊢ 0 : Number
    Γ[x:=Number; r:=NumList] ⊢ {+ 1 {call length r}} : Number
    ————————————————————————————————————————————————————————————
    Γ ⊢ {cases l [{NumEmpty} 0]
                 [{NumCons x r} {+ 1 {call length r}}]} : Number

Note that how this is different from the version in the textbook --- it
has a `type-case` expression with the type name mentioned explicitly ---
for example: `{type-case l NumList {{NumEmpty} 0} ...}`. This is
essentially the same as having each defined type come with its own
`cases` expression. Our rule needs to do a little more work, but overall
it is a little easier to use. (And the same goes for the actual
implementation of the two languages.)

In addition to `cases`, we should also have typing judgments for the
constructors. These are much simpler, for example:

    Γ ⊢ x : Number   Γ ⊢ r : NumList
    ————————————————————————————————
      Γ ⊢ {NumCons x r} : NumList

Alternatively, we could add the constructors as new functions instead of
new special forms --- so in the Picky language they'd be used in `call`
expressions. The `with-type` will then create the bindings for its scope
at runtime, and for the typechecker it will add the relevant types to Γ:

    Γ[NumCons:=(Number NumList -> NumList); NumEmpty:=(-> NumList)]

(This requires functions of any arity, of course.) Using accessor
functions could be similarly simpler than `cases`, but less convenient
for users.

Note about representation: a by-product of our type checker is that
whenever we have a `NumList` value, we know that it *must* be an
instance of either `NumEmpty` or `NumCons`. Therefore, we could
represent such values as a wrapped value container, with a single bit
that distinguishes the two. This is in contrast to dynamically typed
languages like Racket, where every new type needs to have its own
globally unique tag.


------------------------------------------------------------------------
## "Runaway" instances [extra] [Tuesday, April 8th]

Consider this code:

    {with-type {NumList [NumEmpty] ...} {NumEmpty}}

We now know how to type check its validity, but what about the type of
this whole expression? The obvious choice would be `NumList`:

    {with-type {NumList [NumEmpty] ...} {NumEmpty}} : NumList

There is a subtle but important problem here: the expression evaluates
to a `NumList`, but we can no longer use this value, since we're out of
the scope of the `NumList` type definition! In other words, we would
typecheck a program that is pretty much useless.

Even if we were to allow such a value to flow to a different context
with a `NumList` type definition, we wouldn't want the two to be
confused --- following the principle of lexical scope, we'd want each
type definition to be unique to its own scope even if it has the same
concrete name. For example, using `NumList` as the type of the inner
`with-type` here:

    {with-type {NumList something-completely-different}
      {with-type {NumList [NumEmpty] ...}
        {NumEmpty}}}

would make it wrong.

(In fact, we might want to have a new type even if the value goes
outside of this scope and back in. The default struct definitions in
Racket have exactly this property --- they're *generative* --- which
means that each "call" to `define-struct` creates a new type, so:

    (define (two-foos)
      (define (foo x)
        (struct foo (x))
        (foo x))
      (list (foo 1) (foo 2)))

returns two instances of two *different* `foo` types!)

One way to resolve this is to just forbid the type from escaping the
scope of its definition --- so we would forbid the type of the
expression from being `NumList`, which makes

    {with-type {NumList [NumEmpty] ...} {NumEmpty}} : NumList

invalid. But that's not enough --- what about returning a compound value
that *contains* an instance of `NumList`? For example --- what if we
return a list or a function with a `NumList` instance?

    {with-type {NumList [NumEmpty] ...}
      {fun {x} {NumEmpty}}} : Num -> NumList??

Obviously, we would need to extend this restriction: the resulting type
should not mention the defined type *at all* --- not even in lists or
functions or anything else. This is actually easy to do: if the overall
expression is type-checked in the surrounding lexical scope, then it is
type-checked in the surrounding type environment (Γ), and that
environment has nothing in it about `NumList` (well, nothing about
*this* `NumList`).

Note that this is, very roughly speaking, what our course language does:
`define-type` can only define new types when it is used at the
top-level.

This works fine with the above assumption that such a value would be
completely useless --- but there are aspects of such values that are
useful. Such types are close to things that are known as "existential
types", and they are for defining opaque values that you can do nothing
with except pass them around, and only code in a specific lexical
context can actually use them. For example, you could lump together the
value with a function that can work on this value. If it wasn't for the
`define-type` top-level restriction, we could write the following:

    (: foo : Integer -> (List ??? (??? -> Integer)))
    (define (foo x)
      (define-type FOO [Foo Integer])
      (list (Foo 1)
            (lambda (f)
              (cases f [(Foo n) (* n n)]))))

There is nothing that we can do with resulting `Foo` instance (we don't
even have a way to name it) --- but in the result of the above function
we get also a function that could work on such values, even ones from
different calls:

    ((second (foo 1)) (first (foo 2))) -> 4

Since such kind of values are related to hiding information, they're
useful (among other things) when talking about module systems (and
object systems), where you want to have a local scope for a piece of
code with bindings that are not available outside it.


------------------------------------------------------------------------
# Type soundness [Tuesday, April 8th]

> [PLAI §28]

Having a type checker is obviously very useful --- but to be able to
*rely* on it, we need to provide some kind of a formal account of the
kind of guarantees that we get by using one. Specifically, we want to
guarantee that a program that type-checks is guaranteed to *never* fail
with a type error. Such type errors in Racket result in an exception
--- but in C they can result in anything. In our simple Picky
implementation, we still need to check the resulting value in `run`:

    (typecheck prog (NumT) (EmptyTypeEnv))
    (let ([result (eval prog (EmptyEnv))])
      (cases result
        [(NumV n) n]
        ;; this error is never reached, since we make sure
        ;; that the program always evaluates to a number above
        [else (error 'run "evaluation returned a non-number: ~s"
                          result)]))

A soundness proof for this would show that checking the result (in
`cases`) is not needed. However, the check must be there since Typed
Racket (or any other typechecker) is far from making up and verifying
such a proof by itsef.

In this context we have a specific meaning for "fail with a type error",
but these failures can be very different based on the kind of properties
that your type checker verifies. This property of a type system is
called *soundness*: a *sound* type system is one that will never allow
such errors for type-checked code:

>  For any program `p`, if we can type-check `p : τ`, then `p` will
>  evaluate to a value that is in the type `τ`.

The importance of this can be seen in that it is the *only* connection
between the type system and code execution. Without it, a type system is
a bunch of syntactic rules that are completely disconnected from how the
program runs. (Note also that --- "in the type" --- works for the
(common) case where types are sets of values.)

But this statement isn't exactly what we need --- it states a property
that is too strong: what if execution gets stuck in an infinite loop?
(That wasn't needed before we introduced `rec`, where we could extend
the conclusion part to: "... then `p` will terminate and evaluate to a
value that is in the type `τ`".) We therefore need to revise it:

>  For any program `p`, if we can type-check `p : τ`, and if `p`
>  terminates and returns `v`, then `v` is in the type `τ`.

But there are still problems with this. Some programs evaluate to a
value, some get stuck in an infinite loop, and some ... throw an error.
Even with type checking, there are still cases when we get runtime
errors. For example, in practically all statically typed languages the
length of a list is not encoded in its type, so `{first null}` would
throw an error. (It's possible to encode more information like that in
types, but there is a downside to this too: putting more information in
the type system means that things get less flexible, and it becomes more
difficult to write programs since you're moving towards proving more
facts about them.)

Even if we were to encode list lengths in the type, we would still have
runtime errors: opening a missing file, writing to a read-only file
fetching a non-existent URL, etc, so we must find some way to account
for these errors. Some "solutions" are:

* For all cases where an error should be raised, just return some value
  (of the appropriate type). For example, `(first l)` could return `0`
  if the list is empty; `(substring "foo" 10 20)` would return "huh?",
  etc. It seems like a dangerous way to resolve the issue, but in fact
  that's what most C library calls do: return some bogus value (for
  example, `malloc()` returns `NULL` when there is no available memory),
  and possibly set some global flag that specifies the exact error. (The
  main problem with this is that C programmers often don't check all of
  these conditions, leading to propagating undetected errors further
  down --- and all of this is a very rich source of security issues.)

* For all cases where an error should be raised, just get stuck into an
  infinite loop. This approach is obviously impractical --- but it is
  actually popular in some theoretical circles. The reason for that is
  that theory people will often talk about "domains", and to express
  facts about computation on these domains, they're extended with a
  "bottom" value that represents a diverging computation. Since this
  introduction is costly in terms of work that it requires, adding one
  more such value can lead to more effort than re-using the same
  "bottom" value.

* Raise an exception. This works out better than the above two extremes,
  and it is the approach taken by practically all modern languages.

So, assuming exceptions, we need to further refine what it means for a
type system to be sound:

>  For any program `p`, if we can type-check `p : τ`, and if `p`
>  terminates without exceptions and returns `v`, then `v` is in the
>  type `τ`.

An important thing to note here is that languages can have very
different ideas about where to raise an exception. For example, Scheme
implementations often have a trivial type-checker and throw runtime
exceptions when there is a type error. On the other hand, there are
systems that express much more in their type system, leaving much less
room for runtime exceptions.

A soundness proof ties together a particular type system with the
statement that it is sound. As such, it is where you tie the knot
between type checking (which happens at the syntactic level) and
execution (dealing with runtime values). These are two things that are
usually separate --- we've seen throughout the course many examples for
things that could be done only at runtime, and things that should happen
completely on the syntax. `eval` is the important *semantic function*
that connects the two worlds (`compile` also did this, when we converted
our evaluator to a compiler) --- and in here, it is the soundness proof
that makes the connection.

To demonstrate the kind of differences between the two sides, consider
an `if` expression --- when it is executed, only one branch is
evaluated, and the other is irrelevant, but when we check its type,
*both* sides need to be verified. The same goes for a function whose
execution get stuck in an infinite loop: the type checker will not get
into a loop since it is not executing the code, only scans the (finite)
syntax.

The bottom line here is that type soundness is really a claim that the
type system provides some guarantees about the runtime behavior of
programs, and its proof demonstrates that these guarantees do hold. A
fundamental problem with the type system of C and C++ is that it is not
sound: these languages *have* a type system, but it does not provide
such runtime guarantees. (In fact, C is even worse in that it really has
two type systems: there is the system that C programmers usually
interact with, which has a conventional set of type --- including even
higher-order function types; and there is the machine-level type system,
which only talks about various bit lengths of data. For example, using
"%s" in a printf() format string will blindly copy characters from the
address pointed to by the argument until it reaches a 0 character ---
even if the actual argument is really a floating point number or a
function.)

Note that people often talk about "strongly typed languages". This term
is often meaningless in that different people take it to mean different
things: it is sometimes used for a language that "has a static type
checker", or a language that "has a non-trivial type checker", and
sometimes it means that a language has a sound type system. For most
people, however, it means some vague idea like "a language like C or
Pascal or Java" rather than some concrete definition.


------------------------------------------------------------------------
# Explicit polymorphism [extra] [Tuesday, April 8th]

> [PLAI §29]

Consider the `length` definition that we had --- it is specific for
`NumList`s, so rename it to `lengthNum`:

    {with-type {NumList ...}
      {rec {lengthNum : (NumList -> Num)
            {fun {l : NumList} : Num
              {cases l
                [{NumEmpty}    0]
                [{NumCons x r} {+ 1 {call lengthNum r}}]}}}
        {call lengthNum
              {NumCons 1 {NumCons 2 {NumCons 3 {NumEmpty}}}}}}}

To simplify things, assume that types are previously defined, and that
we have an even more Racket-like language where we simply write a
`define` form:

    {define lengthNum
      {fun {l : NumList} : Num
        {cases l
          [{NumEmpty}    0]
          [{NumCons x r} {+ 1 {call lengthNum r}}]}}}

What would happen if, for example, we want to take the length of a list
of booleans? We won't be able to use the above code since we'd get a
type error. Instead, we'd need a separate definition for the other kind
of length:

    {define lengthBool
      {fun {l : BoolList} : Num
        {cases l
          [{BoolEmpty}    0]
          [{BoolCons x r} {+ 1 {call lengthBool r}}]}}}

We've designed a statically typed language that is effective in catching
a large number of errors, but it turns out that it's too restrictive ---
we cannot implement a single generic `length` function. Given that our
type system allows an infinite number of types, this is a major problem,
since every new type that we'll want to use in a list requires writing a
new definition for a length function that is specific to this type.

One way to address the problem would be to somehow add a new `length`
primitive function, with specific type rules to make it apply to all
possible types. (Note that the same holds for the list type too --- we
need a new type definition for each of these, so this solution implies a
new primitive type that will do the same generic trick.) This is
obviously a bad idea: there are other functions that will need the same
treatment (`append`, `reverse`, `map`, `fold`, etc), and there are other
types with similar problems (any new container type). A good language
should allow writing such a length function inside the language, rather
than changing the language for every new addition.

Going back to the code, a good question to ask is what is it exactly
that is different between the two `length` functions? The answer is that
there's very little that is different. To see this, we can take the code
and replace all occurrences of `Num` or `Bool` by some `???`. Even
better --- this is actually abstracting over the type, so we can use a
familiar type variable, τ:

    {define length〈τ〉
      {fun {l : 〈τ〉List} : Num
        {cases l
          [{〈τ〉Empty}    0]
          [{〈τ〉Cons x r} {+ 1 {call length〈τ〉 r}}]}}}

This is a kind of a very low-level "abstraction" --- we replace parts of
the text --- parts of identifiers --- with a kind of a syntactic meta
variable. But the nature of this abstraction is something that should
look familiar --- it's abstracting over the code, so it's similar to a
macro. It's not really a macro in the usual sense --- making it a real
macro involves answering questions like what does `length` evaluate to
(in the macro system that we've seen, a macro is not something that is a
value in itself), and how can we use these macros in the `cases`
patterns. But still, the similarity should provide a good intuition
about what goes on --- and in particular the basic fact is the same:
this *is* an abstraction that happens at the syntax level, since
typechecking is something that happens at that level.

To make things more manageable, we'll want to avoid the abstraction over
parts of identifiers, so we'll move all of the meta type variables, and
make them into arguments, using `〈...〉` brackets to stand for "meta
level applications":

    {define length〈τ〉
      {fun {l : List〈τ〉} : Num
        {cases l
          [{Empty〈τ〉}    0]
          [{Cons〈τ〉 x r} {+ 1 {call length〈τ〉 r}}]}}}

Now, the first "〈τ〉" is actually a kind of an input to `length`, it's
a binding that has the other `τ`s in its scope. So we need to have the
syntax reflect this somehow --- and since `fun` is the way that we write
such abstractions, it seems like a good choice:

    {define length
      {fun {τ}
        {fun {l : List〈τ〉} : Num
          {cases l
            [{Empty〈τ〉}    0]
            [{Cons〈τ〉 x r} {+ 1 {call length〈τ〉 r}}]}}}}

But this is very confused and completely broken. The new abstraction is
not something that is implemented as a function --- otherwise we'll need
to somehow represent type values within our type system. (Trying that
has some deep problems --- for example, if we have such type values,
then it needs to have a type too; and if we add some `Type` for this,
then `Type` itself should be a value --- one that has *itself* as its
type!)

So instead of `fun`, we need a new kind of a syntactic, type-level
abstraction. This is something that is acts as a function that gets used
by the type checker. The common way to write such functions is with a
capital `lambda` --- `Λ`. Since we already use Greek letters for things
that are related to types, we'll use that as is (again, with "〈〉"s),
instead of a confusing capitalized `Lambda` (or a similarly confusing
`Fun`):

    {define length
      〈Λ 〈τ〉                   ; sidenote: similar to (All (t) ...)
        {fun {l : List〈τ〉} : Num
          {cases l
            [{Empty〈τ〉}    0]
            [{Cons〈τ〉 x r} {+ 1 {call length〈τ〉 r}}]}}〉}

and to use this `length` we'll need to instantiate it with a specific
type:

    {+ {call length〈Num〉 {list 1 2}}
       {call length〈Bool〉 {list #t #f}}}

Note that we have several kinds of meta-applications, with slightly
different intentions:

* length〈τ〉 is the recursive call, which needs to keep using the same
  type that initiated the `length` call. It makes sense to have it
  there, since `length` is itself a type abstraction.

* List〈τ〉 is using `List` as if it's also this kind of an abstraction,
  except that instead of abstracting over some generic code, it
  abstracts over a generic type. This makes sense too: it naturally
  leads to a generic definition of `List` that works for all types since
  it is also an abstraction.

* Finally there are `Empty〈τ〉` and `Cons〈τ〉` that are used for
  patterns. This might not be necessary, since they are expected to be
  variants of the `List〈τ〉` type. But if we were doing this without
  pattern matching (for example, see the book) then we'd need `null?`
  and `rest` functions. In that case, the meta application would make
  sense --- `null?〈τ〉` and `rest〈τ〉` are the τ-specific versions of
  these functions which we get with this meta-application, in the same
  way that using `length` needs an explicit type.

Actually, the last item points at one way in which the above sample
calls:

    {+ {call length〈Num〉 {list 1 2}}
       {call length〈Bool〉 {list #t #f}}}

are broken --- we should also have a type argument for `list`:

    {+ {call length〈Num〉 {list〈Num〉 1 2}}
       {call length〈Bool〉 {list〈Bool〉 #t #f}}}

or, given that we're in the limited picky language:

    {+ {call length〈Num〉 {cons〈Num〉 1 {cons〈Num〉 2 null〈Num〉}}}
       {call length〈Bool〉 {cons〈Bool〉 #t {cons〈Bool〉 #f null〈Bool〉}}}}

Such a language is called "parametrically polymorphic with explicit type
parameters" --- it's *polymorphic* since it applies to any type, and
it's *explicit* since we have to specify the types in all places.


------------------------------------------------------------------------
## Polymorphism in the type description language [extra] [Tuesday, April 8th]

Given our definition for `length`, the type of `length〈Num〉` is
obvious:

    length〈Num〉 : List〈Num〉 -> Num

but what would be the type of `length` by itself? If it was a function
(which was a broken idea we've seen), then we would write:

    length : τ -> (List〈τ〉 -> Num)

But this is broken in the same way: the first arrow is fundamentally
different than the second --- one is used for a `Λ`, and the other for a
`fun`. In fact, the arrows are even more different, because the two `τ`s
are very different: the first one *binds* the second. So the first arrow
is bogus --- instead of an arrow we need some way to say that this is a
type that "for all τ" is "List〈τ〉 -> Num". The common way to write
this should be very familiar:

    length : ∀τ. List〈τ〉 -> Num

Finally, `τ` is usually used as a meta type variable; for these types
the convention is to use the first few Greek letters, so we get:

    length : ∀α. List〈α〉 -> Num

And some more examples:

    filter : ∀α. (α->Bool) × List〈α〉 -> List〈α〉
    map : ∀α,β. (α->β) × List〈α〉 -> List〈β〉

where `×` stands for multiple arguments (which isn't mentioned
explicitly in Typed Racket).


------------------------------------------------------------------------
## Type judgments for explicit polymorphism and execution [extra] [Tuesday, April 8th]

Given our notation for polymorphic functions, it looks like we're
introducing a runtime overhead. For example, our `length` definition:

    {define length
      〈Λ 〈α〉
        {fun {l : List〈α〉} : Num
          {cases l
            [{Empty〈α〉}    0]
            [{Cons〈α〉 x r} {+ 1 {call length〈α〉 r}}]}}〉}

looks like it now requires another curried call for each iteration
through the list. This would be bad for two reasons: first, one of the
main goals of static type checking is to *avoid* runtime work, so adding
work is highly undesirable. An even bigger problem is that types are
fundamentally a syntactic thing --- they should not exist at runtime, so
we don't want to perform these type applications at runtime simply
because we don't want types to exist at runtime. If you think about it,
then every traditional compiler that typechecks code does so while
compiling, not when the resulting compiled program runs. (A recent
exception in various languages are "dynamic" types that are used in a
way that is similar to plain (untyped) Racket.)

This means that we want to eliminate these applications in the
typechecker. Even better: instead of complicating the typechecker, we
can begin by applying all of the type meta-applications, and get a
result that does not have any such applications or any type variables
left --- then use the simple typechecker on the result. This process is
called "type elaboration".

As usual, there are two new formal rules for dealing with these
abstractions --- one for type abstractions and another for type
applications. Starting from the latter:

       Γ ⊢ E : ∀α.τ
    ———————————————————
    Γ ⊢ E〈τ₂〉 : τ[τ₂/α]

which means that when we encounter a type application E〈τ₂〉 where E
has a polymorphic type ∀α.τ, then we substitute the type variable α with
the input type τ₂. Note that this means that conceptually, the
typechecker is creating all of the different (monomorphic) `length`
versions, but we don't need all of them for execution --- having checked
the types, we can have a single `length` function which would be similar
to the function that Racket uses (i.e., the same "low level" code with
types erased).

To see how this works, consider our length use, which has a type of `∀α.
List〈α〉 -> Num`. We get the following proof that ends in the exact
type of `length` (remember that when you prove you climb up):

    Γ ⊢ length : ∀α. List〈α〉 -> Num
    ——————————————————————————————————————————————
    Γ ⊢ length〈Bool〉 : (List〈α〉 -> Num)[Bool/α]
    ——————————————————————————————————————————————
    Γ ⊢ length〈Bool〉 : List〈Bool〉 -> Num     [...]
    ——————————————————————————————————————————————
    Γ ⊢ {call length〈Bool〉 {cons〈Bool〉 ...}} : Num

The second rule for type abstractions is:

       Γ[α] ⊢ E : τ
    ———————————————————
    Γ ⊢ 〈Λ〈α〉 E〉 : ∀α.τ

This rule means that to typecheck a type abstraction, we need to check
the body while binding the type variable α --- but it's not bound to
some specific type. Instead, it's left unspecified (or
non-deterministic) --- and typechecking is expected to succeed without
requiring an actual type. If some specific type is actually required,
then typechecking should fail. The intuition behind this is that a
polymorphic function can be one only if it doesn't need some specific
type --- for example, `{fun {x} {- {+ x 1} 1}}` is an identity function,
but it's an identity that requires the input to be a number, and
therefore it cannot have a polymorphic ∀α.α type like `{fun {x} x}`.

Another example is our `length` function --- the actual type that the
list holds better not matter, or our `length` function is not really
polymorphic. This makes sense: to typecheck the function, this rule
means that we need to typecheck the body, with α being some unknown type
that cannot be used.

One thing that we need to be careful when applying any kind of
abstraction (and the first rule does just that for a very simple
lambda-calculus-like language) is infinite loops. But in the case of our
type language, it turns out that this lambda-calculus that gets used at
the meta-level is one of the strongly normalizing kinds, therefore no
infinite loops happen. Intuitively, this means that we should be able to
do this elaboration in just one pass over the code. Furthermore, there
are no side-effects, therefore we can safely cache the results of
applying type abstraction to speed things up. In the case of `length`,
using it on a list of `Num` will lead to one such application, but when
we later get to the recursive call we can reuse the (cached) first
result.


------------------------------------------------------------------------
## Explicit polymorphism conclusions [extra] [Tuesday, April 8th]

Quoted directly from the book:

> Explicit polymorphism seems extremely unwieldy: why would anyone want
> to program with it? There are two possible reasons. The first is that
> it's the only mechanism that the language designer gives for
> introducing parameterized types, which aid in code reuse. The second
> is that the language includes some additional machinery so you don't
> have to write all the types every time. In fact, C++ introduces a
> little of both (though much more of the former), so programmers are,
> in effect, manually programming with explicit polymorphism virtually
> every time they use the STL (Standard Template Library). Similarly,
> the Java 1.5 and C# languages support explicit polymorphism. But we
> can possibly also do better than foist this notational overhead on the
> programmer.


------------------------------------------------------------------------
# Web Programming [Tuesday, April 15th]

> [PLAI §15]

Consider web programming as a demonstration of a frequent problem. The
HTTP protocol is *stateless*: each HTTP query can be thought of as
running a program (or a function), getting a result, then killing it.
This makes interactive applications hard to write.

> For example, consider this behavior (which is based on a real story of
> a probably not-so-real bug known as "the ITA bug"):
>
> * You go on a flight reservation website, and look at flights to Paris
>   or London for a vacation.
>
> * You get a list of options, and choose one for Paris and one for
>   London, ctrl-click the first and then the second to open them in new
>   tabs.
>
> * You look at the descriptions and decide that you like the first one
>   best, so you click the button to buy the ticket.
>
> * A month later you go on your plane, and when you land you realize
>   that you're in the wrong country --- the ticket you paid for was the
>   second one after all...

Obviously there is some fundamental problem here --- especially given
that this problem plagued many websites early on (and these days these
kind of problems can still be found in some places (like the registrar's
system), except that people are much more aware of it, and are much more
prepared to deal with it). In an attempt to clarify what it is exactly
that went wrong, we might require that each interaction will result in
something that is deterministically based on what the browser window
shows when the interaction is made --- but even that is not always true.
Consider the same scenario except with a bookstore and an "add to my
cart" button. In this case you *want* to be able to add one item to the
cart in the first window, then switch to the second window and click
"add" there too: you want to end up with a cart that has *both* items.

The basic problem here is HTTP's statelessness, something that both web
servers and web browsers use extensively. Browsers give you navigation
buttons and sometimes will not even communicate with the web server when
you use them (instead, they'll show you cached pages), they give you the
ability to open multiple windows or tabs from the current one, and they
allow you to "clone" the current tab. If you view each set of HTTP
queries as a session --- this means that web browsers allow you to go
back and forth in time, explore multiple futures in parallel, and clone
your current world.

These are features that the HTTP protocol intentionally allows by being
stateless, and that people have learned to use effectively. A stateful
protocol (like ssh, or ftp) will run in a single process (or a program,
or a function) that is interacting with you directly, and this process
dies only when you disconnect. A big advantage of stateful protocols is
their ability to be very interactive and rely on state (eg, an editor
updates a character on the screen, relying on the rest of it showing the
same text); but stateless protocols can scale up better, and deal with a
more hectic kind of interaction (eg, open a page on an online store,
keep it open and buy the item a month later; or any of the above "time
manipulation" devices).

> Side-note: Some people think that Ajax is the answer to all of these
> problems. In reality, Ajax is layered on top of (asynchronous) web
> queries, so in fact it is the exact same situation. You do have an
> option of creating an application that works completely on the client
> side, but that wouldn't be as attractive --- and even if you do so,
> you're still working inside a browser that can play the same time
> tricks.


------------------------------------------------------------------------
## Basic web programming [Tuesday, April 15th]

> [PLAI §16]

Obviously, writing programs to run on a web server is a profitable
activity, and therefore highly desirable. But when we do so, we need to
somehow cope with the web's statelessness. To see the implications from
a PL point of view we'll use a small "toy" example that demonstrates the
basic issues --- an "addition" service:

- Server sends a page asking for a number,
- User types a number and hits enter,
- Server sends a second page asking for another number,
- User types a second number and hits enter,
- Server sends a page showing the sum of the two numbers.

[Such a small application is not realistic, of course: you can obviously
ask for both numbers on the same page. We still use it, though, to
minimize the general interaction problem to a more comprehensible core
problem.]

Starting from just that, consider how you'd *want* to write the code for
such a service. (If you have experience writing web apps, then try to
forget all of that now, and focus on just this basic problem.) The plain
version of what we want to implement is:

    (print
      (+ (read "First number")
         (read "Second number")))

which is trivially "translated" to:

    (web-display
      (+ (web-read "First number")
         (web-read "Second number")))

But this is never going to work. The interaction is limited to
presenting the user with some data and that's all --- you cannot do any
kind of interactive querying. For the purpose of making this more
concrete, imagine that `web-read` and `web-display` both communicate
information to the user via something like `error`: the information is
sent and at the same time the whole computation is aborted. With this,
the above code will just manage to ask for the first number and nothing
else happens.

We therefore must turn this server function into three separate
functions: one that shows the prompt for the first number, one that gets
the value entered and shows the second prompt, and a third that shows
the results page. The first two of these functions would send the
information (and the respective computation dies) to the browser,
including a form submission URL that will invoke the next function.

Assuming a generic "query argument" that represents the browser request,
and a return value that represents a page for the browser to render, we
have:

    (define (f1 query)
      ... show the first question ...)

    (define (f2 query)
      ... extract the number from the query ...
      ... show the second question ...)

    (define (f3 query)
      ... extract the number from the query ...
      ... show the sum ...)

Note that `f2` receives the first number directly, but `f3` doesn't.
Yet, it is obviously needed to show the sum. A typical hack to get
around this is to use a "hidden field" in the HTML form that `f2`
generates, where that field holds the second result. To make things more
concrete, we'll use some imaginary web API functions:

    (define (f1 query)
      (web-read "First number" 'n1 "f2"))

    (define (f2 query)
      (let ([n1 (get-field query 'n1)])
        ;; imagine that the following "configures" what web-read
        ;; produces by adding a hidden field to display
        (with-hidden-field 'n1 n1
          (web-read "Second number" 'n2 "f3"))))

    (define (f3 query)
      (web-display
        "Your two numbers sum up to: "
        (+ (get-field query 'n1)
           (get-field query 'n2))))

Which would (supposedly) result in something like the following html
forms when the user enters 1 and 2:

    http://.../f1
    <form action="http://.../f2">
      First number:
      <input type="text" name="n1" />
    </form>

    http://.../f2
    <form action="http://.../f3">
      <input type="hidden" name="n1" value="1" />
      Second number:
      <input type="text" name="n2" />
    </form>

    http://.../f3
    <p>Your two numbers sum up to: 3</p>

This is often a bad solution: it gets very difficult to manage with real
services where the "state" of the server consists of much more than just
a single number --- and it might even include values that are not
expressible as part of the form (for example an open database connection
or a running process). Worse, the state is all saved in the client
browser --- if it dies, then the interaction is gone. (Imagine doing
your taxes, and praying that the browser won't crash a third time.)

Another common approach is to store the state information on the server,
and use a small handle (eg, in a cookie) to identify the state, then
each function can use the cookie to retrieve the current state of the
service --- but this is exactly how we get to the above bugs. It will
fail with any of the mentioned time-manipulation features.


------------------------------------------------------------------------
# Continuations: Web Programming [Tuesday, April 15th]

To try and get a better solution, we'll re-start with the original
expression:

    (web-display (+ (web-read "First number")
                    (web-read "Second number")))

and assuming that `web-read` works as a regular function, we need to
begin with executing the first read:

                    (web-read "First number")

We then need to take that result and plug it into an expression that
will read the second number and sum the results --- that's the same as
the first expression, except that instead of the first `web-read` we use
a "hole":

    (web-display (+ <*>
                    (web-read "Second number")))

where `<*>` marks the point where we need to plug the result of the
first question into. A better way to explain this hole is to make the
expression into a function:

    (lambda (<*>)
      (web-display (+ <*>
                      (web-read "Second number"))))

We can split the second and third interactions in the same way. First we
can assemble the above two bits of code into an expression that has the
same meaning as the original one:

    ((lambda (<*>)
       (web-display (+ <*> (web-read "Second number"))))
     (web-read "First number"))

And now we can continue doing this and split the body of the consumer:

       (web-display (+ <*> (web-read "Second number")))

into a "reader" and the rest of the computation (using a new hole):

                           (web-read "Second number")   ; reader part

       (web-display (+ <*> <*2>))                       ; rest of comp

Doing all of this gives us:

    ((lambda (<*1>)
       ((lambda (<*2>)
          (web-display (+ <*1> <*2>)))
        (web-read "Second number")))
     (web-read "First number"))

And now we can proceed to the main trick. Conceptually, we'd like to
think about `web-read` as something that is implemented in a simple way:

    (define (web-read prompt)
      (printf "~a: " prompt)
      (read-number))

except that the "real" thing would throw an error and die once the
prompt is printed. The trick is one that we've already seen: we can turn
the code inside-out by making the above "hole functions" be an argument
to the reading function --- a consumer callback for what needs to be
done once the number is read. This callback is called a *continuation*,
and we'll use a `/k` suffix for names of functions that expect a
continuation (`k` is a common name for a continuation argument):

    (define (web-read/k prompt k)
      (printf "~a: " prompt)
      (k (read-number)))

This is not too different from the previous version --- the only
difference is that we make the function take a consumer function as an
input, and hand it what we read instead of just returning it. It makes
things a little easier, since we pass the hole function to `web-read/k`,
and it will invoke it when needed:

    (web-read/k "First number"
      (lambda (<*1>)
        (web-read/k "Second number"
          (lambda (<*2>)
            (web-display (+ <*1> <*2>))))))

You might notice that this looks too complicated; we could get exactly
the same result with:

    (web-display (+ (web-read/k "First number"
                                (lambda (<*>) <*>))
                    (web-read/k "Second number"
                                (lambda (<*>) <*>))))

but then there's not much point to having `web-read/k` at all... So why
have it? Remember that the main problem is that in the context of a web
server we think of `web-read` as something that throws an error and
kills the computation. So if we use such a `web-read/k` with a
continuation, we can make it save this continuation in some global
state, so it can be used later when there is a value.

As a side note, all of this might start looking very familiar to you if
you have any experience working with callback-heavy code. In fact,
consider the fact that the continuation (or `k`) is basically just a
callback, so the above is roughly:

    webRead("First number", function(a) {
      webRead("Second number", function(b) {
        webDisplay(a + b);
      });
    });

We'll talk more about JavaScript later.


------------------------------------------------------------------------
## Simulating web reading [Tuesday, April 15th]

We can now actually try all of this in plain Racket by simulating web
interactions. This is useful to look at the core problem while avoiding
the whole web mess that is uninteresting for the purpose of our
discussion. The main feature that we need to emulate is statelessness
--- and as we've discussed, we can simulate that using `error` to
guarantee that the process is properly killed for each interaction. We
will do this in `web-display` which simulates sending the results to the
client and therefore terminates the server process:

    (define (web-display n)
      (error 'web-display "~s" n))

More importantly, we need to do it in `web-read/k` --- but in this case,
we need more than just an `error` --- we need a way to store the `k` so
the computation can be resumed later. To continue with the web analogy
we do this in two steps: `error` is used to display the information (the
input prompt), and the user action of entering a number and submitting
it will be simulated by calling a function. Since the computation is
killed after we show the prompt, the way to implement this is by making
the user call a toplevel `submit` function --- and before throwing the
interaction error, we'll save the `k` continuation in a global box:

    (define (web-read/k prompt k)
      (set-box! resumer k)
      (error 'web-read
             "enter (submit N) to continue the following\n  ~a:"
             prompt))

`submit` uses the saved continuation:

    (define (submit n)
      ((unbox resumer) n))

For safety, we'll initialize `resumer` with a function that throws an
error (a real one, not intended for interactions), make `web-display`
reset it to the same function, and also make `submit` do so after
grabbing its value --- meaning that `submit` can only be used after a
`web-read/k`. And for convenience, we'll use `raise-user-error` instead
of `error`, which is a Racket function that throws an error without a
stack trace (since our errors are intended). It's also helpful to
disable debugging in DrRacket, so it won't take us back to the code over
and over again.

    ;;; ---<<<WEB-BASE-LIBRARY>>>---------------------------------------
    ;; Fake web interaction library (to be used with manual code CPS-ing
    ;; examples)

    #lang racket

    (define error raise-user-error)

    (define (nothing-to-do ignored)
      (error 'REAL-ERROR "No computation to resume."))

    (define resumer (box nothing-to-do))

    (define (web-display n)
      (set-box! resumer nothing-to-do)
      (error 'web-display "~s" n))

    (define (web-read/k prompt k)
      (set-box! resumer k)
      (error 'web-read
             "enter (submit N) to continue the following\n  ~a:"
             prompt))

    (define (submit n)
      ;; to avoid mistakes, we clear out `resumer' before invoking it
      (let ([k (unbox resumer)])
        (set-box! resumer nothing-to-do)
        (k n)))

We can now try out our code for the addition server, using plain
argument names instead of `<*>`s:

    (web-read/k "First number"
      (lambda (n1)
        (web-read/k "Second number"
          (lambda (n2)
            (web-display (+ n1 n2))))))

and see how everything works. You can also try now the bogus expression
that we mentioned:

    (web-display (+ (web-read/k "First number"  (lambda (n) n))
                    (web-read/k "Second number" (lambda (n) n))))

and see how it breaks: the first `web-read/k` saves the identity
function as the global resumer, losing the rest of the computation.

------------------------------------------------------------------------

Again, this should be familiar: we've taken a simple compound expression
and "linearized" it as a sequence of an input operation and a
continuation receiver for its result. This is essentially the same thing
that we used for dealing with inputs in the lazy language --- and the
similarity is not a coincidence. The problem that we faced there was
very different (representing IO as values that describe it), but it
originates from a similar situation --- some computation goes on (in
whatever way the lazy language decides to evaluate it), and when we have
a need to read something we must return a description of this read that
contains "the rest of the computation" to the eager part of the
interpreter that executes the IO. Once we get the user input, we send it
to this computation remainder, which can return another read request,
and so on.

Based on this intuition, we can guess that this can work for any piece
of code, and that we can even come up with a nicer "flat" syntax for it.
For example, here is a simple macro that flattens a sequence of reads
and a final display:

    (define-syntax web-code
      (syntax-rules (read display)
        [(_ (read n prompt) more ...)
         (web-read/k prompt
           (lambda (n)
             (web-code more ...)))]
        [(_ (display last))
         (web-display last)]))

and using it:

    (web-code (read x "First number")
              (read y "Second number")
              (display (+ x y)))

However, we'll avoid such cuteness to make the transformation more
explicit for the sake of the discussion. Eventually, we'll see how
things can become even better than that (as done in Racket): we can get
to write plain-looking Racket expressions and avoid even the need for an
imperative form for the code. In fact, it's easy to write this addition
server using Racket's web server framework, and the core of the code
looks very simple:

    (define (start initial-request)
      (page "The sum is: "
            (+ (web-read "First number")
               (web-read "Second number"))))

There is not much more than that --- it has two utilities, `page`
creates a well-formed web page, and `web-read` performs the reading. The
main piece of magic there is in `send/suspend` which makes the web
server capture the computation's continuation and store it in a hash
table, to be retrieved when the user visits the given URL. Here's the
full code:

    #lang web-server/insta
    (define (page . body)
      (response/xexpr
       `(html (body ,@(map (lambda (x)
                             (if (number? x) (format "~a" x) x))
                           body)))))
    (define (web-read prompt)
      ((compose string->number (curry extract-binding/single 'n)
                request-bindings send/suspend)
       (lambda (k)
         (page `(form ([action ,k])
                  ,prompt ": " (input ([type "text"] [name "n"])))))))
    (define (start initial-request)
      (page "The sum is: "
            (+ (web-read "First number")
               (web-read "Second number"))))


------------------------------------------------------------------------
# More Web Transformations [Tuesday, April 15th]

> [PLAI §17]


------------------------------------------------------------------------
## Transforming a recursive function [Tuesday, April 15th]

Going back to transforming code, we did the transformation on a simple
expression --- and as you'd guess, it's possible to make it work for
more complex code, even recursive functions. Let's start with some
simple function that sums up a bunch of numbers, given a list of prompts
for these numbers. Since it's a function, it's a reusable piece of code
that can be used in multiple places, and to demonstrate that, we add a
`web-display` with a specific list of prompts.

    (define (sum prompts)
      (if (null? prompts)
        0
        (+ (web-read (first prompts))
           (sum (rest prompts)))))

    (web-display (sum '("First" "Second" "Third")))

We begin by converting the `web-read` to its continuation version:

    (define (sum prompts)
      (if (null? prompts)
        0
        (web-read/k (first prompts)
          (lambda (n)
            (+ n
               (sum (rest prompts)))))))

    (web-display (sum '("First" "Second" "Third")))

But using `web-read/k` immediately terminates the running computation,
which means that when `sum` is called on the last line, the surrounding
`web-display` will be lost, and therefore this will not work. The way to
solve this is to make `sum` itself take a continuation, which we'll get
in a similar way --- by rewriting it as a `sum/k` function, and then we
can make our sample use pull in the web-display into the callback as
we've done before:

    (define (sum/k prompts k)
      (if (null? prompts)
        0
        (web-read/k (first prompts)
          (lambda (n)
            (+ n
               (sum (rest prompts)))))))

    (sum/k '("First" "Second" "Third")
           (lambda (sum) (web-display sum)))

We also need to deal with the recursive `sum` call and change it to a
`sum/k`. Clearly, the continuation is the same continuation that the
original sum was called with, so we need to pass it on in the recursive
call too:

    (define (sum/k prompts k)
      (if (null? prompts)
        0
        (web-read/k (first prompts)
          ;; get the value provided by the user, and add it to the value
          ;; that the recursive call generates
          (lambda (n)
            (+ n
               (sum/k (rest prompts)
                      k))))))

    (sum/k '("First" "Second" "Third")
           (lambda (sum) (web-display sum)))

But there is another problem now: the addition is done outside of the
continuation, therefore it will be lost as soon as there's a second
`web-read/k` call. In other words, computation bits that are outside of
any continuations are going to disappear, and therefore they must be
encoded as an explicit part of the continuation. The solution is
therefore to move the addition *into* the continuation:

    (define (sum/k prompts k)
      (if (null? prompts)
        0
        (web-read/k (first prompts)
          (lambda (n)
            (sum/k (rest prompts)
                   (lambda (sum-of-rest)
                     (k (+ n sum-of-rest))))))))

    (sum/k '("First" "Second" "Third")
           (lambda (sum) (web-display sum)))

Note that with this code every new continuation is bigger --- it
contains the previous continuation (note that "contains" here is done by
making it part of the closure), and it also contains one new addition.

But if the continuation is only getting bigger, then how do we ever get
a result out of this? Put differently, when we reach the end of the
prompt list, what do we do? --- Clearly, we just return 0, but that
silently drops the continuation that we worked so hard to accumulate.
This means that just returning 0 is wrong --- instead, we should send
the 0 to the pending continuation:

    (define (sum/k prompts k)
      (if (null? prompts)
        (k 0)
        (web-read/k (first prompts)
          (lambda (n)
            (sum/k (rest prompts)
                   (lambda (sum-of-rest)
                     (k (+ n sum-of-rest))))))))

    (sum/k '("First" "Second" "Third")
           (lambda (sum) (web-display sum)))

This makes sense now, and the code works as expected. This `sum/k` is a
utility to be used in a web server application, and such applications
need to be transformed in a similar way to what we're doing. Therefore,
our own `sum/k` is a function that expects to be invoked from such
transformed code --- so it needs to have an argument for the waiting
receiver, and it needs to pass that receiver around (accumulating more
functionality into it) until it's done.

As a side note, `web-display` is the only thing that is used in the
toplevel continuation, so we could have used it directly without a
`lambda` wrapper:

    (sum/k '("First" "Second" "Third")
           web-display)


------------------------------------------------------------------------
## Using `sum/k` [Tuesday, April 15th]

To get some more experience with this transformation, we'll try to
convert some code that uses the above `sum/k`. For example, lets add a
multiplier argument that will get multiplied by the sum of the given
numbers. Begin with the simple code. This is an actual application, so
we're writing just an expression to do the computation and show the
result, not a function.

    (web-display (* (web-read "Multiplier")
                    (sum '("First" "Second" "Third"))))

We now need to turn the two function calls into their `*/k` form. Since
we covered `sum/k` just now, begin with that. The first step is to
inspect its continuation: this is the same code after we replace the
`sum` call with a hole:

    (web-display (* (web-read "Multiplier")
                    <*>))

Now take this expression, make it into a function by abstracting over
the hole and call it `n`, and pass that to `sum/k`:

    (sum/k '("First" "Second" "Third")
           (lambda (n)
             (web-display (* (web-read "Multiplier")
                             n))))

(Note that this is getting rather mechanical now.) Now for the
`web-read` part, we need to identify its continuation --- that's the
expression that surrounds it inside the first continuation function, and
we'll use `m` for the new hole:

             (* m
                n)

As above, abstract over `m` to get a continuation, and pass it into
`web-read/k`:

    (sum/k '("First" "Second" "Third")
           (lambda (n)
             (web-read/k "Multiplier"
                         (lambda (m)
                           (web-display (* m n))))))

and we're done. An interesting question here is what would happen if
instead of the above, we start with the `web-read` and *then* get to the
`sum`? We'd end up with a different version:

    (web-read/k "Multiplier"
                (lambda (m)
                  (sum/k '("First" "Second" "Third")
                         (lambda (n)
                           (web-display (* m n))))))

Note how these options differ --- one reads the multiplier first, and
the other reads it last.

> Side-note: if in the last step of turning `web-read` to `web-read/k`
> we consider the *whole* expression when we formulate the continuation,
> then we get to the same code. But this isn't really right, since it is
> converting code that is already-converted.

In other words, our conversion results in code that fixes a specific
evaluation order for the original expression. The way that the inputs
happen in the original expression

    (web-display (* (web-read "Multiplier")
                    (sum '("First" "Second" "Third"))))

is unspecified in the code --- it only happens to be left-to-right
implicitly, because Racket evaluates function arguments in that order.
However, the converted code does *not* depend on how Racket evaluates
function arguments. (Can you see a similar conclusion here about
strictness?)

Note also another property of the converted code: every intermediate
result has a name now. This makes sense, since another way to fix the
evaluation order is to do just that. For example, convert the above to
either

    (let* ([m (web-read "Multiplier")]
           [n (sum '("First" "Second" "Third"))])
      (* m n))

or

    (let* ([n (sum '("First" "Second" "Third"))]
           [m (web-read "Multiplier")])
      (* m n))

This is also a good way to see why this kind of conversion can be a
useful tool in compiling code: the resulting code is in a kind of a
low-level form that makes it easy to translate to assembly form, where
function calls are eliminated, and instead there are only jumps (since
all calls are tail-calls). In other words, the above can be seen as a
piece of code that is close to something like:

    val n = sum(["First","Second","Third"])
    val m = web_read("Multiplier")
    web_display(m*n)

and it's almost visible in the original converted code if we format it
in a very weird way:

    ;; sum(["First","Second","Third"]) -> n
    (sum/k '("First" "Second" "Third") (lambda (n)
    ;; web_read("Multiplier") -> m
    (web-read/k "Multiplier" (lambda (m)
    ;; web_display(m*n)
    (web-display (* m n))))))


------------------------------------------------------------------------
## Converting stateful code [Tuesday, April 15th]

Another case to consider is applying this transformation to code that
uses mutation with some state. For example, here's some simple account
code that keeps track of a `balance` state:

    (define account
      (let ([balance (box 0)])
        (lambda ()
          (set-box! balance
                    (+ (unbox balance)
                       (web-read (format "Balance: ~s; Change"
                                         (unbox balance)))))
          (account))))

(Note that there is no `web-display` here, since it's an infinite loop.)
As usual, the fact that this function is expected to be used by a web
application means that it should receive a continuation:

    (define account/k
      (let ([balance (box 0)])
        (lambda (k)
          (set-box! balance
                    (+ (unbox balance)
                       (web-read (format "Balance: ~s; Change"
                                         (unbox balance)))))
          (account))))

Again, we need to convert the `web-read` into `web-read/k` by
abstracting out its continuation. We'll take the `set-box!` expression
and create a continuation out of it:

          (set-box! balance
                    (+ (unbox balance)
                       <*>))

and using `change` as the name for the continuation argument, we get:

    (define account/k
      (let ([balance (box 0)])
        (lambda (k)
          (web-read/k (format "Balance: ~s; Change"
                              (unbox balance))
                      (lambda (change)
                        (set-box! balance (+ (unbox balance) change))))
          (account))))

And finally, we translate the loop call to pass along the same
continuation it received (it seems suspicious, but there's nothing else
that could be used there):

    (define account/k
      (let ([balance (box 0)])
        (lambda (k)
          (web-read/k (format "Balance: ~s; Change" (unbox balance))
                      (lambda (change)
                        (set-box! balance (+ (unbox balance) change))))
          (account/k k))))

But if we try to run this --- (account/k web-display) --- we don't get
any result at all: it reads one number and then just stops without the
usual request to continue, and without showing any result. The lack of
printed result is a hint for the problem --- it must be the void return
value of the `set-box!`. Again, we need to remember that invoking a
`web-read/k` kills any pending computation and the following (resume)
will restart its continuation --- but the recursive call is not part of
the loop.

The problem is the continuation that we formulated:

          (set-box! balance
                    (+ (unbox balance)
                       change))

which should actually contain the recursive call too:

          (set-box! balance
                    (+ (unbox balance)
                       change))
          (account/k k)

In other words, the recursive call was left outside of the continuation,
and therefore it was lost when the fake server terminated the
computation on a `web-read/k` --- so it must move into the continuation
as well:

    (define account/k
      (let ([balance (box 0)])
        (lambda (k)
          (web-read/k (format "Balance: ~s; Change" (unbox balance))
                      (lambda (change)
                        (set-box! balance (+ (unbox balance) change))
                        (account/k k))))))

and the code now works. The only suspicious thing that we're still left
with is the loop that passes `k` unchanged --- but this actually is the
right thing to do here. The original loop had a tail-recursive call that
didn't pass along any new argument values, since the infinite loop is
doing its job via mutations to the box and nothing else was done in the
looping call. The continuation of the original call is therefore also
the continuation of the second call, etc. All of these continuations are
closing over a single box and this binding does not change (it *cannot*
change if we don't use a `set!`); instead, the boxed value is what
changes through the loop.


------------------------------------------------------------------------
## Converting higher order functions [Tuesday, April 15th]

Next we try an even more challenging transformation: a higher order
function. To get a chance to see more interesting examples, we'll have
some more code in this case.

For example, say that we want to compute the sum of squares of a list.
First, the simple code (as above, there's no need to wrap a
`web-display` around the whole thing, just make it return the result):

    (define (sum l) (foldl + 0 l))
    (define (square n) (* n n))
    (define (read-number prompt)
      (web-read (format "~a number" prompt)))
    (web-display (sum (map (lambda (prompt)
                             (square (read-number prompt)))
                           '("First" "Second" "Third"))))

Again, we can begin with `web-read` --- we want to convert it to the
continuation version, which means that we need to convert `read-number`
to get one too. This transformation is refreshingly trivial:

    (define (read-number/k prompt k)
      (web-read/k (format "~a number" prompt) k))

This is an interesting point --- it's a simple definition that just
passes `k` on, as is. The reason for this is similar to the simple
continuation passing of the imperative loop: the pre-translation
`read-number` is doing a simple tail call to `web-read`, so the
evaluation context of the two is identical. The only difference is the
prompt argument, and that's the same `format` call.

Of course things would be different if `format` itself required a web
interaction, since then we'd need some `format/k`, but without that
things are really simple. The same goes for the two utility functions
--- `sum` and `square`: they're not performing any web interaction so it
seems likely that they'll stay the same.

We now get to the main expression, which should obviously change since
it needs to call `read-number/k`, so it needs to send it some
continuation. By now, it should be clear that passing an identity
function as a continuation is going to break the surrounding context
once the running computation is killed for the web interaction. We need
to somehow generate a top-level identity continuation and propagate it
inside, and the `sum` call should be in that continuation together with
the `web-display` call. Actually, if we do the usual thing and write the
expression with a `<*>` hole, we get:

    (web-display (sum (map (lambda (prompt) (square <*>))
                           '("First" "Second" "Third"))))

and continuing with the mechanical transformation that we know, we need
to abstract over this expression+hole into a function, then pass it as
an argument to `read-number/k`:

    ;; very broken
    (read-number/k
     (lambda (<*>)
       (web-display (sum (map (lambda (prompt) (square <*>))
                              '("First" "Second" "Third"))))))

But that can't work in this case --- we need to send `read-number/k` a
prompt, but we can't get a specific one since there is a *list* of them.
In fact, this is related to a more serious problem --- pulling out
`read-number/k` like this is obviously broken since it means that it
gets called only once, instead, we need to call it once for each prompt
value.

The solution in this case is to convert `map` too:

    (web-display (sum (map/k (lambda (prompt)
                               (square (read-number prompt)))
                             '("First" "Second" "Third")
                             ...some-continuation...)))

and of course we should move `web-display` and `sum` into that
continuation:

    (map/k (lambda (prompt) (square (read-number prompt)))
           '("First" "Second" "Third")
           (lambda (l) (web-display (sum l))))

We can now use `read-number/k`, but the question is what should it get
for it's continuation?

    (map/k (lambda (prompt) (square (read-number/k prompt ???)))
           '("First" "Second" "Third")
           (lambda (l) (web-display (sum l))))

Clearly, `map/k` will need to somehow communicate *some* continuation to
the mapped function, which in turn will send it to `read-number/k`. This
means that the mapped function should get converted too, and gain a `k`
argument. To do this, we'll first make things convenient and have a name
for it (this is only for convenience, we could just as well convert the
`lambda` directly):

    (define (read-squared prompt)
      (square (read-number/k prompt ???)))
    (map/k read-squared
           '("First" "Second" "Third")
           (lambda (l) (web-display (sum l))))

Then convert it in the now-obvious way:

    (define (read-squared/k prompt k)
      (read-number/k prompt
                     (lambda (n)
                       (k (square n)))))
    (map/k read-squared/k
           '("First" "Second" "Third")
           (lambda (l) (web-display (sum l))))

Everything is in place now --- except for `map/k`, of course. We'll
start with the definition of plain `map`:

    (define (map f l)
      (if (null? l)
        null
        (cons (f (first l)) (map f (rest l)))))

The first thing in turning it into a `map/k` is adding a `k` input,

    (define (map f l k)
      (if (null? l)
        null
        (cons (f (first l)) (map f (rest l)))))

and now we need to face the fact that the `f` input is itself one with a
continuation --- an `f/k`:

    (define (map/k f/k l k)
      (if (null? l)
        null
        (cons (f (first l)) (map f (rest l)))))

Consider now the single `f` call --- that should turn into a call to
`f/k` with some continuation:

    (define (map/k f/k l k)
      (if (null? l)
        null
        (cons (f/k (first l) ???) (map f (rest l)))))

but since `f/k` will involve a web interaction, it will lead to killing
the `cons` around it. The solution is to move that `cons` into the
continuation that is handed to `f/k` --- and as usual, this involves the
second `cons` argument --- the continuation is derived from replacing
the `f/k` call by a hole:

        (cons <*> (map f (rest l)))

and abstracting that hole, we get:

    (define (map/k f/k l k)
      (if (null? l)
        null
        (f/k (first l)
             (lambda (result)
               (cons result (map f (rest l)))))))

We now do exactly the same for the recursive `map` call --- it should
use `map/k` with `f/k` and some continuation:

    (define (map/k f/k l k)
      (if (null? l)
        null
        (f/k (first l)
             (lambda (result)
               (cons result (map/k f/k (rest l) ???))))))

and we need to move the surrounding `cons` yet again into this
continuation. The holed expression is:

               (cons result <*>)

and abstracting that and moving it into the `map/k` continuation we get:

    (define (map/k f/k l k)
      (if (null? l)
        null
        (f/k (first l)
             (lambda (result)
               (map/k f/k (rest l)
                      (lambda (new-rest)
                        (cons result new-rest)))))))

There are just one more problem with this --- the `k` argument is never
used. This implies two changes, since it needs to be used once in each
of the conditional branches. Can you see where it should be added? (Try
to do this before reading the code below.)

The complete code follows:

    (define (map/k f/k l k)
      (if (null? l)
        (k null)
        (f/k (first l)
             (lambda (result)
               (map/k f/k (rest l)
                      (lambda (new-rest)
                        (k (cons result new-rest))))))))
    (define (sum l) (foldl + 0 l))
    (define (square n) (* n n))
    (define (read-number/k prompt k)
      (web-read/k (format "~a number" prompt) k))
    (define (read-squared/k prompt k)
      (read-number/k prompt (lambda (n) (k (square n)))))
    (map/k read-squared/k
           '("First" "Second" "Third")
           (lambda (l) (web-display (sum l))))


------------------------------------------------------------------------
# Highlevel Overview on Continuations [Tuesday, April 15th]

Very roughly speaking, the transformation we made turns a function call
like

    (...stuff... (f ...args...) ...more-stuff...)

into

    (f/k ...args...
         (lambda (<*>)
           (...stuff... <*> ...more-stuff...)))

This is the essence of the solution to the statelessness problem: to
remember where we left off, we conveniently flip the expression
inside-out so that its context is all stored in its continuation. One
thing to note is that we did this only for functions that had some kind
of web interaction, either directly or indirectly (since in the indirect
case they still need to carry around the continuation).

If we wanted to make this process a completely mechanical one, then we
wouldn't have been able to make this distinction. After all, a function
like `map` is perfectly fine as it is, unless it happens to be used with
a continuation-carrying function --- and that's something that we know
only at runtime. We would therefore need to transform *all* function
calls as above, which in turn means that all functions would need to get
an extra continuation argument.

Here are a few things to note about such fully-transformed code:

* All function calls in such code are tail calls. There is no single
  call with some context around it that is left for the time when the
  call is done. This is the exact property that makes it useful for a
  stateless interaction: such contexts are bad since a web interaction
  will mean that the context is discarded and lost. (In our pretend
  system, this is done by throwing an error.) Having no non-tail context
  means that capturing the continuation argument is sufficient, and no
  context gets lost.

* An implication of this, when you consider how the language is
  implemented, is that there is no need to have anything "on the stack"
  to execute fully transformed code. (If you'd use the stepper on such
  code, there would be no accumulation of context.) So is this some
  radical new way of computing without a stack? Not really: if you think
  about it, continuation arguments hold the exact same information that
  is traditionally put on the stack. (There is therefore an interesting
  relationship between continuations and runtime stacks, and in fact,
  one way of making it possible to grab continuations without doing such
  a transformation is to capture the current stack.)

* The evaluation order is fixed. Obviously, if Racket guarantees a
  left-to-right evaluation, then the order is always fixed --- but in
  the fully transformed code there are no function calls where this
  makes any difference. If Racket were to change, the transformed code
  would still retain the order it uses. More specifically, when we do
  the transformation, we control the order of evaluation by choosing how
  to proceed at every point. For example, if we have:

      (...stuff... (f1 ...args...) (f2 ...args...) ...more-stuff...)

  then it's up to use to choose whether to pull `f1` first, or maybe
  we'd want to start with `f2`.

  But there's more: the resulting code is independent of the evaluation
  strategy of the language. Even if the language is lazy, the
  transformed code is still executing things in the same order.
  (Alternatively, we could convert things so that the resulting
  computation corresponds to a lazy evaluation strategy even in a strict
  language.)

* In other words, the converted code is completely sequential. The
  conversion process requires choosing left-to-right or delaying some
  evaluations (or all), but the resulting code is free from any of these
  and has exactly one specific (sequential) order. You can therefore see
  how this kind of transformation is something that a compiler would
  want to do, since the resulting sequential code is easier for
  execution on a sequential base (like machine code, or C code). Another
  way to see this is that we have explicit names for each and every
  intermediate result --- so the converted code would have a direct
  mapping between identifiers and machine registers (unlike "plain" code
  where some of these are implicit and compilation needs to make up
  names).

* The transformation is a *global* one. Not only do we have to transform
  the first top-level expression that makes up the web application, we
  also need to convert every function that is mentioned in the code, and
  in functions that those functions mentioned, etc. Even worse, the
  converted code is very different from the original version, since
  everything is shuffled around --- in a way that matches the sequential
  execution, but it's very hard to even see the original intention
  through all of these explicit continuations and the new intermediate
  result names.

  The upshot of this is that it's not really something that we need to
  do manually, but instead we'd like it to be done automatically for us,
  by the compiler of the language.

What we did here is the tedious way of getting continuations: we
basically implemented them by massaging our code, turning it inside-out
into code with the right shape. The problem with this is that the
resulting code is no longer similar to what we had originally written,
which makes it more difficult to debug and to maintain. We therefore
would like to have this done in some automatic way, ideally in a way
that means that we can leave our plain original code as is.


------------------------------------------------------------------------
# An Automatic Continuation Converter [Tuesday, April 15th]

> [PLAI §18]

The converted code that we produced manually above is said to be written
in "Continuation Passing Style", or CPS. What we're looking for is for a
way to generate such code automatically --- a way to "CPS" a given
source code. When you think about it, this process is essentially a
source to source function which should be bolted onto the compiler or
evaluator. In fact, if we want to do this in Racket, then this
description makes it sound a lot like a macro --- and indeed it could be
implemented as such.

> Note that "CPS" has two related but distinct meanings here: you could
> have code that is written "in CPS style", which means that it handles
> explicit continuations. Uses of this term usually refer to using
> continuation functions in some places in the code, not for fully
> transformed code. The other meaning is used for fully-CPS-ed code,
> which is almost never written directly. In addition, "CPS" is often
> used as a verb --- either the manual process of refactoring code into
> passing some continuations explicitly (in the first case), or the
> automatic process of fully converting code (in the second one).

Before we get to the actual implementation, consider how we did the
translation --- there was a difference in how we handled plain top-level
expressions and library functions. In addition, we had some more
discounts in the manual process --- one such discount was that we didn't
treat all value expressions as possible computations that require
conversion. For example, in a function application, we took the function
sub-expression as a simple value and left it as is, but for an automatic
translation we need to convert that expression too since it might itself
be a more complicated expression.

Instead of these special cases and shortcuts, we'll do something more
uniform: we will translate *every* expression into a function. This
function will accept a receiver (= a continuation) and will pass it the
value of the expression. This will be done for *all* expressions, even
simple ones like plain numbers, for example, we will translate the `5`
expression into (lambda (k) (k 5)), and the same goes for other
constants and plain identifiers. Since we're specifying a transformation
here, we will treat it as a kind of a meta function and use a `CPS[x]`
to make it easy to talk about:

    CPS[5]
    -->
    (lambda (k) (k 5)) ; same for other numbers and constants

    CPS[x]
    -->
    (lambda (k) (k x)) ; same as above for identifiers too

When we convert a primitive function application, we still do the usual
thing, which is now easier to see as a general rule --- using `CPS[?]`
as the meta function that does the transformation:

    CPS[(+ E1 E2)]
    -->
    (lambda (k)        ; everything turns to cont.-consuming functions
      (CPS[E1]         ; the CPS of E1 -- it expects a cont. argument
       (lambda (v1)    ; send this cont. to CPS[E1], so v1 is its value
         (CPS[E2]      ; same for E2 -- expects a cont.
          (lambda (v2) ; and again, v2 becomes the value of E2
            (k (+ v1 v2))))))) ; finally return the sum to our own cont.

In the above, you can see that (CPS[E] (lambda (v) ...)) can be read as
"evaluate `E` and bind the result to `v`". (But note that the CPS
conversion is not doing any evaluation, it just reorders code to
determine how it gets evaluated when it later runs --- so "compute"
might be a better term to use here.) With this in mind, we can deal with
other function applications: evaluate the function form, evaluate the
argument form, then apply the first value on the second value, and
finally wrap everything with a (lambda (k) ...) and return the result to
this continuation:

    CPS[(E1 E2)]
    -->
    (lambda (k)
      (CPS[E1]         ; bind the result of evaluating E1
       (lambda (v1)    ; to v1
         (CPS[E2]      ; and the result of evaluating E2
          (lambda (v2) ; to v2
            (k (v1 v2))))))) ; apply and return the result

But this is the rule that we should use for primitive non-continuation
functions only --- it's similar to what we did with `+` (except that we
skipped evaluating `+` since it's known). Instead, we're dealing here
with functions that are defined in the "web language" (in the code that
is being converted), and as we've seen, these functions get a `k`
argument which they use to return the result to. That was the whole
point: pass `k` on to functions, and have them return the value directly
to the `k` context. So the last part of the above should be fixed:

    CPS[(E1 E2)]
    -->
    (lambda (k)
      (CPS[E1]         ; bind the result of evaluating E1
       (lambda (v1)    ; to v1
         (CPS[E2]      ; and the result of evaluating E2
          (lambda (v2) ; to v2
            (v1 v2 k)))))) ; apply and have it return the result to k

There's a flip side to this transformation --- whenever a function is
created with a `lambda` form, we need to add a `k` argument to it, and
make it return its value to it. Then, we need to "lift" the whole
function as usual, using the same transformation we used for other
values in the above. We'll use `k` for the latter continuation argument,
and `cont` for the former:

    CPS[(lambda (arg) E)]
    -->
    (lambda (k) ; this is the usual
      (k        ; lifting of values
       (lambda (arg cont) ; the translated function has a cont. input
         (CPS[E] cont)))) ; the translated body returns its result to it

It is interesting to note the two continuations in the translated
result: the first one (using `k`) is the continuation for the function
value, and the second one (using `cont`) is the continuation used when
the function is applied. Comparing this to our evaluators --- we can say
that the first roughly corresponds to evaluating a function form to get
a closure, and the second corresponds to evaluating the body of a
function when it's applied, which means that `cont` is the dynamic
continuation that matches the dynamic context in which the function is
executed. Inspecting the CPS-ed form of the identity function is
unsurprising: it simply passes its first argument (the "real" one) into
the continuation since that's how we return values in this system:

    CPS[(lambda (x) x)]
    -->
    (lambda (k)
      (k
       (lambda (x cont)
         (CPS[x] cont))))
    -->
    (lambda (k)
      (k
       (lambda (x cont)
         ((lambda (k) (k x)) cont))))
    --> ; reduce the redundant function application
    (lambda (k)
      (k
       (lambda (x cont)
         (cont x))))

Note the reduction of a trivial application --- doing this systematic
conversion leads to many of them.

We now get to the transformation of the form that is the main reason we
started with all of this --- `web-read`. This transformation is simple,
it just passes along the continuation to `web-read/k`:

    CPS[(web-read E)]
    -->
    (lambda (k)
      (CPS[E]           ; evaluate the prompt expression
       (lambda (prompt) ; and bind it to prompt
         (web-read/k prompt k)))) ; use the prompt and the current cont.

We also need to deal with `web-display` --- we changed the function
calling protocol by adding a continuation argument, but `web-display` is
defined outside of the CPS-ed language so it doesn't have that argument.
Another way of fixing it could be to move its definition into the
language, but then we'll still need to have a special treatment for the
`error` that it uses.

    CPS[(web-display E)]
    -->
    (lambda (k)
      (CPS[E]          ; evaluate the expression
       (lambda (val)   ; and bind it to val
         (web-display val))))

As you can see, all of these transformations are simple rewrites. We can
use a simple `syntax-rules` macro to implement this transformation,
essentially creating a DSL by translating code into plain Racket. Note
that in the specification above we've implicitly used some parts of the
input as keywords --- `lambda`, `+`, `web-read`, and `define` --- this
is reflected in the macro code. The order of the rules is important, for
example, we need to match first on (web-read E) and then on the more
generic (E1 E2), and we ensure that the last default lifting of values
has a simple expression by matching on (x ...) before that.

    (define-syntax CPS
      (syntax-rules (+ lambda web-read web-display) ;*** keywords
        [(CPS (+ E1 E2))
         (lambda (k)
           ((CPS E1)
            (lambda (v1)
              ((CPS E2)
               (lambda (v2)
                 (k (+ v1 v2)))))))]
        [(CPS (web-read E))
         (lambda (k)
           ((CPS E)
            (lambda (val)
              (web-read/k val k))))]
        [(CPS (web-display E))
         (lambda (k)                ; could be:
           ((CPS E)                 ;     (lambda (k)
            (lambda (val)           ;       ((CPS E) web-display))
              (web-display val))))] ; but keep it looking uniform
        [(CPS (lambda (arg) E))
         (lambda (k)
           (k (lambda (arg cont)
                ((CPS E)
                 cont))))]
        [(CPS (E1 E2))
         (lambda (k)
           ((CPS E1)
            (lambda (v1)
              ((CPS E2)
               (lambda (v2)
                 (v1 v2 k))))))]
        ;; the following pattern ensures that the last rule is used only
        ;; with simple values and identifiers
        [(CPS (x ...))
         ---syntax-error---]
        [(CPS V)
         (lambda (k) (k V))]))

The transformation that this code implements is one of the oldest CPS
transformations --- it is called the Fischer Call by Value CPS
transformation, and is due Michael Fischer. There has been much more
research into such transformations --- the Fischer translation, while
easy to understand due to its uniformity, introduces significant
overhead in the form of many new functions in its result. Some of these
are easy to optimize --- for example, things like ((lambda (k) (k v)) E)
could be optimized to just (E v) assuming a left-to-right evaluation
order or proving that E has no side-effects (and Racket performs this
optimization and several others), but some of the overhead is not easily
optimized. There have been several other CPS transformations, in an
attempt to avoid such overhead.

Finally, trying to run code using this macro can be a little awkward. We
need to explicitly wrap all values in definitions by a `CPS`, and we
need to invoke top-level expressions with a particular continuation ---
`web-display` in our context. We can do all of that with a convenience
macro that will transform a number of definitions followed by an
optional expression.

> Note the use of `begin` --- usually, it is intended for sequential
> execution, but it is also used in macro result expressions when we
> need a macro to produce multiple expressions (since the result of a
> macro must be a single S-expression) --- this is why it's used here,
> not for sequencing side-effects.

    (define-syntax CPS-code
      (syntax-rules (define)
        [(CPS-code (define (id arg) E) more ...)
         ;; simple translation to `lambda'
         (CPS-code (define id (lambda (arg) E)) more ...)]
        [(CPS-code (define id E) more ...)
         (begin (define id ((CPS E) (lambda (x) x)))
                (CPS-code more ...))]
        [(CPS-code last-expr)
         ((CPS last-expr) web-display)]
        [(CPS-code) ; happens when there is no plain expr at
         (begin)])) ; the end so do nothing in this case

The interesting thing that this macro does is set up a proper
continuation for definitions and top-level expressions. In the latter
case, it passes `web-display` as the continuation, and in the former
case, it passes the identity function as the continuation --- which is
used to "lower" the lifted value from its continuation form into a plain
value. Using the identity function as a continuation is not really
correct: it means that if evaluating the expression to be bound performs
some web interaction, then the definition will be aborted, leaving the
identifier unbound. The way to solve this is by arranging for the
definition operation to be done in the continuation, for example, we can
get closer to this using an explicit mutation step:

        [(CPS-code (define id E) more ...)
         (begin (define id #f)
                ((CPS E) (lambda (x) (set! id x)))
                (CPS-code more ...))]

But there are two main problems with this: first, the rest of the code
--- `(CPS-code more ...)` --- should also be done in the continuation,
which will defeat the global definitions. We could try to use the
continuation to get the scope:

        [(CPS-code (define id E) more ...)
         ((CPS E) (lambda (id) (CPS-code more ...)))]

but that breaks recursive definitions. In any case, the second problem
is that this is not accurate even if we solved the above: we really need
to have parts of the Racket definition mechanism exposed to make it
work. So we'll settle with the simple version as an approximation. It
works fine if we use definitions only for functions, and invoke them in
toplevel expressions.

For reference, the complete code at this point follows.

    ;;; ---<<<WEB-SIMPLE-LANGUAGE>>>------------------------------------
    ;; Simulation of web interactions with a CPS converter (not an
    ;; interpreter)

    #lang racket

    (define error raise-user-error)

    (define (nothing-to-do ignored)
      (error 'nothing-to-do "No computation to resume."))

    (define resumer (box nothing-to-do))

    (define (web-display n)
      (set-box! resumer nothing-to-do)
      (error 'web-display "~s" n))

    (define (web-read/k prompt k)
      (set-box! resumer k)
      (error 'web-read
             "enter (submit N) to continue the following\n  ~a:"
             prompt))

    (define (submit n)
      ;; to avoid mistakes, we clear out `resumer' before invoking it
      (let ([k (unbox resumer)])
        (set-box! resumer nothing-to-do)
        (k n)))

    (define-syntax CPS
      (syntax-rules (+ lambda web-read web-display) ;*** keywords
        [(CPS (+ E1 E2))
         (lambda (k)
           ((CPS E1)
            (lambda (v1)
              ((CPS E2)
               (lambda (v2)
                 (k (+ v1 v2)))))))]
        [(CPS (web-read E))
         (lambda (k)
           ((CPS E)
            (lambda (val)
              (web-read/k val k))))]
        [(CPS (web-display E))
         (lambda (k)
           ((CPS E)
            (lambda (val)
              (web-display val))))]
        [(CPS (lambda (arg) E))
         (lambda (k)
           (k (lambda (arg cont)
                ((CPS E)
                 cont))))]
        [(CPS (E1 E2))
         (lambda (k)
           ((CPS E1)
            (lambda (v1)
              ((CPS E2)
               (lambda (v2)
                 (v1 v2 k))))))]
        ;; the following pattern ensures that the last rule is used only
        ;; with simple values and identifiers
        [(CPS (x ...))
         ---syntax-error---]
        [(CPS V) ; <-- only numbers, other literals, and identifiers
         (lambda (k)
           (k V))]))

    (define-syntax CPS-code
      (syntax-rules (define)
        [(CPS-code (define (id arg) E) more ...)
         ;; simple translation to `lambda'
         (CPS-code (define id (lambda (arg) E)) more ...)]
        [(CPS-code (define id E) more ...)
         (begin (define id ((CPS E) (lambda (x) x)))
                (CPS-code more ...))]
        [(CPS-code last-expr)
         ((CPS last-expr) web-display)]
        [(CPS-code) ; happens when there is no plain expr at
         (begin)])) ; the end so do nothing in this case

Here is a quick example of using this:

    (CPS-code
      (web-display (+ (web-read "First number")
                      (web-read "Second number"))))

Note that this code uses `web-display`, which is not really needed since
`CPS-code` would use it as the top-level continuation. (Can you see why
it works the same either way?) So this is even closer to a plain
program:

    (CPS-code (+ (web-read "First number")
                 (web-read "Second number")))

A slightly more complicated example:

    (CPS-code
      (define (add n)
        (lambda (m)
          (+ m n)))
      (define (read-and-add n)
        ((add n) (web-read "Another number")))
      (read-and-add (web-read "A number")))

Using this for the other examples is not possible with the current state
of the translation macro. These example will require extending the CPS
transformation with functions of any arity, multiple expressions in a
body, and it recognize additional primitive functions. None of these is
difficult, it will just make it more verbose.


------------------------------------------------------------------------
# Continuations as a Language Feature [Tuesday, April 15th]

> This is conceptually between [PLAI §18] and [PLAI §19]

In the list of CPS transformation rules there were two rules that
deserve additional attention in how they deal with their continuation.

First, note the rule for `web-display`:

    [(CPS (web-display E))
     (lambda (k)
       ((CPS E)
        (lambda (val)
          (web-display val))))]

--- it simply ignores its continuation. This means that whenever
`web-display` is used, the rest of the computation is simply discarded,
which seems wrong --- it's the kind of problem that we've encountered
several times when we discussed the transforming web application code.
Of course, this doesn't matter much for our implementation of
`web-display` since it aborts the computation anyway using `error` ---
but what if we did that intentionally? We would get a kind of an "abort
now" construct: we can implement this as a new `abort` form that does
just that:

    (define-syntax CPS
      (syntax-rules (...same... abort) ;*** new keyword
        ...
        [(CPS (abort E))
         (lambda (k)
           ((CPS E) (lambda (x) x)))] ; ignore `k'
        ...))

You could try that --- (CPS-code (+ 1 2)) produces 3 as "web output",
but (CPS-code (+ 1 (abort 2))) simply returns 2. In fact, it doesn't
matter how complicated the code is --- as soon as it encounters an
`abort` the whole computation is discarded and we immediately get its
result, for example, try this:

    (CPS-code
      (define (add n)
        (lambda (m)
          (+ m n)))
      (define (read-and-add n)
        ((abort 999) ((add n) (web-read "Another number"))))
      (read-and-add (web-read "A number")))

it reads the first number and then it immediately returns 999. This
seems like a potentially useful feature, except that it's a little too
"untamed" --- it aborts the program completely, getting all the way back
to the top-level with a result. (It's actually quite similar to throwing
an exception, only without a way to catch it.) It would be more useful
to somehow control the part of the computation that gets aborted
instead.

That leads to the second exceptional form in our translator: `web-read`.
If you look closely at all of our transformation rules, you'll see that
the continuation argument is never made accessible to user code --- the
`k` argument is always generated by the macro (and inaccessible to user
code due to the hygienic macro system). The continuation is only passed
as the extra argument to user functions, but in the rule that adds this
argument:

        [(CPS (lambda (arg) E))
         (lambda (k)
           (k (lambda (arg cont)
                ((CPS E)
                 cont))))]

the new `cont` argument is introduced by the macro so it is inaccessible
as well. The only place where the `k` argument is actually used is in
the `web-read` rule, where it is sent to the resulting `web-read/k`
call. (This makes sense, since web reading is how we mimic web
interaction, and therefore it is the only reason for CPS-ing our code.)
However, in our fake web framework this function is a given built-in, so
the continuation is still not accessible for user code.

What if we pass the continuation argument to a user function in a way
that *intentionally* exposes it? We can achieve this by writing a
function that is similar to `web-read/k`, except that it will somehow
pass the continuation to user code. A simple way to do that is to have
the new function take a function value as its primary input, and call
this function with the continuation (which is still received as the
implicit second argument):

    (define (call-k f k)
      (f k))

This is close, but it fails because it doesn't follow our translated
function calling protocol, where every function receives two inputs ---
the original argument and the continuation. Because of this, we need to
call `f` with a second continuation value, which is `k` as well:

    (define (call-k f k)
      (f k k))

But we also fail to follow the calling protocol by passing `k` as is: it
is a continuation value, which in our CPS system is a one-argument
function. (In fact, this is another indication that continuations are
not accessible to user code --- they don't follow the same function
calling protocol.) It is best to think about continuations as *meta*
values that are not part of the user language just yet. To make it
usable, we need to wrap it so we get the usual two-argument function
which user code can call:

    (define (call-k f k)
      (f (lambda (val cont) (k val)) k))

This explicit wrapping is related to the fact that continuations are a
kind of meta-level value --- and the wrapping is needed to "lower" it to
the user's world. (This is sometimes called "reification": a meta value
is *reified* as a user value.)

Using this new definition, we can write code that can access its own
continuation as a plain value. Here is a simple example that grabs the
top-level continuation and labels it `abort`, then uses it in the same
way we've used the above `abort`:

    > (CPS-code (call-k (lambda (abort) (+ 1 (abort 2)))))
    web-display: 2

But we can grab any continuation we want, not just the top-level one:

    (CPS-code (+ 100 (call-k (lambda (abort) (+ 1 (abort 2))))))
    web-display: 102

Side note: how come we didn't need a new CPS translation rule for this
function? There is no need for one, since `call-k` is already written in
a way that follows our calling convention, and no translation rule is
needed. In fact, no such rule is needed for `web-read` too --- except
for changing the call to `web-read/k`, it does exactly the same thing
that a function call does, so we can simply rename `web-read/k` as
`web-read` and drop the rule. (Note that the rewritten function call
will have a (CPS web-read) --- but CPS-ing an identifier results in the
identifier itself.) The same holds for `web-display` --- we just need to
make it adhere to the calling convention and add a `k` input which is
ignored. One minor complication is that `web-display` is also used as a
continuation value for a top-level expression in `CPS-code` --- so we
need to wrap it there.

The resulting code follows:

    #lang racket

    (define error raise-user-error)

    (define (nothing-to-do ignored)
      (error 'nothing-to-do "No computation to resume."))

    (define resumer (box nothing-to-do))

    (define (web-display n k) ; note that k is not used!
      (set-box! resumer nothing-to-do)
      (error 'web-display "~s" n))

    (define (web-read prompt k)
      (set-box! resumer k)
      (error 'web-read
             "enter (submit N) to continue the following\n  ~a:"
             prompt))

    (define (submit n)
      ;; to avoid mistakes, we clear out `resumer' before invoking it
      (let ([k (unbox resumer)])
        (set-box! resumer nothing-to-do)
        (k n)))

    (define (call-k f k)
      (f (lambda (val cont) (k val)) k))

    (define-syntax CPS
      (syntax-rules (+ lambda)
        [(CPS (+ E1 E2))
         (lambda (k)
           ((CPS E1)
            (lambda (v1)
              ((CPS E2)
               (lambda (v2)
                 (k (+ v1 v2)))))))]
        [(CPS (lambda (arg) E))
         (lambda (k)
           (k (lambda (arg cont)
                ((CPS E)
                 cont))))]
        [(CPS (E1 E2))
         (lambda (k)
           ((CPS E1)
            (lambda (v1)
              ((CPS E2)
               (lambda (v2)
                 (v1 v2 k))))))]
        ;; the following pattern ensures that the last rule is used only
        ;; with simple values and identifiers
        [(CPS (x ...))
         ---syntax-error---]
        [(CPS V)
         (lambda (k)
           (k V))]))

    (define-syntax CPS-code
      (syntax-rules (define)
        [(CPS-code (define (id arg) E) more ...)
         ;; simple translation to `lambda'
         (CPS-code (define id (lambda (arg) E)) more ...)]
        [(CPS-code (define id E) more ...)
         (begin (define id ((CPS E) (lambda (x) x)))
                (CPS-code more ...))]
        [(CPS-code last-expr)
         ((CPS last-expr) (lambda (val) (web-display val 'whatever)))]
        [(CPS-code) ; happens when there is no plain expr at
         (begin)])) ; the end so do nothing in this case

Obviously, given `call-k` we could implement `web-read/k` in user code:
`call-k` makes the current continuation available and going on from
there is simple (it will require a little richer language, so we will do
that in a minute). In fact, there is no real reason to stick to the fake
web framework to play with continuations. (Note: since we don't throw an
error to display the results, we can also allow multiple non-definition
expressions in `CPS-code`.)

    ;;; ---<<<CPS-LANGUAGE>>>-------------------------------------------
    ;; A language that is CPS-transformed (not an interpreter)

    #lang racket

    (define (call-k f k)
      (f (lambda (val cont) (k val)) k))

    (define-syntax CPS
      (syntax-rules (+ lambda)
        [(CPS (+ E1 E2))
         (lambda (k)
           ((CPS E1)
            (lambda (v1)
              ((CPS E2)
               (lambda (v2)
                 (k (+ v1 v2)))))))]
        [(CPS (lambda (arg) E))
         (lambda (k)
           (k (lambda (arg cont)
                ((CPS E)
                 cont))))]
        [(CPS (E1 E2))
         (lambda (k)
           ((CPS E1)
            (lambda (v1)
              ((CPS E2)
               (lambda (v2)
                 (v1 v2 k))))))]
        ;; the following pattern ensures that the last rule is used only
        ;; with simple values and identifiers
        [(CPS (x ...))
         ---syntax-error---]
        [(CPS V)
         (lambda (k)
           (k V))]))

    (define-syntax CPS-code
      (syntax-rules (define)
        [(CPS-code (define (id arg) E) more ...)
         ;; simple translation to `lambda'
         (CPS-code (define id (lambda (arg) E)) more ...)]
        [(CPS-code (define id E) more ...)
         (begin (define id ((CPS E) (lambda (x) x)))
                (CPS-code more ...))]
        [(CPS-code expr more ...)
         (begin ((CPS expr) (lambda (x) x))
                (CPS-code more ...))]
        [(CPS-code) (begin)])) ; done

    (CPS-code (call-k (lambda (abort) (+ 1 (abort 2))))
              (+ 100 (call-k (lambda (abort) (+ 1 (abort 2))))))


------------------------------------------------------------------------
# Continuations in Racket [Tuesday, April 15th]

As we have seen, CPS-ing code makes it possible to implement web
applications with a convenient interface. This is fine in theory, but in
practice it suffers from some problems. Some of these problems are
technicalities: it relies on proper implementation of tail calls (since
all calls are tail calls), and it represents the computation stack as a
chain of closures and therefore prevents the usual optimizations. But
there is one problem that is much more serious: it is a *global*
transformation, and as such, it requires access to the complete program
code. As an example, consider how `CPS-code` deals with definitions: it
uses an identity function as the continuation, but that wasn't the
proper way to do them, since it would break if computing the value
performs some web interaction. A good solution would instead put the
side-effect that `define` performs in the continuation --- but this side
effect is not even available for us when we work inside Racket.

Because of this, the proper way to make continuations available is for
the language implementation itself to provide it. There are a few
languages that do just that --- and Scheme has pioneered this as part of
the core requirements that the standard dictates: a Scheme
implementation needs to provide `call-with-current-continuation`, which
is the same tool as our `call-k`. Usually it is also provided with a
shorter name, `call/cc`. Here are our two examples, re-done with
Racket's built-in `call/cc`:

    (call/cc (lambda (abort) (+ 1 (abort 2))))
    (+ 100 (call/cc (lambda (abort) (+ 1 (abort 2)))))

[Side note: continuations as we see here are still provided only by a
few "fringe" functional languages. However, they are slowly making their
way into more mainstream languages --- Ruby has these continuations too,
and several other languages provide more limited variations, like
generators in Python. On the other hand, Racket provides a much richer
functionality: it has delimited continuations (which represents only a
part of a computation context), and its continuations are also
composable --- a property that goes beyond what we see here.]

Racket also comes with a more convenient `let/cc` form, which exposes
the "grab the current continuation" pattern more succinctly --- it's a
simple macro definition:

    (define-syntax-rule (let/cc k body ...)
      (call/cc (lambda (k) body ...)))

and the two examples become:

    (let/cc abort (+ 1 (abort 2)))
    (+ 100 (let/cc abort (+ 1 (abort 2))))

When it gets to choosing an implementation strategy, there are two
common approaches: one is to do the CPS transformation at the compiler
level, and another is to capture the actual runtime stack and wrap it in
an applicable continuation objects. The former can lead to very
efficient compilation of continuation-heavy programs, but the latter
makes it easier to deal with foreign functions (consider higher order
functions that are given as a library where you don't have its source)
and allows using the normal runtime stack that CPUs are using very
efficiently. Racket implements continuations with the latter approach
mainly for these reasons.

To see how these continuations expose some of the implementation details
that we normally don't have access to, consider grabbing the
continuation of a definition expression:

    > (define b (box #f))
    > (define a (let/cc k (set-box! b k) 123))
    > a
    123
    > ((unbox b) 1000)
    > a
    1000

> Note that using a top-level (let/cc abort ...code...) is not really
> aborting for a reason that is related to this: a true `abort` must
> capture the continuation before any computation begins. A natural
> place to do this is in the REPL implementation.

Finally, we can use these to re-implement our fake web framework, using
Racket's continuations instead of performing our own transformation. The
only thing that requires continuations is our `web-read` --- and using
the Racket facilities we can implement it as follows:

    (define (web-read prompt) ; no `k' input
      (let/cc k ; instead, get it with `let/cc'
        ;; and the body is the same as it was
        (set-box! resumer k)
        (error 'web-read
               "enter (submit N) to continue the following\n  ~a:"
               prompt)))

Note that this kind of an implementation is no longer a "language" ---
it is implemented as a plain library now, demonstrating the flexibility
that having continuations in our language enables. While this is still
just our fake toy framework, it is the core way in which the Racket web
server is implemented (see the "addition server" implementation above),
using a hash table that maps freshly made URLs to stored continuations.
The complete code follows:

    ;;; ---<<<CONTINUATION-BASED-WEB-LIBRARY>>>-------------------------
    ;; Simulation of web interactions with Racket's built-in
    ;; continuation facility

    #lang racket

    (define error raise-user-error)

    (define (nothing-to-do ignored)
      (error 'nothing-to-do "No computation to resume."))

    (define resumer (box nothing-to-do))

    (define (web-display n)
      (set-box! resumer nothing-to-do)
      (error 'web-display "~s" n))

    (define (web-read prompt)
      (let/cc k
        (set-box! resumer k)
        (error 'web-read
               "enter (submit N) to continue the following\n  ~a:"
               prompt)))

    (define (submit n)
      ;; to avoid mistakes, we clear out `resumer' before invoking it
      (let ([k (unbox resumer)])
        (set-box! resumer nothing-to-do)
        (k n)))

Using this, you can try out some of the earlier examples, which now
become much simpler since there is no need to do any CPS-ing. For
example, the code that required transforming `map` into a `map/k` can
now use the plain `map` directly. In fact, that's the exact code we
started that example with --- no changes needed:

    (define (sum l) (foldl + 0 l))
    (define (square n) (* n n))
    (define (read-number prompt)
      (web-read (format "~a number" prompt)))
    (web-display (sum (map (lambda (prompt)
                             (square (read-number prompt)))
                           '("First" "Second" "Third"))))

Note how `web-read` is executed directly --- it is a plain library
function.


------------------------------------------------------------------------
# Playing with Continuations [extra] [Tuesday, April 15th]

> [PLAI §19]

So far we've seen a number of "tricks" that can be done with
continuations. The simplest was aborting a computation --- here's an
implementation of functions with a `return` that can be used to exit the
function early:

    (define-syntax (fun stx)
      (syntax-case stx ()
        [(_ name (x ...) body ...)
         (with-syntax ([return (datum->syntax #'name 'return)])
           #'(define (name x ...) (let/cc return body ...)))]))

    ;; try it:
    (fun mult (list)
      (define (loop list)
        (cond [(null? list) 1]
              [(zero? (first list)) (return 0)] ; early return
              [else (* (first list) (loop (rest list)))]))
      (loop list))
    (mult '(1 2 3 0 x))

[Side note: This is a cheap demonstration. If we rewrite the loop
tail-recursively, then aborting it is simple --- just return 0 instead
of continuing the loop. And that's not a coincidence, aborting from a
tail-calling loop is easy, and CPS makes such aborts possible by making
only tail calls.]

But such uses of continuations are simple because they're used only to
"jump out" of some (dynamic) context. More exotic uses of continuations
rely on the ability to jump into a previously captured continuation. In
fact, our `web-read` implementation does just that (and more). The main
difference is that in the former case the continuation is used exactly
once --- either explicitly by using it, or implicitly by returning a
value (without aborting). If a continuation can be used after the
corresponding computation is over, then why not use it over and over
again... For example, we can try an infinite loop by capturing a
continuation and later use it as a jump target:

    (define (foo)
      (define loop (let/cc k k))    ; captured only for the context
      (printf "Meh.\n")
      (loop 'something))            ; need to give it some argument

This almost works --- we get two printouts so clearly the jump was
successful. The problem is that the captured `loop` continuation is the
one that expects a value to bind to `loop` itself, so the second
attempted call has `'something` as the value of `loop`, obviously,
leading to an error. This can be used as a hint for a solution ---
simply pass the continuation to itself:

    (define (foo)
      (define loop (let/cc k k))
      (printf "Meh.\n")
      (loop loop))                  ; keep the value of `loop'

Another way around this problem is to capture the continuation that is
just *after* the binding --- but we can't do that (try it...). Instead,
we can use side-effects:

    (define (foo)
      (define loop (box #f))
      (let/cc k (set-box! loop k))  ; cont. of the outer expression
      (printf "Meh.\n")
      ((unbox loop) 'something))

> Note: the `'something` value goes to the continuation which makes it
> the result of the `(let/cc ...)` expression --- which means that it's
> never actually used now.

This might seem like a solution that is not as "clean", since it uses
mutation --- but note that the problem that we're solving stems from a
continuation that exposes the mutation that Racket performs when it
creates a new binding.

Here's an example of a loop that does something a little more
interesting in a goto-like kind of way:

    (define (foo)
      (define n (box 0))
      (define loop (box #f))
      (let/cc k (set-box! loop k))
      (set-box! n (add1 (unbox n)))
      (printf "n = ~s\n" (unbox n))
      ((unbox loop)))

> Note: in this example the continuation is called *without* any inputs.
> How is this possible? As we've seen, the `'something` value in the
> last example is the never-used result of the `let/cc` expression. In
> this case, the continuation is called with no input, which means that
> the `let/cc` expression evaluates to ... nothing! This is not just
> some `void` value, but no value at all. The complete story is that in
> Racket expressions can evaluate to *multiple* values, and in this
> case, it's no values at all.

Given such examples it's no wonder that continuations tend to have a
reputation for being "similar to goto in their power". This reputation
has some vague highlevel justification in that both features can produce
obscure "spaghetti code" --- but in practice they are very different. On
one hand continuations are more limited: unlike `goto`, you can only
jump to a continuation that you have already "visited". On the other
hand, jumping to a continuation is doing much more than jumping to a
goto label, the latter changes the next instruction to execute (the
"program counter" register), but the former changes current computation
context (in low level terms, both the PC register and the stack). (See
also the `setjmp()` and `longjmp()` functions in C, or the `context`
related functions (`getcontext()`, `setcontext()`, `swapcontext()`).)

To demonstrate how different continuations are from plain gotos, we'll
start with a variation of the above loop --- instead of performing the
loop we just store it in a global box, and we return the counter value
instead of printing it:

    (define loop (box #f))

    (define (foo)
      (define n (box 0))
      (let/cc k (set-box! loop k))
      (set-box! n (add1 (unbox n)))
      (unbox n))

Now, the first time we call (foo), we get 1 as expected, and then we can
call (unbox loop) to re-invoke the continuation and get the following
numbers:

    > (foo)
    1
    > ((unbox loop))
    2
    > ((unbox loop))
    3

[Interesting experiment: try doing the same, but use (list (foo)) as the
first interaction, and the same ((unbox loop)) later.]

The difference between this use and a `goto` is now evident: we're not
just just jumping to a label --- we're jumping back into a computation
that returns the next number. In fact, the continuation can include a
context that is outside of `foo`, for example, we can invoke the
continuation from a different function, and `loop` can be used to search
for a specific number:

    (define (bar)
      (let ([x (foo)])
        (unless (> x 10) ((unbox loop)))
        x))

and now (bar) returns 11. The loop is now essentially going over the
obvious part of `foo` but also over parts of `bar`. Here's an example
that makes it even more obvious:

    (define (bar)
      (let* ([x (foo)]
             [y (* x 2)])
        (unless (> x 10) ((unbox loop)))
        y))

Since the `y` binding becomes part of the loop. Our `foo` can be
considered as a kind of a producer for natural numbers that can be used
to find a specific number, invoking the `loop` continuation to try the
next number when the last one isn't the one we want.


------------------------------------------------------------------------
## The Ambiguous Operator: `amb` [extra] [Tuesday, April 15th]

Our `foo` is actually a very limited version of something that is known
as "McCarthy's Ambiguous Operator", usually named `amb`. This operator
is used to perform a kind of a backtrack-able choice among several
values.

To develop our `foo` into such an `amb`, we begin by renaming `foo` as
`amb` and `loop` as `fail`, and instead of returning natural numbers in
sequence we'll have it take a list of values and return values from this
list. Also, we will use mutable variables instead of boxes to reduce
clutter (a feature that we've mostly ignored so far). The resulting code
is:

    (define fail #f)

    (define (amb choices)
      (let/cc k (set! fail k))
      (let ([choice (first choices)])
        (set! choices (rest choices))
        choice))

Of course, we also need to check that we actually have values to return:

    (define fail #f)

    (define (amb choices)
      (let/cc k (set! fail k))
      (if (pair? choices)
        (let ([choice (first choices)])
          (set! choices (rest choices))
          choice)
        (error "no more choices!")))

The resulting `amb` can be used in a similar way to the earlier `foo`:

    (define (bar)
      (let* ([x (amb '(5 10 15 20))]
             [y (* x 2)])
        (unless (> x 10) (fail))
        y))
    (bar)

This is somewhat useful, but searching through a simple list of values
is not too exciting. Specifically, we can have only one search at a
time. Making it possible to have multiple searches is not too hard:
instead of a single failure continuation, store a stack of them, where
each new `amb` pushes a new one on it.

We define `failures` as this stack and push a new failure continuation
in each `amb`. `fail` becomes a function that simply invokes the most
recent failure continuation, if one exists.

    (define failures null)

    (define (fail)
      (if (pair? failures)
        ((first failures))
        (error "no more choices!")))

    (define (amb choices)
      (let/cc k (set! failures (cons k failures)))
      (if (pair? choices)
        (let ([choice (first choices)])
          (set! choices (rest choices))
          choice)
        (error "no more choices!")))

This is close, but there's still something missing. When we run out of
options from the `choices` list, we shouldn't just throw an error ---
instead, we should invoke the previous failure continuation, if there is
one. In other words, we want to use `fail`, but before we do, we need to
pop up the top-most failure continuation since it is the one that we are
currently dealing with:

    (define failures null)

    (define (fail)
      (if (pair? failures)
        ((first failures))
        (error "no more choices!")))

    (define (amb choices)
      (let/cc k (set! failures (cons k failures)))
      (if (pair? choices)
        (let ([choice (first choices)])
          (set! choices (rest choices))
          choice)
        (begin (set! failures (rest failures))
               (fail))))

    (define (assert condition)
      (unless condition (fail)))

Note the addition of a tiny `assert` utility, something that is commonly
done with `amb`. We can now play with this code as before:

    (let* ([x (amb '(5 10 15 20))]
           [y (* x 2)])
      (unless (> x 10) (fail))
      y)

But of course the new feature is more impressive, for example, find two
numbers that sum up to 6 and the first is the square of the second:

    (let ([a (amb '(1 2 3 4 5 6 7 8 9 10))]
          [b (amb '(1 2 3 4 5 6 7 8 9 10))])
      (assert (= 6 (+ a b)))
      (assert (= a (* b b)))
      (list a b))

Find a Pythagorean triplet:

    (let ([a (amb '(1 2 3 4 5 6))]
          [b (amb '(1 2 3 4 5 6))]
          [c (amb '(1 2 3 4 5 6))])
      (assert (= (* a a) (+ (* b b) (* c c))))
      (list a b c))

Specifying the list of integers is tedious, but easily abstracted into a
function:

    (let* ([int6 (lambda () (amb '(1 2 3 4 5 6)))]
           [a (int6)]
           [b (int6)]
           [c (int6)])
      (assert (= (* a a) (+ (* b b) (* c c))))
      (list a b c))

A more impressive demonstration is finding a solution to tests known as
"Self-referential Aptitude Test", for example, [here's one such test]
(by Christian Schulte and Gert Smolka) --- it's a 10-question multiple
choice test:

[here's one such test]:
  https://mozart.github.io/mozart-v1/doc-1.4.0/fdt/node38.html

  1. The first question whose answer is b is question (a) 2; (b) 3; (c)
     4; (d) 5; (e) 6.
  2. The only two consecutive questions with identical answers are
     questions (a) 2 and 3; (b) 3 and 4; (c) 4 and 5; (d) 5 and 6; (e) 6
     and 7.
  3. The answer to this question is the same as the answer to question
     (a) 1; (b) 2; (c) 4; (d) 7; (e) 6.
  4. The number of questions with the answer a is (a) 0; (b) 1; (c) 2;
     (d) 3; (e) 4.
  5. The answer to this question is the same as the answer to question
     (a) 10; (b) 9; (c) 8; (d) 7; (e) 6.
  6. The number of questions with answer a equals the number of
     questions with answer (a) b; (b) c; (c) d; (d) e; (e) none of the
     above.
  7. Alphabetically, the answer to this question and the answer to the
     following question are (a) 4 apart; (b) 3 apart; (c) 2 apart; (d) 1
     apart; (e) the same.
  8. The number of questions whose answers are vowels is (a) 2; (b) 3;
     (c) 4; (d) 5; (e) 6.
  9. The number of questions whose answer is a consonant is (a) a prime;
     (b) a factorial; (c) a square; (d) a cube; (e) divisible by 5.
  10. The answer to this question is (a) a; (b) b; (c) c; (d) d; (e) e.

and the solution is pretty much a straightforward translation:

    (define (self-test)
      (define (choose-letter) (amb '(a b c d e)))
      (define q1  (choose-letter))
      (define q2  (choose-letter))
      (define q3  (choose-letter))
      (define q4  (choose-letter))
      (define q5  (choose-letter))
      (define q6  (choose-letter))
      (define q7  (choose-letter))
      (define q8  (choose-letter))
      (define q9  (choose-letter))
      (define q10 (choose-letter))
      ;; 1. The first question whose answer is b is question (a) 2;
      ;;    (b) 3; (c) 4; (d) 5; (e) 6.
      (assert (eq? q1 (cond [(eq? q2 'b) 'a]
                            [(eq? q3 'b) 'b]
                            [(eq? q4 'b) 'c]
                            [(eq? q5 'b) 'd]
                            [(eq? q6 'b) 'e]
                            [else (assert #f)])))
      ;; 2. The only two consecutive questions with identical answers
      ;;    are questions (a) 2 and 3; (b) 3 and 4; (c) 4 and 5; (d) 5
      ;;    and 6; (e) 6 and 7.
      (define all (list q1 q2 q3 q4 q5 q6 q7 q8 q9 q10))
      (define (count-same-consecutive l)
        (define (loop x l n)
          (if (null? l)
            n
            (loop (first l) (rest l)
                  (if (eq? x (first l)) (add1 n) n))))
        (loop (first l) (rest l) 0))
      (assert (eq? q2 (cond [(eq? q2 q3) 'a]
                            [(eq? q3 q4) 'b]
                            [(eq? q4 q5) 'c]
                            [(eq? q5 q6) 'd]
                            [(eq? q6 q7) 'e]
                            [else (assert #f)])))
      (assert (= 1 (count-same-consecutive all))) ; exactly one
      ;; 3. The answer to this question is the same as the answer to
      ;;    question (a) 1; (b) 2; (c) 4; (d) 7; (e) 6.
      (assert (eq? q3 (cond [(eq? q3 q1) 'a]
                            [(eq? q3 q2) 'b]
                            [(eq? q3 q4) 'c]
                            [(eq? q3 q7) 'd]
                            [(eq? q3 q6) 'e]
                            [else (assert #f)])))
      ;; 4. The number of questions with the answer a is (a) 0; (b) 1;
      ;;    (c) 2; (d) 3; (e) 4.
      (define (count x l)
        (define (loop l n)
          (if (null? l)
            n
            (loop (rest l) (if (eq? x (first l)) (add1 n) n))))
        (loop l 0))
      (define num-of-a (count 'a all))
      (define num-of-b (count 'b all))
      (define num-of-c (count 'c all))
      (define num-of-d (count 'd all))
      (define num-of-e (count 'e all))
      (assert (eq? q4 (case num-of-a
                        [(0) 'a]
                        [(1) 'b]
                        [(2) 'c]
                        [(3) 'd]
                        [(4) 'e]
                        [else (assert #f)])))
      ;; 5. The answer to this question is the same as the answer to
      ;;    question (a) 10; (b) 9; (c) 8; (d) 7; (e) 6.
      (assert (eq? q5 (cond [(eq? q5 q10) 'a]
                            [(eq? q5 q9) 'b]
                            [(eq? q5 q8) 'c]
                            [(eq? q5 q7) 'd]
                            [(eq? q5 q6) 'e]
                            [else (assert #f)])))
      ;; 6. The number of questions with answer a equals the number of
      ;;    questions with answer (a) b; (b) c; (c) d; (d) e; (e) none
      ;;    of the above.
      (assert (eq? q6 (cond [(= num-of-a num-of-b) 'a]
                            [(= num-of-a num-of-c) 'b]
                            [(= num-of-a num-of-d) 'c]
                            [(= num-of-a num-of-e) 'd]
                            [else 'e])))
      ;; 7. Alphabetically, the answer to this question and the answer
      ;;    to the following question are (a) 4 apart; (b) 3 apart; (c)
      ;;    2 apart; (d) 1 apart; (e) the same.
      (define (choice->integer x)
        (case x [(a) 1] [(b) 2] [(c) 3] [(d) 4] [(e) 5]))
      (define (distance x y)
        (if (eq? x y)
          0
          (abs (- (choice->integer x) (choice->integer y)))))
      (assert (eq? q7 (case (distance q7 q8)
                        [(4) 'a]
                        [(3) 'b]
                        [(2) 'c]
                        [(1) 'd]
                        [(0) 'e]
                        [else (assert #f)])))
      ;; 8. The number of questions whose answers are vowels is (a) 2;
      ;;    (b) 3; (c) 4; (d) 5; (e) 6.
      (assert (eq? q8 (case (+ num-of-a num-of-e)
                        [(2) 'a]
                        [(3) 'b]
                        [(4) 'c]
                        [(5) 'd]
                        [(6) 'e]
                        [else (assert #f)])))
      ;; 9. The number of questions whose answer is a consonant is (a) a
      ;;    prime; (b) a factorial; (c) a square; (d) a cube; (e)
      ;;    divisible by 5.
      (assert (eq? q9 (case (+ num-of-b num-of-c num-of-d)
                        [(2 3 5 7) 'a]
                        [(1 2 6)   'b]
                        [(0 1 4 9) 'c]
                        [(0 1 8)   'd]
                        [(0 5 10)  'e]
                        [else (assert #f)])))
      ;; 10. The answer to this question is (a) a; (b) b; (c) c; (d) d;
      ;;     (e) e.
      (assert (eq? q10 q10)) ; (note: does nothing...)
      ;; The solution should be: (c d e b e e d c b a)
      all)

Note that the solution is simple because of the freedom we get with
continuations: the search is not a sophisticated one, but we're free to
introduce ambiguity points anywhere that fits, and mix assertions with
other code without worrying about control flow (as you do in an
implementation that uses explicit loops). On the other hand, it is not
too efficient since it uses a naive search strategy. (This could be
improved somewhat by deferring ambiguous points, for example, don't
assign q7, q8, q9, and q10 before the first question; but much of the
cost comes from the strategy for implementing continuation in Racket,
which makes capturing continuations a relatively expensive operation.)

When we started out with the modified loop, we had a representation of
an arbitrary natural number --- but with the move to lists of choices we
lost the ability to deal with such infinite choices. Getting it back is
simple: delay the evaluation of the `amb` expressions. We can do that by
switching to a list of thunks instead. The change in the code is in the
result: just return the result of calling `choice` instead of returning
it directly. We can then rename `amb` to `amb/thunks` and reimplement
`amb` as a macro that wraps all of its sub-forms in thunks.

    (define (amb/thunks choices)
      (let/cc k (set! failures (cons k failures)))
      (if (pair? choices)
        (let ([choice (first choices)])
          (set! choices (rest choices))
          (choice))                    ;*** call the choice thunk
        (begin (set! failures (rest failures))
               (fail))))

    (define-syntax-rule (amb E ...)
      (amb/thunks (list (lambda () E) ...)))

With this, we can implement code that computes choices rather than
having them listed:

    (define (integers-between n m)
      (assert (<= n m))
      (amb n (integers-between (add1 n) m)))

or even ones that are infinite:

    (define (integers-from n)
      (amb n (integers-from (add1 n))))

As with any infinite sequence, there are traps to avoid. In this case,
trying to write code that can find any Pythagorean triplet as:

    (collect 7
      (let ([a (integers-from 1)]
            [b (integers-from 1)]
            [c (integers-from 1)])
        (assert (= (* a a) (+ (* b b) (* c c))))
        (list a b c)))

will not work. The problem is that the search loop will keep
incrementing `c`, and therefore will not find any solution. The search
can work if only the top-most choice is infinite:

    (collect 7
      (let* ([a (integers-from 1)]
             [b (integers-between 1 a)]
             [c (integers-between 1 a)])
        (assert (= (* a a) (+ (* b b) (* c c))))
        (list a b c)))

The complete code follows:

    ;;; ---<<<AMB>>>----------------------------------------------------
    ;; The ambiguous operator and related utilities

    #lang racket

    (define failures null)

    (define (fail)
      (if (pair? failures)
        ((first failures))
        (error "no more choices!")))

    (define (amb/thunks choices)
      (let/cc k (set! failures (cons k failures)))
      (if (pair? choices)
        (let ([choice (first choices)])
          (set! choices (rest choices))
          (choice))
        (begin (set! failures (rest failures))
               (fail))))

    (define-syntax-rule (amb E ...)
      (amb/thunks (list (lambda () E) ...)))

    (define (assert condition)
      (unless condition (fail)))

    (define (integers-between n m)
      (assert (<= n m))
      (amb n (integers-between (add1 n) m)))

    (define (integers-from n)
      (amb n (integers-from (add1 n))))

    (define (collect/thunk n thunk)
      (define results null)
      (let/cc too-few
        (set! failures (list too-few))
        (define result (thunk))
        (set! results (cons result results))
        (set! n (sub1 n))
        (unless (zero? n) (fail)))
      (set! failures null)
      (reverse results))

    (define-syntax collect
      (syntax-rules ()
        ;; collect N results
        [(_ N E) (collect/thunk N (lambda () E))]
        ;; collect all results
        [(_ E) (collect/thunk -1 (lambda () E))]))

As a bonus, the code includes a `collect` tool that can be used to
collect a number of results --- it uses `fail` to iterate until a
sufficient number of values is collected. A simple version is:

    (define (collect/thunk n thunk)
      (define results null)
      (define result (thunk))
      (set! results (cons result results))
      (set! n (sub1 n))
      (unless (zero? n) (fail))
      (reverse results))

(Question: why does this code use mutation to collect the results?)

But since this might run into a premature failure, the actual version in
the code installs its own failure continuation that simply aborts the
collection loop. To try it out:

    (collect (* (integers-between 1 3) (integers-between 1 5)))


------------------------------------------------------------------------
## Generators (and Producers) [extra] [Tuesday, April 15th]

Another popular facility that is related to continuations is generators.
The idea is to split code into separate "producers" and "consumers",
where the computation is interleaved between the two. This simplifies
some notoriously difficult problems. It is also a twist on the idea of
co-routines, where two functions transfer control back and forth as
needed. (Co-routines can be developed further into a "cooperative
threading" system, but we will not cover that here.)

A classical example that we have mentioned previously is the "same
fringe" problem. One of the easy solutions that we talked about was to
run two processes that spit out the tree leaves, and a third process
that grabs both outputs as they come and compares them. Using a lazy
language allowed a very similar solution, where the two processes are
essentially represented as two lazy lists. But with continuations we can
find a solution that works in a strict language too, and in fact, one
that is very close to the two processes metaphor.

The fact that continuations can support such a solution shouldn't be
surprising: as with the kind of server-client interactions that we've
seen with the web language, and as with the `amb` tricks, the main theme
is the same --- the idea of suspending computation. (Intuitively, this
also explains why a lazy language is related: it is essentially making
all computations suspendable in a sense.)

To implement generators, we begin with a simple code that we want to
eventually use:

    (define (producer)
      (yield 1)
      (yield 2)
      (yield 3))

where `yield` is expected to behave similarly to a `return` --- it
should make the function return 1 when called, and then somehow return 2
and 3 on subsequent calls. To make it easier to develop, we'll make
`yield` an argument to the producer:

    (define (producer yield)
      (yield 1)
      (yield 2)
      (yield 3))

To use this producer, we need to find a proper value to call it with.
Sending it an identity, (lambda (x) x), is clearly not going to work: it
will make all `yield`s executed on the first call, returning the last
value. Instead, we need some way to abort the computation on the first
`yield`. This, of course, can be done with a continuation, which we
should send as the value of the `yield` argument. And indeed,

    > (let/cc k (producer k))
    1

returns `1` as we want. But if we use this expression again, we get more
`1`s as results:

    > (let/cc k (producer k))
    1
    > (let/cc k (producer k))
    1

The problem is obvious: our producer starts every time from scratch,
always sending the first value to the given continuation. Instead, we
need to make it somehow save where it stopped --- its own continuation
--- and on subsequent calls it should resume from that point. We start
with adding a `resume` continuation to save our position into:

    (define (producer yield)
      (define resume #f)
      (if (not resume)   ; we just started, so no resume yet
        (begin (yield 1)
               (yield 2)
               (yield 3))
        (resume 'blah))) ; we have a resume, use it

Next, we need to make it so that each use of `yield` will save its
continuation as the place to resume from:

    (define (producer yield)
      (define resume #f)
      (if (not resume)
        (begin (let/cc k (set! resume k) (yield 1))
               (let/cc k (set! resume k) (yield 2))
               (let/cc k (set! resume k) (yield 3)))
        (resume 'blah)))

But this is still broken in an obvious way: every time we invoke this
function, we define a new local `resume` which is always `#f`, leaving
us with the same behavior. We need `resume` to persist across calls ---
which we can get by "pulling it out" using a `let`:

    (define producer
      (let ([resume #f])
        (lambda (yield)
          (if (not resume)
            (begin (let/cc k (set! resume k) (yield 1))
                   (let/cc k (set! resume k) (yield 2))
                   (let/cc k (set! resume k) (yield 3)))
            (resume 'blah)))))

And this actually works:

    > (let/cc k (producer k))
    1
    > (let/cc k (producer k))
    2
    > (let/cc k (producer k))
    3

(Tracing how it works is a good exercise.)

Before we continue, we'll clean things up a little. First, to make it
easier to get values from the producer, we can write a little helper:

    (define (get producer)
      (let/cc k (producer k)))

Next, we can define a local helper inside the producer to improve it in
a similar way by making up a `yield` that wraps the `raw-yield` input
continuation (also flip the condition):

    (define producer
      (let ([resume #f])
        (lambda (raw-yield)
          (define (yield value)
            (let/cc k (set! resume k) (raw-yield value)))
          (if resume
            (resume 'blah)
            (begin (yield 1)
                   (yield 2)
                   (yield 3))))))

And we can further abstract out the general producer code from the
specific 1-2-3 producer that we started with. The complete code is now:

    (define (make-producer producer)
      (let ([resume #f])
        (lambda (raw-yield)
          (define (yield value)
            (let/cc k (set! resume k) (raw-yield value)))
          (if resume
            (resume 'blah)
            (producer yield)))))

    (define (get producer)
      (let/cc k (producer k)))

    (define producer
      (make-producer (lambda (yield)
                       (yield 1)
                       (yield 2)
                       (yield 3))))

When we now evaluate (get producer) three times, we get back the three
values in the correct order. But there is a subtle bug here, first try
this (after re-running!):

    > (list (get producer) (get producer))

Seems that this is stuck in an infinite loop. To see where the problem
is, re-run to reset the producer, and then we can see the following
interaction:

    > (* 10 (get producer))
    10
    > (* 100 (get producer))
    20
    > (* 12345 (get producer))
    30

This looks weird... Here's a more clarifying example:

    > (list (get producer))
    '(1)
    > (get producer)
    '(2)
    > (get producer)
    '(3)

Can you see what's wrong now? It seems that all three invocations of the
producer use the same continuation --- the first one, specifically, the
`(list <*>)` continuation. This also explains why we run into an
infinite loop with `(list (get producer) (get producer))` --- the first
continuation is:

    (list <*> (get producer))

so when we get the first `1` result we plug it in and proceed to
evaluate the second `(get producer)`, but that re-invokes the *first*
continuation again, getting into an infinite loop. We need to look
closely at our `make-producer` to see the problem:

    (define (make-producer producer)
      (let ([resume #f])
        (lambda (raw-yield)
          (define (yield value)
            (let/cc k (set! resume k) (raw-yield value)))
          (if resume
            (resume 'blah)
            (producer yield)))))

When `(make-producer (lambda (yield) ...))` is first called, `resume` is
initialized to `#f`, and the result is the `(lambda (raw-yield) ...)`,
which is bound to the global `producer`. Next, we call this function,
and since `resume` is `#f`, we apply the `producer` on our `yield` ---
which is a closure that has a reference to the `raw-yield` that we
received --- the continuation that was used in this first call. The
problem is that on subsequent calls `resume` will contain a continuation
which it is called, but this will jump back to that first closure with
the original `raw-yield`, so instead of returning to the current calling
context, we re-return to the first context --- the same first
continuation. The code can be structured slightly to make this a little
more obvious: push the `yield` definition into the only place it is used
(the first call):

    (define (make-producer producer)
      (let ([resume #f])
        (lambda (raw-yield)
          (if resume
            (resume 'blah)
            (let ([yield (lambda (value)
                           (let/cc k
                             (set! resume k)
                             (raw-yield value)))])
              (producer yield))))))

`yield` is not used elsewhere, so this code has exactly the same meaning
as the previous version. You can see now that when the producer is first
used, it gets a `raw-yield` continuation which is kept in a newly made
closure --- and even though the following calls have *different*
continuations, we keep invoking the first one. These calls get new
continuations as their `raw-yield` input, but they ignore them. It just
happened that the when we evaluated `(get producer)` three times on the
REPL, all calls had essentially the same continuation (the `P` part of
the REPL), so it seemed like things are working fine.

To fix this, we must avoid calling the same initial `raw-yield` every
time: we must change it with each call so it is the right one. We can do
this with another mutation --- introduce another state variable that
will refer to the correct `raw-yield`, and update it on every call to
the producer. Here's one way to do this:

    (define (make-producer producer)
      (let ([resume #f]
            [return-to-caller #f])
        (lambda (raw-yield)
          (set! return-to-caller raw-yield)
          (if resume
            (resume 'blah)
            (let ([yield (lambda (value)
                           (let/cc k
                             (set! resume k)
                             (return-to-caller value)))])
              (producer yield))))))

Using this, we get well-behaved results:

    > (list (get producer))
    '(1)
    > (* 8 (get producer))
    16
    > (get producer)
    3

or (again, after restarting the producer by re-running the code):

    > (list (get producer) (get producer) (get producer))
    '(1 2 3)

> Side-note: a different way to achieve this is to realize that when we
> invoke `resume`, we're calling the continuation that was captured by
> the `let/cc` expression. Currently, we're sending just `'blah` to that
> continuation, but we could send `raw-yield` there instead. With that,
> we can make that continuation be the target of setting the
> `return-to-caller` state variable. (This is how PLAI solves this
> problem.)

    (define (make-producer producer)
      (let ([resume #f])
        (lambda (raw-yield)
          (define return-to-caller raw-yield)
          (define (yield value)
            (set! return-to-caller
                  (let/cc k
                    (set! resume k)
                    (return-to-caller value))))
          (if resume
            (resume raw-yield)
            (producer yield)))))

Continuing with our previous code, and getting the `yield` back into a a
more convenient definition form, we have this complete code:

    ;;; ---<<<PRODUCER>>>-----------------------------------------------
    ;; An implementation of producer functions

    #lang racket

    (define (make-producer producer)
      (let ([resume #f]
            [return-to-caller #f])
        (lambda (raw-yield)
          (define (yield value)
            (let/cc k (set! resume k) (return-to-caller value)))
          (set! return-to-caller raw-yield)
          (if resume
            (resume 'blah)
            (producer yield)))))

    (define (get producer)
      (let/cc k (producer k)))

    (define producer
      (make-producer (lambda (yield)
                       (yield 1)
                       (yield 2)
                       (yield 3))))

There is still a small problem with this code:

    > (list (get producer) (get producer) (get producer))
    '(1 2 3)
    > (get producer)
    ;; infinite loop

Tracking this problem is another good exercise, and finding a solution
is easy. (For example, throwing an error when the producer is exhausted,
or returning `'done`, or returning the return value of the producer
function.)


------------------------------------------------------------------------
## Delimited Continuations [extra] [Tuesday, April 15th]

While the continuations that we have seen are a useful tool, they are
often "too global" --- they capture the complete computation context.
But in many cases we don't want that, instead, we want to capture a
specific context. In fact, this is exactly why producer code got
complicated: we needed to keep capturing the `return-to-caller`
continuation to make it possible to return to the correct context rather
than re-invoking the initial (and wrong) context.

Additional work on continuations resulted in a feature that is known as
"delimited continuations". These kind of continuations are more
convenient in that they don't capture the complete context --- just a
potion of it up to a specific point. To see how this works, we'll
restart with a relatively simple producer definition:

    (define producer
      (let ()
        (define (cont)
          (let/cc ret
            (define (yield value)
              (let/cc k (set! cont k) (ret value)))
            (yield 1)
            (yield 2)
            (yield 3)
            4))
        (define (generator) (cont))
        generator))

This producer is essentially the same as one that we've seen before: it
seems to work in that it returns the desired values for every call:

    > (producer)
    1
    > (producer)
    2
    > (producer)
    3

But fails in that it always returns to the initial context:

    > (list (producer))
    '(1)
    > (+ 100 (producer))
    '(2)
    > (* "bogus" (producer))
    '(3)

Fixing this will lead us down the same path we've just been through: the
problem is that `generator` is essentially an indirection "trampoline"
function that goes to whatever `cont` currently holds, and except for
the initial value of `cont` the other values are continuations that are
captured inside `yield`, meaning that the calls are all using the same
`ret` continuation that was grabbed once, at the beginning. To fix it,
we will need to re-capture a return continuation on every use of
`yield`, which we can do by modifying the `ret` binding, giving us a
working version:

    (define producer
      (let ()
        (define (cont)
          (let/cc ret
            (define (yield value)
              (let/cc k
                (set! cont (lambda () (let/cc r (set! ret r) (k))))
                (ret value)))
            (yield 1)
            (yield 2)
            (yield 3)
            4))
        (define (generator) (cont))
        generator))

This pattern of grabbing the current continuation and then jumping to
another --- `(let/cc k (set! cont k) (ret value))` --- is pretty common,
enough that there is a specific construct that does something similar:
`control`. Translating the `let/cc` form to it produces:

    (control k (set! cont ...) value)

A notable difference here is that we don't use a `ret` continuation.
Instead, another feature of the `control` form is that the value returns
to a specific point back in the current computation context that is
marked with a `prompt`. (Note that the `control` and `prompt` bindings
are not included in the default `racket` language, we need to get them
from a library: `(require racket/control)`.) The fully translated code
simply uses this `prompt` in place of the outer capture of the `ret`
continuation:

    (define producer
      (let ()
        (define (cont)
          (prompt
            (define (yield value)
              (control k
                (set! cont ???)
                value))
            (yield 1)
            (yield 2)
            (yield 3)
            4))
        (define (generator) (cont))
        generator))

We also need to translate the `(lambda () (let/cc r (set! ret r) (k)))`
expression --- but there is no `ret` to modify. Instead, we get the same
effect by another use of `prompt` which is essentially modifying the
implicitly used return continuation:

    (define producer
      (let ()
        (define (cont)
          (prompt
            (define (yield value)
              (control k
                (set! cont (lambda () (prompt (k))))
                value))
            (yield 1)
            (yield 2)
            (yield 3)
            4))
        (define (generator) (cont))
        generator))

This looks like the previous version, but there's an obvious advantage:
since there is no `ret` binding that we need to maintain, we can pull
out the `yield` definition to a more convenient place:

    (define producer
      (let ()
        (define (yield value)
          (control k
            (set! cont (lambda () (prompt (k))))
            value))
        (define (cont)
          (prompt
            (yield 1)
            (yield 2)
            (yield 3)
            4))
        (define (generator) (cont))
        generator))

Note that this is an important change, since the producer machinery can
now be abstracted into a `make-producer` function, as we've done before:

    (define (make-producer producer)
      (define (yield value)
        (control k
          (set! cont (lambda () (prompt (k))))
          value))
      (define (cont) (prompt (producer yield)))
      (define (generator) (cont))
      generator)

    (define producer
      (make-producer (lambda (yield)
                       (yield 1)
                       (yield 2)
                       (yield 3)
                       4)))

This is, again, a common pattern in such looping constructs --- where
the continuation of the loop keeps modifying the prompt as we do in the
thunk assigned to `cont`. There are two other operators that are similar
to `control` and `prompt`, which re-instate the point to return to
automatically. Confusingly, they have completely different name: `shift`
and `reset`. In the case of our code, we simply do the straightforward
translation, and drop the extra wrapping step inside the value assigned
to `cont` since that is done automatically. The resulting definition
becomes even shorter now:

    (define (make-producer producer)
      (define (yield value) (shift k (set! cont k) value))
      (define (cont) (reset (producer yield)))
      (define (generator) (cont))
      generator)

(Question: which set of forms is the more powerful one?)

It even looks like this code works reasonably well when the producer is
exhausted:

    > (list (producer) (producer) (producer) (producer) (producer))
    '(1 2 3 4 4)

But the problem is still there, except a but more subtle. We can see it
if we add a side-effect:

    (define producer
      (make-producer (lambda (yield)
                       (yield 1)
                       (yield 2)
                       (yield 3)
                       (printf "Hey!\n")
                       4)))

and now we get:

    > (list (producer) (producer) (producer) (producer) (producer))
    Hey!
    Hey!
    '(1 2 3 4 4)

This can be solved in the same way as we've discussed earlier --- for
example, grab the result value of the producer (which means that we get
the value only after it's exhausted), then repeat returning that value.
A particularly easy way to do this is to set `cont` to a thunk that
returns the value --- since the resulting `generator` function simply
invokes it, we get the desired behavior of returning the last value on
further calls:

    (define (make-producer producer)
      (define (yield value) (shift k (set! cont k) value))
      (define (cont)
        (reset (let ([retval (producer yield)])
                 ;; we get here when the producer is done
                 (set! cont (lambda () retval))
                 retval)))
      (define (generator) (cont))
      generator)

    (define producer
      (make-producer (lambda (yield)
                       (yield 1)
                       (yield 2)
                       (yield 3)
                       (printf "Hey!\n")
                       4)))

and now we get the improved behavior:

    > (list (producer) (producer) (producer) (producer) (producer))
    Hey!
    '(1 2 3 4 4)


------------------------------------------------------------------------
# Continuation Conclusions [Tuesday, April 15th]

Continuations are often viewed as a feature that is too complicated to
understand and/or are hard to implement. As a result, very few languages
provide general first-class continuations. Yet, they are an extremely
useful tool since they enable implementing new kinds of control
operators as user-written libraries. The "user" part is important here:
if you want to implement producers (or a convenient `web-read`, or an
ambiguous operator, or any number of other uses) in a language that
doesn't have continuations your options are very limited. You can ask
for the new feature and wait for the language implementors to provide
it, or you can CPS the relevant code (and the latter option is possible
only if you have complete control over the whole code source to
transform). With continuations, as we have seen, it is not only possible
to build such libraries, the resulting functionality is as if the
language has the desired feature already built-in. For example, Racket
comes with a generator library that is very similar to Python generators
--- but in contrast to Python, it is implemented completely in user
code. (In fact, the implementation is very close to the delimited
continuations version that we've seen last.)

Obviously, in cases where you don't have continuations and you need them
(or rather when you need some functionality that is implementable via
continuations), you will likely resort to the CPS approach, in some
limited version. For example, the [Racket documentation search page]
allows input to be typed while the search is happening.

[Racket documentation search page]: http://docs.racket-lang.org/search/

This is a feature that by itself is not available in JavaScript --- it
is as if there are two threads running (one for the search and one to
handle input), where JS is single-threaded on principle. This was
implemented by making the search code side-effect free, then CPS-ing the
code, then mimic threads by running the search for a bit, then storing
its (manually built) continuation, handling possible new input, then
resuming the search via this continuation. An approach that solves a
similar problem using a very different approach is node.js --- a
JavaScript-based server where all IO is achieved via functions that
receive callback functions, resulting in a style of code that is
essentially writing CPSed code. For example, it is similar in principle
to write code like:

    ;; copy "foo" to "tmp", read a line, delete "tmp", log the line
    (copy-file "foo" "tmp"
      (lambda ()
        (read-line "tmp"
          (lambda (line)
            (delete-file "tmp"
              (lambda ()
                (log-line line
                  (lambda ()
                    (printf "All done.\n")))))))))

or a concrete node.js example --- to swap two files, you could write:

    function swap(path1, path2, callback) {
      fs.rename(path1, "temp-name",
        function() {
          fs.rename(path2, path1,
            function() {
              fs.rename("temp-name", path2, callback);
            });
        });
    }

and if you want to follow the convention of providing a "convenient"
synchronous version, you would also add:

    function swapSync(path1, path2) {
      fs.renameSync(path1, "temp-name");
      fs.renameSync(path2, path1);
      fs.renameSync("temp-name", path2);
    }

As we have seen in the web server application example, this style of
programming tends to be "infectious", where a function that deals with
these callback-based functions will itself consume a callback ---

    ;; abstract the above as a function
    (define (safe-log-line in-file callback)
      (copy-file in-file "tmp"
        (lambda ()
          ... (log-line line callback))))

You should be able to see now what is happening here, without even
mentioning the word "continuation" in the docs... See also this [Node vs
Apache] video and read this [extended JS rant]. Quote:

> No one ever for a second thought that a programmer would write actual
> code like that. And then Node came along and all of the sudden here we
> are pretending to be compiler back-ends. Where did we go wrong?

[Node vs Apache]:
  https://www.youtube.com/watch?v=bzkRVzciAZg

[extended JS rant]:
  https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function

> (Actually, JavaScript has gotten better with promises, and then even
> better with `async`/`await` --- but these are new, so it is actually
> common to find libraries that provide two such versions and a *third*
> promise-based one, and even a *fourth* one, using `async`. See for
> example the `replace-in-file` package on NPM.)

Finally, as mentioned a few times, there has been extensive research
into many aspects of continuations. Different CPS approaches, different
implementation strategies, [a zoo-full of control operators], assigning
types to continuation-related functions and connections between
continuations and types, even connections between CPS and certain proof
techniques. Some research is still going on, though not as hot as it was
--- but more importantly, many modern languages "discover" the utility
of having continuations, sometimes in some more limited forms (eg,
Python and JavaScript generators), and sometimes in full form (eg,
Ruby's `callcc`).

[a zoo-full of control operators]:
  https://docs.racket-lang.org/reference/cont.html#%28part._.Additional_.Control_.Operators%29


------------------------------------------------------------------------
## Sidenote: JavaScript: Continuations, Promises, and Async/Await [Tuesday, April 15th]

JavaScript went through a long road that is directly related to
continuations. The motivating language feature that started all of this
is the early decision to make the language single-threaded. This feature
does make code easier to handle WRT side effects of all kinds. For
example, consider:

    let x = 0;
    x = x + 1;
    console.log(`A. x = ${x}`);
    x = x + 1;
    console.log(`B. x = ${x}`);

In JS, this code is guaranteed to show exactly the two output lines,
with the expected values of 1 and 2. This is not guaranteed in any
language that has threads (or any form of parallelism), since other
threads might kick in at any point, possibly producing more output or
mutating the value of `x`.

For several years JS was used only as a lightweight scripting language
for simple tasks on web pages, and life was simple with the single
threaded-ness nature of the language. Note that the environment *around*
the language (= the browser) was very early quite thread-heavy, with
different threads used for text layout, rendering, loading, user
interactions, etc. But then came Node and JS was used for more
traditional uses, with a web server being one of the first use cases.
This meand that there was a a need for some way to handle "multiple
threads" --- it's impractical to wait for any server-side interaction to
be done before serving the next.

This was addressed by designing Node as a framework that makes heavy use
of callback functions and IO events. In cases where you really want a
linear sequence of operations you can use `*Sync` functions, such as the
one we've seen above:

    function swapSync(path1, path2) {
      fs.renameSync(path1, "temp-name");
      fs.renameSync(path2, path1);
      fs.renameSync("temp-name", path2);
    }

But you can also use the non-`Sync` version which allows to switch
between different active jobs --- translating the above to (ignoring
errors for simplicity):

    function swap(path1, path2) {
      fs.rename(path1, "temp-name", () => {
        fs.rename(path2, path1, () => {
          fs.rename("temp-name", path2);
        });
      });
    }

This could be better in some cases --- for example, if the filesystem is
a remote-mounted directory, each rename will likely take noticeable
time. Using callbacks means that after firing each rename request the
whole process is free to do any other work, and later resume the
execution with the next operation when the last request is done.

But there is a major problem with this code: we can be more efficient
since the main process can do anything while waiting for the three
operations, but if you want to run this function to swap two files then
you likely have some code that needs to run after executing it --- code
that expects the name swapping to have happened:

    swap("foo", "bar");
    ... more code ...

The problem is that the function call returns *immediately*, and just
registers the first rename operation. There might be a small chance
(*very* unlikely) for the first rename to have happened, but even if it
happens fast, the callback is *guaranteed* to not be called (again,
since the core language is single-threaded), making this code completely
broken.

This means that for such uses we *must* switch to using callbacks in the
main function too:

    function swap(path1, path2, cb) {
      fs.rename(path1, "temp-name", () => {
        fs.rename(path2, path1, () => {
          fs.rename("temp-name", path2, () =>
            cb()); // or just use cb directly
        });
      });
    }

And the uses too:

    swap("foo", "bar", () => {
      ... more code ...
    });

This is the familiar manual CPS-ing of code that was for many years
something that all JS programmes needed to use.

As we've seen, the problem here is that writing such code is difficult,
and even more so working with such code when it needs to be extended,
debugged, etc. Eventually, the JS world settled on using "promises" to
simplify all of this: the idea is that instead of building "towers of
callbacks", we abstract them as promises that are threaded using a
`.then` property for the chain of operations.

    function swapPromise(path1, path2) {
      return fsPromises.rename(path1, "temp-name")
               .then(() => fsPromises.rename(path2, path1))
               .then(() => fsPromises.rename("temp-name", path2));
    }

Note that here too, the changes in the function body are reflated in how
the whole function is used: the body uses promises, and therefore the
whole function returns a promise.

This is a little better in that we don't suffer from nested code towers,
but it's still not too convenient. The last step in this evolution was
the introduction of `async`/`await` --- language features that declare a
promise-ified function which can wait for promises to be executed in a
more-or-less plain looking code:

    async function swap(path1, path2) {
      await fsPromises.rename(path1, "temp-name");
      await fsPromises.rename(path2, path1);
      await fsPromises.rename("temp-name", path2);
    }

(The implementation of this feature is interesting: it relies on using
generator functions, and using the fact that you can send values *into*
a generator when you resume it --- and all of this is, roughly speaking,
used to implement a threading system.)

This looks almost like normal code, but note that while doing so we lose
the simplification of a single-threaded language. Uses of `await` are
equivalent to function calls that can switch the context to other code
which might introduce unexpected side-effects. This is not really
getting back to a proper multi-threaded language like Racket: it's
rather a language with "cooperative threads". You still have the
advantage that if you *don't* use `await`, then code executes
sequentially and uninterrupted. Another problem that is a result of all
of this is that you might have some code where you need some async
operation: and just adding that might lead to new problems if you're not
fully aware of how it might affect your code.

Whether it is useful or not to complicate these cases just to be able to
write non-async code is a question... Meanwhile, you can see that there
are *still* significant problems in the JS world that are a result of
this mess. For example, the most popular web server framework,
[express](https://expressjs.com/), was finally released in its version 5
form (which supports async handlers) at the end of 2024, after around a
*decade* of working on it --- and it will take a bunch of time for it to
gain popularity.