PL: Lecture #27  Tuesday, April 11th
(text)

Web Programming

PLAI §15

Consider web programming as a demonstration of a frequent problem. The HTTP protocol is stateless: each HTTP query can be thought of as running a program (or a function), getting a result, then killing it. This makes interactive applications hard to write.

For example, consider this behavior (which is based on a real story of a probably not-so-real bug known as “the ITA bug”):

Obviously there is some fundamental problem here — especially given that this problem plagued many websites early on (and these days these kind of problems can still be found in some places (like the registrar’s system), except that people are much more aware of it, and are much more prepared to deal with it). In an attempt to clarify what it is exactly that went wrong, we might require that each interaction will result in something that is deterministically based on what the browser window shows when the interaction is made — but even that is not always true. Consider the same scenario except with a bookstore and an “add to my cart” button. In this case you want to be able to add one item to the cart in the first window, then switch to the second window and click “add” there too: you want to end up with a cart that has both items.

The basic problem here is HTTP’s statelessness, something that both web servers and web browsers use extensively. Browsers give you navigation buttons and sometimes will not even communicate with the web server when you use them (instead, they’ll show you cached pages), they give you the ability to open multiple windows or tabs from the current one, and they allow you to “clone” the current tab. If you view each set of HTTP queries as a session — this means that web browsers allow you to go back and forth in time, explore multiple futures in parallel, and clone your current world.

These are features that the HTTP protocol intentionally allows by being stateless, and that people have learned to use effectively. A stateful protocol (like ssh, or ftp) will run in a single process (or a program, or a function) that is interacting with you directly, and this process dies only when you disconnect. A big advantage of stateful protocols is their ability to be very interactive and rely on state (eg, an editor updates a character on the screen, relying on the rest of it showing the same text); but stateless protocols can scale up better, and deal with a more hectic kind of interaction (eg, open a page on an online store, keep it open and buy the item a month later; or any of the above “time manipulation” devices).

Side-note: Some people think that Ajax is the answer to all of these problems. In reality, Ajax is layered on top of (asynchronous) web queries, so in fact it is the exact same situation. You do have an option of creating an application that works completely on the client side, but that wouldn’t be as attractive — and even if you do so, you’re still working inside a browser that can play the same time tricks.

Basic web programming

PLAI §16

Obviously, writing programs to run on a web server is a profitable activity, and therefore highly desirable. But when we do so, we need to somehow cope with the web’s statelessness. To see the implications from a PL point of view we’ll use a small “toy” example that demonstrates the basic issues — an “addition” service:

[Such a small application is not realistic, of course: you can obviously ask for both numbers on the same page. We still use it, though, to minimize the general interaction problem to a more comprehensible core problem.]

Starting from just that, consider how you’d want to write the code for such a service. (If you have experience writing web apps, then try to forget all of that now, and focus on just this basic problem.) The plain version of what we want to implement is:

(print
  (+ (read "First number")
    (read "Second number")))

which is trivially “translated” to:

(web-display
  (+ (web-read "First number")
    (web-read "Second number")))

But this is never going to work. The interaction is limited to presenting the user with some data and that’s all — you cannot do any kind of interactive querying. For the purpose of making this more concrete, imagine that web-read and web-display both communicate information to the user via something like error: the information is sent and at the same time the whole computation is aborted. With this, the above code will just manage to ask for the first number and nothing else happens.

We therefore must turn this server function into three separate functions: one that shows the prompt for the first number, one that gets the value entered and shows the second prompt, and a third that shows the results page. The first two of these functions would send the information (and the respective computation dies) to the browser, including a form submission URL that will invoke the next function.

Assuming a generic “query argument” that represents the browser request, and a return value that represents a page for the browser to render, we have:

(define (f1 query)
  ... show the first question ...)

(define (f2 query)
  ... extract the number from the query ...
  ... show the second question ...)

(define (f3 query)
  ... extract the number from the query ...
  ... show the sum ...)

Note that f2 receives the first number directly, but f3 doesn’t. Yet, it is obviously needed to show the sum. A typical hack to get around this is to use a “hidden field” in the HTML form that f2 generates, where that field holds the second result. To make things more concrete, we’ll use some imaginary web API functions:

(define (f1 query)
  (web-read "First number" 'n1 "f2"))

(define (f2 query)
  (let ([n1 (get-field query 'n1)])
    ;; imagine that the following "configures" what web-read
    ;; produces by adding a hidden field to display
    (with-hidden-field 'n1 n1
      (web-read "Second number" 'n2 "f3"))))

(define (f3 query)
  (web-display
    "Your two numbers sum up to: "
    (+ (get-field query 'n1)
      (get-field query 'n2))))

Which would (supposedly) result in something like the following html forms when the user enters 1 and 2:

http://.../f1
<form action="http://.../f2">
  First number:
  <input type="text" name="n1" />
</form>

http://.../f2
<form action="http://.../f3">
  <input type="hidden" name="n1" value="1" />
  Second number:
  <input type="text" name="n2" />
</form>

http://.../f3
<p>Your two numbers sum up to: 3</p>

This is often a bad solution: it gets very difficult to manage with real services where the “state” of the server consists of much more than just a single number — and it might even include values that are not expressible as part of the form (for example an open database connection or a running process). Worse, the state is all saved in the client browser — if it dies, then the interaction is gone. (Imagine doing your taxes, and praying that the browser won’t crash a third time.)

Another common approach is to store the state information on the server, and use a small handle (eg, in a cookie) to identify the state, then each function can use the cookie to retrieve the current state of the service — but this is exactly how we get to the above bugs. It will fail with any of the mentioned time-manipulation features.

Continuations: Web Programming

To try and get a better solution, we’ll re-start with the original expression:

(web-display (+ (web-read "First number")
                (web-read "Second number")))

and assuming that web-read works as a regular function, we need to begin with executing the first read:

                (web-read "First number")

We then need to take that result and plug it into an expression that will read the second number and sum the results — that’s the same as the first expression, except that instead of the first web-read we use a “hole”:

(web-display (+ <*>
                (web-read "Second number")))

where <*> marks the point where we need to plug the result of the first question into. A better way to explain this hole is to make the expression into a function:

(lambda (<*>)
  (web-display (+ <*>
                  (web-read "Second number"))))

We can split the second and third interactions in the same way. First we can assemble the above two bits of code into an expression that has the same meaning as the original one:

((lambda (<*>)
  (web-display (+ <*> (web-read "Second number"))))
(web-read "First number"))

And now we can continue doing this and split the body of the consumer:

  (web-display (+ <*> (web-read "Second number")))

into a “reader” and the rest of the computation (using a new hole):

                      (web-read "Second number")  ; reader part

  (web-display (+ <*> <*2>))                      ; rest of comp

Doing all of this gives us:

((lambda (<*1>)
  ((lambda (<*2>)
      (web-display (+ <*1> <*2>)))
    (web-read "Second number")))
(web-read "First number"))

And now we can proceed to the main trick. Conceptually, we’d like to think about web-read as something that is implemented in a simple way:

(define (web-read prompt)
  (printf "~a: " prompt)
  (read-number))

except that the “real” thing would throw an error and die once the prompt is printed. The trick is one that we’ve already seen: we can turn the code inside-out by making the above “hole functions” be an argument to the reading function — a consumer callback for what needs to be done once the number is read. This callback is called a continuation, and we’ll use a /k suffix for names of functions that expect a continuation (k is a common name for a continuation argument):

(define (web-read/k prompt k)
  (printf "~a: " prompt)
  (k (read-number)))

This is not too different from the previous version — the only difference is that we make the function take a consumer function as an input, and hand it what we read instead of just returning it. It makes things a little easier, since we pass the hole function to web-read/k, and it will invoke it when needed:

(web-read/k "First number"
  (lambda (<*1>)
    (web-read/k "Second number"
      (lambda (<*2>)
        (web-display (+ <*1> <*2>))))))

You might notice that this looks too complicated; we could get exactly the same result with:

(web-display (+ (web-read/k "First number"
                            (lambda (<*>) <*>))
                (web-read/k "Second number"
                            (lambda (<*>) <*>))))

but then there’s not much point to having web-read/k at all… So why have it? Remember that the main problem is that in the context of a web server we think of web-read as something that throws an error and kills the computation. So if we use such a web-read/k with a continuation, we can make it save this continuation in some global state, so it can be used later when there is a value.

As a side note, all of this might start looking very familiar to you if you have any experience working with callback-heavy code. In fact, consider the fact that the continuation (or k) is basically just a callback, so the above is roughly:

webRead("First number", function(a) {
  webRead("Second number", function(b) {
    webDisplay(a + b);
  });
});

if you follow the JS convention of having a plain name be a callback-able function (vs the fooSync variants).

Simulating web reading

We can now actually try all of this in plain Racket by simulating web interactions. This is useful to look at the core problem while avoiding the whole web mess that is uninteresting for the purpose of our discussion. The main feature that we need to emulate is statelessness — and as we’ve discussed, we can simulate that using error to guarantee that the process is properly killed for each interaction. We will do this in web-display which simulates sending the results to the client and therefore terminates the server process:

(define (web-display n)
  (error 'web-display "~s" n))

More importantly, we need to do it in web-read/k — but in this case, we need more than just an error — we need a way to store the k so the computation can be resumed later. To continue with the web analogy we do this in two steps: error is used to display the information (the input prompt), and the user action of entering a number and submitting it will be simulated by calling a function. Since the computation is killed after we show the prompt, the way to implement this is by making the user call a toplevel submit function — and before throwing the interaction error, we’ll save the k continuation in a global box:

(define (web-read/k prompt k)
  (set-box! resumer k)
  (error 'web-read
        "enter (submit N) to continue the following\n  ~a:"
        prompt))

submit uses the saved continuation:

(define (submit n)
  ((unbox resumer) n))

For safety, we’ll initialize resumer with a function that throws an error (a real one, not intended for interactions), make web-display reset it to the same function, and also make submit do so after grabbing its value — meaning that submit can only be used after a web-read/k. And for convenience, we’ll use raise-user-error instead of error, which is a Racket function that throws an error without a stack trace (since our errors are intended). It’s also helpful to disable debugging in DrRacket, so it won’t take us back to the code over and over again.


🗎web-base-library.rkt ⇩
;; Fake web interaction library (to be used with manual code CPS-ing
;; examples)

#lang racket

(define error raise-user-error)

(define (nothing-to-do ignored)
  (error 'REAL-ERROR "No computation to resume."))

(define resumer (box nothing-to-do))

(define (web-display n)
  (set-box! resumer nothing-to-do)
  (error 'web-display "~s" n))

(define (web-read/k prompt k)
  (set-box! resumer k)
  (error 'web-read
        "enter (submit N) to continue the following\n  ~a:"
        prompt))

(define (submit n)
  ;; to avoid mistakes, we clear out `resumer' before invoking it
  (let ([k (unbox resumer)])
    (set-box! resumer nothing-to-do)
    (k n)))

We can now try out our code for the addition server, using plain argument names instead of <*>s:

(web-read/k "First number"
  (lambda (n1)
    (web-read/k "Second number"
      (lambda (n2)
        (web-display (+ n1 n2))))))

and see how everything works. You can also try now the bogus expression that we mentioned:

(web-display (+ (web-read/k "First number"  (lambda (n) n))
                (web-read/k "Second number" (lambda (n) n))))

and see how it breaks: the first web-read/k saves the identity function as the global resumer, losing the rest of the computation.


Again, this should be familiar: we’ve taken a simple compound expression and “linearized” it as a sequence of an input operation and a continuation receiver for its result. This is essentially the same thing that we used for dealing with inputs in the lazy language — and the similarity is not a coincidence. The problem that we faced there was very different (representing IO as values that describe it), but it originates from a similar situation — some computation goes on (in whatever way the lazy language decides to evaluate it), and when we have a need to read something we must return a description of this read that contains “the rest of the computation” to the eager part of the interpreter that executes the IO. Once we get the user input, we send it to this computation remainder, which can return another read request, and so on.

Based on this intuition, we can guess that this can work for any piece of code, and that we can even come up with a nicer “flat” syntax for it. For example, here is a simple macro that flattens a sequence of reads and a final display:

(define-syntax web-code
  (syntax-rules (read display)
    [(_ (read n prompt) more ...)
    (web-read/k prompt
      (lambda (n)
        (web-code more ...)))]
    [(_ (display last))
    (web-display last)]))

and using it:

(web-code (read x "First number")
          (read y "Second number")
          (display (+ x y)))

However, we’ll avoid such cuteness to make the transformation more explicit for the sake of the discussion. Eventually, we’ll see how things can become even better than that (as done in Racket): we can get to write plain-looking Racket expressions and avoid even the need for an imperative form for the code. In fact, it’s easy to write this addition server using Racket’s web server framework, and the core of the code looks very simple:

(define (start initial-request)
  (page "The sum is: "
        (+ (web-read "First number")
          (web-read "Second number"))))

There is not much more than that — it has two utilities, page creates a well-formed web page, and web-read performs the reading. The main piece of magic there is in send/suspend which makes the web server capture the computation’s continuation and store it in a hash table, to be retrieved when the user visits the given URL. Here’s the full code:

#lang web-server/insta
(define (page . body)
  (response/xexpr
  `(html (body ,@(map (lambda (x)
                        (if (number? x) (format "~a" x) x))
                      body)))))
(define (web-read prompt)
  ((compose string->number (curry extract-binding/single 'n)
            request-bindings send/suspend)
  (lambda (k)
    (page `(form ([action ,k])
              ,prompt ": " (input ([type "text"] [name "n"])))))))
(define (start initial-request)
  (page "The sum is: "
        (+ (web-read "First number")
          (web-read "Second number"))))