PL: Lecture #2  Tuesday, January 8th
(text file)

Intro to Racket

Side-note: “Goto Statement Considered Harmful”

A review of “Goto Statement Considered Harmful”, by E.W. DIJKSTRA

This paper tries to convince us that the well-known goto statement should be eliminated from our programming languages or, at least (since I don’t think that it will ever be eliminated), that programmers should not use it. It is not clear what should replace it. The paper doesn’t explain to us what would be the use of the if statement without a goto to redirect the flow of execution: Should all our postconditions consist of a single statement, or should we only use the arithmetic if, which doesn’t contain the offensive goto?

And how will one deal with the case in which, having reached the end of an alternative, the program needs to continue the execution somewhere else?

The author is a proponent of the so-called “structured programming” style, in which, if I get it right, gotos are replaced by indentation. Structured programming is a nice academic exercise, which works well for small examples, but I doubt that any real-world program will ever be written in such a style. More than 10 years of industrial experience with Fortran have proved conclusively to everybody concerned that, in the real world, the goto is useful and necessary: its presence might cause some inconveniences in debugging, but it is a de facto standard and we must live with it. It will take more than the academic elucubrations of a purist to remove it from our languages.

Publishing this would waste valuable paper: Should it be published, I am as sure it will go uncited and unnoticed as I am confident that, 30 years from now, the goto will still be alive and well and used as widely as it is today.

Confidential comments to the editor: The author should withdraw the paper and submit it someplace where it will not be peer reviewed. A letter to the editor would be a perfect choice: Nobody will notice it there!

Quick Intro to Racket

Racket syntax: similar to other Sexpr-based languages.

Reminder: the parens can be compared to C/etc function call parens — they always mean that some function is applied. This is the reason why (+ (1) (2)) won’t work: if you use C syntax that is +(1(), 2()) but 1 isn’t a function so 1() is an error.

An important difference between syntax and semantics. A good way to think about this is the difference between the string 42 stored in a file somewhere (two ASCII values), and the number 42 stored in memory (in some representation). You could also continue with the above example: there is nothing wrong with “murder” — it’s just a word, but murder is something you’ll go to jail for. The evaluation function that Racket uses is actually a function that takes a piece of syntax and returns (or executes) its semantics.


define expressions are used for creating new bindings, do not try to use them to change values. For example, you should not try to write something like (define x (+ x 1)) in an attempt to mimic x = x+1. It will not work.


There are two boolean values built in to Racket: #t (true) and #f (false). They can be used in if statements, for example:

(if (< 2 3) 10 20)  -->  10

because (< 2 3) evaluates to #t. As a matter of fact, any value except for #f is considered to be true, so:

(if 0 1 2)        -->  1  ; all of these are "truthy"
(if "false" 1 2)  -->  1
(if "" 1 2)      -->  1
(if null 1 2)    -->  1
(if #t 1 2)      -->  1  ; including the true value
(if #f 1 2)      -->  2  ; the only false value
(if #false 1 2)  -->  2  ; another way to write it
(if false 1 2)    -->  2  ; also false since it's bound to #f

Note: Racket is a functional language — so everything has a value.

This means that the expression

(if test consequent)

has no meaning when test evaluates to #f. This is unlike Pascal/C where statements do something (side effect) like printing or an assignment — here an if statement with no alternate part will just do nothing if the test is false… Racket, however, must return some value — it could decide on simply returning #f (or some unspecified value) as the value of

(if #f something)

as some implementations do, but Racket just declares it a syntax error. (As we will see in the future, Racket has a more convenient when with a clearer intention.)


Well, almost everything is a value…

There are certain things that are part of Racket’s syntax — for example if and define are special forms, they do not have a value! More about this shortly.

(Bottom line: much more things do have a value, compared with other languages.)


cond is used for a ifelse ifelse ifelse … sequence. The problem is that nested ifs are inconvenient. For example,

(define (digit-num n)
  (if (<= n 9)
    1
    (if (<= n 99)
      2
      (if (<= n 999)
        3
        (if (<= n 9999)
          4
          "a lot")))))

In C/Java/Whatever, you’d write:

function digit_num(n) {
  if (n <= 9)        return 1;
  else if (n <= 99)  return 2;
  else if (n <= 999)  return 3;
  else if (n <= 9999) return 4;
  else return "a lot";
}

(Side question: why isn’t there a return statement in Racket?)

But trying to force Racket code to look similar:

(define (digit-num n)
  (if (<= n 9)
    1
  (if (<= n 99)
    2
  (if (<= n 999)
    3
  (if (<= n 9999)
    4
    "a lot")))))

is more than just bad taste — the indentation rules are there for a reason, the main one is that you can see the structure of your program at a quick glance, and this is no longer true in the above code. (Such code will be penalized!)

So, instead of this, we can use Racket’s cond statement, like this:

(define (digit-num n)
  (cond [(<= n 9)    1]
        [(<= n 99)  2]
        [(<= n 999)  3]
        [(<= n 9999) 4]
        [else        "a lot"]))

Note that else is a keyword that is used by the cond form — you should always use an else clause (for similar reasons as an if, to avoid an extra expression evaluation there, and we will need it when we use a typed language). Also note that square brackets are read by DrRacket like round parens, it will only make sure that the paren pairs match. We use this to make code more readable — specifically, there is a major difference between the above use of [] from the conventional use of (). Can you see what it is?

The general structure of a cond:

(cond [test-1 expr-1]
      [test-2 expr-2]
      ...
      [test-n expr-n]
      [else  else-expr])

Example for using an if expression, and a recursive function:

(define (fact n)
  (if (zero? n)
    1
    (* n (fact (- n 1)))))

Use this to show the different tools, especially:

An example of converting it to tail recursive form:

(define (helper n acc)
  (if (zero? n)
    acc
    (helper (- n 1) (* acc n))))

(define (fact n)
  (helper n 1))

Additional notes about homework submissions:

Lists & Recursion

Lists are a fundamental Racket data type.

A list is defined as either:

  1. the empty list (null, empty, or '()),

  2. a pair (cons cell) of anything and a list.

As simple as this may seem, it gives us precise formal rules to prove that something is a list.

Examples:

null
(cons 1 null)
(cons 1 (cons 2 (cons 3 null)))
(list 1 2 3) ; a more convenient function to get the above

List operations — predicates:

null? ; true only for the empty list
pair? ; true for any cons cell
list? ; this can be defined using the above

We can derive list? from the above rules:

(define (list? x)
  (if (null? x)
    #t
    (and (pair? x) (list? (rest x)))))

or better:

(define (list? x)
  (or (null? x)
      (and (pair? x) (list? (rest x)))))

But why can’t we define list? more simply as

(define (list? x)
  (or (null? x) (pair? x)))

The difference between the above definition and the proper one can be observed in the full Racket language, not in the student languages (where there are no pairs with non-list values in their tails).

List operations — destructors for pairs (cons cells):

first
rest

Traditionally called car, cdr.

Also, any c<x>r combination for <x> that is made of up to four as and/or ds — we will probably not use much more than cadr, caddr etc.


Example for recursive function involving lists:

(define (list-length list)
  (if (null? list)
    0
    (+ 1 (list-length (rest list)))))

Use different tools, esp:

How come we could use list as an argument — use the syntax checker

(define (list-length-helper list len)
  (if (null? list)
    len
    (list-length-helper (rest list) (+ len 1))))

(define (list-length list)
  (list-length-helper list 0))

Main idea: lists are a recursive structure, so functions that operate on lists should be recursive functions that follow the recursive definition of lists.

Another example for list function — summing a list of numbers

(define (sum-list l)
  (if (null? l)
    0
    (+ (first l) (sum-list (rest l)))))

Also show how to implement rcons, using this guideline.


More examples:

Define reverse — solve the problem using rcons.

rcons can be generalized into something very useful: append.

Redefine reverse using tail recursion.

Some Style

When you have some common value that you need to use in several places, it is bad to duplicate it. For example:

(define (how-many a b c)
  (cond [(> (* b b) (* 4 a c)) 2]
        [(= (* b b) (* 4 a c)) 1]
        [(< (* b b) (* 4 a c)) 0]))

What’s bad about it?

In general, the ability to use names is probably the most fundamental concept in computer science — the fact that makes computer programs what they are.

We already have a facility to name values: function arguments. We could split the above function into two like this:

(define (how-many-helper b^2 4ac) ; note: valid names!
  (cond [(> b^2 4ac) 2]
        [(= b^2 4ac) 1]
        [else        0]))

(define (how-many a b c)
  (how-many-helper (* b b) (* 4 a c)))

But instead of the awkward solution of coming up with a new function just for its names, we have a facility to bind local names — let. In general, the syntax for a let special form is

(let ([id expr] ...) expr)

For example,

(let ([x 1] [y 2]) (+ x y))

But note that the bindings are done “in parallel”, for example, try this:

(let ([x 1] [y 2])
  (let ([x y] [y x])
    (list x y)))

(Note that “in parallel” is quoted here because it’s not really parallelism, but just a matter of scopes: the RHSs are all evaluated in the surrounding scope!)

Using this for the above problem:

(define (how-many a b c)
  (let ([b^2 (* b b)]
        [4ac (* 4 a c)])
    (cond [(> b^2 4ac) 2]
          [(= b^2 4ac) 1]
          [else        0])))

Some notes on writing code (also see the style-guide in the handouts section)

Code quality will be graded to in this course!


The fact that in Racket we can use functions as values is very useful — for example, map, foldl & foldr, many more.

Example:

;; every? : (A -> Boolean) (Listof A) -> Boolean
;; Returns false if any element of lst fails
;; the given pred, true if all pass pred.
(define (every? pred lst)
  (or (null? lst)
      (and (pred (first lst))
          (every? pred (rest lst)))))

Tail calls

You should generally know what tail calls are, but here’s a quick review of the subject. A function call is said to be in tail position if there is no context to “remember” when you’re calling it. Very roughly, this means that function calls that are not nested in argument expressions of another call are tail calls. This definition is something that depends on a context, for example, in an expression like

(if (huh?)
  (foo (add1 (* x 3)))
  (foo (/ x 2)))

both calls to foo are tail calls, but they’re tail calls of this expression and therefore apply to this context. It might be that this code is inside another call, as in

(blah (if (huh?)
        (foo (add1 (* x 3)))
        (foo (/ x 2)))
      something-else)

and the foo calls are now not in tail position. The main feature of all Scheme implementations including Racket wrt tail calls is that calls that are in tail position of a function are said to be “eliminated”. That means that if we’re in an f function, and we’re about to call g in tail position and therefore whatever g returns would be the result of f too, then when Racket does the call to g it doesn’t bother keeping the f context — it won’t remember that it needs to “return” to f and will instead return straight to its caller. In other words, when you think about a conventional implementation of function calls as frames on a stack, Racket will get rid of a stack frame when it can.

Another way to see this is to use DrRacket’s stepper to step through a function call. The stepper is generally an alternative debugger, where instead of visualizing stack frames it assembles an expression that represents these frames. Now, in the case of tail calls, there is no room in such a representation to keep the call — and the thing is that in Racket that’s perfectly fine since these calls are not kept on the call stack.

Note that there are several names for this feature:

When should you use tail calls?

Often, people who are aware of tail calls will try to use them always. That’s not always a good idea. You should generally be aware of the tradeoffs when you consider what style to use. The main thing to remember is that tail-call elimination is a property that helps reducing space use (stack space) — often reducing it from linear space to constant space. This can obviously make things faster, but usually the speedup is just a constant factor since you need to do the same number of iterations anyway, so you just reduce the time spent on space allocation.

Here is one such example that we’ve seen:

(define (list-length-1 list)
  (if (null? list)
    0
    (+ 1 (list-length-1 (rest list)))))

;; versus

(define (list-length-helper list len)
  (if (null? list)
    len
    (list-length-helper (rest list) (+ len 1))))
(define (list-length-2 list)
  (list-length-helper list 0))

In this case the first (recursive) version version consumes space linear to the length of the list, whereas the second version needs only constant space. But if you consider only the asymptotic runtime, they are both O(length(l)).

A second example is a simple implementation of map:

(define (map-1 f l)
  (if (null? l) l (cons (f (first l)) (map-1 f (rest l)))))

;; versus

(define (map-helper f l acc)
  (if (null? l)
    (reverse acc)
    (map-helper f (rest l) (cons (f (first l)) acc))))
(define (map-2 f l)
  (map-helper f l '()))

In this case, both the asymptotic space and the runtime consumption are the same. In the recursive case we have a constant factor for the stack space, and in the iterative one (the tail-call version) we also have a similar factor for accumulating the reversed list. In this case, it is probably better to keep the first version since the code is simpler. In fact, Racket’s stack space management can make the first version run faster than the second — so optimizing it into the second version is useless.