# Lists & Recursion

Lists are a fundamental Racket data type.

A list is defined as either:

1. the empty list (`null`, `empty`, or `'()`),

2. a pair (`cons` cell) of anything and a list.

As simple as this may seem, it gives us precise formal rules to prove that something is a list.

• Why is there a “the” in the first rule?

Examples:

null
(cons 1 null)
(cons 1 (cons 2 (cons 3 null)))
(list 1 2 3) ; a more convenient function to get the above

List operations — predicates:

null? ; true only for the empty list
pair? ; true for any cons cell
list? ; this can be defined using the above

We can derive `list?` from the above rules:

(define (list? x)
(if (null? x)
#t
(and (pair? x) (list? (rest x)))))

or better:

(define (list? x)
(or (null? x)
(and (pair? x) (list? (rest x)))))

But why can’t we define `list?` more simply as

(define (list? x)
(or (null? x) (pair? x)))

The difference between the above definition and the proper one can be observed in the full Racket language, not in the student languages (where there are no pairs with non-list values in their tails).

List operations — destructors for pairs (`cons` cells):

first
rest

Traditionally called `car`, `cdr`.

Also, any `c<x>r` combination for `<x>` that is made of up to four `a`s and/or `d`s — we will probably not use much more than `cadr`, `caddr` etc.

Example for recursive function involving lists:

(define (list-length list)
(if (null? list)
0
(+ 1 (list-length (rest list)))))

Use different tools, esp:

• syntax-checker
• stepper

How come we could use `list` as an argument — use the syntax checker

(define (list-length-helper list len)
(if (null? list)
len
(list-length-helper (rest list) (+ len 1))))

(define (list-length list)
(list-length-helper list 0))

Main idea: lists are a recursive structure, so functions that operate on lists should be recursive functions that follow the recursive definition of lists.

Another example for list function — summing a list of numbers

(define (sum-list l)
(if (null? l)
0
(+ (first l) (sum-list (rest l)))))

Also show how to implement `rcons`, using this guideline.

More examples:

Define `reverse` — solve the problem using `rcons`.

`rcons` can be generalized into something very useful: `append`.

• How would we use `append` instead of `rcons`?

• How much time will this take? Does it matter if we use `append` or `rcons`?

Redefine `reverse` using tail recursion.

• Is the result more complex? (Yes, but not too bad because it collects the elements in reverse.)

# Some Style

When you have some common value that you need to use in several places, it is bad to duplicate it. For example:

(define (how-many a b c)
(cond [(> (* b b) (* 4 a c)) 2]
[(= (* b b) (* 4 a c)) 1]
[(< (* b b) (* 4 a c)) 0]))

• It’s longer than necessary, which will eventually make your code less readable.

• It’s slower — by the time you reach the last case, you have evaluated the two sequences three times.

• It’s more prone to bugs — the above code is short enough, but what if it was longer so you don’t see the three occurrences on the same page? Will you remember to fix all places when you debug the code months after it was written?

In general, the ability to use names is probably the most fundamental concept in computer science — the fact that makes computer programs what they are.

We already have a facility to name values: function arguments. We could split the above function into two like this:

(define (how-many-helper b^2 4ac) ; note: valid names!
(cond [(> b^2 4ac) 2]
[(= b^2 4ac) 1]
[else        0]))

(define (how-many a b c)
(how-many-helper (* b b) (* 4 a c)))

But instead of the awkward solution of coming up with a new function just for its names, we have a facility to bind local names — `let`. In general, the syntax for a `let` special form is

(let ([id expr] ...) expr)

For example,

(let ([x 1] [y 2]) (+ x y))

But note that the bindings are done “in parallel”, for example, try this:

(let ([x 1] [y 2])
(let ([x y] [y x])
(list x y)))

(Note that “in parallel” is quoted here because it’s not really parallelism, but just a matter of scopes: the RHSs are all evaluated in the surrounding scope!)

Using this for the above problem:

(define (how-many a b c)
(let ([b^2 (* b b)]
[4ac (* 4 a c)])
(cond [(> b^2 4ac) 2]
[(= b^2 4ac) 1]
[else        0])))

Some notes on writing code (also see the style-guide in the handouts section)

Code quality will be graded to in this course!

• Use abstractions whenever possible, as said above. This is bad:

(define (how-many a b c)
(cond
[(> (* b b) (* 4 a c)) 2]
[(= (* b b) (* 4 a c)) 1]
[(< (* b b) (* 4 a c)) 0]))

(define (what-kind a b c)
(cond
[(= a 0) 'degenerate]
[(> (* b b) (* 4 a c)) 'two]
[(= (* b b) (* 4 a c)) 'one]
[(< (* b b) (* 4 a c)) 'none]))
• But don’t over abstract: `(define one 1)` or `(define two "two")`

• Always do test cases (show coverage tool), you might want to comment them, but you should always make sure your code works.

• Do not under-document, but also don’t over-document.

• INDENTATION! (Let DrRacket decide; get used to its rules) –> This is part of the culture that was mentioned last time, but it’s done this way for good reason: decades of programming experience have shown this to be the most readable format. It’s also extremely important to keep good indentation since programmers in all Lisps don’t count parens — they look at the structure.

• As a general rule, `if` should be either all on one line, or the condition on the first and each consequent on a separate line. Similarly for `define` — either all on one line or a newline after the object that is being define (either an identifier or a an identifier with arguments).

• Another general rule: you should never have white space after an open-paren, or before a close paren (white space includes newlines). Also, before an open paren there should be either another open paren or white space, and the same goes for after a closing paren.

• Use the tools that are available to you: for example, use `cond` instead of nested `if`s (definitely do not force the indentation to make a nested `if` look like its C counterpart — remember to let DrRacket indent for you).

Another example — do not use `(+ 1 (+ 2 3))` instead of `(+ 1 2 3)` (this might be needed in extremely rare situations, only when you know your calculus and have extensive knowledge about round-off errors).

Another example — do not use `(cons 1 (cons 2 (cons 3 null)))` instead of `(list 1 2 3)`.

Also — don’t write things like:

(if (< x 100) #t #f)

since it’s the same as just

(< x 100)

A few more of these:

(if x #t y)  --same-as-->  (or x y)
(if x y #f)  --same-as-->  (and x y)
(if x #f #t)  --same-as-->  (not x)

(Actually the first two are almost the same, for example, `(and 1 2)` will return `2`, not `#t`.)

• Use these as examples for many of these issues:

(define (interest x)
(* x (cond
[(and (> x 0) (<= x 1000)) 0.04]
[(and (> x 1000) (<= x 5000)) 0.045]
[else 0.05])))

(define (how-many a b c)
(cond ((> (* b b) (* (* 4 a) c))
2)
((< (* b b) (* (* 4 a) c))
0)
(else
1)))

(define (what-kind a b c)
(if (equal? a 0) 'degenerate
(if (equal? (how-many a b c) 0) 'zero
(if (equal? (how-many a b c) 1) 'one
'two)
)
)
)

(define (interest deposit)
(cond
[(< deposit 0) "invalid deposit"]
[(and (>= deposit 0) (<= deposit 1000)) (* deposit 1.04) ]
[(and (> deposit 1000) (<= deposit 5000)) (* deposit 1.045)]
[(> deposit 5000) (* deposit 1.05)]))

(define (interest deposit)
(if (< deposit 1001) (* 0.04 deposit)
(if (< deposit 5001) (* 0.045 deposit)
(* 0.05 deposit))))

(define (what-kind a b c) (cond ((= 0 a) 'degenerate)
(else (cond ((> (* b b)(*(* 4 a) c)) 'two)
(else (cond ((= (* b b)(*(* 4 a) c)) 'one)
(else 'none)))))));

The fact that in Racket we can use functions as values is very useful — for example, `map`, `foldl` & `foldr`, many more.

Example:

;; every? : (A -> Boolean) (Listof A) -> Boolean
;; Returns false if any element of lst fails
;; the given pred, true if all pass pred.
(define (every? pred lst)
(or (null? lst)
(and (pred (first lst))
(every? pred (rest lst)))))

# Tail calls

You should generally know what tail calls are, but here’s a quick review of the subject. A function call is said to be in tail position if there is no context to “remember” when you’re calling it. Very roughly, this means that function calls that are not nested in argument expressions of another call are tail calls. This definition is something that depends on a context, for example, in an expression like

(if (huh?)
(foo (/ x 2)))

both calls to `foo` are tail calls, but they’re tail calls of this expression and therefore apply to this context. It might be that this code is inside another call, as in

(blah (if (huh?)
(foo (/ x 2)))
something-else)

and the `foo` calls are now not in tail position. The main feature of all Scheme implementations including Racket wrt tail calls is that calls that are in tail position of a function are said to be “eliminated”. That means that if we’re in an `f` function, and we’re about to call `g` in tail position and therefore whatever `g` returns would be the result of `f` too, then when Racket does the call to `g` it doesn’t bother keeping the `f` context — it won’t remember that it needs to “return” to `f` and will instead return straight to its caller. In other words, when you think about a conventional implementation of function calls as frames on a stack, Racket will get rid of a stack frame when it can.

Another way to see this is to use DrRacket’s stepper to step through a function call. The stepper is generally an alternative debugger, where instead of visualizing stack frames it assembles an expression that represents these frames. Now, in the case of tail calls, there is no room in such a representation to keep the call — and the thing is that in Racket that’s perfectly fine since these calls are not kept on the call stack.

Note that there are several names for this feature:

• “Tail recursion”. This is a common way to refer to the more limited optimization of only tail-recursive functions into loops. In languages that have tail calls as a feature, this is too limited, since they also optimize cases of mutual recursion, or any case of a tail call.

• “Tail call optimization”. In some languages, or more specifically in some compilers, you’ll hear this term. This is fine when tail calls are considered only an “optimization” — but in Racket’s case (as well as Scheme), it’s more than just an optimization: it’s a language feature that you can rely on. For example, a tail-recursive function like `(define (loop) (loop))` must run as an infinite loop, not just optimized to one when the compiler feels like it.

• “Tail call elimination”. This is the so far the most common proper name for the feature: it’s not just recursion, and it’s not an optimization.

### When should you use tail calls?

Often, people who are aware of tail calls will try to use them always. That’s not always a good idea. You should generally be aware of the tradeoffs when you consider what style to use. The main thing to remember is that tail-call elimination is a property that helps reducing space use (stack space) — often reducing it from linear space to constant space. This can obviously make things faster, but usually the speedup is just a constant factor since you need to do the same number of iterations anyway, so you just reduce the time spent on space allocation.

Here is one such example that we’ve seen:

(define (list-length-1 list)
(if (null? list)
0
(+ 1 (list-length-1 (rest list)))))

;; versus

(define (list-length-helper list len)
(if (null? list)
len
(list-length-helper (rest list) (+ len 1))))
(define (list-length-2 list)
(list-length-helper list 0))

In this case the first (recursive) version version consumes space linear to the length of the list, whereas the second version needs only constant space. But if you consider only the asymptotic runtime, they are both O(length(l)).

A second example is a simple implementation of `map`:

(define (map-1 f l)
(if (null? l) l (cons (f (first l)) (map-1 f (rest l)))))

;; versus

(define (map-helper f l acc)
(if (null? l)
(reverse acc)
(map-helper f (rest l) (cons (f (first l)) acc))))
(define (map-2 f l)
(map-helper f l '()))

In this case, both the asymptotic space and the runtime consumption are the same. In the recursive case we have a constant factor for the stack space, and in the iterative one (the tail-call version) we also have a similar factor for accumulating the reversed list. In this case, it is probably better to keep the first version since the code is simpler. In fact, Racket’s stack space management can make the first version run faster than the second — so optimizing it into the second version is useless.

# Note on Types

Types can become interesting when dealing with higher-order functions. For example, `map` receives a function and a list of some type, and applies the function over this list to accumulate its output, so its type is:

;; map : (A -> B) (Listof A) -> (Listof B)

Actually, `map` can use more than a single list, it will apply the function on the first element in all lists, then the second and so on. So the type of `map` with two lists can be described as:

;; map : (A B -> C) (Listof A) (Listof B) -> (Listof C)

Here’s a hairy example — what is the type of this function:

(define (foo x y)
(map map x y))

Begin by what we know — both `map`s, call them `map1` and `map2`, have the double- and single-list types of `map` respectively, here they are, with different names for types:

;; the first `map', consumes a function and two lists
map1 : (A B -> C) (Listof A) (Listof B) -> (Listof C)

;; the second `map', consumes a function and one list
map2 : (X -> Y) (Listof X) -> (Listof Y)

Now, we know that `map2` is the first argument to `map1`, so the type of `map1`s first argument should be the type of `map2`:

(A B -> C) = (X -> Y) (Listof X) -> (Listof Y)

From here we can conclude that

A = (X -> Y)

B = (Listof X)

C = (Listof Y)

If we use these equations in `map1`’s type, we get:

map1 : ((X -> Y) (Listof X) -> (Listof Y))
(Listof (X -> Y))
(Listof (Listof X))
-> (Listof (Listof Y))

Now, `foo`’s two arguments are the 2nd and 3rd arguments of `map1`, and its result is `map1`s result, so we can now write the type of `foo`:

;; foo : (Listof (X -> Y))
;;      (Listof (Listof X))
;;      -> (Listof (Listof Y))
(define (foo x y)
(map map x y))

This should help you understand why, for example, this will cause a type error:

and why this is valid:

## Side-note: Names are important

An important “discovery” in computer science is that we don’t need names for every intermediate sub-expression — for example, in almost any language we can write the equivalent of:

s = (-b + sqrt(b^2 - 4*a*c)) / (2*a)

x = b * b
y = 4 * a
y = y * c
x = x - y
x = sqrt(x)
y = -b
x = y + x
y = 2 * a
s = x / y

Such languages are put in contrast to assembly languages, and were all put under the generic label of “high level languages”.

(Here’s an interesting idea — why not do the same for function values?)