PL: Lecture #23  Tuesday, November 27th
(text file)

Macro Conclusions

PLAI §37.5

Macros are extremely powerful, but this also means that their usage should be restricted only to situations where they are really needed. You can view any function as extending the current collection of tools that you provide — where these tools are much more difficult for your users to swallow than plain functions: evaluation can happen in any way, with any scope, unlike the uniform rules of function application. An analogy is that every function (or value) that you provide is equivalent to adding nouns to a vocabulary, but macros can add completely new rules for reading, since using them might result in a completely different evaluation. Because of this, adding macros carelessly can make code harder to read and debug — and using them should be done in a way that is as clear as possible for users.

When should a macro be used?

It is also important to note that macros should not be used too frequently. As said above, every macro adds a completely different way of reading your code — a way that doesn’t use the usual “nouns” and “verbs”, but there are other reasons not to use a macro.

One common usage case is as an optimization — trying to avoid an extra function call. For example, this:

int min(int x, int y) {
  if ( x < y ) then return x; else return y;
}

might seem wasteful if you don’t want a full function call on every usage of min. So you might be tempted to use this instead:

#define min(x,y) x<y ? x : y

you even know the pitfalls of C macros so you make it more robust:

#define min(x,y) (((x)<(y)) ? (x) : (y))

But small functions like the above are things that any decent compiler should know how to optimize, and even if your compiler doesn’t, it’s still not worth doing this optimization because programmer time is the most expensive factor in any computer system. In addition, a compiler is committed to doing these optimizations only when possible (eg, it is not possible to in-line a recursive function) and to do proper in-lining — unlike the min CPP macro above which is erroneous in case x or y are expressions that have side-effects.

Side-note: macros in mainstream languages

Macros are an extremely powerful tool in Racket (and other languages in the Lisp family) — how come nobody else uses them?

Well, people have tried to use them in many contexts. The problem is that you cannot get away with a simple solution that does nothing more than textual manipulation of your programs. For example, the standard C preprocessor is a macro language, but it is fundamentally limited to very simple situations. This is still a hot topic these days, with modern languages trying out different solutions (or giving up and claiming that macros are evil).

Here is an example that was written by Mark Jason Dominus (“Higher Order Perl”), in a Perl mailing list post among further discussion on macros in Lisp vs other languages, including Perl’s source transformers that are supposed to fill a similar role.

The example starts with writing the following simple macro:

#define square(x) x*x

This doesn’t quite work because

2/square(10)

expands to

2/10*10

which is 2, but you wanted 0.02. So you need this instead:

#define square(x) (x*x)

but this breaks because

square(1+1)

expands to

(1+1*1+1)

which is 3, but you wanted 4. So you need this instead:

#define square(x) ((x)*(x))

But what about

x = 2;
square(x++)

which expands to

((x++)*(x++))

? So you need this instead:

int MYTMP;
#define square(x) (MYTMP = (x), MYTMP*MYTMP)

but now it only works for ints; you can’t do square(3.5) any more. To really fix this you have to use nonstandard extensions, something like:

#define square(x) ({typedef xtype = x; xtype xval = x; xval*xval; })

or more like:

#define square(x) \
  ({typedef xtype = (x); \
    xtype xval = (x); \
    xval*xval; })

And that’s just to get trivial macros, like “square()”, to work.


You should be able to appreciate now the tremendous power of macros. This is why there are so many “primitive features” of programming languages that can be considered as merely library functionality given a good macro system. For example, people are used to think about OOP as some inherent property of a language — but in Racket there are at least two very different object systems that it comes with, and several others in user-distributed code. All of these are implemented as a library which provides the functionality as well as the necessary syntax in the form of macros. So the basic principle is to have a small core language with powerful constructs, and make it easy to express complex ideas using these constructs.

This is an important point to consider before starting a new DSL (reminder: domain specific language) — if you need something that looks like a simple DSL but might grow to a full language, you can extend an existing language with macros to have the features you want, and you will always be able to grow to use the full language if necessary. This is particularly easy with Racket, but possible in other languages too.


Side note: the principle of a powerful but simple code language and easy extensions is not limited to using macros — other factors are involved, like first-class functions. In fact, “first class”-ness can help in many situations, for example: single inheritance + classes as first-class values can be used instead of multiple inheritance.

Types

PLAI §24

In our Toy language implementation, there are certain situations that are not covered. For example,

{< {+ 1 2} 3}

is not a problem, but

{+ {< 1 2} 3}

will eventually use Racket’s addition function on a boolean value, which will crash our evaluator. Assuming that we go back to the simple language we once had, where there were no booleans, we can still run into errors — except now these are the errors that our code raises:

{+ {fun {} 1} 2}

or

{1 2 3}

or

{{fun {x y} {+ x y}} 5}

In any case, it would be good to avoid such errors right from the start — it seems like we should be able to identify such bad code and not even try to run it. One thing that we can do is do a little more work at parse time, and declare the {1 2 3} program fragment as invalid. We can even try to forbid

{bind {{x 1}} {x 2 3}}

in the same way, but what should we do with this? —

{fun {x} {x 2 3}}

The validity of this depends on how it is used. The same goes for some invalid expressions — the above bogus expression can be fine if it’s in a context that shadows <:

{bind {{< *}}
  {+ {< 1 2} 3}}

Finally, consider this:

{+ 3 {if <mystery> 5 {fun {x} x}}}

where mystery contains something like random or read. In general, knowing whether a piece of code will run with no errors is a problem that is equivalent to the halting problem — and because of this, there is no way to create an “exact” type system: they are all either too restrictive (rejecting programs that would run with no errors) or too permissive (accepting programs that might crash). This is a very practical issue — type safety means a lot less bugs in the system. A good type system is still an actively researched problem.

What is a Type?

PLAI §25

A type is any property of a program (or an expression) that can be determined without running the program. (This is different than what is considered a type in Racket which is a property that is known only at run-time, which means that before run-time we know nothing so in essence we have a single type (in the static sense).) Specifically, we want to use types in a way that predicts some aspects of the program’s behavior, for example, whether a program will crash.

Usually, types are being used as the kind of value that an expression can evaluate to, not the precise value itself. For example, we might have two kinds of values — functions and numbers, and we know that addition always operates on numbers, therefore

{+ 1 {fun {x} x}}

is a type error. Note that to determine this we don’t care about the actual function, just the fact that it is a function.

Important: types can discriminate certain programs as invalid, but they cannot discriminate correct programs from incorrect ones. For example, there is no way for any type system to know that this:

{fun {x} {+ x 1}}

is an incorrect decrease-by-one function.

In general, type systems try to get to the optimal point where as much information as possible is known, yet the language is not too restricted, no significant computing resources are wasted, and programmers don’t spend much time annotating their code.

Why would you want to use a type system?

Our Types — The Picky Language

The first thing we need to do is to agree on what types are. Earlier, we talked about two types: numbers and functions (ignore booleans or anything else for now), we will use these two types for now.

In general, this means that we are using the Types are Sets meaning for types, and specifically, we will be implmenting a type system known as a Hindley-Milner system. This is not what Typed Racket is using. In fact, one of the main differences is that in our type system each binding has exactly one type, whereas in Typed Racket an identifier can have different types in different places in the code. An example of this is something that we’ve talked about earlier:

(: foo : (U String Number) -> Number)
(define (foo x)          ;\ these `x`s have a
  (if (number? x)        ;/ (U Number String) type
    (+ x 1)              ;> this one is a Number
    (string-length x)))  ;> and this one is a String

A type system is presented as a collection of rules called “type judgments”, which describe how to determine the type of an expression. Beside the types and the judgments, a type system specification needs a (decidable) algorithm that can assign types to expressions.

Such a specification should have one rule for every kind of syntactic construct, so when we get a program we can determine the precise type of any expression. Also, these judgments are usually recursive since a type judgment will almost always rely on the types of sub-expressions (if any).

For our restricted system, we have two rules (= judgments) that we can easily specify:

n : Number  (any numeral `n' is a number)
{fun {x} E} : Function

And what about an identifier? Well, it is clear that we need to keep some form of an environment that will keep an account of types assigned to identifiers (note: all of this is not at run-time). This environment is used in all type judgments, and usually written as a capital Greek Gamma character (in some places G is used to stick to ASCII texts). The conventional way to write the two rules above is:

Γ ⊢ n : Number
Γ ⊢ {fun {x} E} : Function

The first one is read as “Gamma proves that n has the type Number”. Note that this is a syntactic environment, much like DE-ENVs that you have seen in homework.

So, we can write a rule for identifiers that simply has the type assigned by the environment:

Γ ⊢ x : Γ(x)

We now need a rule for addition and a rule for application (note: we’re using a very limited subset of our old language, where arithmetic operators are not function applications). Addition is easy: if we can prove that both a and b are numbers in some environment Γ, then we know that {+ a b} is a number in the same environment. We write this as follows:

Γ ⊢ A : Number  Γ ⊢ B : Number
———————————————————————————————
    Γ ⊢ {+ A B} : Number

Now, what about application? We need to refer to some arbitrary type now, and the common letter for that is a Greek lowercase tau:

Γ ⊢ F : Function  Γ ⊢ V : τₐ
—————————————————————————————
    Γ ⊢ {call F V} : ???

that is — if we can prove that f is a function, and that v is a value of some type τₐ, then … ??? Well, we need to know more about f: we need to know what type it consumes and what type it returns. So a simple function is not enough — we need some sort of a function type that specifies both input and output types. We will use the notation that was seen throughout the semester and dump function. Now we can write:

Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
——————————————————————————————
    Γ ⊢ {call F V} : τ₂

which makes sense — if you take a function of type τ₁->τ₂ and you feed it what it expects, you get the obvious output type. But going back to the language — where do we get these new arrow types from? We will modify the language and require that every function specifies its input and output type (and assume we have only one argument functions). For example, we will write something like this for a function that is the curried version of addition:

{fun {x : Number} : (Number -> Number)
  {fun {y : Number} : Number
    {+ x y}}}

So: the revised syntax for the limited language that contains only additions, applications and single-argument functions, and for fun — go back to using the call keyword is. The syntax we get is:

<PICKY> ::= <num>
          | <id>
          | { + <PICKY> <PICKY> }
          | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
          | { call <PICKY> <PICKY> }

<TYPE>  ::= Number
          | ( <TYPE> -> <TYPE> )

and the typing rules are:

Γ ⊢ n : Number

Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

Γ ⊢ x : Γ(x)

Γ ⊢ A : Number  Γ ⊢ B : Number
———————————————————————————————
    Γ ⊢ {+ A B} : Number

Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
——————————————————————————————
    Γ ⊢ {call F V} : τ₂

But we’re still missing a big part — the current rule for a fun expression is too weak, it does not allow us to conclude that this expression:

{fun {x : Number} : (Number -> Number)
  3}

is invalid. Instead, it will make us think that this program:

{call {call {fun {x : Number} : (Number -> Number)
              3}
            5}
      7}

is valid, and should return a number. What’s missing? We need to check that the body part of the function is correct, so the rule for typing a fun is no longer a simple one. Here is how we check the body instead of blindly believing program annotations:

          Γ[x:=τ₁] ⊢ E : τ₂
——————————————————————————————————————
Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

That is — we want to make sure that if x has type τ₁, then the body expression E has type τ₂, and if we can prove this, then we can trust these annotations.

There is an important relationship between this rule and the call rule for application:

(Side note: Racket comes with a contract system that can identify type errors dynamically, and assign blame to either the caller or the callee — and these correspond to these two sides.)

Note that, as we said, number is really just a property of a certain kind of values, we don’t know exactly what numbers are actually used. In the same way, the arrow function types don’t tell us exactly what function it is, for example, (Number -> Number) can indicate a function that adds three to its argument, subtracts seven, or multiplies it by 7619. But it certainly contains much more than the previous naive function type. (Consider also Typed Racket here: it goes much further in expressing facts about code.)

For reference, here is the complete BNF and typing rules:

<PICKY> ::= <num>
          | <id>
          | { + <PICKY> <PICKY> }
          | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
          | { call <PICKY> <PICKY> }

<TYPE>  ::= Number
          | ( <TYPE> -> <TYPE> )

Γ ⊢ n : Number

Γ ⊢ x : Γ(x)

Γ ⊢ A : Number  Γ ⊢ B : Number
———————————————————————————————
    Γ ⊢ {+ A B} : Number

          Γ[x:=τ₁] ⊢ E : τ₂
——————————————————————————————————————
Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
——————————————————————————————
    Γ ⊢ {call F V} : τ₂

Examples of using types (abbreviate Number as Num) — first, a simple example:

                {} ⊢ 5 : Num  {} ⊢ 7 : Num
                ———————————————————————————
{} ⊢ 2 : Num        {} ⊢ {+ 5 7} : Num
—————————————————————————————————————————————
          {} ⊢ {+ 2 {+ 5 7}} : Num

and a little more involved one:

    [x:=Num] ⊢ x : Num  [x:=Num] ⊢ 3 : Num
    ———————————————————————————————————————
          [x:=Num] ⊢ {+ x 3} : Num
———————————————————————————————————————————————
{} ⊢ {fun {x : Num} : Num {+ x 3}} : Num -> Num  {} ⊢ 5 : Num
——————————————————————————————————————————————————————————————
      {} ⊢ {call {fun {x : Num} : Num {+ x 3}} 5} : Num

Finally, try a buggy program like

{+ 3 {fun {x : Number} : Number x}}

and see where it is impossible to continue.

The main thing here is that to know that this is a type error, we have to prove that there is no judgment for a certain type (in this case, no way to prove that a fun expression has a Num type), which we (humans) can only do by inspecting all of the rules. Because of this, we need to also add an algorithm to our type system, one that we can follow and determine when it gives up.

Typing control

PLAI §26

We will now extend our typed Picky language to have a conditional expression, and predicates. First, we extend the BNF with a predicate expression, and we also need a type for the results:

<PICKY> ::= <num>
          | <id>
          | { + <PICKY> <PICKY> }
          | { < <PICKY> <PICKY> }
          | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
          | { call <PICKY> <PICKY> }
          | { if <PICKY> <PICKY> <PICKY> }

<TYPE>  ::= Number
          | Boolean
          | ( <TYPE> -> <TYPE> )

Initially, we use the same rules, and add the obvious type for the predicate:

Γ ⊢ A : Number  Γ ⊢ B : Number
———————————————————————————————
    Γ ⊢ {< A B} : Boolean

And what should the rule for if look like? Well, to make sure that the condition is a boolean, it should be something of this form:

Γ ⊢ C : Boolean  Γ ⊢ T : ???  Γ ⊢ E : ???
———————————————————————————————————————————
          Γ ⊢ {if C T E} : ???

What would be the types of t and e? A natural choice would be to let the programmer use any two types:

Γ ⊢ C : Boolean  Γ ⊢ T : τ₁  Γ ⊢ E : τ₂
—————————————————————————————————————————
          Γ ⊢ {if C T E} : ???

But what would the return type be? This is still a problem. (BTW, some kind of a union would be nice, but it has some strong implications that we will not discuss.) In addition, we will have a problem detecting possible errors like:

{+ 2 {if <mystery> 3 {fun {x} x}}}

Since we know nothing about the condition, we can just as well be conservative and force both arms to have the same type. The rule is therefore:

Γ ⊢ C : Boolean  Γ ⊢ T : τ  Γ ⊢ E : τ
———————————————————————————————————————
          Γ ⊢ {if C T E} : τ

— using the same letter indicates that we expect the types to be identical, unlike the previous attempt. Consequentially, this type system is fundamentally weaker than Typed Racket which we use in this class.

Here is the complete language specification with this extension:

<PICKY> ::= <num>
          | <id>
          | { + <PICKY> <PICKY> }
          | { < <PICKY> <PICKY> }
          | { fun { <id> : <TYPE> } : <TYPE> <PICKY> }
          | { call <PICKY> <PICKY> }
          | { if <PICKY> <PICKY> <PICKY> }

<TYPE>  ::= Number
          | Boolean
          | ( <TYPE> -> <TYPE> )

Γ ⊢ n : Number

Γ ⊢ x : Γ(x)

Γ ⊢ A : Number  Γ ⊢ B : Number
———————————————————————————————
    Γ ⊢ {+ A B} : Number

Γ ⊢ A : Number  Γ ⊢ B : Number
———————————————————————————————
    Γ ⊢ {< A B} : Boolean

          Γ[x:=τ₁] ⊢ E : τ₂
——————————————————————————————————————
Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)

Γ ⊢ F : (τ₁ -> τ₂)  Γ ⊢ V : τ₁
——————————————————————————————
    Γ ⊢ {call F V} : τ₂

Γ ⊢ C : Boolean  Γ ⊢ T : τ  Γ ⊢ E : τ
———————————————————————————————————————
          Γ ⊢ {if C T E} : τ

Extending Picky

In general, we can extend this language in one of two ways. For example, lets say that we want to add the with form. One way to add it is what we did above — simply add it to the language, and write the rule for it. In this case, we get:

Γ ⊢ V : τ₁  Γ[x:=τ₁] ⊢ E : τ₂
——————————————————————————————
Γ ⊢ {with {x : τ₁ V} E} : τ₂

Note how this rule encapsulates information about the scope of with. Also note that we need to specify the types for the bound values.

Another way to achieve this extension is if we add with as a derived rule. We know that when we see a

{with {x V} E}

expression, we can just translate it into

{call {fun {x} E} V}

So we could achieve this extension by using a rewrite rule to translate all with expressions into calls of anonymous functions (eg, using the with-stx facility that we have seen recently). This could be done formally: begin with the with form, translate to the call form, and finally show the necessary goals to prove its type. The only thing to be aware of is the need to translate the types too, and there is one type that is missing from the typed-with version above — the output type of the function. This is an indication that we don’t really need to specify function output types — we can just deduce them from the code, provided that we know the input type to the function.

Indeed, if we do this on a general template for a with expression, then we end up with the same goals that need to be proved as in the above rule:

      Γ[x:=τ₁] ⊢ E : τ₂
——————————————————————————————————————
Γ ⊢ {fun {x : τ₁} : τ₂ E} : (τ₁ -> τ₂)      Γ ⊢ V : τ₁
———————————————————————————————————————————————————————
        Γ ⊢ {call {fun {x : τ₁} : τ₂ E} V} : τ₂
        ———————————————————————————————————————
            Γ ⊢ {with {x : τ₁ V} E} : τ₂

Conclusion — we’ve seen type judgment rules, and using them in proof trees. Note that in these trees there is a clear difference between rules that have no preconditions — there are axioms that are always true (eg, a numeral is always of type num).

The general way of proving a type seems similar to evaluation of an expression, but there is a huge difference — nothing is really getting evaluated. As an example, we always go into the body of a function expression, which is done to get the function’s type, and this is later used anywhere this function is used — when you evaluate this:

{with {f {fun {x : Number} : Number x}}
  {+ {call f 1} {call f 2}}}

you first create a closure which means that you don’t touch the body of the function, and later you use it twice. In contrast, when you prove the type of this expression, you immediately go into the body of the function which you have to do to prove that it has the expected Number->Number type, and then you just use this type twice.

Finally, we have seen the importance of using the same type letters to enforce types, and in the case of typing an if statement this had a major role: specifying that the two arms can be any two types, or the same type.