2010-04-02 - Recursive Macros ======================================================================== >>> Recursive Macros Syntax transformations can be recursive. For example, we have seen how `let*' can be implemented by a transformation that uses two rules, one of which expands to another use of `let*': (define-syntax let* (syntax-rules () [(let* () body ...) (let () body ...)] [(let* ((x v) (xs vs) ...) body ...) (let ((x v)) (let* ((xs vs) ...) body ...))])) When Scheme expands a `let*' expression, the result may contain a new `let*' which needs extending as well. An important implication of this is that recursive macros are fine, as long as the recursive case is using a *smaller* expression. This is just like any form of recursion (or loop), where you need to be looping over a `well-founded' set of values -- where each iteration uses a new value that is closer to some base case. For example, consider the following macro: (define-syntax-rule (while condition body ...) (when condition body ... (while condition body ...))) It seems like this is a good implementation of a `while' loop -- after all, if you were to implement it as a function using thunks, you'd write very similar code: (define (while condition-thunk body-thunk) (when (condition-thunk) (body-thunk) (while condition-thunk body-thunk))) But if you look at the nested `while' form in the transformation rule, you'll see that it is exactly the same as the input form. This means that this macro can never be completely expanded -- it specifies infinite code! In practice, this makes the (MzScheme) compiler loop forever, consuming more and more memory. This is unlike, for example, the recursive `let*' rule which uses one less binding-value pair than specified as its input. The reason that the function version of `while' is fine is that it iterates using the *same* code, and the condition thunk will depend on some state that converges to a base case (usually the body thunk will perform some side-effects that makes the loop converge). But in the macro case there is *no* evaluation happening, if the transformed syntax contains the same input pattern, we end up having a macro that expands infinitely. The correct solution for a `while' macro is therefore to use plain recursion using a local recursive function: (define-syntax-rule (while condition body ...) (letrec ([loop (lambda () (when condition body ... (loop)))]) (loop))) A popular way to deal with macros like this that revolve around a specific control flow is to separate them into a function that uses thunks, and a macro that does nothing except wrap input expressions as thunks. In this case, we get this solution: (define (while/proc condition-thunk body-thunk) (when (condition-thunk) (body-thunk) (while/proc condition-thunk body-thunk))) (define-syntax-rule (while condition body ...) (while/proc (lambda () condition) (lambda () body ...))) ======================================================================== >> Another Example: a simple loop. Here is an implementation of a macro that does a simple arithmetic loop: (define-syntax for (syntax-rules (= to do) [(for x = m to n do body ...) (letrec ([loop (lambda (x) (when (<= x n) body ... (loop (+ x 1))))]) (loop m))])) (Note that this is not complete code: it suffers from the usual problem of multiple evaluations of the `n' expression.) This macro combines both control flow and lexical scope. You can see here the immediate subform after `syntax-rules' is being used -- normally, identifiers in a transformation pattern can match anything, but in this case we specify that `=', `to', and `do' must appear as keywords in the form, unlike `x', `n', `m' and `body'. For example, making (for i = 1 3 (printf "i = ~s\n" i)) a syntax error. Control flow is specified by the loop (specified, as usual in Scheme, as a tail-recursive function) -- for example, it determines how code is iterated, and it also determines what the `for' form will evaluate to (it evaluates to whatever `when' evaluates to, the void value in this case). Scope is also specified here, by translating the code to a function -- this code makes `x' have a scope that covers the body so this is valid: (for i = 1 to 3 do (printf "i = ~s\n" i)) but it also makes the boundary expression `n' be in this scope, making this: (for i = 1 to (if (even? i) 10 20) do (printf "i = ~s\n" i)) valid. This is easily solved by writing this: (define-syntax for (syntax-rules (= to do) [(for x = m to n do body ...) (let ([n* n] [m* m]) ; just in case (letrec ([loop (lambda (x) (when (<= x n*) body ... (loop (+ x 1))))]) (loop m*)))])) which makes the previous use result in a "reference to undefined identifier: i" error. Furthermore, the fact that we have a hygienic macro system means that it is perfectly fine to use nested `for' expressions: (for a = 1 to 9 do (for b = 1 to 9 do (printf "~s,~s " a b)) (newline)) The transformation is, therefore, completely specifying the semantics of this new form. Extending this syntax is easy using multiple transformation rules -- for example, say that we want to extend it to have a `step' optional keyword. The standard idiom is to have the step-less pattern translated into one that uses `step 1': (for x = m to n do body ...) --> (for x = m to n step 1 do body ...) Usually, you should remember that `syntax-rules' tries the patterns one by one until a match is found, but in this case there is no problems because the keywords make the choice unambiguous: (define-syntax for (syntax-rules (= to do step) [(for x = m to n do body ...) (for x = m to n step 1 do body ...)] [(for x = m to n step d do body ...) (let ([n* n] [m* m] [d* d]) (letrec ([loop (lambda (x) (when (<= x n*) body ... (loop (+ x d*))))]) (loop m*)))])) (for i = 1 to 10 step 2 do (printf "i = ~s\n" i)) We can even extend it to do a different kind of iteration, for example, iterate over list: (define-syntax for (syntax-rules (= to do step in) [(for x = m to n do body ...) (for x = m to n step 1 do body ...)] [(for x = m to n step d do body ...) (let ([n* n] [m* m] [d* d]) (letrec ([loop (lambda (x) (when (<= x n*) body ... (loop (+ x d*))))]) (loop m*)))] ;; list [(for x in l do body ...) (for-each (lambda (x) body ...) l)])) (for i in (list 1 2 3 4) do (printf "i = ~s\n" i)) (for i in (list 1 2 3 4) do (for i = 0 to i do (printf "i = ~s " i)) (newline)) ========================================================================