The language for this homework is:
#lang pl 15 |
In this homework you will get to implement a Racket macro, essentially extending Racket with a small DSL.
“The Swine Before Perl” is a known talk by Shriram Krishnamurthi (who also wrote the textbook we’re using). In this talk, he demonstrates how a Racket macro can be used to implement a small domain specific language, using the problem of implementing a finite automaton. Not only does this result in extending Racket with a DSL for fininte automata, it is also a simpler solution than the usual one, and a more efficient one. The efficiency benefit comes from the fact that the automatons are essentially compiled to Racket code rather than being represented as data that is interpreted — the speed improvement that is mentioned in the talk would be more substantial now, since Racket has improved a lot since then, and together with a JIT, we get automaton definitions that compile all the way down to machine-code. (As a side-note, similar methods are common in implementations of regular expression matchers.)
In this homework, you will fix a bug in the automaton implementation, and extend it to implement pushdown-automatons. You should begin by going through the presentation slides, which shouldn’t take much time — it’s fine to begin reading at the slide that is titled “A Pearl”.
The automaton code is not actually present in one piece in the slides — it is a combination of the transformation that is specified in one slide, and the wrapper that uses define-syntax and syntax-rules on the next one. We’re going to use a slightly revised version of the macro: first of all, we will still be working in a typed language, and secondly, we will use the match form that is more powerful and convenient then the case form that is used in the slides.
The following is the version of the code with these changes:
;; A macro that defines a DFA language
(define-syntax automaton
(syntax-rules (: ->)
[(automaton init-state
[state : (input-sym -> new-state) ...]
...)
(lambda (string)
(: state : (Listof Symbol) -> Boolean)
...
(define (state stream)
(match stream
['() #t]
[(cons 'input-sym more) (new-state more)]
...
[_ #f]))
...
(init-state (string->symbols string)))])) |
Since the macro evaluates to a function, we can use it as usual, and bind some name to it. Note that the function form that automaton is translated to does not have a type, so we need to declare a type for a function that we define this way. Here is the automaton that is used in the slides — it is intended to identify strings that begin with a "c", end with an "r", and have any number of "a" and/or "d" in the middle:
(: cXr : String -> Boolean)
;; Identifies strings that match "c[ad]*r+"
(define cXr (automaton init
[init : (c -> more)]
[more : (a -> more)
(d -> more)
(r -> end)]
[end : (r -> end)])) |
Finally, here are some test cases to verify that everything works fine:
;; tests: (test (cXr "cadr")) (test (cXr "cadadadadadadddddaaarrr")) (test (not (cXr "ccadr"))) (test (not (cXr "c"))) ; BAD TEST! |
Your first task is to fix this bug. The fix will have two parts: (a) extend the automaton syntax so there is a place to specify an accepting state; (b) use the new input to fix the broken place in the code (hint: there is a single place in the output pattern where a #t can be the result). The following is how the resulting macro should be used for a correct implementation of the cXr function:
(define cXr (automaton init end ; is the accepting state
[init : (c -> more)]
[more : (a -> more)
(d -> more)
(r -> end)]
[end : (r -> end)])) |
Finally, make sure that you have complete coverage! You might see some part
of the code as uncovered after running the above tests. That is not
some weird result of using macros. In fact, if you do see some part that is
shown as uncovered, then this is most likely correct, and you need to add a
test to get the needed coverage for exactly that part. (But note that this
is a missing test for your cXr, it’s not a test of the macro
transformation.)
Side-comment: you might see complete coverage without adding a test.
This is fine too — it is probably a result of better code in your
expansion...
Now that you know how to implement a simple automaton, you can proceed to implement an extended version: a pushdown version. The syntax for the new pushdown syntax is similar to the automaton macro, and the implementation is basically an extension of the code that keeps track a stack of (symbolic) tokens. The specification still has an initial state, an accepting state, and a sequence of transition rules. The rules are similar to the automaton rules, except that each transition has:
To clarify the syntax, here is the definition of a simple one-state pushdown automaton that accepts strings with balanced parentheses, with tests. Note that string->symbols translates round parentheses to open and close symbols respectively.
(: balanced : String -> Boolean)
;; Identifies strings that contain only balanced parentheses
(define balanced (pushdown init init
[init : ((open) () -> init (open))
((close) (open) -> init ())
((*) (*) -> init ())]))
;; tests:
(test (balanced ""))
(test (balanced "()"))
(test (balanced "(((())))"))
(test (balanced "((()())(()))"))
(test (not (balanced "(")))
(test (not (balanced ")")))
(test (not (balanced ")("))) |
To make this question easy, here is what the above code should expand to. Important: this is just an example of how the above code expands, the actual solution should work with any automatons! It’s best if you write a different one to make sure that your code works as it should.
(lambda (string)
(: init : (Listof Symbol) (Listof Symbol) -> Boolean)
(define (init stream stack)
(match (list stream stack)
[(list '() '()) ...same as in automaton...]
[(list (list-rest 'open more-input)
(list-rest more-stack))
(init more-input (append '(open) more-stack))]
[(list (list-rest 'close more-input)
(list-rest 'open more-stack))
(init more-input (append '() more-stack))]
[(list (list-rest '* more-input)
(list-rest '* more-stack))
(init more-input (append '() more-stack))]
[_ #f]))
(init (append (string->symbols string) '(*)) '(*))) |
This code uses a new match pattern: list-rest. This pattern is similar to a cons pattern, but instead of matching just the head and the tail of the list, it can match any prefix of the list, and the rest of it. For example,