Assignment Two, CS351 S2001
Designing a Parser's API
Objective
In this assignment you will design an API for a simple parser,
motivate your design, and document the design. The documentation
will include some sample programs that use the API.
Specifications
The basic idea is to write a system that takes a specification for a
small ``expression language'' and gives the program the ability to
read from a stream and parse it up into the elements of this language,
represented in a form suitable for further processing by the program
that uses the parser. The range of languages allowed is quite
limited: they are simple expression languages, with no precedence ie
only left-to-write association of operators.
Grammar
The expression language consists of two sorts of tokens: operators,
which are one character long, and variables, which are sequences of
characters. The characters allowed in variables (called
constituent characters) are disjoint from those allowed in
operators. Parenthesis '(' and ')' can be used for grouping. Each
operator is either unary (prefix) or binary (infix), and these two
classes of operator are disjoint.
Order
The expression "(var1 opA var2 opB var3)" is equivalent to "(var1 opA
(var2 opB var3))". Whitespace can occur anywhere except inside a
variable. Unary operators bind ``tighter'' than binary ones, so "(op1
var1 op2 var2)" where op1 is unary and op2 is binary is identical to
"((op1 var1) op2 var2)".
Which characters are constituents, and the identity and nature of the
operators, can change from run to run - so your API must have a way of
getting this information from the program using it.
Expressions can be thought of as representing trees whose leaves are
"variables" and whose internal nodes are tagged with operators. There
would be two kinds of internal nodes: ones with two children which are
tagged with binary operators, and ones with one child which are tagged
with unary operators.
You will need to be represent the expressions that are read in a
fashion that allows the program using the library to examine and
manipulate them sensibly. Hint: define a separate API for objects
representing expressions and portions thereof.
Here is a formal grammar, in BNF.
EXPR := VAR | EXPR OP2 EXPR | OP1 EXPR | '(' EXPR ')'
OP1 := ... list of possible single chars, eg: '-' | '~'
OP2 := ... list of possible single chars, eg: '+' | '*' | '/'
VAR := CONSTITUENT+
CONSTITUENT := ... list of constituent chars, typically: '1' | '2' | 'a'
Note that the characters '(', ')', constituent characters, and OP1
characters, and OP2 characters are disjoint. Ie if '-' in a
unary operators then '-' is not a binary operator, or a constituent
character.
What you should turn in
format
All the below should be ASCII files, handed in by running the
cs351-handin program on a directory containing only the files you wish
to hand in. The documentation can be html format if you'd like.
Please do not use any binary format files. The file parser.h should
be legal C++ code. None of the member functions should have bodies,
ie none of them should be inline. The bodies will be defined in the
companion parser.cc file you will write later.
Design Constraints and Motivation
This explains how you thought about the problem, and gives the
criteria you thought the library needs to satisfy. It is motivation
for the exact decisions explained below. Eg if the program using the
library first has to do some setup before it can do real work, the
reason you thought this was reasonable and the reasoning behind why
you put the division between the various stages where you did should
be explained here.
file: parser.h
This formally sets out the classes and member functions in your API.
Documentation with Examples
This documents what all the (public) stuff in parser.h is for. After
reading this, a competent programmer should be able to write a program
that uses a library that meets your API.
Due date: 2pm Mon Feb 12.