Compiler Project 2010: A Front-End for Mini-Java
Implement a syntax analyzer and a partial semantic processor for
the Mini-Java programming language.
The Mini-Java manual gives a description of the source language.
The translator must correctly recognize and process any valid Mini-Java program.
It should report syntactic errors, and
then continue analyzing the rest of the source program.
The language processor must also construct an AST and make necessary
passes over this program representation.
The partial semantic analysis involves checking and binding names to their
definitions.
However, it does not include checking of other semantic constraints,
e.g., types of expressions or arguments.
These will part of the back-end of the translator,
optionally implemented as a separate project.
The Mini-Java analyzer must construct
and print out a symbol cross-reference table.
The produced cross-reference table indicates where in the source
program each identifier is defined,
and where it accessed (used as an operand or argument).
Note that any identifier may be redeclared at an inner
block, and you must make sure to bind
each applied occurrence of an identifier
to its current valid definition.
For each identifier, the printed table includes
- the identifier symbol,
- the static block level of its definition (numbered 1,2, etc.),
- the textual position of its definition, and
- the textual positions of all its accessing occurrences.
Sort and format the listing of the table for ease of use.
The work includes the following subtasks as parts of the assignment.
Note that the given (partially) informal Mini-Java
description must be transformed for implementation.
The given definition contains some syntactic issues that need first to
be formalised or resolved in order to implement its analyzers.
E.g., the token patterns are only described verbally.
Similarly, you may need to make transformations to the context-free
grammar in order to use top-down parsing technique (LL(1) parsing).
Such modifications and clarifications are documented as part of the
project report, as explained below.
However, you may not make arbitrary changes to the syntax.
All transformations must be justified by technical reasons
and they must not in practice affect the language that is
accepted and processed.
Requirements for the report
Write a report on the assignment, as
a document in PDF format.
The report must include appropriate identifications:
the name of the student, the name of the course,
and the correct date and time of delivery.
Describe the overall architecture of your language processor with
appropriate UML diagrams. Explain these diagrams.
Describe the testing and used test data.
Give instructions how to build and run your analyzer.
The report also includes the above mentioned
clarifications and transformations needed for the
Mini-Java syntax definition. You must specify
- the Mini-Java token patterns as regular expressions or,
alternatively, as regular definitions
- a modified context-free grammar suitable for
recursive-descent parsing (eliminating any LL(1)
violations or other ambiguities)
- another grammar must specify abstract syntax trees (AST), i.e.,
simplified, semantically relevent internal representation for
Mini-Java programs
(alternatively use UML diagrams to define the abstract syntax)
- error handling strategies and solutions used in your Mini-Java
analyzer
For completeness, include the original Mini-Java definitions
as a part of your document so that
you can refer to them when explaining your own
syntactic transformations.
Also, report any restrictions or interpretations of your own,
special solutions, data structures, and dependencies
(and any other items needed to help to understand your implementation).
Don't copy unnecessary declarations and definitions from the
program code to the document - your code should be self-documenting.
Instead, report things
that are not so self-evident from the program.
Explain and motivate your design decisions and solutions.
Tell in your own words about hard or unexpected problems,
lessons learned, and valuable experiences.
Especially, report all shortcomings or failures
(since you may still get points for a good explanation
of the dilemma or problem).
Delivery of the work
The project deadline is Friday 26th February 23:59.
The work should be returned to the exercise assistant via
e-mail.
Paula.Kuosmanen atta cs.helsinki.fi
Send only the address of the project stored in an
appropriate zip form.
This zip should include both the source files and the project
report, contained in a directory
named according to your unique department user name.
More detailed instructions and the requirements for the
assignment are given in the exercise groups.
Implementation requirements and grading criteria
The work is individual project: you must design and implement
your language processor for Mini-Java independently by yourself.
The implementation includes both recognition of Mini-Java
constructs and appropriate handling of errors.
The Mini-Java analyzer is to be written purely in a general-purpose
programming language. No language-processing tools
(libraries, language recognizer generators, frameworks) are allowed.
You must yourself make sure that
your Mini-Java analyzer can be build, run, and tested
with generally available standard tools at the CS department.
Java is to be used as the implementation language, by default
(you must consult the assistant about using other available
programming language).
Points for the work are approximately given
according to the following table
(preliminary):
Packaging (way of delivery, format, completeness) |
1 |
Scanner implementation |
5 |
Parser implementation |
5 |
  Implementation of (partial) semantic processing |
5 |
Overall quality: coding style, understandability, modularity,
maintainability |
5 |
Quality of the document:
completeness, clarity, relevance of contents and discussion |
5 |
Max of total points |
26 |
After the deadline 26th Feb 23:59,
the maximum points to be gained for a delivered work diminishes linearly.
One point is lost for missing the project deadline. And then
one more point will be lost per each full 6 hours delay of submission.
So, returning a work that is 3 days late can gain maximum 14 points.
After 1st March 23:59, assignments are not accepted.