Parsing with ANTLR 3.1
Frank Wierzbicki - Sun
- Talk Abstract
- Writing a basic recursive descent parser by hand is actually a pretty straight-forward task. However, when the parser grows to cover error handling, language specific semantic peculiarities, and other real world concerns that must be covered in a production-ready language, a parser can become difficult to read and maintain. This is where Terrance Parr's ANTLR (ANother Tool for Language Recognition) really helps. ANTLR is essentially a tool that allows you to write a parser as an executable EBNF (Extended BackusÐNaur Form) with embeddable actions. Another way to say it is that ANTLR provides a domain specific language for writing parsers.
Over the last year I have worked extensively with ANTLR 3.0 and then 3.1 to write a new parser for Jython 2.5. In this talk I will discuss the various techniques that I was able to use to make a correct parser for Jython that is reasonably readable. I will also discuss the way Python exposes its internal AST (abstract syntax tree) and how I was able to use this property to empirically prove to a reasonable level of confidence that Jython's AST is the same as the C Implementation of Python's AST.
Some of the topics that will be discussed:
* Basic parsing with ANTLR 3.1
* AST generation and rewriting
* Heterogeneous node specification syntax
* How significant indents (part of Python) are parsed in Jython with ANTLR
* Error handling
* Visualizing grammars with ANTLRWorks
This talk assumes that the audience is familiar with EBNF and recursive descent parsers, however, no previous experience with ANTLR or Python is assumed.
Frank Wierzbicki is the Jython Project Lead. He is employed by Sun Microsystems where he works on Jython full-time. He has been a Java and a Python developer for over ten years. Frank has a B.S. in Biochemistry from Old Dominion University and an All-But-Dissertation in Neuroscience from Baylor College of Medicine.
Key Issues for Discussion (cooperative)
(please expand cooperatively) Talk:Jython