codersnotes

peg/leg parser generator August 3rd, 2009

I've been on the hunt for a good parser generator for quite a while now. bison/yacc is starting to get a bit long in the tooth, but despite quite hard I have not been able to find anything good to replace it with.

All I wanted was something simple that could generate C or C++ code, and ideally would combine the lexer and parser into one process. It turns out there's not that many of these around, and the ones that do exist, well, for some reason the authors don't seem to want them to actually be able to run on anyone's computer. (don't even start me on ANTLR...)

But then I found peg. And it's cousin, leg. peg uses Parsing Expression Grammars, which allow you to do cool stuff like have optional or repeated sections. You'll probably just want to use leg for most yacc-style parsing, as it adds the ability for things to have values associated with them. See calc.leg (included) for a good example.

So being the helpful soul that I am, I've taken the liberty of putting up some nice easy Win32 binaries of it here, for you all to download. I also fixed a nasty stack overflow bug in there.

Here's some useful rules you'll want to use, if you're parsing most C-esque yaccy things.
- = ( [ \t] | EOL )* # eat any whitespace
-- = &[^a-zA-Z0-9_] - # used after any keyword, to enforce a break between the next keyword
EOF = !.
EOL = [\n\r]
Use the first rule (the "minus" rule) directly after any punctuation. (e.g. sum = expr '*'- expr). Use the second rule (the "double minus" rule) directly after any keywords (e.g. statement = 'while'-- expr).

Enjoy!

 

Written by Richard Mitton,

software engineer and travelling wizard.

Follow me on twitter: http://twitter.com/grumpygiant