Search this site


Metadata

Articles

Projects

Presentations

Python's seeming lack of good parser

I've been searching for a decent recursive descent parser for Python. Too bad I haven't found one :(

None are truely standalone, though many claim to be generators. Either generate code or give me a nice parser library, not half-assedly in the middle! Urgh!

ANTLR depends on import antlr and Java. PLY does similar. Others simply suck. Who wants to lug around piles of libraries and modules? I don't. PLY may be an option, but it may be some time before I can make a decent grammar with it. Perhaps in a day or two when I have more time.

Granted, I'm probably just frustrated from many hours of trying parsers without success. It's not that there aren't any parsers that don't work. It's that there aren't any parsers that are as easy to use as perl's Parse::RecDescent.

All I want is to parse an extremely simple config file of my own design. I may not even need recursive descent, seeing as how I only go 1 level deep. Though, I would prefer a token parser that suited my needs (cfgparser is too limited, shlex is broken), I haven't been able to find one.

I was able to get a config file parser with older grok using Parse::RecDescent in only a few hours, and even after 10 minutes I was using it successfully. Have parsers fallen to the way-side with the advent of XML as a cure-all?

This pisses me off. I should be able to say, "here's the grammar for my data" and be happy. I really wanted to get the config parser done in py grok tonight. I'm giving serious consideration to adding multiline and statefulness support to grok, just so I can parse a damned config file. That is, use grok to read it's own config file so that we can grok whatever data the config file says.

If you're reading, and you have suggestions for python text parsing modules that do not suck, please let me know.