Overview

Parsy is an easy way to combine simple, small parsers into complex, larger parsers.

If it means anything to you, it’s a monadic parser combinator library for LL(infinity) grammars in the spirit of Parsec, Parsnip, and Parsimmon.

If that means nothing, rest assured that parsy is a very straightforward and Pythonic solution for parsing text that doesn’t require knowing anything about monads.

Parsy differentiates itself from other solutions with the following:

  • it is not a parser generator, but a combinator based parsing library.
  • a very clean implementation, only a few hundred lines, that borrows from the best of recent combinator libraries.
  • free, good quality documentation, all in one place. (Please raise an issue on GitHub if you have any problems, or find the documentation lacking in any way).
  • it avoids mutability, and therefore a ton of related bugs.
  • it has monadic binding with a nice syntax. In plain English:
    • we can easily handle cases where later parsing depends on the value of something parsed earlier e.g. Hollerith constants.
    • it’s easy to build up complex result objects, rather than having lists of lists etc.
    • there is no need for things like pyparsing’s Forward class .
  • it has a minimalist philosophy. It doesn’t include built-in helpers for any specific grammars or languages, but provides building blocks for making these.

Basic usage looks like this:

Example 1 - parsing a set of alternatives:

>>> from parsy import string
>>> parser = (string('Dr.') | string('Mr.') | string('Mrs.')).desc("title")
>>> parser.parse('Mrs.')
'Mrs.'
>>> parser.parse('Mr.')
'Mr.'

>>> parser.parse('Joe')
ParseError: expected title at 0:0

>>> parser.parse_partial('Dr. Who')
('Dr.', ' Who')

Example 2 - Parsing a dd-mm-yy date:

>>> from parsy import string, regex
>>> from datetime import date
>>> ddmmyy = regex(r'[0-9]{2}').map(int).sep_by(string("-"), min=3, max=3).combine(
...                lambda d, m, y: date(2000 + y, m, d))
>>> ddmmyy.parse('06-05-14')
datetime.date(2014, 5, 6)

To learn how to use parsy, you should continue with:

Other Python projects

  • pyparsing. Also a combinator approach, but in general much less cleanly implemented, and rather scattered documentation.
  • funcparserlib - the most similar to parsy. It differs from parsy mainly in normally using a separate tokenization phase, lacking the convenience of the generate() method for creating parsers, and documentation that relies on understanding Haskell type annotations.
  • Lark. With Lark you write a grammar definition in a separate mini-language as a string, and have a parser generated for you, rather than writing the grammar in Python. It has the advantage of speed and being able to use different parsing algorithms.