# Separate lexing/tokenization phases

Most of the documentation in parsy assumes that when you call Parser.parse() you will pass a string, and will get back your final parsed, constructed object (of whatever type you desire).

A more classical approach to parsing is that you first have a lexing/tokenization phase, the result of which is a simple list of tokens. These tokens could be strings, or other objects.

You then have a separate parsing phase that consumes this list of tokens, and produces your final object, which is very often a tree-like structure or other complex object.

Parsy can actually work with either approach. Further, for the split lexing/parsing approach, parsy can be used either to implement the lexer, or the parser, or both! The following examples use parsy to do both lexing and parsing.

However, parsy’s features for this use case are not as developed as some other Python tools. If you are building a parser for a full language that needs the split lexing/parsing approach, you might be better off with PLY.

## Calculator

Our second example illustrates lexing and then parsing a sequence of mathematical operations, e.g “1 + 2 * (3 - 4.5)”, with precedence.

In this case, while doing the parsing stage, instead of building up an AST of objects representing the operations, the parser actually evaluates the expression.

from parsy import digit, generate, match_item, regex, string, success, test_item

def lexer(code):
whitespace = regex(r"\s*")
integer = digit.at_least(1).concat().map(int)
float_ = (digit.many() + string(".").result(["."]) + digit.many()).concat().map(float)
parser = whitespace >> ((float_ | integer | regex(r"[()*/+-]")) << whitespace).many()
return parser.parse(code)

def eval_tokens(tokens):
# This function parses and evaluates at the same time.

lparen = match_item("(")
rparen = match_item(")")

@generate
res = yield multiplicative
sign = match_item("+") | match_item("-")
while True:
operation = yield sign | success("")
if not operation:
break
operand = yield multiplicative
if operation == "+":
res += operand
elif operation == "-":
res -= operand
return res

@generate
def multiplicative():
res = yield simple
op = match_item("*") | match_item("/")
while True:
operation = yield op | success("")
if not operation:
break
operand = yield simple
if operation == "*":
res *= operand
elif operation == "/":
res /= operand
return res

@generate
def number():
sign = yield match_item("+") | match_item("-") | success("+")
value = yield test_item(lambda x: isinstance(x, (int, float)), "number")
return value if sign == "+" else -value

simple = (lparen >> expr << rparen) | number

return expr.parse(tokens)

def simple_eval(expr):
return eval_tokens(lexer(expr))

import pytest  # noqa  isort:skip

test_item = pytest.mark.skip(test_item)  # This is not a test

if __name__ == "__main__":
print(simple_eval(input()))