lark
Modern Python parsing library using standard EBNF grammar notation — generates parsers from grammar files with Earley or LALR(1) algorithms. lark features: EBNF grammar syntax in string or file, Earley parser for any context-free grammar (including ambiguous), LALR(1) for high-performance unambiguous grammars, automatic parse tree construction, Transformer/Visitor patterns for tree processing, terminal/rule disambiguation, priority-based ambiguity resolution, grammar imports/inheritance, standalone parser generation (no lark dependency at runtime), regex terminal support, and lark-cython for C-speed parsing.
Score Breakdown
⚙ Agent Friendliness
🔒 Security
Pure parsing library with no network calls. Parsing user-controlled input: lark uses PEG/CFG algorithms, not regex backtracking, so ReDoS not applicable. Deeply nested input may cause Python recursion limits with Earley — set sys.setrecursionlimit() or use iterative Earley. Transformer executing code on parse: validate transformer methods don't eval user data.
⚡ Reliability
Best When
Building parsers for DSLs, config formats, or small languages where formal EBNF grammar notation produces correct, maintainable parsers — especially when LALR performance is needed.
Avoid When
Simple string splitting/regex patterns, HTML/XML (use BeautifulSoup/lxml), quick prototyping without grammar files (use pyparsing), or context-sensitive grammars.
Use Cases
- • Agent LALR grammar parser — from lark import Lark; grammar = '''start: WORD (',' WORD)*\nWORD: /[a-z]+/'''; parser = Lark(grammar, parser='lalr'); tree = parser.parse('hello,world') — LALR parsing; agent parses comma-separated words using efficient LALR algorithm; tree is a Tree object with children; fast for unambiguous grammars
- • Agent Earley ambiguous grammar — from lark import Lark; grammar = '''sentence: noun verb noun | noun noun verb'''; parser = Lark(grammar, parser='earley', ambiguity='resolve'); tree = parser.parse('dog cat bites') — natural language parsing; agent handles ambiguous grammars where multiple parses exist; Earley handles all context-free grammars
- • Agent tree transformation — from lark import Transformer; class MyTransformer(Transformer): def add(self, items): return items[0] + items[1]; def number(self, items): return int(items[0]); result = MyTransformer().transform(parser.parse('1 + 2')) — tree transformation; agent evaluates parsed expression trees by transforming bottom-up
- • Agent standalone parser — from lark.tools import standalone; code = standalone(parser); open('parser.py', 'w').write(code) — code generation; agent generates standalone parser that works without lark installed; useful for deployment where lark adds too much dependency weight
- • Agent JSON parser example — from lark import Lark; json_grammar = open('json.lark').read(); parser = Lark(json_grammar, parser='lalr', lexer='basic'); tree = parser.parse(json_string) — LALR JSON parsing; agent learns grammar design by using lark's included example grammars; lark ships with example grammars for JSON, Python, EBNF
Not For
- • Simple pattern extraction — for simple regex patterns use Python's re module; lark is overkill for single-pattern extraction
- • Non-context-free grammars — lark handles context-free grammars only; for context-sensitive or format-specific parsing use dedicated parsers (HTML: BeautifulSoup)
- • Quick prototyping without grammar files — for rapid grammar prototyping pyparsing's Python-native notation is more convenient
Interface
Authentication
No auth — pure Python parsing library.
Pricing
lark is MIT licensed. Free for all use.
Agent Metadata
Known Gotchas
- ⚠ Create parser once, parse many times — Lark(grammar) compiles the grammar which takes 100ms-1s; agent code creating Lark() inside a loop or per-request is slow; create parser at module level or in __init__; parser.parse(text) is fast after compilation; same parser handles multiple inputs safely (stateless)
- ⚠ LALR requires unambiguous grammar, Earley handles ambiguity — agent code using parser='lalr' with ambiguous grammar raises GrammarError or produces wrong parses silently; if grammar has shift/reduce conflicts use Earley first to prototype, then refactor to unambiguous for LALR; use lark grammar validator to check
- ⚠ Terminals are uppercase, rules are lowercase in EBNF — WORD, NUMBER are terminal (regex); word, number are grammar rules; mixing case produces GrammarError or silent wrong behavior; agent grammar files must follow this convention; lark is case-sensitive in terminal/rule distinction
- ⚠ Tree children include both terminals and subtrees — tree.children is a list that may contain Token objects (strings with extra metadata) and Tree objects; tree.children[0] may be Token (has .type and .value) or Tree (has .data and .children); agent Transformer methods receive children as list — check isinstance to handle both
- ⚠ ?rule inline removes rule from tree — grammar: ?expr: term | sum means expr rule is inlined (its child promoted directly); useful to remove intermediate nodes; but agent code expecting tree.find_data('expr') won't find inlined rules; use ? judiciously to simplify tree structure without surprising Transformer implementations
- ⚠ Whitespace handling requires explicit terminal — lark does not skip whitespace by default unless using common.lark import; agent grammar must include: %ignore /\s+/ or %import common.WS_INLINE to ignore whitespace; forgetting this causes UnexpectedCharacters on whitespace in input; common patterns available via %import common
Alternatives
Full Evaluation Report
Comprehensive deep-dive: security analysis, reliability audit, agent experience review, cost modeling, competitive positioning, and improvement roadmap for lark.
AI-powered analysis · PDF + markdown · Delivered within 30 minutes
Package Brief
Quick verdict, integration guide, cost projections, gotchas with workarounds, and alternatives comparison.
Delivered within 10 minutes
Score Monitoring
Get alerted when this package's AF, security, or reliability scores change significantly. Stay ahead of regressions.
Continuous monitoring
Scores are editorial opinions as of 2026-03-06.