Skip to content
· 0%
Go back

Only perl Can Parse Perl (and Rust)

Published:
·3 min read·infrastructure

78% of Perl developers do not use a language server.1 The existing options either need a running Perl process or stop short of the hard cases.

perl-lsp does neither. Zero-Perl native-Rust binary. 600,000 lines of Rust.

Larry Wall said “only perl can parse Perl.”2 He was right. Perfect Perl parsing requires running Perl. But perfection is not the bar. Materiality is. 95% of CPAN parses clean without executing a line of Perl.

Verified, not read

The codebase is too large for anyone to read. That was never the plan.

Behavioral specs define what should happen. Tests lock the behavior in. Mutation testing verifies the tests. Property-based testing. Fuzz testing. Adversarial build loops where author agents write and critic agents attack. Each layer catches what the one below misses. No single layer is the answer. The stack is.

The real cost is verification, not generation.

The parser constraint

/ is division or a regex delimiter depending on what came before it. { } is a hash or a block. Heredocs declare on one line and deliver their body three lines later. BEGIN blocks execute at compile time and change what the parser sees next.

Perl does not have a grammar. It has a state machine that looks like one. Standard parser tools cannot express that. Tree-sitter assumes context-free grammars. PEG backtracks into exponential walls on Perl’s ambiguity.

Patrick McKenzie and Steven Zimmerman discuss why DSL parsing hits a wall.

Pest, a PEG parser in Rust, lasted one week. We built the state machine. Context-aware lexer, recursive descent parser, sublexing context stack. The same approach Perl’s own toke.c uses. AST, scope resolution, symbol tables, Moose/Moo-aware semantic analysis with MRO-aware method lookup. Everything a compiler does at parse time. Nothing it does at runtime.

The line is execution, not analysis. The remaining edge cases are being hammered out. The test ratchet keeps us moving forward.

Where this goes

It started by building a tree-sitter-perl compatible AST in Rust. tree-sitter-perl kept breaking our internal AST tooling, and instead of turning Perl off, I decided to replace the parser. The parser needed a test corpus, the corpus needed real-world Perl, and at some point I had a language server and a methodology that compounds.

The receipts lied often enough that verification became its own subsystem. Tokens cost less than CI and senior attention. Perl was the proving ground.

Hear about Parsing Perl Without Perl online with Toronto Perl Mongers, April 30.

Footnotes

  1. 2025 Perl IDE Survey. 130 of 602 submissions checked yes to using a Perl LSP. October–December 2025.

  2. “As is often said, only perl can parse Perl.” The Perl Journal #19, page 28. Archived PDF.



Next Post
Code Review Tools Have a Pricing Problem