[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LPeg pattern optimization
- From: Fabio Mascarenhas <mascarenhas@...>
- Date: Sat, 20 Feb 2010 16:17:22 -0200
When using locals you are effectively inlining the pattern for each of your lexical elements (terminals) inside the grammar definitions, so if you define your number terminal in a local and use it three times in your grammar then the resulting grammar will have three copies of the pattern for numbers.
When you moved the terminals to non-terminals inside the grammar you replaced the inlined copies with plain LPEG CALLs. Continuing the previous example, now you only have one copy of the pattern for numbers in your grammar, and three calls to it in the non-terminals that use it.
About whether is this a best practice, it depends on what you are gaining in terms of size, as inlining patterns lets LPEG do other kinds of optimization that will speed up the actual parsing, while the final size of the grammar is generally unrelated to the parsing time. There also other considerations, such as that keeping everything in the grammar lets you easily change the definition of a terminal and compile a slightly different grammar, which is impossible to do if the terminal was defined in a local, as the original definition is baked in the grammar rules that use it.
--
Fabio Mascarenhas
On Sat, Feb 20, 2010 at 2:34 PM, Ico
<lua@zevv.nl> wrote:
Hi,
Recently I've hit the LPeg max-pattern-size limit while working on a
parser, so I've been trying to cut down the size of the resulting
compiled pattern.
My original pattern consisted of a lot of definitions of various lexical
elements and simple constructs in local variables, and a true LPeg
grammar definition using P{..}, only for describing the recursive part
of the parser.
While trying to optimize I have been moving the separate locals inside
the grammar, which seem to have shrunk the total size of the parser to
only 30% of it's original size. Quite an optimization!
So, can anybody explain what is happening here, why the big difference
in resulting pattern size ? Is LPeg doing some kind of optimization,
only possible inside a grammar definition ? Is it considered best
practice to put as much as possible inside the grammar definition to
allow better optimization ?