[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LPEG design patterns
- From: Sean Conner <sean@...>
- Date: Wed, 1 May 2019 16:54:09 -0400
It was thus said that the Great joy mondal once stated:
> Hi,
>
> In LPEG , you can pass a variable to your parser using Carg.
>
> BUT if you build your grammar dynamically for EACH string / file you could
> pre-create state for each of your Cmt functions.
>
> - One design is stateless ( which has some dubious clarity )
>
> - One design creates a separate grammar for each string , which is likely
> slower ?
You missed one---you could always use global variables and avoid Carg() or
building a separate grammer entirely.
> Question 1 :: is option 2 a anti-pattern ? do we stick to option 1 ?
*I* think so, but that's me. I can't speak for others.
> Another issue I feel you have with using Carg is that all your rules have
> Carg all over the place.
>
> local patt1 = P(Carg(1)* .................)
>
> local patt2 = P(Carg(1)* .................)
It depends upon what you are doing. I just finished a formatting program
(implements a form of OrgMode for my own blogging needs [1]) and I counted
only 15 instances of Carg() in 93 (if I counted correctly) named LPEG rules,
half of them in one rule:
local style = Cmt(P"//" * Carg(1),stack "i")
+ Cmt(P"/" * Carg(1),stack "em")
+ Cmt(P"**" * Carg(1),stack "b")
+ Cmt(P"*" * Carg(1),stack "strong")
+ Cmt(P"+" * Carg(1),stack "del")
+ Cmt(P"=" * Carg(1),stack "code")
+ Cmt(P"~~" * Carg(1),stack "tt")
+ Cmt(P"~" * Carg(1),stack "kbd")
> I understand that its possible to pass multiple arguments with multiple
> states :
>
> local patt1 = P(Carg(1)* .................)
>
> local patt2 = P(Carg(2)* .................)
>
> local patt2 = P(Carg(3)* .................)
>
> But it doesn't make things any clearer, and now you need to remember which
> number corresponds to which variable.
Easy enough to solve:
local MACROS = 1
local STACK = 2
local QUOTE = 3
local patt1 = P(Carg(MACROS) * ... )
local patt2 = P(Carg(STACK) * ... )
local patt3 = P(Carg(QUOTE) * ... )
Or pass a table---that's what I'm doing:
local state =
{
email_all = false,
stack = {},
quote = {},
abbr = {},
}
> My question is that of confidence and design - how large does your parser
> have to get before this method of arranging stops making sense ?
Hard to say. The parser I wrote is over 700 lines long and the method
doesn't seem all that onerous to me. I prefer the Carg() method over the
others because 1) there's no global data to mess up and 2) I only pay the
compilation overhead once.
-spc (You will not believe the number of times I got the "body may accept
empty string" error while writing this program ... )
[1] https://github.com/spc476/mod_blog/blob/master/Lua/format.lua
LPEG code starts at line 150.
A sample input file is here:
https://github.com/spc476/mod_blog/blob/master/NOTES/testmsg