lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great joy mondal once stated:
>  Hi,
> 
> In LPEG , you can pass a variable to your parser using Carg.
> 
> BUT if you build your grammar dynamically for EACH string / file you could
> pre-create state for each of your Cmt functions.
> 
> - One design is stateless ( which has some dubious clarity )
> 
> - One design creates a separate grammar for each string , which is likely
> slower ?

  You missed one---you could always use global variables and avoid Carg() or
building a separate grammer entirely.

> Question 1 :: is option 2 a anti-pattern ? do we stick to option 1 ?

  *I* think so, but that's me.  I can't speak for others.

> Another issue I feel you have with using Carg is that all your rules have
> Carg  all over the place.
> 
> local patt1 = P(Carg(1)* .................)
> 
> local patt2 = P(Carg(1)* .................)

  It depends upon what you are doing.  I just finished a formatting program
(implements a form of OrgMode for my own blogging needs [1]) and I counted
only 15 instances of Carg() in 93 (if I counted correctly) named LPEG rules,
half of them in one rule:

local style = Cmt(P"//" * Carg(1),stack "i")
            + Cmt(P"/"  * Carg(1),stack "em")
            + Cmt(P"**" * Carg(1),stack "b")
            + Cmt(P"*"  * Carg(1),stack "strong")
            + Cmt(P"+"  * Carg(1),stack "del")
            + Cmt(P"="  * Carg(1),stack "code")
            + Cmt(P"~~" * Carg(1),stack "tt")
            + Cmt(P"~"  * Carg(1),stack "kbd")

> I understand that its possible to pass multiple arguments with multiple
> states :
> 
> local patt1 = P(Carg(1)* .................)
> 
> local patt2 = P(Carg(2)* .................)
> 
> local patt2 = P(Carg(3)* .................)
> 
> But it doesn't make things any clearer, and now you need to remember which
> number corresponds to which variable.

  Easy enough to solve:

	local MACROS = 1
	local STACK  = 2
	local QUOTE  = 3

	local patt1 = P(Carg(MACROS) * ... )
	local patt2 = P(Carg(STACK)  * ... )
	local patt3 = P(Carg(QUOTE)  * ... )

Or pass a table---that's what I'm doing:

	local state =
	{
	  email_all = false,
	  stack     = {},
	  quote     = {},
	  abbr      = {},
	}
	
> My question is that of confidence and design - how large does your parser
> have to get before this method of arranging stops making sense ?

  Hard to say.  The parser I wrote is over 700 lines long and the method
doesn't seem all that onerous to me.  I prefer the Carg() method over the
others because 1) there's no global data to mess up and 2) I only pay the
compilation overhead once.

  -spc (You will not believe the number of times I got the "body may accept
	empty string" error while writing this program ... )

[1]	https://github.com/spc476/mod_blog/blob/master/Lua/format.lua

	LPEG code starts at line 150.

	A sample input file is here:

	https://github.com/spc476/mod_blog/blob/master/NOTES/testmsg