[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Elegant design for creating error messages in LPEG parser
- From: joy mondal <joykrishnamondal@...>
- Date: Mon, 8 Apr 2019 11:09:30 +0400
It seems everything lpeg.Cf can do can be done with lpeg.Cmt.
Under what circumstances would using lpeg.Cmt INSTEAD of lpeg.Cf be considered a severe design failure ?
I tried using lpeg.Cf recursively and its quite convoluted.
For parsing thing like this:
Its quite a bit easier with Cmt since I can create an empty table ( state ) at the start of the loop. with Cf you are not sure if you at the start,middle or end of the loop.
I had a look at the moon-script code base ( written using LPEG ) and there seems no usage of lpeg.Cf.
What I am trying to find is the minimum number of functions that is needed for LPEG.
Up until now I haven't found a function that cannot be used uniquely in a given situation, so I am quite curious to be proven wrong.
On 2019-04-03 9:50 p.m., Sean Conner wrote:
> I've also used it to fail a pattern that would otherwise match. It is
> possible to come up with a pattern that matches the numbers between 0 and
> 255 but it's quite involved and part of it looks like:
> dec_octet = P"25" * R"05"
> + P"2" * P"04" * P"09"
> + P"1" * ...
> I found it easier to use Cmt() instead:
> local dec_octet = Cmt(DIGIT^1,function(_,position,capture)
> local n = tonumber(capture)
> if n < 256 then
> return position
> When the string of digits is greater than 255, this returns nil, which
> causes the match to fail. Doing this:
> dec_octet = DIGIT^1
> / function(c)
> local n = tonumber(c)
> if n < 256 then
> return c
> won't cause the match to fail---instead it will return nil as the captured
Unrelated, but, those aren't strictly equivalent:
tonumber("000001") --> 1
1 < 256, but "000001" doesn't follow the
This may or may not be a problem depending on your use-case. (e.g. \000
escapes in Lua take 3 digits and fail on >255, but some hypothetical
language could take up to 3 digits less than 255 such that e.g. \999
would be \99 followed by an literal 9)