[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

• Subject: Re: lpeg as a part of lua (was: An introduction to Lua)
• From: Sean Conner <sean@...>
• Date: Sun, 22 Oct 2017 23:36:24 -0400

```It was thus said that the Great Jonathan Goble once stated:
>
> Okay, interesting. So that particular pattern, at least, can be converted
> to LPeg. However, that was used merely as an example of a more general
> concern, which is that there are things matchable with Lua patterns but not
> matchable with LPeg.
>
> Is that true or not? If not, then that removes one of my objections to LPeg
> replacing standard patterns (though the first objection, regarding the
> difficulty of LPeg, would still stand).

I'm looking through both the Lua patterns and LPeg.  Let's see ...

Character class:

x	lpeg.P("x")
.	lpeg.P(1)
%a	lpeg.R("AZ","az")
%A	P(1) - lpeg.R("AZ","az")
%c	lpeg.R("\0\31","\127\127")
%C	lpeg.R(" ~","\128\255")
%d	lpeg.R"09"
%D	P(1) - lpeg.R"09"
%g	lpeg.R"!\255"
%G	lpeg.R"\0\32"
%l	lpeg.R"az"
%L	P(1) - lpeg.R"az"
...

Yeah, I think character classes are well covered by both.  So are sets and
complements of sets:

[1aA]	lpeg.S"1aA"
[^1aA]	P(1) - lpeg.S"1aA"

Pattern items:

x	lpeg.P"x"
x*	lpeg.P"x"^0
x+	lpeg.P"x"^1
x-	Erm ...
x?	lpeg.P"x"^-1
%n	patt / "%n"
%b()	Yes, see below ...
%f[set]	Erm ... I think (P(1) - set)^0

So the two patterns that are not trivial are "x-" and "%b()".  The later
is straightforward:

lpeg.P {
"start",
char  = (P(1) - S"()") + lpeg.V"start",
start = lpeg.P"(" * char^0 * lpeg.P")"
}

and in fact, isn't restricted to a single character per delimeter:

lpeg.P {
"start",
char  = (P(1) - (P"<q>" + P"</q>")) + lpeg.V"start",
start = lpeg.P"<q>" * char^0 * lpeg.P"</q>"
}

I would have to play around with the %f[set] pattern, not being terribly
familiar with it, but I think the LPeg I have for it is correct if I
understand the documentation.

The other pattern, "x-" is not quite straightforward, but it can be worked
around (as I did for your sample pattern) by knowing what the pattern is,
and what stops it.  Perhaps as an automatic translation could do a "look
ahead" for the next pattern (or patterns) and generate an LPeg expression,
but I tend to reach for LPeg over Lua patterns.  Anyway ...

Captures are easy:

()aa()		lpeg.Cp() * lpeg.P"aa" * lpeg.Cp()
(bbb)		lpeg.C("bbb")
(a*(.)%w(%s*))	lpeg.C(
lpeg.P"a"^0 			-- a*
* lpeg.C(1)			-- (.)
* lpeg.R("AZ","az","09")	-- %w
* lpeg.C(lpeg.S" \t\v\r\n")^0)	-- %s*
)

I'm thinking there's not anything that Lua patterns do that you can't do
in LPeg, but there are certainly things you can do in LPeg that you can't do
using Lua patterns.

And I content that the only reason LPeg looks difficult is that you are
not familiar with it.  I personally find it difficult to read Lua patterns
(and even regexs that tend to look like line noise to me) but that's because
I rarely use them, instead using LPeg.

-spc

```