[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: lpeg.Cg, lpeg.Cb, and how to visualize what they do
- From: Eduardo Ochs <eduardoochs@...>
- Date: Sun, 13 Aug 2023 15:50:24 -0300
Hi list,
I'm trying to learn how group captures in lpeg "really" work, and it
seems that they store data in a certain data structure - I will refer
to it as "Ltables", but this is obviously an improvised name - that
is between tables and lists of values...
In the examples below I will use my favorite pretty-printing function,
"PP", whose output is like this:
PP(2, "3", {4, 5, a=6, [{7,8}]=9, [{7,8}]=10})
--> 2 "3" {1=4, 2=5, "a"=6, {1=7, 2=8}=9, {1=7, 2=8}=10}
If we run this in a REPL,
require "lpeg"
B,C,P,R,S,V = lpeg.B,lpeg.C,lpeg.P,lpeg.R,lpeg.S,lpeg.V
Cb,Cc,Cf,Cg = lpeg.Cb,lpeg.Cc,lpeg.Cf,lpeg.Cg
Cp,Cs,Ct = lpeg.Cp,lpeg.Cs,lpeg.Ct
Carg,Cmt = lpeg.Carg,lpeg.Cmt
lpeg.pm = function (pat, str) PP(pat:match(str or "")) end
(Cc("a","b") * Cc("c","d")) :pm()
(Cc("a","b") * Cc("c","d"):Cg"e") :pm()
(Cc("a","b") * Cc("c","d"):Cg"e") :Ct():pm()
(Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f") :Ct():pm()
(Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"):Ct():pm()
(Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e") :pm()
we get this output:
> (Cc("a","b") * Cc("c","d")) :pm()
"a" "b" "c" "d"
> (Cc("a","b") * Cc("c","d"):Cg"e") :pm()
"a" "b"
> (Cc("a","b") * Cc("c","d"):Cg"e") :Ct():pm()
{1="a", 2="b", "e"="c"}
> (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f") :Ct():pm()
{1="a", 2="b", 3="f", "e"="c"}
> (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"):Ct():pm()
{1="a", 2="b", 3="f", 4="c", 5="d", "e"="c"}
> (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e") :pm()
"a" "b" "f" "c" "d"
>
If we define a table like this,
{20, 30, a=40, a=50, 60}
then the second assignment to "a" will override the first one; lpeg.Cg
does something similar to that...
Let me use this notation for Ltables. This lpeg pattern
Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f"
matches the empty string, and returns this Ltable:
{."a" "b" e={."c" "d".} "f".}
Ltables can be coerced both to tables, by lpeg.Ct, and to lists of
values. The output of
(Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f") :pm()
(Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f") :Ct():pm()
(Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"):Ct():pm()
(Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e") :pm()
is:
> (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f") :pm()
"a" "b" "f"
> (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f") :Ct():pm()
{1="a", 2="b", 3="f", "e"="c"}
> (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"):Ct():pm()
{1="a", 2="b", 3="f", 4="c", 5="d", "e"="c"}
> (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e") :pm()
"a" "b" "f" "c" "d"
>
In my (current) way of thinking this lpeg pattern
Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"
returns this Ltable,
{."a" "b" ["e"]={."c" "d".} "f" ["e"]}
where the ["e"]={."c" "d".} stores an Ltable in the entry with the key
"e" in the current Ltable, and the ["e"] at the end reads the value
stored in the key "e", coerces it to a list of values, and adds these
values to the current Ltable...
Questions:
==========
What is the official name of this data structure? Is there a place in
which it is described in more details than in the Lpeg manual? Where?
The Lpeg manual only talks about "most recent group capture", and it
says this:
"Most recent means the last complete outermost group capture with
the given name. A Complete capture means that the entire pattern
corresponding to the capture has matched. An Outermost capture
means that the capture is not inside another complete capture."
here:
http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#cap-g
http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#cap-b
I hope I'm not the only person who finds that too terse... and also,
is there anyone here - besides me - who have tried to draw diagrams to
understand how the operations on captures work? My current diagrams
are here:
http://anggtwu.net/LATEX/2023lpegcaptures.pdf
Thanks in advance!...
Eduardo Ochs
http://anggtwu.net/luaforth.html