lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi list,

I'm trying to learn how group captures in lpeg "really" work, and it
seems that they store data in a certain data structure - I will refer
to it as "Ltables", but this is obviously an improvised name - that
is between tables and lists of values...

In the examples below I will use my favorite pretty-printing function,
"PP", whose output is like this:

  PP(2, "3", {4, 5, a=6, [{7,8}]=9, [{7,8}]=10})
  --> 2 "3" {1=4, 2=5, "a"=6, {1=7, 2=8}=9, {1=7, 2=8}=10}

If we run this in a REPL,

  require "lpeg"
  B,C,P,R,S,V = lpeg.B,lpeg.C,lpeg.P,lpeg.R,lpeg.S,lpeg.V
  Cb,Cc,Cf,Cg = lpeg.Cb,lpeg.Cc,lpeg.Cf,lpeg.Cg
  Cp,Cs,Ct    = lpeg.Cp,lpeg.Cs,lpeg.Ct
  Carg,Cmt    = lpeg.Carg,lpeg.Cmt
  lpeg.pm     = function (pat, str) PP(pat:match(str or "")) end

  (Cc("a","b") * Cc("c","d"))                           :pm()
  (Cc("a","b") * Cc("c","d"):Cg"e")                     :pm()
  (Cc("a","b") * Cc("c","d"):Cg"e")                :Ct():pm()
  (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f")        :Ct():pm()
  (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"):Ct():pm()
  (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e")     :pm()

we get this output:

  >   (Cc("a","b") * Cc("c","d"))                           :pm()
   "a" "b" "c" "d"
  >   (Cc("a","b") * Cc("c","d"):Cg"e")                     :pm()
   "a" "b"
  >   (Cc("a","b") * Cc("c","d"):Cg"e")                :Ct():pm()
   {1="a", 2="b", "e"="c"}
  >   (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f")        :Ct():pm()
   {1="a", 2="b", 3="f", "e"="c"}
  >   (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"):Ct():pm()
   {1="a", 2="b", 3="f", 4="c", 5="d", "e"="c"}
  >   (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e")     :pm()
   "a" "b" "f" "c" "d"
  >

If we define a table like this,

  {20, 30, a=40, a=50, 60}

then the second assignment to "a" will override the first one; lpeg.Cg
does something similar to that...

Let me use this notation for Ltables. This lpeg pattern

  Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f"

matches the empty string, and returns this Ltable:

  {."a" "b" e={."c" "d".} "f".}

Ltables can be coerced both to tables, by lpeg.Ct, and to lists of
values. The output of

  (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f")             :pm()
  (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f")        :Ct():pm()
  (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"):Ct():pm()
  (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e")     :pm()

is:

  >   (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f")             :pm()
   "a" "b" "f"
  >   (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f")        :Ct():pm()
   {1="a", 2="b", 3="f", "e"="c"}
  >   (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"):Ct():pm()
   {1="a", 2="b", 3="f", 4="c", 5="d", "e"="c"}
  >   (Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e")     :pm()
   "a" "b" "f" "c" "d"
  >

In my (current) way of thinking this lpeg pattern

  Cc("a","b") * Cc("c","d"):Cg"e" * Cc"f" * Cb"e"

returns this Ltable,

  {."a" "b" ["e"]={."c" "d".} "f" ["e"]}

where the ["e"]={."c" "d".} stores an Ltable in the entry with the key
"e" in the current Ltable, and the ["e"] at the end reads the value
stored in the key "e", coerces it to a list of values, and adds these
values to the current Ltable...


Questions:
==========
What is the official name of this data structure? Is there a place in
which it is described in more details than in the Lpeg manual? Where?
The Lpeg manual only talks about "most recent group capture", and it
says this:

  "Most recent means the last complete outermost group capture with
   the given name. A Complete capture means that the entire pattern
   corresponding to the capture has matched. An Outermost capture
   means that the capture is not inside another complete capture."

here:

  http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#cap-g
  http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#cap-b

I hope I'm not the only person who finds that too terse... and also,
is there anyone here - besides me - who have tried to draw diagrams to
understand how the operations on captures work? My current diagrams
are here:

  http://anggtwu.net/LATEX/2023lpegcaptures.pdf

Thanks in advance!...
  Eduardo Ochs
  http://anggtwu.net/luaforth.html