lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Mon, 14 Aug 2023 at 14:29, Roberto Ierusalimschy
<roberto@inf.puc-rio.br> wrote:
>
> > Questions:
> > ==========
> > What is the official name of this data structure? Is there a place in
> > which it is described in more details than in the Lpeg manual? Where?
>
> Maybe I am misuderstanding something, but there is no such data
> structure. Captures are computations. You may try to represent
> the result of one single capture as a data structure, but there is
> no structure to represent the tree of all captures made in a match.
>
> A most archetypal capture is the function capture. What structure
> would represent the following match?
>
>   ((lpeg.C(1) * lpeg.C(1)) / function (a,b)
>     return string.char(string.byte(a) + string.byte(b))
>   end):match(" '")
>
> -- Roberto

Hi Roberto,

I would draw this

  > require "lpeg"
  > f = function (a,b) return string.char(string.byte(a) + string.byte(b)) end
  > = ((lpeg.C(1) * lpeg.C(1)) / f):match("#$")
  G
  >

as this - switch to a monospaced font if needed:

   #   $
  \-/ \-/
  "#" "$"
  \-----/
    "G"

and here is a low-tech animation of how the submatches happen and what
they return:

   #   $    #   $    #   $     #   $
           \-/      \-/ \-/   \-/ \-/
           "#"      "#" "$"   "#" "$"
                              \-----/
                                "G"

Here is a case that I find very strange... no, actually a case that is
simple to understand followed by one that I find very strange. Compare:

  > require "lpeg"
  > = ((lpeg.C(1):Cg"c" * lpeg.C(1):Cg"d") * lpeg.Cb"c"):match"ab"
  a
  > = (lpeg.C(1):Cg"c" * (lpeg.C(1):Cg"d" * lpeg.Cb"c")):match"ab"
  a
  >

= (lpeg.C(1):Cg"c" * (lpeg.C(1):Cg"d" * lpeg.Cb"x")):match"ab"

I draw them as this:

    a     b                a     b
  \---/ \---/ \---/      \---/ \---/ \---/
   "a"   "b"  ["c"]       "a"   "b"  ["c"]
  \---/ \---/            \---/ \---/
  c="a" d="b"            c="a" d="b"
  \---------/                  \---------/
  c="a" d="b"                  d="b" ["c"]
  \---------------/            \---------/
  c="a" d="b" ["c"]            not found?
  \---------------/
  c="a" d="b"  "a"

Each ["c"] means "fetch the value associated to the key "c" and append
it to the current Ltable", and the lower underbrace in the first
diagram shows the moment in which that fetch happens and the ["c"] is
replaced by "a". The second diagram shows what I _expected_ that would
happen in the second match; I expected that in this subpattern

  (lpeg.C(1):Cg"d" * lpeg.Cb"c")

the lpeg.Cb"c" would look only at the "Cg"s that happen inside that
subpattern, and I would get an error like this one...

  stdin:1: back reference 'c' not found

...but I was wrong. I _guess_ that what is happening in the second
(...):match"ab" is this:

    a     b
  \---/ \---/ \---/
   "a"   "b"  ["c"]
  \---/ \---/
  c="a" d="b"
        \---------/
        d="b" ["c"]
  \---------------/
  c="a" d="b" ["c"]
  \---------------/
  c="a" d="b"  "a"

and the expansion ["c"] -> "a" is delayed as most as possible...

Anyway, I hope that these diagrams would make enough sense to the
people who can help me fix them, and who can help me fix my mental
model...

  Thanks in advance =S,
    Eduardo Ochs
    http://anggtwu.net/luaforth.html