lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> BTW, did you used recursive-descent parsing or packrat parsing?

Neither one nor the other. I used a virtual "parsing machine" that
executes the pattern. Each pattern is actually a program for this
machine. The function "lpeg.print(p)" prints the code of a given
pattern in a (somewhat) readable format. You may have some fun
"debugging" them. (I am writing something about this parsing machine,
but it is still far from presentable.)


> 5) Is there any difference (efficiency?) between the given methods to 
> search a pattern anywhere in a string?

I am not sure yet. I think it depends on the pattern. Anyway, I would
wait for the implementation to be more stable before doing too many
measures.


> 6) Do I understand correctly that now, the library is no longer able to 
> return key/value pairs in tables? Only numerical indices?
> Andreas Stenius gave an example of parsing XML-like attributes, pushing 
> name and value keys for each part of the pair. I rewrote it for LPeg-0.4 
> but I lost these keys. Not a big problem, but I found it convenient.

I don't remember Andreas' example, but I don't see how to use the label
mechanism for that. The (removed) label mechanism only allowed fixed
labels, not captures ones. Anyway, I guess you can use a function capture
(or an accumulator) to add key-values into tables.


> 7) Well, you implemented Mike's suggestion of improvement to the lpeg.R 
> syntax... Do I have any chance of having my lpeg.Rep(patt, n) function, 
> ideally implemented at VM level (ie. without duplicating the pattern n 
> times)?

I guess no.  I would like to have a shortcut for this functionallity,
but the implementation would be the same as patt * patt * ... * patt.
If you check the current implementation, you will see that patt^n actually
duplicates the pattern n times. This is not bad, unless n or the pattern
is really large. You see, some compilers even do this to your C code to
optimize it (loop unroling) ;)

If the pattern or n is really large, you can use a grammar to
compact the code, but it will be slower; for instance, the following
grammar matches "patt" 625 times:

  { [1] = m.V(2) * m.v(2) * m.v(2) * m.v(2) * m.v(2),
    [2] = m.V(3) * m.v(3) * m.v(3) * m.v(3) * m.v(3),
    [3] = m.V(4) * m.v(4) * m.v(4) * m.v(4) * m.v(4),
    [4] = m.V(5) * m.v(5) * m.v(5) * m.v(5) * m.v(5),
    [5] = patt }

These tricks are better left for those that need them.

-- Roberto