[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: lpeg v0.4
- From: roberto@... (Roberto Ierusalimschy)
- Date: Sat, 27 Jan 2007 18:06:55 -0200
> BTW, did you used recursive-descent parsing or packrat parsing?
Neither one nor the other. I used a virtual "parsing machine" that
executes the pattern. Each pattern is actually a program for this
machine. The function "lpeg.print(p)" prints the code of a given
pattern in a (somewhat) readable format. You may have some fun
"debugging" them. (I am writing something about this parsing machine,
but it is still far from presentable.)
> 5) Is there any difference (efficiency?) between the given methods to
> search a pattern anywhere in a string?
I am not sure yet. I think it depends on the pattern. Anyway, I would
wait for the implementation to be more stable before doing too many
measures.
> 6) Do I understand correctly that now, the library is no longer able to
> return key/value pairs in tables? Only numerical indices?
> Andreas Stenius gave an example of parsing XML-like attributes, pushing
> name and value keys for each part of the pair. I rewrote it for LPeg-0.4
> but I lost these keys. Not a big problem, but I found it convenient.
I don't remember Andreas' example, but I don't see how to use the label
mechanism for that. The (removed) label mechanism only allowed fixed
labels, not captures ones. Anyway, I guess you can use a function capture
(or an accumulator) to add key-values into tables.
> 7) Well, you implemented Mike's suggestion of improvement to the lpeg.R
> syntax... Do I have any chance of having my lpeg.Rep(patt, n) function,
> ideally implemented at VM level (ie. without duplicating the pattern n
> times)?
I guess no. I would like to have a shortcut for this functionallity,
but the implementation would be the same as patt * patt * ... * patt.
If you check the current implementation, you will see that patt^n actually
duplicates the pattern n times. This is not bad, unless n or the pattern
is really large. You see, some compilers even do this to your C code to
optimize it (loop unroling) ;)
If the pattern or n is really large, you can use a grammar to
compact the code, but it will be slower; for instance, the following
grammar matches "patt" 625 times:
{ [1] = m.V(2) * m.v(2) * m.v(2) * m.v(2) * m.v(2),
[2] = m.V(3) * m.v(3) * m.v(3) * m.v(3) * m.v(3),
[3] = m.V(4) * m.v(4) * m.v(4) * m.v(4) * m.v(4),
[4] = m.V(5) * m.v(5) * m.v(5) * m.v(5) * m.v(5),
[5] = patt }
These tricks are better left for those that need them.
-- Roberto