[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LPeg 0.10.2: no more captures returned by '#' predicate.
- From: Ico Doornekamp <lua@...>
- Date: Thu, 10 Mar 2011 15:38:37 +0100
* On Thu Mar 10 15:16:58 +0100 2011, Roberto Ierusalimschy wrote:
> > One of the changes from lpeg 0.9 to 0.10.2 is the new behaviour of the
> > and predicate (#) which now no longer keeps its captures. Unfortunately
> > this broke a big (>150 loc) parser in my application, which uses this
> > feature quite a lot.
> >
> > Is there a trick or workaround to make LPeg show similar behaviour as it
> > did in 0.9 ?
>
> You may try run-time captures. (They would have to save the captures
> somewhere.)
Yes, I have already been looking into this, and it looks promising.
> More often than not a positive look-ahead will be matched again outside
> the look-ahead (otherwise the match does not go on). In those cases
> you only have to change the capture from the look-ahead to the non
> look-ahead pattern. (This new form is also slightly more efficient, as
> it avoids some work when the look-ahead fails.)
I understand. I'm still not really sure how to rewrite my actual parser though,
since I'm not sure the above applies to my code. The reason I used the
non-consuming 'and' predicate is because I need to parse certain parts of te
input twice. For example, the input
Via: SIP/2.0/UDP 79.125.108.118:5062;branch=1234567;rport=5062
is parsed into table captures which result in
{
"value" = "SIP/2.0/UDP 79.125.108.118:5062;branch=1234567;rport=5062",
"name" = "Via",
"details" = table 0x828fc78 {
"sent_by" = "79.125.108.118:5062",
"protocol" = "SIP/2.0/UDP",
"params" = table 0x81f8968 {
1 = table 0x81e3298 {
"key" = "branch",
"value" = "1234567",
},
2 = table 0x81f6330 {
"key" = "rport",
"value" = "5062",
},
},
},
},
As you can see the data in the "value" field is also found split up into its
parts in the "details" part of the table. First it is captured as a whole,
after which it is passed to another part of the parser that knows how to deal
with the particular header type and cut it in pieces.
For now I have taken the pragmatic route to just add the LPeg 0.9 source file
to the project instead of depending on the version supplied by the operating
system. I'll figure out a better way as soon as the project leaves me some
time.
Thanks,
Ico
--
:wq
^X^Cy^K^X^C^C^C^C