Re: LPeg 0.10.2: no more captures returned by '#' predicate.

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: LPeg 0.10.2: no more captures returned by '#' predicate.
From: Ico Doornekamp <lua@...>
Date: Thu, 10 Mar 2011 15:38:37 +0100

* On Thu Mar 10 15:16:58 +0100 2011, Roberto Ierusalimschy wrote:

> > One of the changes from lpeg 0.9 to 0.10.2 is the new behaviour of the
> > and predicate (#) which now no longer keeps its captures. Unfortunately
> > this broke a big (>150 loc) parser in my application, which uses this 
> > feature quite a lot.  
> > 
> > Is there a trick or workaround to make LPeg show similar behaviour as it
> > did in 0.9 ?
> 
> You may try run-time captures. (They would have to save the captures
> somewhere.) 

Yes, I have already been looking into this, and it looks promising.

> More often than not a positive look-ahead will be matched again outside
> the look-ahead (otherwise the match does not go on). In those cases
> you only have to change the capture from the look-ahead to the non
> look-ahead pattern.  (This new form is also slightly more efficient, as
> it avoids some work when the look-ahead fails.)

I understand. I'm still not really sure how to rewrite my actual parser though,
since I'm not sure the above applies to my code. The reason I used the
non-consuming 'and' predicate is because I need to parse certain parts of te
input twice. For example, the input

  Via: SIP/2.0/UDP 79.125.108.118:5062;branch=1234567;rport=5062 

is parsed into table captures which result in

      {
        "value" = "SIP/2.0/UDP 79.125.108.118:5062;branch=1234567;rport=5062", 
        "name" = "Via", 
        "details" = table 0x828fc78 {
          "sent_by" = "79.125.108.118:5062", 
          "protocol" = "SIP/2.0/UDP", 
          "params" = table 0x81f8968 {
            1 = table 0x81e3298 {
              "key" = "branch", 
              "value" = "1234567", 
            }, 
            2 = table 0x81f6330 {
              "key" = "rport", 
              "value" = "5062", 
            }, 
          }, 
        }, 
      }, 

As you can see the data in the "value" field is also found split up into its
parts in the "details" part of the table. First it is captured as a whole,
after which it is passed to another part of the parser that knows how to deal
with the particular header type and cut it in pieces.

For now I have taken the pragmatic route to just add the LPeg 0.9 source file
to the project instead of depending on the version supplied by the operating
system. I'll figure out a better way as soon as the project leaves me some
time.

Thanks,

Ico
-- 
:wq
^X^Cy^K^X^C^C^C^C

Follow-Ups:
- Re: LPeg 0.10.2: no more captures returned by '#' predicate., Roberto Ierusalimschy

References:
- LPeg 0.10.2: no more captures returned by '#' predicate., Ico Doornekamp
- Re: LPeg 0.10.2: no more captures returned by '#' predicate., Roberto Ierusalimschy

Prev by Date: Best way to pass options to a module at require time?
Next by Date: Re: Status of LuaJIT on mingw 64 bits?
Previous by thread: Re: LPeg 0.10.2: no more captures returned by '#' predicate.
Next by thread: Re: LPeg 0.10.2: no more captures returned by '#' predicate.
Index(es):
- Date
- Thread