[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: lpeg: bug or intentional? `… * (Cmt(0,f1) + Cmt(0,f2)) * …` doesn't call f2
- From: nobody <nobody+lua-list@...>
- Date: Sun, 7 Apr 2019 21:30:12 +0200
Hey everyone,
this behavior has surprised me a bit. I was trying to use this to keep
track of the nesting of structures (to produce a backtrace in terms of
grammar rules when stuff fails to match), but it doesn't work as
expected… Simplified example:
(checked in versions 1.0.1 and 1.0.2)
lpeg = require "lpeg" ; C, Cmt, P = lpeg.C, lpeg.Cmt, lpeg.P
depth = 0
enter = Cmt( 0, function(_,p) depth = depth + 1 ; return p end )
leave = Cmt( 0, function(_,p) depth = depth - 1 ; return p end )
-- wrap pattern with enter/leave (pattern simplified to "foo" here)
-- leave should always ping once (on success or by backtracking)
p = (enter + leave) * C"foo" * leave
-- tests
p:match "foo" ; print( depth ) --> 0 - ok!
p:match "bar" ; print( depth ) --> 1 - huh?!?
In other words, `leave` doesn't get called when backtracking. As a
further check, manually rewriting this as
q = (Cmt( 1, function( _,p,c )
depth = depth + 1
if c == "f" then return p end
end) + leave) * (P"oo"/"foo") * leave
(shifting the test for "f" into the Cmt, thus turning the 0+0 choice
into a 1+0 choice) results in the expected output
q:match "foo" ; print( depth ) --> 0
q:match "bar" ; print( depth ) --> 0
which makes me believe that the version at the top correctly describes
the intended behavior.
Looking at the debug output, the last step of p:match "bar" results in
varying huge numbers for the PC:
s: |bar| stck:0, dyncaps:0, caps:0 -17672968042060: giveup
s: |bar| stck:0, dyncaps:0, caps:0 -17849526133724: giveup
s: |bar| stck:0, dyncaps:0, caps:0 -17844495977256: giveup
etc.
(They stay constant over repeated runs against the same pattern, but
change every time you re-create `p` or re-start Lua. So maybe this is
some uninitialized / misinterpreted memory problem?)
Irrespective of whether this is a bug or not, does anyone know of a way
to get an equivalent of the `(enter+leave)*pat*leave` pattern to work?
-- nobody
Full debug info: (nothing else below)
The :ptree() of p seems to look sane (as far as I can tell…):
[1 = function 2 = function 3 = function ]
seq
choice
run-time
true
run-time
true
seq
capture kind: 'simple' key: 0
seq
char 'f'
seq
char 'o'
char 'o'
run-time
true
The :pcode() for reference to accompany the next block:
[1 = function 2 = function 3 = function ]
00: testany -> 9
02: choice -> 9
04: opencapture group (idx = 1)
05: any
06: closeruntime
07: commit -> 11
09: opencapture group (idx = 2)
10: closeruntime
11: char 'f'
12: char 'o'
13: char 'o'
14: fullcapture simple (size = 3) (idx = 0)
15: opencapture group (idx = 3)
16: closeruntime
17: end
The -DDEBUG trace: (with ((comments)) added)
((the first part looks ok i guess – do the `enter`))
-------------------------------------
======
=======
s: |bar| stck:1, dyncaps:0, caps:0 00: choice -> 6
-------------------------------------
======
=======
s: |bar| stck:2, dyncaps:0, caps:0 02: opencapture group (idx = 1)
-------------------------------------
======
group (idx: 1 - size: 0) -> 0x5563448d07a8
=======
s: |bar| stck:2, dyncaps:0, caps:1 03: closeruntime
-------------------------------------
======
=======
s: |bar| stck:2, dyncaps:0, caps:0 04: commit -> 8
((now f doesn't match b, so backtrack… - ok))
-------------------------------------
======
=======
s: |bar| stck:1, dyncaps:0, caps:0 08: char 'f'
**FAIL**
((huge (random!) negative number for PC - this looks fishy))
-------------------------------------
======
=======
s: |bar| stck:0, dyncaps:0, caps:0 -17672968042060: giveup