lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I tested with LPeg versions < 1.0.0. It seems this bug is present in
lpeg >= 0.12.1.

The problem happens when the parse tree is compiled to byte code.
I built LPeg 1.0.0 in debug mode and modified the test case to print
the tree before it's compiled.
I also reduced the test case further.


#!/usr/bin/env lua
lpeg = require "lpeg"

p = lpeg.P{
  "Line";
  P = lpeg.P"a",
  Q = lpeg.V"P",
  Line = lpeg.P"x" * lpeg.C(lpeg.V"P"),
}

p:ptree()
p:match("xx")


A sucessfully compiled tree looks like:

[1 = P  2 = P  3 = Line  ]
grammar 3
  rule n: 0  key: 3 -- Line = lpeg.P"x" * lpeg.C(lpeg.V"P"),
    seq
      char 'x'
      capture cap: 5  key: 0  n: 0
        call key: 1
  rule n: 1  key: 0  -- Q = lpeg.V"P"
    call key: 2
  rule n: 2  key: 2  -- P = lpeg.P"a"
    char 'a'

While a tree that fails compilation looks like:

[1 = P  2 = P  3 = Line  ]
grammar 3
  rule n: 0  key: 3 -- Line = lpeg.P"x" * lpeg.C(lpeg.V"P"),
    seq
      char 'x'
      capture cap: 5  key: 0  n: 0
        call key: 1
  rule n: 1  key: 2 -- P = lpeg.P"a"
    char 'a'
  rule n: 2  key: 0 -- Q = lpeg.V"P"
    call key: 2
Segmentation fault

I think the rule ordering is different because of string (key) hash
randomization.

If I understand things correctly (and I may not), it seems that the
"Q" rule references itself instead of "P" (call key: 2 instead of call
key: 1 in the failing tree above). This  leads to infinite
recursion/stack exhaustion during compilation.

It's also related to captures somehow. Without the capture, the tree
is built correctly. With the capture, the segfault happens.

I've been looking at TOpenCall -> TCall resolution in
lptree.c:finalfix, lptree.c:fixonecall and calls to
lptree.c:correctkeys when joining ktables, but I'm afraid I don't
really understand how it works at this point.

Sorry for sending a lot of emails, but I figure I might as well keep
anyone looking at this updated.

On Sun, Sep 11, 2016 at 11:09 PM, Sebastian Cato <seb.cato@gmail.com> wrote:
> Hello again,
>
> I've managed to reduce the test case to something a bit more managable:
>
> #!/usr/bin/env lua
> lpeg = require "lpeg"
>
> print(lpeg.P{
>   "Line";
>   P = lpeg.P"a",
>   Q = lpeg.V"P",
>   Line = (lpeg.P"x" * lpeg.C(1-lpeg.V"P")),
> }:match("xx"))
>
> I've also tested it on FreeBSD with Lua 5.2.4 and LPeg 1.0.0 with the
> same intermittent segfault happening.
>
> //Sebastian Cato
>
> On Sun, Sep 11, 2016 at 6:22 PM, Sebastian Cato <seb.cato@gmail.com> wrote:
>> Sorry, hit enter too early. Should make a habit of filling out the address last.
>>
>> So, I started writing a markdown (or a dialect thereof) to HTML
>> translator to play around with LPeg. The code is at:
>>
>> https://github.com/sebcat/lpeg-markdown
>>
>> While running markdown_test.lua, the interpreter crashes intermittently.
>>
>> a commit that exhibits this problem, should the repo change:
>>
>> git checkout c82794f8f2637ac4161f21113bbf4f23aa34906e
>>
>> platform info:
>>
>> $ uname -a
>> Linux genesis 4.6.6-200.fc23.x86_64 #1 SMP Wed Aug 10 23:13:35 UTC
>> 2016 x86_64 x86_64 x86_64 GNU/Linux
>> $ lua -v
>> Lua 5.3.3  Copyright (C) 1994-2016 Lua.org, PUC-Rio
>>> lpeg = require "lpeg"
>>> lpeg.version()
>> 1.0.0
>>
>> Since the problem only occurs intermittently, I run markdown_test.lua as such:
>> #!/usr/bin/bash -e
>> while true; do
>>   ./markdown_test.lua
>> done
>>
>> Looking at the core dump, I see:
>>
>> #0  0x00007f3fb7060889 in hascaptures (tree=0x0) at lpcode.c:131
>> #1  0x00007f3fb706092c in hascaptures (tree=0x56012245c45c) at lpcode.c:144
>> #2  0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>> #3  0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #4  0x00007f3fb706092c in hascaptures (tree=0x56012245c46c) at lpcode.c:144
>> #5  0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>> #6  0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #7  0x00007f3fb706092c in hascaptures (tree=0x56012245c46c) at lpcode.c:144
>> #8  0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>> #9  0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #10 0x00007f3fb706092c in hascaptures (tree=0x56012245c46c) at lpcode.c:144
>> #11 0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>> #12 0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #13 0x00007f3fb706092c in hascaptures (tree=0x56012245c46c) at lpcode.c:144
>> #14 0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>> #15 0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #16 0x00007f3fb706092c in hascaptures (tree=0x56012245c46c) at lpcode.c:144
>> #17 0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>> #18 0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #19 0x00007f3fb706092c in hascaptures (tree=0x56012245c46c) at lpcode.c:144
>> #20 0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>> #21 0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #22 0x00007f3fb706092c in hascaptures (tree=0x56012245c46c) at lpcode.c:144
>> #23 0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>> #24 0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #25 0x00007f3fb706092c in hascaptures (tree=0x56012245c46c) at lpcode.c:144
>> #26 0x00007f3fb706092c in hascaptures (tree=0x56012245c454) at lpcode.c:144
>>
>> [...]
>>
>> #261630 0x00007f3fb706092c in hascaptures (tree=0x56012245c1dc) at lpcode.c:144
>> #261631 0x00007f3fb706092c in hascaptures (tree=0x56012245c18c) at lpcode.c:144
>> #261632 0x00007f3fb706092c in hascaptures (tree=0x56012245c10c) at lpcode.c:144
>> #261633 0x00007f3fb706092c in hascaptures (tree=0x56012245c674) at lpcode.c:144
>> #261634 0x00007f3fb7061c6e in codecapture (compst=0x7ffef2f91360,
>> tree=0x56012245c66c, tt=-1, fl=0x7ffef2f901a0) at lpcode.c:720
>> #261635 0x00007f3fb706266a in codegen (compst=0x7ffef2f91360,
>> tree=0x56012245c66c, opt=0, tt=-1, fl=0x7ffef2f901a0) at lpcode.c:905
>> #261636 0x00007f3fb7061af1 in codechoice (compst=0x7ffef2f91360,
>> p1=0x56012245c664, p2=0x56012245c66c, opt=0, fl=0x7ffef2f901a0) at
>> lpcode.c:682
>> #261637 0x00007f3fb70625db in codegen (compst=0x7ffef2f91360,
>> tree=0x56012245c65c, opt=0, tt=-1, fl=0x7ffef2f901a0) at lpcode.c:900
>> #261638 0x00007f3fb7062491 in codeseq1 (compst=0x7ffef2f91360,
>> p1=0x56012245c65c, p2=0x56012245c694, tt=-1, fl=0x7f3fb7063d40
>> <fullset_>) at lpcode.c:876
>> #261639 0x00007f3fb70626f3 in codegen (compst=0x7ffef2f91360,
>> tree=0x56012245c654, opt=0, tt=-1, fl=0x7f3fb7063d40 <fullset_>) at
>> lpcode.c:910
>> #261640 0x00007f3fb7061d18 in codecapture (compst=0x7ffef2f91360,
>> tree=0x56012245c64c, tt=-1, fl=0x7f3fb7063d40 <fullset_>) at
>> lpcode.c:726
>> #261641 0x00007f3fb706266a in codegen (compst=0x7ffef2f91360,
>> tree=0x56012245c64c, opt=0, tt=-1, fl=0x7f3fb7063d40 <fullset_>) at
>> lpcode.c:905
>> #261642 0x00007f3fb7061da0 in coderuntime (compst=0x7ffef2f91360,
>> tree=0x56012245c644, tt=-1) at lpcode.c:734
>> #261643 0x00007f3fb7062685 in codegen (compst=0x7ffef2f91360,
>> tree=0x56012245c644, opt=0, tt=-1, fl=0x7f3fb7063d40 <fullset_>) at
>> lpcode.c:906
>> #261644 0x00007f3fb70622f6 in codegrammar (compst=0x7ffef2f91360,
>> grammar=0x56012245c02c) at lpcode.c:850
>> #261645 0x00007f3fb706269d in codegen (compst=0x7ffef2f91360,
>> tree=0x56012245c02c, opt=0, tt=-1, fl=0x7f3fb7063d40 <fullset_>) at
>> lpcode.c:907
>> #261646 0x00007f3fb7061d18 in codecapture (compst=0x7ffef2f91360,
>> tree=0x56012245c024, tt=-1, fl=0x7f3fb7063d40 <fullset_>) at
>> lpcode.c:726
>> #261647 0x00007f3fb706266a in codegen (compst=0x7ffef2f91360,
>> tree=0x56012245c024, opt=0, tt=-1, fl=0x7f3fb7063d40 <fullset_>) at
>> lpcode.c:905
>> #261648 0x00007f3fb70629b1 in compile (L=0x560122446018,
>> p=0x56012245c018) at lpcode.c:977
>> #261649 0x00007f3fb705fd7d in prepcompile (L=0x560122446018,
>> p=0x56012245c018, idx=1) at lptree.c:1099
>> #261650 0x00007f3fb705ff7b in lp_match (L=0x560122446018) at lptree.c:1154
>> #261651 0x00007f3fb81d97aa in luaD_precall (L=L@entry=0x560122446018,
>> func=func@entry=0x560122456170, nresults=nresults@entry=1) at
>> ldo.c:365
>> #261652 0x00007f3fb81ef5fd in luaV_execute (L=L@entry=0x560122446018)
>> at lvm.c:1134
>> #261653 0x00007f3fb81d9b9f in luaD_call (L=L@entry=0x560122446018,
>> func=<optimized out>, nResults=<optimized out>) at ldo.c:496
>> #261654 0x00007f3fb81d9bf1 in luaD_callnoyield (L=0x560122446018,
>> func=<optimized out>, nResults=<optimized out>) at ldo.c:506
>> #261655 0x00007f3fb81d8fc2 in luaD_rawrunprotected
>> (L=L@entry=0x560122446018, f=f@entry=0x7f3fb81cf250 <f_call>,
>> ud=ud@entry=0x7ffef2f918a0) at ldo.c:142
>> #261656 0x00007f3fb81d9e7d in luaD_pcall (L=L@entry=0x560122446018,
>> func=func@entry=0x7f3fb81cf250 <f_call>, u=u@entry=0x7ffef2f918a0,
>> old_top=80, ef=<optimized out>)
>>     at ldo.c:727
>> #261657 0x00007f3fb81d0851 in lua_pcallk (L=0x560122446018, nargs=0,
>> nresults=-1, errfunc=<optimized out>, ctx=0, k=0x0) at lapi.c:968
>> #261658 0x0000560121f058ab in docall (L=0x560122446018, narg=0,
>> nres=-1) at lua.c:203
>> #261659 0x0000560121f0646c in handle_script (argv=<optimized out>,
>> L=0x560122446018) at lua.c:443
>> #261660 pmain (L=0x560122446018) at lua.c:577
>> #261661 0x00007f3fb81d97aa in luaD_precall (L=L@entry=0x560122446018,
>> func=0x560122446640, nresults=1) at ldo.c:365
>> #261662 0x00007f3fb81d9b93 in luaD_call (L=L@entry=0x560122446018,
>> func=<optimized out>, nResults=<optimized out>) at ldo.c:495
>> #261663 0x00007f3fb81d9bf1 in luaD_callnoyield (L=0x560122446018,
>> func=<optimized out>, nResults=<optimized out>) at ldo.c:506
>> #261664 0x00007f3fb81d8fc2 in luaD_rawrunprotected
>> (L=L@entry=0x560122446018, f=f@entry=0x7f3fb81cf250 <f_call>,
>> ud=ud@entry=0x7ffef2f91b40) at ldo.c:142
>> #261665 0x00007f3fb81d9e7d in luaD_pcall (L=L@entry=0x560122446018,
>> func=func@entry=0x7f3fb81cf250 <f_call>, u=u@entry=0x7ffef2f91b40,
>> old_top=16, ef=<optimized out>)
>>     at ldo.c:727
>> #261666 0x00007f3fb81d0851 in lua_pcallk (L=0x560122446018, nargs=2,
>> nresults=1, errfunc=<optimized out>, ctx=0, k=0x0) at lapi.c:968
>> #261667 0x0000560121f0565b in main (argc=2, argv=0x7ffef2f91c88) at lua.c:603
>>
>> I'm not familiar enough with LPeg or PEG in general to know what's
>> causing the crash. Any ideas?
>>
>> Again, sorry for sending a partial e-mail earlier.
>>
>> //Sebastian Cato
>>
>> On Sun, Sep 11, 2016 at 6:12 PM, Sebastian Cato <seb.cato@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I started writing a markdown (or a dialect thereof) to HTML translator to play around with LPeg.
>>>