[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LPeg indent/whitespace based grammar
- From: Sean Conner <sean@...>
- Date: Sat, 8 Jul 2017 16:44:44 -0400
It was thus said that the Great Matthias Dörfelt once stated:
> I actually do have a follow up question:
>
> I am now trying to wrap my head around the re module and tried making a
> simple grammar to parse something like this
>
> a = 'b'
>
> and put it into a dictionary style table a la:
>
> { a = 'b'}
>
> I tried to do so using table and named group capture, but I can’t for the
> life of me figure out how to dynamically set the name of the group
> capture. Will I have to write a custom helper function to achieve this or
> is there a pure re way?
I found *a* way. I'm not sure if it's the *best* way, and there's no way
to translate it to use only the re module. So, the code:
lpeg = require "lpeg"
local function doset(tab,name,value)
if tab[name] == nil then
tab[name] = value
elseif type(tab[name]) == 'table' then
table.insert(tab[name],value)
else
tab[name] = { tab[name] , value }
end
return tab
end
local token = lpeg.C(lpeg.R"!~"^1)
local pair = lpeg.Cg(token * lpeg.P" "^1 * token) * lpeg.P"\n"
local list = lpeg.Cf(lpeg.Ct"" * pair^1,doset)
test = [[
field1 value1
field2 value2
field3 value3a
field3 value3b
field3 value3e
field4 value4
]]
x = list:match(test)
lpeg.Cf() is a folding capture, which is used to accumulate a single result
from a stream of patter matches. I first use lpeg.Ct() with an empty string
to obtain a table to use. Then, pair will capture two tokens (sequence of
characters separated by spaces) and group them with lpeg.Cg(). The initial
table and the pair of captures are passed to the function doset() which
accumulates the results. The rather convoluted nature of doset() is to
handle the case of the same field name being repeated (a case I had to
handle when I came up with this). When you run example above, you'll end up
with a table like:
x =
{
field1 = "value1",
field3 =
{
[1] = "value3a",
[2] = "value3b",
[3] = "value3e",
},
field4 = "value4",
field2 = "value2",
}
I do wish re had a folding catpure syntax, but I can work around it.
-spc