It was thus said that the Great Johann ''Myrkraverk'' Oskarsson once stated:
Ok, so I am trying out LPEG. For somewhat rigid syntax, I came up with
local p = { } -- the patterns, all in a table
p.space = lpeg.S( " \t\n\r" ) ^ 1 -- whitespace
p.label = lpeg.Cg( lpeg.R( "az" ) ^ 0, "lbl" ) * p.space
p.instruction = ( lpeg.R( "az" ) + lpeg.R( "AZ" ) ) ^ 1
p.operand = lpeg.Cg( lpeg.R( "09" ) ^ 1, "imm" )
p.line = lpeg.Ct( p.label
* lpeg.Cg( p.instruction, "ins" ) * lpeg.S( " \t\n\r" ) ^ 0
* p.operand ) -- lbl, ins, op; in that order.
[I hope the spacing comes out OK; there was a discussion earlier about
issues with Thunderbird, which I'm using.]
It came out fine here.
For lines that don't have a label, it creates a key with the empty
string. That is, the table will be something like
{
imm = "35",
ins = "store",
lbl = ""
}
Is there a way for LPEG to return nil in that case instead?
Yes. If you change p.label to read:
p.label = (lpeg.Cg(lpeg.R("az")^1,'lbl') + lpeg.P(true))
* p.space
Then the following string " store 35" (note the leading space) will return:
{
ins = "store",
imm = "35",
}
A breakdown:
p.label = (
-- check for at least one character
-- that makes a label
lpeg.Cg(lpeg.R("az")^1,'lbl')
-- if there isn't a character, we still
-- want to succeed
+ lpeg.P(true) -- success, even though there's nothing
)
* p.space -- followed by mandatory space characters.
You can also tighten up p.instruction to read:
p.instruction = lpeg.R("az","AZ")^1