lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 09/04/2014 01:05 PM, Paul K wrote:
Hi All,

I've made some progress with relaxed parsing of Lua grammar (thanks to
all who helped with my earlier questions), but have stumbled on an
issue I can't find a solution for. I'm sure it's caused by my limited
understanding of LPEG processing, so would be interested in any
advice.

Here is the setup. I have a grammar that allows zero or more
statements of various types, but I also want to accept and ignore
anything that doesn't match any of those types. Using the grammar
(below): "do end", "do (1) end", "do (1)(2) end" are all valid
examples and "do (1)a(2) end" is not, but I want it to be processed in
the same way as "do (1)(2) end" (with "a" ignored).

What I tried to to is to use lpeg.V("Stat")^0 + lpeg.C(lpeg.P(1)), but
this doesn't allow "a" to be captured and the processing continued; I
also tried to do (lpeg.V("Stat") + lpeg.C(lpeg.P(1)))^0, however this
doesn't work either as it captures valid fragments before ^0
backtracking.

The question is: how do I write the expression that take zero or more
repetitions of a pattern and (separately) captures all non-matching
strings?

Here is my simplified example. It almost works, but it in addition to
capturing "a" as unknown (which is what I want), it also captures
"end" as unknown, which is what I don't want:

local lpeg = require 'lpeg'
local function recover(p, err)
   return p + lpeg.Cmt(lpeg.Cc(err),
     function(s, p, ...) print("recover", ...) return true end)
end
local function unknown(p)
   return p + lpeg.Cmt(lpeg.C(lpeg.P(1)),
     function(s, p, ...) print("unknown", ...) return true end)
end
local function capture(pos, ...)
   print("capture", pos, ...)
   return { pos = pos, ... }
end
local function token(p) return p * lpeg.S(" ")^0 end

local chunk = lpeg.P { "Chunk";
   Chunk = lpeg.V("Block") * -1 + error;
   Block = unknown(lpeg.V("Stat"))^0; --<-- "unknown" handling
   DoStat = lpeg.Cp() * token(lpeg.P"do") * lpeg.V("Block") *
recover(token(lpeg.P"end"), "end") / capture;
   ExprStat = lpeg.Cp() * token(lpeg.P"(") * token(lpeg.R("09")^0) *
recover(token(lpeg.P")"), ")") / capture;
   Stat = lpeg.V("ExprStat") + lpeg.V("DoStat");
}

print("matches", lpeg.match(chunk, "do (1)(2) end"))
print("matches with unknown", lpeg.match(chunk, "do (1)a(2) end"))

Thank you.

Paul.


I don't use LPEG much. But I would like to try my (over-simplified) version:

~~~~
local lpeg = require 'lpeg'

local sp = (lpeg.S" ")^0
local keywords = lpeg.P"do" + lpeg.P"end"

local chunk = lpeg.P {
   "Chunk";
   Chunk = lpeg.V("Block") * -1 + error;
   Block = (lpeg.V("Stat"))^0;
   Stat = lpeg.V("ExprStat") + lpeg.V("DoStat") + lpeg.V("EatenChar");
   ExprStat = lpeg.P"(" * lpeg.R("09")^0 * lpeg.P")" * lpeg.Cc"ExprStat";
   DoStat = lpeg.P"do" * sp * lpeg.V("Block") * lpeg.P"end" * sp * lpeg.Cc"DoStat";
   EatenChar = lpeg.C(lpeg.P(1)) - keywords;
}

print("matches", lpeg.match(chunk, "do (1)(2) end"))
print("matches with unknown", lpeg.match(chunk, "do (1)a(2) end"))
~~~~