lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi All,

I've made some progress with relaxed parsing of Lua grammar (thanks to
all who helped with my earlier questions), but have stumbled on an
issue I can't find a solution for. I'm sure it's caused by my limited
understanding of LPEG processing, so would be interested in any
advice.

Here is the setup. I have a grammar that allows zero or more
statements of various types, but I also want to accept and ignore
anything that doesn't match any of those types. Using the grammar
(below): "do end", "do (1) end", "do (1)(2) end" are all valid
examples and "do (1)a(2) end" is not, but I want it to be processed in
the same way as "do (1)(2) end" (with "a" ignored).

What I tried to to is to use lpeg.V("Stat")^0 + lpeg.C(lpeg.P(1)), but
this doesn't allow "a" to be captured and the processing continued; I
also tried to do (lpeg.V("Stat") + lpeg.C(lpeg.P(1)))^0, however this
doesn't work either as it captures valid fragments before ^0
backtracking.

The question is: how do I write the expression that take zero or more
repetitions of a pattern and (separately) captures all non-matching
strings?

Here is my simplified example. It almost works, but it in addition to
capturing "a" as unknown (which is what I want), it also captures
"end" as unknown, which is what I don't want:

local lpeg = require 'lpeg'
local function recover(p, err)
  return p + lpeg.Cmt(lpeg.Cc(err),
    function(s, p, ...) print("recover", ...) return true end)
end
local function unknown(p)
  return p + lpeg.Cmt(lpeg.C(lpeg.P(1)),
    function(s, p, ...) print("unknown", ...) return true end)
end
local function capture(pos, ...)
  print("capture", pos, ...)
  return { pos = pos, ... }
end
local function token(p) return p * lpeg.S(" ")^0 end

local chunk = lpeg.P { "Chunk";
  Chunk = lpeg.V("Block") * -1 + error;
  Block = unknown(lpeg.V("Stat"))^0; --<-- "unknown" handling
  DoStat = lpeg.Cp() * token(lpeg.P"do") * lpeg.V("Block") *
recover(token(lpeg.P"end"), "end") / capture;
  ExprStat = lpeg.Cp() * token(lpeg.P"(") * token(lpeg.R("09")^0) *
recover(token(lpeg.P")"), ")") / capture;
  Stat = lpeg.V("ExprStat") + lpeg.V("DoStat");
}

print("matches", lpeg.match(chunk, "do (1)(2) end"))
print("matches with unknown", lpeg.match(chunk, "do (1)a(2) end"))

Thank you.

Paul.