[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Yieldable/streaming LPEG
- From: Sean Conner <sean@...>
- Date: Sun, 31 Aug 2014 21:00:49 -0400
It was thus said that the Great Paul K once stated:
>
> Are there any other options for parsing incomplete strings using LPEG?
I don't think LPeg contains much state itself (seeing how it's composable)
so there isn't much to save---it either matches a pattern, or it doesn't.
But I think that if you can break your parsing up into natural "units",
you should be able to parse a bit and if you fail to parse, add more data
until you can. As a proof-of-concept:
==[ proof-of-concept.lua ]=======================
BUF = 2 -- we'll be readin in this many characters at a time
local lpeg = require "lpeg"
-- ************************************************************************
-- parse a line of text. If a line doesn't end with a '\n' then it's an
-- error. Return the line of text and position at the end of the line, such
-- that we can resume parsing there if need be.
-- ************************************************************************
local line = lpeg.C((lpeg.P(1) - lpeg.P"\n")^0)
* lpeg.P"\n"
* lpeg.Cp()
-- ************************************************************************
-- Interator to get lines from a file.
-- ************************************************************************
function getlines(file)
local function getnext(state,var)
-- -------------------------------
-- attempt to get a line of data
-- -------------------------------
local l,pos = line:match(state.text,state.pos)
-- -------------------------------------------------------------------
-- if we get nil, or the new position is the same as the old position,
-- then we've exhausted our buffer of data. Attempt to read more from
-- our stream (in this case, a file)
-- -------------------------------------------------------------------
if l == nil or pos == state.pos then
local data = state.file:read(BUF)
-- ----------------------------------------------------------------
-- check for end of our stream. If so, and the current position is at
-- the start of the string, then we've encountered a partial, non-'\n'
-- terminated line, so signal an error. Otherwise, we've successfully
-- reached the end of the input.
-- -----------------------------------------------------------------
if not data or data == "" then
if state.pos == 1 then
error("bad token on line " .. tostring(state.line))
end
return nil
end
-- ------------------------------------------------------------------
-- We've read more data from the stream. Discard what we've parsed so
-- far from our buffered data, keep what we haven't parsed and append
-- the new data, and try our parse again.
-- ------------------------------------------------------------------
state.text = state.text:sub(state.pos,-1) .. data
state.pos = 1
return getnext(state,var)
end
-- --------------------------------------------------------------
-- okay, update our state, and return the data we've just parsed.
-- --------------------------------------------------------------
state.line = state.line + 1
state.pos = pos
return l
end
return getnext,{ -- our interator function
file = file, -- stream we're reading from
text = file:read(BUF), -- buffered data
pos = 1, -- starting position in buffer
line = 1 -- line (for error reporting)
}
end
f = io.open(arg[1],"r")
for line in getlines(f) do
print(line)
end
==[ END OF LINE ]========================================
So that's one way of doing it. It isn't that pretty, but it works.
-spc