lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Wed, Feb 01, 2012 at 05:05:24PM -0800, William Ahern wrote:
> On Wed, Feb 01, 2012 at 04:27:16PM -0800, William Ahern wrote:
> > On Wed, Feb 01, 2012 at 03:51:41PM -0200, Roberto Ierusalimschy wrote:
> > > > [Hmm, that brings to mind another question:  How much of the input
> > > > string does it have to accumulate in memory?  Can it start discarding
> > > > earlier portions of it at some point?  If not, it wouldn't be so useful
> > > > for the use I have in mind: parsing really big files without needing to
> > > > have them entirely in memory.]
> > > 
> > > That is an important point. Also, do you have benchmarks comparing your
> > > patch to standard LPeg?
> > > 
> > 
> > Attached is my LPeg JSON library used for testing.
> 
> Same library, but parsing a single 5.1MB JSON file:
> 

Parsing same 5.1MB JSON file, but yielding every 512 bytes. It's 0.03
seconds slower.

% for I in 1 2 3; do time ./rfc-index.lua < /tmp/rfc-index.json; done
lpeg 0.10 (yieldable)
./rfc-index.lua < /tmp/rfc-index.json  2.25s user 0.46s system 99% cpu 2.725
total
lpeg 0.10 (yieldable)
./rfc-index.lua < /tmp/rfc-index.json  2.25s user 0.47s system 99% cpu 2.732
total
lpeg 0.10 (yieldable)
./rfc-index.lua < /tmp/rfc-index.json  2.25s user 0.47s system 99% cpu 2.739
total

Here's the yielding rfc-index.lua script:

#!/tmp/build/bin/lua5.2

local which = ...
local json = require(which or "json")

local input = io.stdin:read("*a")

local buffer = {
	len = 0,
	tovector = function(self, yieldable)
	        if yieldable then
			coroutine.yield()
		end

		if self.len < #input then
			self.len = math.min(self.len + 512, #input)
		end

		return input, self.len, (self.len == #input)
	end
}

local done = false
local step = coroutine.wrap(function()
	local table = json.decode(buffer)
	done = true
end)

repeat
	step()
until done