[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Ideas about implementing a string type?
- From: Rici Lake <lua@...>
- Date: Sun, 01 Apr 2007 16:48:09 -0500
David Kastrup wrote:
For some callbacks getting information from TeX, Taco switched the
data passed into Lua from strings to integers and reportedly got a
speedup of about 10 from that.
I suppose that's possible if you're measuring just the cost of
the callback. But it certainly doesn't jive with my experience.
Here's a simple experiment: you can get string.gmatch to return
either an index (as an integer) or a capture (as a string). The
first option is slightly faster (but in my experience not enough
to make it useful if you're going to need the string eventually).
Anyway, it's makes for a simple benchmark.
The following program runs through a string of about 50k 1000
times, calling gmatch with either the pattern "()...", which
returns every third index) or the pattern "(...)", which returns
every third substring of 3 characters. To increase the cost of
string allocation, I actually cycle through patterns with from
2 to 21 dots. Here's the program:
local data = [[
-- I used /usr/share/games/fortunes/literature here, it's about 50k
local count = 0
local pat1, pat2
if ... == "int" then
pat1, pat2 = '()', ''
elseif ... == "str" then
pat1, pat2 = '(', ')'
if pat1 then
for j = 1, 1000 do
local pat = pat1 .. ('.'):rep(2 + j%20) .. pat2
for i in data:gmatch(pat) do count = count + 1 end
The compile time is negligible. Results on my machine, default lua build:
rici@rici-desktop:~/src/lua-5.1.1/src$ time lua test.lua int
rici@rici-desktop:~/src/lua-5.1.1/src$ time lua test.lua str
So the string case takes less than twice as long as the int case.
From my experience with, for example, utf-8 iterators which return
either utf-8 single character strings or the integer code, that seems
Perhaps Taco was doing something overly complex to pass strings into
Lua, I don't know. I'd be happy to review the code, though.