[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: String tokenization function
- From: Philipp Janda <siffiejoe@...>
- Date: Sat, 7 Nov 2015 20:19:50 +0100
Am 07.11.2015 um 16:52 schröbte Marco Atzori:
Il 07/11/2015 16:46, Matthew Wild ha scritto:
If you have actual constraints, revealing them might allow people to
help you better. If you don't have any constraints and just want it to
perform well, then there are some really much simpler and more
efficient solutions than your implementation.
This code is part of a videogames addon, and is executed every frame
refresh (about every tenth of a second) for which using tables created
in continuation is a considerable waste of memory. If there are simpler
solutions, show them to me as well. I asked your help for this reason ;-)
This one doesn't allocate any memory (except for the result string --
and the separator if you insist on passing ASCII codes around):
local s_find, s_sub, s_char = string.find, string.sub, string.char
local function numtok( s, c )
local pos, n = 0, 0
while pos do
n, pos = n+1, s_find( s, c, pos+1, true )
end
return n
end
local function gettok( s, b, c, e )
if b == 0 then return s end
c = s_char( c ) -- XXX why pass ASCII codes anyway???!
e = e or b -- default value for e
if b < 0 or e < -1 then
local cnt = numtok( s, c )
if b < 0 then b = b + cnt + 1 end -- make b positive
if e < -1 then e = e + cnt + 1 end -- make e positive
end
if e > 0 and b > e then b, e = e, b end -- fix order
local bpos, pos, n = 1, 0, 1
while n < b and pos do -- find first requested token
n, pos = n+1, s_find( s, c, pos+1, true )
end
if pos then
bpos = pos+1
if e < 1 then pos = nil end -- to the end of the input string
while n < e+1 and pos do -- find end of last requested token
n, pos = n+1, s_find( s, c, pos+1, true )
end
end
return s_sub( s, bpos, (pos or #s+1)-1 )
end
However, if you extract tokens from each input string at least `n`
times, Dirk's approach is probably faster for a very small `n`. Happy
benchmarking!
Philipp