[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Interning strings considered harmful (somewhat)
- From: Roberto Ierusalimschy <roberto@...>
- Date: Mon, 26 Oct 2009 14:11:03 -0200
> The second part of the following benchmark shows distinctly non-linear
> performance: Each pass of the first part runs under one second on my
> machine, but the passes in the second part take 4, 14, and 32 seconds,
> I'm a bit worried by this, and the impact on programs which process
> data from untrusted inputs. I haven't got a really good idea what can
> be done about this. In similar cases, people have used Jenkins'
> lookup3.c hash function with a random seed. Perhaps it is sufficient
> to drop the skipping from Lua's hash function and use a random seed
> stored in the Lua state, but the internal mixing of the current hash
> function seems to be rather weak.
What exactly is worrying you? Excluding malware, I do not think this
situation happens enough to justify any worry. Considering malware, I
guess there are plenty of ways for untrusted inputs to waste CPU time,
this is only one more.
(BTW, simply dropping the skip in the hash function does solve the
"problem"; just add a "step = 1" in lstring.c to check. But it creates a
new one: the hash function becomes too expensive for long strings.)