[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Hash Table Collisions (n.runs-SA-2011.004)
- From: Mark Hamburg <mark@...>
- Date: Thu, 29 Dec 2011 10:55:47 -0800
On Dec 29, 2011, at 10:28 AM, Michal Kottman wrote:
> On 29 December 2011 18:37, fredrik danerklint <fredan-lua@fredan.se> wrote:
>> Overview:
>>
>> Hash tables are a commonly used data structure in most programming
>> languages. Web application servers or platforms commonly parse
>> attacker-controlled POST form data into hash tables automatically, so
>> that they can be accessed by application developers.
>>
>> If the language does not provide a randomized hash function or the
>> application server does not recognize attacks using multi-collisions, an
>> attacker can degenerate the hash table by sending lots of colliding
>> keys. The algorithmic complexity of inserting n elements into the table
>> then goes to O(n**2), making it possible to exhaust hours of CPU time
>> using a single HTTP request.
>>
>> They did mention that Lua could be vulnerably to this kind of an attack at
>> the end of the presentation, so I would like to know if it is or not.
>
> As far as I understand it, the attacks target a kind of hash tables
> where colliding keys are stored in linked lists in the hash table.
> When the weak hash function is abused, and a lot of keys with the same
> hash are generated, the performance degrades to O(n**2). The attacks
> target the simple hashing functions by Dan Bernstein, like DJBX33A,
> which are used in many other languages.
>
> Lua uses the following implementation of hash tables: (quoting from
> the Lua 5.1 source code - ltable.c):
>
> ** Hash uses a mix of chained scatter table with Brent's variation.
> ** A main invariant of these tables is that, if an element is not
> ** in its main position (i.e. the `original' position that its hash gives
> ** to it), then the colliding element is in its own main position.
> ** Hence even when the load factor reaches 100%, performance remains good.
>
> Therefore I am not sure if the attack also applies to Lua's approach.
> Lua also uses a different hashing function, and I'm not sure their
> technique of generating colliding hash keys also applies to it:
>
> TString *luaS_newlstr (lua_State *L, const char *str, size_t l) {
> GCObject *o;
> unsigned int h = cast(unsigned int, l); /* seed */
> size_t step = (l>>5)+1; /* if string is too long, don't hash all its chars */
> size_t l1;
> for (l1=l; l1>=step; l1-=step) /* compute hash */
> h = h ^ ((h<<5)+(h>>2)+cast(unsigned char, str[l1-1]));
If you can generate several keys with the same hash value, then you can basically defeat any hash table implementation simply because the hash will become ineffective. Lua's hash tables more or less fall back to linked lists of cells on collisions. (The details are actually a bit more complicated.) Brent's variation as I understand it encourages having cells filled with entries whose keys actually hash to those cells, but if all the keys generate the same hash, it won't get you all that far.
So, the next question would be: can one reliably generate a collection of keys that all have the same hash value? And the answer is that it's pretty easy to do since for long strings the Lua string hash function starts skipping characters.
A secure implementation would introduce some randomizing element per run or per Lua universe that would result in hash values that would collide under some conditions not colliding under others.
Mark