[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Userdata with methods *and* table access.
- From: David Given <dg@...>
- Date: Wed, 13 Oct 2004 22:44:20 +0100
Tom Spilman wrote:
[...]
if ( key[0] == 'x' && key[1] == 0 ) { // faster than strcmp()
[...]
I'm trying to speed up the function by the way as it's still a bit slower
than a normal table index operation. I'm considering removing type checking
in lclass_checkobject<>(), but aside from that does anyone have any other
suggestions to speed this up?
This will only produce a tiny improvement, but switch is your friend:
if (key[0] != '\0')
{
switch (key[0] | (key[1] << 8))
{
case 'x':
/* do X thing */
break;
case 'y':
/* do Y thing */
break;
}
}
Using switch rather than a series of ifs will allow the compiler to generate
an inline binary tree, or a calculated jump table, or something similar. This
can be a lot faster than a series of ifs, even for small numbers of tests.
Doing this:
cmp r0, #'x'
beq routine_to_do_x
cmp r0, #'y'
beq routine_to_do_y
routine_to_do_everything_else
b skip_to_end
routine_to_do_x
b skip_to_end
routine_to_do_y
skip_to_end:
...is much kinder on the cache than:
cmp r0, #'x'
bne skip_x
routine_to_do_x
skip_x:
cmp r0, #'y'
bne skip_y
routine_to_do_y
skip_y:
In addition, this last piece of code, which is what a series of ifs will
almost certainly generate, will cause a pipeline flush at pretty much every
branch.
I've sucessfully used this technique to use switch to compare strings of up to
seven bytes long:
switch (unaligned_longlong_read(ptr))
{
case 'string1':
/* do something */
...
}
However, there are some gotchas:
* You musn't read past the end of the string, because you may overrun the
memory block and cause a segmentation fault. That's why you need the if
statement in the first example.
* You must either read each byte individually and chop them together, as in
the first example, or used deep magic to get an unaligned read, as in the second.
* I use gcc for everything, which supports multibyte character constants.
Other compilers don't. In any case, you've got to be careful of endianness
issues. One trick is to do:
#define STRINGCONST(s)
(s[0] | (s[1]<<8) | (s[2]<<16) | (s[3]<<24))
unsigned int s = STRINGCONST("RIFF");
This will be optimised by the compiler into a constant load, and it's 100%
legal. This will produce a big-endian string constant, suitable for decoding
RIFF files. Very handy.
This message brought to you by the late-night committee for anally retentive
code optimisation.
--
[insert interesting .sig here]