Re: Performance problem with table.move

Em seg., 7 de dez. de 2020 às 01:16, Andrew Gierth <andrew@tao11.riddles.org.uk> escreveu:

>>>>> "Ranier" == Ranier Vilela <ranier.vf@gmail.com> writes:

Ranier> For loops, the quickest option, are variables with the natural
Ranier> size of the machine.

>> No.

Ranier> No what?

I mean: No, that is a false statement.

For a concrete example: on 64-bit intel, there is an explicit
optimization guideline (from Intel's own optimization reference manual):

Assembly/Compiler Coding Rule 63. (H impact, M generality) Use the
32-bit versions of instructions in 64-bit mode to reduce code size
unless the 64-bit version is necessary to access 64-bit data or
additional registers.

("H impact" = high impact, "M generality" = medium generality)

Other 64-bit architectures typically have equal performance for 32-bit
and 64-bit operations.

If you want to prove me wrong, show an example of _actual code_ where
either 64-bit values or unsigned 32-bit are an improvement over signed
32-bit.

See at:

https://blog.robertelder.org/signed-or-unsigned/

" To summarize my conclusion, I will claim that unsigned integers are better than signed integers for programming in C"

https://stackoverflow.com/questions/4712315/performance-of-unsigned-vs-signed-integers

"unsigned leads to the same or better performance than signed. Some examples:

Division by a constant which is a power of 2 (see also the answer from FredOverflow)
Division by a constant number (for example, my compiler implements division by 13 using 2 asm instructions for unsigned, and 6 instructions for signed)
Checking whether a number is even (i have no idea why my MS Visual Studio compiler implements it with 4 instructions for signed numbers; gcc does it with 1 instruction, just like in the unsigned case)

" Division by powers of 2 is faster with unsigned int, because it can be optimized into a single shift instruction."

Which seems to be ideal for indexing arrays that are power 2 in size.

For loops:

https://godbolt.org/

x86-64 clang 11.0 (-O2):

int f(int i) { int j, k = 0; for (j = i; j < i + 10; ++j) ++k; return k; }
f(int): # @f(int)
lea eax, [rdi + 9]
cmp eax, edi
cmovl eax, edi
sub eax, edi
add eax, 1
ret

unsigned int f(unsigned int i) { unsigned int j, k = 0; for (j = i; j < i + 10; ++j) ++k; return k; }
f(unsigned int): # @f(unsigned int)
xor ecx, ecx
cmp edi, -10
mov eax, 10
cmovae eax, ecx
ret

Ranier Vilela