lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Em seg., 7 de dez. de 2020 às 01:16, Andrew Gierth <> escreveu:
>>>>> "Ranier" == Ranier Vilela <> writes:

 Ranier> For loops, the quickest option, are variables with the natural
 Ranier> size of the machine.

 >> No.

 Ranier> No what?

I mean: No, that is a false statement.

For a concrete example: on 64-bit intel, there is an explicit
optimization guideline (from Intel's own optimization reference manual):

  Assembly/Compiler Coding Rule 63. (H impact, M generality) Use the
  32-bit versions of instructions in 64-bit mode to reduce code size
  unless the 64-bit version is necessary to access 64-bit data or
  additional registers.

("H impact" = high impact, "M generality" = medium generality)

Other 64-bit architectures typically have equal performance for 32-bit
and 64-bit operations.

If you want to prove me wrong, show an example of _actual code_ where
either 64-bit values or unsigned 32-bit are an improvement over signed
See at:
" To summarize my conclusion, I will claim that unsigned integers are better than signed integers for programming in C"

"unsigned leads to the same or better performance than signed. Some examples:
  • Division by a constant which is a power of 2 (see also the answer from FredOverflow)
  • Division by a constant number (for example, my compiler implements division by 13 using 2 asm instructions for unsigned, and 6 instructions for signed)
  • Checking whether a number is even (i have no idea why my MS Visual Studio compiler implements it with 4 instructions for signed numbers; gcc does it with 1 instruction, just like in the unsigned case)
" Division by powers of 2 is faster with unsigned int, because it can be optimized into a single shift instruction."
Which seems to be ideal for indexing arrays that are power 2 in size.

For loops:
x86-64 clang 11.0 (-O2):

int f(int i) { int j, k = 0; for (j = i; j < i + 10; ++j) ++k; return k; }
f(int):                                  # @f(int)
        lea     eax, [rdi + 9]
        cmp     eax, edi
        cmovl   eax, edi
        sub     eax, edi
        add     eax, 1

unsigned int f(unsigned int i) { unsigned int j, k = 0; for (j = i; j < i + 10; ++j) ++k; return k; }
f(unsigned int):                                  # @f(unsigned int)
        xor     ecx, ecx
        cmp     edi, -10
        mov     eax, 10
        cmovae  eax, ecx

Ranier Vilela