lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I would need more tests, including on hardware I no longer have access to,
but I have had experience with packed structs and unaligned accesses and
in general you do not want them unless you have a very good reason.

- Not all ARM CPUs support them. On some 32 bit ARM CPUs that don't
  you will get SIGBUS on Android.

- With several CPU / libc combinations you will get a significant
  performance penalty. Basically there are architectures where the
  CPU handles the unaligned access and it causes a small
  performance hit that can be compensated by cache locality, but
  on other architectures it raises an exception that is caught by
  a handler in the libc, and that can be *super slow*.

-- 
Pierre Chapuis

On Mon, Oct 11, 2021, at 18:39, Roberto Ierusalimschy wrote:
> Hugo Gualandi came with the idea of using a packed structure to store
> Lua values. Intel CPUs (and it seems ARMs too) can work with unaligned
> data (or aligned with weaker boundaries) and, at least for some
> architectures, with very small (or even none) performance penalties.
>
> As a very fast check, I simply changed the following line in lobject.h:
>
> -typedef struct TValue {
> +typedef struct __attribute__((packed)) TValue {
>    TValuefields;
>  } TValue;
>
> This is valid in gcc and clang.  (It gives one warning in ltable.c which
> for now I am ignoring. It is a trivial change to correct that: pass the
> second parameter of 'mainposition' by value instead of by reference.)
>
> I quickly tested that in two Intel i7. As expected, memory use
> by arrays is cut by almost half (9/16). Maybe unexpected, I did
> not see any relevant performance penalties at all. (In a few
> benchmarks, performance even improved, probably because there
> is less memory trafic.)
>
> It would be good to know how this change works in other architectures.
>
> -- Roberto