lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]




Le ven. 15 mai 2020 à 11:36, Francisco Olarte <folarte@peoplecall.com> a écrit :
Philippe:

On Fri, May 15, 2020 at 11:01 AM Philippe Verdy <verdyp@gmail.com> wrote:
>> Have you ever worked
> Yes of course !!!

It was rhetorical.


>> with an architecture needing alignment, or
>> considered the zero length array does not need to be of a trivially
>> aligned type like char, i.e., how about { char a; int b[] }?
> Yes, not a problem, the "char a" has no alignment constraints: sizeof(a)=1 and alignof(a) =1
> But given that for example sizeof(int)=4 and alignof(int)=4 mayu be needed with some platform, b will be prepadded. with 3 bytes. The total structure has then 8 bytes, not 5

Are you sure? I would swear it is "4 not 1", and not "8 not 5":

gcc version 8.3.0 (Debian 8.3.0-6)
$ cat kk.c
#include <stdio.h>
struct {  char a;  int b[]; } x;
int main(int ac, char **av) {
  printf("%d\n", (int)sizeof(x));
  return 0;
}
$ gcc kk.c
$ ./a.out
4

You are NOT testing the same thing:

here the size is the size of char plus the *prepended* padding  before a possible int (whose sizeof and alignof are 4, so there are 3 prepended bytes *before* the member array b; but not really after member a which does not need it at all).

Also I took an example,: all depends on the platform alignment constraints (for short, int, long, long long, float, short float, double, long double... but never for "[un][[signed] char" alone, which is warrantied in C to have size and alignment set to 1, independantly of the bitsize of a char, which may not be necessarily an "octet" but just a fuzzy "byte"!)

I know that there exists platforms whose "byte" addressable units is smaller than an octet: some controlers using nibbles, and some platforms even allow bit addressable memory,and can even be very fast using serial links rather than parallel buses that are hard to synchronize realiably at very high clock speed above 1 GHz, except for very short distances, possibly only inside the same chip for clocks over 10 GHz: using parallel buses at these high speeds uses too much energy to synchronize them and get reliable transfers.

I'm sure there will be very fast CPU that will restore bit-addressable memory and serial links everywhere because it will allow faster clocks and lower energy dissipation (and external memory chips using serial links already exist for the same reason: accelerate the transfert rates with the same or improved reliability at larger distances. This is already why there are now optical links working reliably at 10Gbps, 100Gbps, and soon we'll have 1Mbps optical links: parallel buses will no longer be the best choice except inside the same chip.

A C compiler however requires that a "char" type with sizeof()=1 has a minimum bitsize and the current trend for them mean that they should have at least 8 bits (there's no maximum), but it does not indicate at all the minimum unit for addressable memory (that's aslo why pointers and pointer differences in C/C++ are not necessarily the same size as native integers: pointers and pointer diffs may be larger than an "native" integer).

The C/C++ standards also do not require the native "int" to have a defined bitsize, or that their bitsize is a multiple of the bitsize for a "char", but it just requires them to have a "size" in bytes a multiple of 1 byte, even if this wastes some unused padding bits; but most, probably all, plateforms have avoided these padding bits and just increased the bitsize to extend the value domain, within the same sizeof() value, or could use some of these bits as control bits which could be useful as autocorrecting codes for transmission on very fast links. This means that a char could still have an 8-bit value range, but 10-bit storage length with 2 autocorrecting bits (autocorrection will occur when loading the value into an 8-bit register, the value may still be transmitted via a very fast and long distance serial link; the encoding used on the serial link may also not be based on binary bits bout could use "ternary trits" or a number of states which is not necessarily a power of 2, allowing more complex autocorrecting codes, without wasting energy and time for synchronization of parallel buses).

In programs, those details for data units will be invisible: it does not matter how memory is addressed, the C/C++ is just measuring the data units that a program can freely use without limitation, it ignores all extra correcting bits, and other paddings that may be needed between these units. The design of the seril links or buses does not matter: there could be any protocol such as an Ethernet frame or USB frame, or a CSMA wireless frame, including other specific codes for autosynchronization, or adaptation to environment noise or shared access to the transmission media, or adaptation codes for rewriting over the same storage area whose previous state is unknown.

As well a C/C++ program can work even if the minimum addressable unit of memory is larger than a single char without breaking the rule sizeof(char)=alignof(char)=1. For example, a "char" may still require 16 bits, two chars in a structure or array would be stored at the same physical address, and the C/C++ compiler will convert its "pointers" to a pair containing an physical address and a relative bit number converted to a bitshift and bitmask). Pointers in C/C++ are not necessarily the same as physical addresses which may have a more complex structure than just a single integer (meaning that converting a pointer to an int in C/C++ can be lossy, and that the reverse will not work in general except on some existing popular platforms). For C/C++ there's just an adaptation layer for the logical data units to the physical units with only the intend to be "fast" for most common applications, but there are multiple adapation possible that could be better for some platforms or some goals: it is this adaptation layer that determines the bitsizes of all basic types (char, int...) and their alignment and padding contraints; then the adaptation layer must just insure that:

* 1==sizeof(char)<=sizeof(wide char) , and
* 1<=sizeof(boolean)<=sizeof(short)<=sizeof(int)<=sizeof(long)<=sizeof(long long),    and
* sizeof(type)== sizeof(signed type)==sizeof(unsigned type) for all integer types (including char, boolean, short, long, long long, and pointer diffs), and
* a "pointer diff" type is defined as one of the available integer types, and
* 1<=sizeof(pointer)<=sizeof(pointer diff) and
* 1<=sizeof(short float)<=sizeof(float)<=sizeof(double)<=sizeof(long double)
* the "char" type should have a value domain at least 6-bit wide (only if unsigned), it may be signed or unsigned and should cover at least the range 0..63 (almost all platforms today make sure its value domain is at least 8-bit wide signed or unsigned but covering at least the domain range 0..127). The bitsize of char, value range and sign-ness is freely changeable
* for all integer types, the negation fo a value may not necessarily give a distinct value and there's no warranty that: abs(value)==abs(-value), just the warranty that: value==abs(value) or value==-abs(value); this has consequences for comparision operators because integer types do not obey the common arithmetic rules for absolute values, they can operate in a non-infinite cyclic Galois field rather than an interval of the unbounded set of integer numbers)
* the cardinality of the valid domain range of integer types is not necessarily a power of 2 -- Some platforms may use only BCD integers: a single "char" could as well store only units 0 to 99 (using 8 bits internally, it would be unsigned in that case), or -499 to 499, or 0 to 9999 (using 12 bits internally), or 0 to 9999 (using 16 bits internally).

The C/C++ requires no alignment at all for any type. if alignment is needed and used by the compiler for its own performance goals (and measurable in "bytes" with "sizeof") it is only with *prepadding* before the type, no type is followed by trailing padding bytes.

_______________________________________________
lua-l mailing list -- lua-l@lists.lua.org
To unsubscribe send an email to lua-l-leave@lists.lua.org