lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Le sam. 17 nov. 2018 à 07:55, Dirk Laurie <dirk.laurie@gmail.com> a écrit :
Op Sa. 17 Nov. 2018 om 04:31 het Tim Hill <drtimhill@gmail.com> geskryf:

> I think “word” wrt CPU architecture was never a precise term, and probably never will be.

Oh, a mathematically precise definition is possible. "A word is an
individually addressable entity larger than the smallest such entity."
Whether this definition is useful is can be established only by a
disciple of Bourbaki (one of which, an I mistake not, has recently
joined this list).

And this definition applies to Lua (or _javascript_) objects: they are the smallest individually addressable entities, just like nodes in Lisp: they always carry a type with them (including at the lower VM level where they are "addressed" relatively by opcodes. of the virtual instruction set; there's a special case for characters in strings, but strings are immutable objects and getting characters for strings (e.g. extracting substrings) is a way to "project" a string with a function that will also return another typed object (another string object, or another number object). The "size" in bits of Lua objects is not defined precisely.

Note that processors do not always have a single word size: there exists processors that have several addressable spaces which may be used either to refer to the same objects or parts of object (as aliased addresses) or to completely separate spaces : the 80x86 have the concept of segments for example where these spaces are not necessarily independant, but there's also the case of microsontrolers that can address memory using different word sizes (e.g. bytes, or bits), and the case of I/O spaces (not ncessarily separated from the memory space), interrupt spaces, MMU spaces (pages), register files (different word sizes between ALUs, FPUs and VPUs).

This makes difficulties to define an address (and the concept of "minimal unit of information which is individually addressable", notably in C, a problem solved in C++ using more cleanly typed pointers (including notably fonction pointers, or object method pointers which carry with them the address of the object on which they apply and possibly a table entry number), but used also in C with things like near/far pointers (it's not evident that CPU integers can represent all needed distinct addresses, you frequently need multiple integer-typed "words" to represent a single distinct address, so even at CPU level, a word is not just an integer, it is typed even if the types are not necessarily stored but implied by the instructions used to handle them). Converting a typed address to another type is most often lossy and reconstructing a complete adress from a partial adress will require additonal words (including for example some access token or a current operating mode/state, representing a proof of access right or privilege).