[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: string.pack with bit resolution
- From: bil til <flyer31@...>
- Date: Wed, 16 Oct 2019 23:23:16 -0700 (MST)
One further offer to make:
I would skip the signed bit numbers. Signed integers always have the
slightly awkward property, that there is one negative number more than the
positives, and this "negative surplus" is this "wicked 0x80", which can even
lead to crazy nightmares for experienced programmers. In case of chars, this
is only a 1% defect, but in case of a2, this is a 25% defect, which is
really hard to explain to any user.
So let's concentrate on the positive world and on the unsigned bit numbers
A, A2 nd A4.
Labeling bits with large letter is of course a half nightmare again. So if
you are flexible enough for this, I would use the sing "." to mark a bit.
And if I have convinced you already enough concerning the importance of bits
in such packings, you could please also allow the short cuts : and | for
2-bit and 4-bit unsigned.
So then 3 more lines in your format list:
. a bit (value 0/nil/false, 1/true)
.[n] an (unsigned) bit number with n bits (value 0/nil/false, 1/true ...
2^n-1)
: an unsigned number with 2 bits (value 0/nil/false, 1/true, 2, 3)
| an unsigned number with 2 bits (value 0/nil/false, 1/true, 2, 3)
If possible also the 2 following additonal float types:
r short float
D long double
(remark: long double is an ansi C standard type - this in ANY case needs to
be somehow in the list ... the 64bit fans otherwise will kill you...)
As separators you have already allowed spaces, some people for sure want
colons, I would propose also single quotes / hyphens '. So then the last
line of format list should read:
" ", ",", "'": These 3 charakters (space, colon, single quote) are ignored
Then you could write nice formats to pack/unpack String into its bits like
this (e. 1 Byte, 1short, 1 int):
"....'...." or "||"
"....'....'....'...." or "||'||"
"||||'||||"
(of course you should also allow "8." or "16." or "32." ... but the above
notaions really look nice, even deigners would like this, I hope)
A further VERY nice application would appear, if you would allow to specify
the n in the format list. You use the parameter n already for lua_Number
(SIDE REMARK1: which makes sense - just please specify how you do this - I
assume you need _tt and then the native byte number (so in LUA_32BITS this
would be 4 byte for int or 4 byte for float) - you have to specify how long
the _tt is - I assume 1 byte is fine for this, and maybe zero for float and
1 for integer and 2 for boolean or s - this of course really MUST be
specified exactly in the descirption of pack / unpack. You could e. g. also
use _tt marking 0 for boolean with 1 byte, 4 for int32, 5 for float32, 8 for
int64, 9 for double64, maybe also FE for pointer32 and FF for pointer64").
... so n is given away already... then maybe use #).
(SIDE REMARK2: The specifier j and J is stupid - this makes no sense ... if
somebody wants an integer in this list, please i should be used)
(SIDE REMARK3: In the format list for h and l you write "native size" - this
is stupid in my eyes, please change to "2 bytes" for h, and "4 bytes" for l,
or do you know some other native size for short and long??? - only
lua_Integer and lua_Number has native size, as I see it)
So for the extension I want to describe in the following, please two further
line in your format list:
i# (or #| or #.4 ...): this number element is put into the string, and
additionally it is used as count specifier for the following elements (then
allowing to write #i #b #c)
^i (or: ^| or ^b or ^.4...) this number element is put into the string, and
additionally it is used as length specifier for the following elements (then
allowing to write i^ b^ c^)
[SIDE REMARK4]: your existing format s1 then is an abbreviation of the
writing
"^b c^", s2 then could also be written "^h #c^", ...)
If you do this # parameter as I just defined, then you could also allow
tables as parameters for pack (and as return for unpack):
string.pack( "n# #n", #t, t)
this should please pack the index field of a table into the string. If you
want to allow similar unpacking of a table in such a very nice and compact
statement, it would be nice if it would be possible to write:
t.setn, t.seti= string.unpack( "n# #n")
If the user knows, that the table contains only booleans, then the user
could e. g. also write:
string.pack ( "n# #.", #t, t)
and if you supply the functions table.setn, table.seti:
t.setn, t.seti= string.unpack( "n# #.")
(this would require, that your unpack function in case of bit specifier .
returns by default Boolean, but this is no restriction... if somebody
presers lua_Number, this will be converted by the number receiver function
more or less automatically I see it)
... it is in such case VERY helpful, if the format string for packing and
unpacking is always EXACTLY the same, this makes life for configuration of
user data MUCH easier ...
... this would be really VERY nice for a whole bunch of users I am very
sure...
if for the table hash-part you allow two further functions table.getnk()
(number of keys), table.getkv(), and further table.setnkv( nk, ...) (the ...
here stands for succeding pairs of keys and values), then you can use this
also for the hash part of a table:
string.pack( "n# #z #z", t.getnk(), t.getkv())
t.setnkv( string.unpack( "n# #z #z"))
... this ALSO would be really VERY nice for a whole bunch of users I
think... )
Concerning error handling: What do you do, if you invoke the string.pack
with specifier s1, and you give a string as argument with more than 255
Bytes? Do you then give an error message, or do you just give out the first
255 Bytes of this string? ... both would be ok, both has advantages and
disadvantages, just please better specify this. (to be honest, although this
is a lazy approach, I would in such case give NO error ... just then limit
to 255 Bytes ... this is somehow responsibility of the poeple who specify
the formats, and if they want to test this, they can write some more code...
also e. g. if numbers overflow, then best just use the max number possible,
e.g. for bits it is clear, that only nil or false or zero will create zero,
and any other value will create 1).
[SIDE REMARK5]: I am quite sure that some poeple would like this
functionality to use also for streams, without the need to invoke the
garbage collector to create an intermediate string object. For such cases it
would be very nice, if in your C could you could allow the possibility to
give a "byte-by-byte stream function" as first parameter to this pack /
unpack function. So the C prototype of your base function for packing should
e. g. be something like this:
void stringpack( void* target, BOOL boTargetIsFunc, ...) (in C of course
this stupid va-stuff instead of ...)
so target then can be lua_String object, but also function which accepts one
byte, and which then is invoked for every output byte of stringpack.
And the lua-lib function str_pack then is defined like
str_pack( lua_String, ...) (using va notation again...)
Then, if some C programmer wants, he can apply this also easily for other
streams, e. g. he can then easily define can.pack, or tcip.pack, or knx.pack
for CAN / TCIP / KNX interfaces...). (you could also consider to give a
stream as parameter to such a function, as it is often allowed in C, but I
think this would be NOT good in the case of lua, this would be too flexible
and easily collide in some uncontrolled manner e. g. in the case of
coroutines, if the lua-user s allowed to do such things by himself)]
... sorry, this was a bit much ...
(but please do not come with the argument, that these things bloat up the c
code for string.pack / string.unpack very much - this I do not believe you
... these are just some minor additons in c code, which make these functions
MUCH more flexible in use)
--
Sent from: http://lua.2524044.n2.nabble.com/Lua-l-f2524044.html