lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, Oct 23, 2014 at 04:06:50PM -0500, Andrew Starks wrote:
> On Thu, Oct 23, 2014 at 3:35 PM, Thiago L. <fakedme@gmail.com> wrote:
> 
> >
> > On 23/10/14 05:58 PM, Sean Conner wrote:
> >
> >> It was thus said that the Great Roberto Ierusalimschy once stated:
> >>
> >>> [...]
> >>>> lstrlib.c, line 1142.  Change:
> >>>>     buff[islittle ? i : size - 1 - i] = (n & MC);
> >>>> To:
> >>>>     buff[islittle ? i : size - 1 - i] = (char)(n & MC);
> >>>> Explanation:  Prevents compiler warning about possible loss of data.
> >>>>
> >>> This compiler seems quite dumb :-) How can (n & 0xFF) loose data??
> >>>
> >>    Being charitable [1], *technically* you are potentially losing
> >> information---values 128 to 255 may become -128 to -1, if chars are signed
> >> [2].
> >>
> >>    -spc (Or it could be an utterly stupid compiler)
> >>
> >> [1]     Like Microsoft needs any charity
> >>
> >> [2]     C standard leaves the signness [3] of a bare 'char' declaration up
> >>         to the implementation---it can be either signed or unsigned.
> >>
> >> [3]     Is that even a word?
> >>
> >>  I think signedness is a word... (don't ask me tho, idk)
> >
> Please forgive a naive for asking:
> 
> Technically speaking, if you change one lvalue's type to another one that
> isn't exactly equivalent, aren't you sort of "loosing data", or is there
> another, better term for that?

If the new type can represent the value, you've lost nothing. If the new
type cannot represent the value, then it depends on the type. For unsigned
types the C specification requires modulo behavior. For signed types it
leaves it up to the implementation to decide.

That means that for unsigned arithmetic on machines which used ones'
complement of signed-magnitude representation, the behavior of unsigned
types must be synthesized by the compiler. For signed types compilers
usually define the behavior to be whatever the hardware actually does, but
of course they're free to synthesize whatever behavior they want.

> That is, 2^7 +1 is -128 for a "signed char" (or whatever) or it's 129 for
> an unsigned char, even though the rvalue's bits are the same...?

I think you may have meant -127. But the bits are not necessarily the same.
That's only the case in two's complement representation. You have to be
careful about differentiating value and representation.[1]

There's really no substitute for reading and comprehending what the standard
says about this stuff. Download

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

which is the last draft before publication of the C11 standard[2]. It's
nearly word-for-word equivalent to the official version, which is costly.

Try reading sections 6.2.5 and 6.2.6. Then 6.3.1.3 and 6.3.1.8.

It's important to first understand that, per 6.3.1.8, operands of type char
and short are first promoted to int. Which means that ((char)1<<7)+1
actually always results in the _value_ 129, because it becomes
((int)(char)1<<7)+1. If you subsequently demote the type by converting it to
the narrower signed char type, which cannot represent the value 129, then
what happens is up to the compiler (per 6.3.1.3p3). A compiler could legally
throw an error.

[1] And representation does _not_ mean the same thing as what the bare-metal
hardware is doing: C compiled to JavaScript can use two's complement
representation even though the native arithemtic type in JavaScript is a
signed-magnitude floating-point type, which may or may not use a
signed-magnitude bit representation on the bare metal.

[2] The wording for the sections I mentioned has not changed much since C89.
But here are links to earlier versions.
C89: http://flash-gordon.me.uk/ansi.c.txt
C99 TC3: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf