lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]




On Wed, Jul 11, 2018, 4:57 PM Gregg Reynolds <dev@mobileink.com> wrote:


On Wed, Jul 11, 2018, 4:53 PM Coda Highland <chighland@gmail.com> wrote:
On Wed, Jul 11, 2018 at 3:58 PM, Gregg Reynolds <dev@mobileink.com> wrote:
>
>
> On Wed, Jul 11, 2018, 1:43 AM Dirk Laurie <dirk.laurie@gmail.com> wrote:
> ...
>>
>> >From the point of view of the utf8 library, UTF-8 is a reversible way
>> of mapping a certain subset of strings (which I here call "codons",
>> borrowing a term from DNA theory) onto a certain subset of 32-bit
>> integers.
>
>
> Not even wrong. https://en.m.wikipedia.org/wiki/Not_even_wrong. Utf8 has
> nothing to do with "a certain subset of 32 bit integers".
>
> If you're talking about utf8, but you're not talking about Unicode, then
> what are you talking about? I'm not against it, I just don't see what you're
> after.

UTF-8 = Unicode Transformation Format, 8 bit. The transformation
methodology is independent of the character set it represents. The
CANONICAL APPLICATION of this transformation format is to represent
Unicode characters, but it can be considered to be a variable-length
integer representation scheme. There's nothing wrong with discussing
the manipulation of data encoded in this format without having to drag
in the concept of a character set.

Agreed. Then again, the only reason we have it is because we needed to deal with lots o' chars.

Anyway my point was there's nothing special about 32 bits in utf8.

P.s. to the OP's point, it has nothing to do with strings.

/s/ Adam