lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Wed, Jul 11, 2018 at 3:58 PM, Gregg Reynolds <dev@mobileink.com> wrote:
>
>
> On Wed, Jul 11, 2018, 1:43 AM Dirk Laurie <dirk.laurie@gmail.com> wrote:
> ...
>>
>> >From the point of view of the utf8 library, UTF-8 is a reversible way
>> of mapping a certain subset of strings (which I here call "codons",
>> borrowing a term from DNA theory) onto a certain subset of 32-bit
>> integers.
>
>
> Not even wrong. https://en.m.wikipedia.org/wiki/Not_even_wrong. Utf8 has
> nothing to do with "a certain subset of 32 bit integers".
>
> If you're talking about utf8, but you're not talking about Unicode, then
> what are you talking about? I'm not against it, I just don't see what you're
> after.

UTF-8 = Unicode Transformation Format, 8 bit. The transformation
methodology is independent of the character set it represents. The
CANONICAL APPLICATION of this transformation format is to represent
Unicode characters, but it can be considered to be a variable-length
integer representation scheme. There's nothing wrong with discussing
the manipulation of data encoded in this format without having to drag
in the concept of a character set.

/s/ Adam