On Wed, Jul 11, 2018, 4:35 PM Jay Carlson <
nop@nop.com> wrote:
On 2018-07-11, at 4:58 PM, Gregg Reynolds <dev@mobileink.com> wrote:
> On Wed, Jul 11, 2018, 1:43 AM Dirk Laurie <dirk.laurie@gmail.com> wrote:
> ...
> >From the point of view of the utf8 library, UTF-8 is a reversible way
> of mapping a certain subset of strings (which I here call "codons",
> borrowing a term from DNA theory) onto a certain subset of 32-bit
> integers.
>
> Not even wrong. https://en.m.wikipedia.org/wiki/Not_even_wrong. Utf8 has nothing to do with "a certain subset of 32 bit integers".
What's wrong with the claim?
Depends on what you mean by "the claim". In any case utf8 is a well-defined variable-width mapping. Some stuff ends up as 32 bits, some doesn't. Nothing magical about 32. Also it has nothing to do with strings.