lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
From: Gregg Reynolds <dev@...>
Date: Tue, 10 Jul 2018 18:53:20 -0500

On Tue, Jul 10, 2018, 5:20 PM Alysson Cunha <alyssonrpg@gmail.com> wrote:

there are 3 entities with unicode strings::

1 - The bytes according to the encoding used (UTF-8, UTF-16 Big Endian, UTF-16 Little endian, UTF-32)
2 - The unicode code points - The union of one or more bytes compose the code points

Union? I don't think so.

3 - And the trickest of they, the glyphs. One or more unicode code points compose a single glyph.

Unicode does not traffic in glyphs. (Except when it must for backwards compatibility.)

Example: This flag "🏴󠁧󠁢󠁥󠁮󠁧󠁿" is composed of 7 unicode code points, these code-points encoded as UTF-8 occupies 14 bytes.
A single glyph (the flag) is composed by 7 unicode code points, or 14 UTF-8 bytes..

Please try to be precise. If you mean

Character 'BLACK FLAG' (U+2691)

Then you have a problem. That is exactly one code point and one char. If you mean some other Unicode char, then tell us what it is, in hex.

Follow-Ups:
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Sean Conner

References:
- Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Hugo Musso Gualandi
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Axel Kittenberger
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Lorenzo Donati
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Albert Chan
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Sean Conner
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Lorenzo Donati
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Dirk Laurie
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Soni "They/Them" L.
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Soni "They/Them" L.
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha

Prev by Date: Re: [ANN] LuaRocks 3.0.0beta1 for Unix
Next by Date: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Previous by thread: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Next by thread: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Index(es):
- Date
- Thread