lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Gregg, you're mistaken.

Please, visit: http://unicode.org/Public/emoji/5.0/emoji-sequences.txt

That flag is not the BLACK FLAG... it is the England Flag

image.png

Another good example is the glyph "1️⃣"

It is product of the sequence of unicode code points: 0031 FE0F 20E3 . Note that the first unicode code point in that sequence is the traditional/default '1' character, the same value for the ASCII Encoding... but it is following with 2 more code points, changing the glyph to be rendered.



Em Ter, 10 de jul de 2018 20:53, Gregg Reynolds <dev@mobileink.com> escreveu:


On Tue, Jul 10, 2018, 5:20 PM Alysson Cunha <alyssonrpg@gmail.com> wrote:
there are 3 entities with unicode strings::

1 - The bytes according to the encoding used (UTF-8, UTF-16 Big Endian, UTF-16 Little endian, UTF-32)
2 - The unicode code points - The union of one or more bytes compose the code points

Union? I don't think so.

3 - And the trickest of they, the glyphs. One or more unicode code points compose a single glyph.

Unicode does not traffic in glyphs. (Except when it must for backwards compatibility.)

Example: This flag "🏴󠁧󠁢󠁥󠁮󠁧󠁿" is composed of 7 unicode code points, these code-points encoded as UTF-8 occupies 14 bytes.
A single glyph (the flag) is composed by 7 unicode code points, or 14 UTF-8 bytes..

Please try to be precise. If you mean 

Character 'BLACK FLAG' (U+2691)


Then you have a problem. That is exactly one code point and one char. If you mean some other Unicode char, then tell us what it is, in hex.