On Tue, Jul 10, 2018, 5:20 PM Alysson Cunha <email@example.com> wrote:there are 3 entities with unicode strings::1 - The bytes according to the encoding used (UTF-8, UTF-16 Big Endian, UTF-16 Little endian, UTF-32)2 - The unicode code points - The union of one or more bytes compose the code pointsUnion? I don't think so.3 - And the trickest of they, the glyphs. One or more unicode code points compose a single glyph.Unicode does not traffic in glyphs. (Except when it must for backwards compatibility.)
Example: This flag "🏴" is composed of 7 unicode code points, these code-points encoded as UTF-8 occupies 14 bytes.A single glyph (the flag) is composed by 7 unicode code points, or 14 UTF-8 bytes..Please try to be precise. If you mean
Character 'BLACK FLAG' (U+2691)Then you have a problem. That is exactly one code point and one char. If you mean some other Unicode char, then tell us what it is, in hex.