there are 3 entities with unicode strings::
1 - The bytes according to the encoding used (UTF-8, UTF-16 Big Endian, UTF-16 Little endian, UTF-32)
2 - The unicode code points - The union of one or more bytes compose the code points
3 - And the trickest of they, the glyphs. One or more unicode code points compose a single glyph.
Example: This flag "🏴" is composed of 7 unicode code points, these code-points encoded as UTF-8 occupies 14 bytes.
A single glyph (the flag) is composed by 7 unicode code points, or 14 UTF-8 bytes.
Many emojis are union of more than 1 code point.... And there are the Composing Code Points .... A + ´ , (2 unicode code points) that my be presented as "Á" by text editors/text presenters.
I think utf8.len() returns the quantity of Unicode Code Points, not glyphs...
PS: In Delphi, I made a library myself to handle glyphs, code points and bytes....