lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Mon, Dec 17, 2018 at 10:40 PM Sean Conner <sean@conman.org> wrote:

>  Also, can someone describe the symbol I just copied?

If you look at the raw message payload, you will see that the character was encoded as =E2=80=89. In "quoted-printable" that means three bytes with the given hex values. The message headers indicate that the encoding was UTF-8, so you can use some UTF-8 decoder to get the Unicode codepoint for those three bytes, which gives you U+2009 THIN SPACE [1], which per [2] is described (among other spaces) as:

The main difference among other space characters is their width. U+2000..U+2006 are
standard quad widths used in typography. U+2007 figure space has a fixed width, known
as tabular width, which is the same width as digits used in tables. U+2008 punctuation
space is a space defined to be the same width as a period. U+2009 thin space and U+200A
hair space are successively smaller-width spaces used for narrow word gaps and for justification
of type. The fixed-width space characters (U+2000..U+200A) are derived from
conventional (hot lead) typography. Algorithmic kerning and justification in computerized
typography do not use these characters. However, where they are used (for example, in
typesetting mathematical formulae), their width is generally font-specified, and they typi-
cally do not expand during justification. The exception is U+2009 thin space, which
sometimes gets adjusted.

(end)

It finds another mention in [3]:

Some or all of the following characters may be tailored to be in MidNum, depending on the environment, to allow for languages that use spaces as thousands separators, such as €1 234,56.
U+0020 SPACE
U+00A0 NO-BREAK SPACE 
U+2007 FIGURE SPACE
U+2008 PUNCTUATION SPACE
U+2009 THIN SPACE
U+202F NARROW NO-BREAK SPACE

(end)

Which makes it relevant for the discussion.

Cheers,
V.


[2] The Unicode® Standard Version 11.0 – Core Specification, page 264. https://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf.

[3] Unicode® Standard Annex #29 UNICODE TEXT SEGMENTATION https://www.unicode.org/reports/tr29/tr29-33.html#Word_Boundary_Rules