Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
From: "Soni \"They/Them\" L." <fakedme@...>
Date: Tue, 10 Jul 2018 18:28:02 -0300



On 2018-07-10 06:20 PM, Gregg Reynolds wrote:

You point being?

I mean, it's a joke, really, but if I were to actually redesign unicode,I'd throw away all those annoying character tables and encode them aspart of the bits.

It would solve all practical problems with unicode. But we aren't gonnahave that, so we should instead stick with no unicode support for thetime being. At least until they finally decide that unicode was a hugemistake and restart the whole thing.

On Tue, Jul 10, 2018, 4:15 PM Soni "They/Them" L. <fakedme@gmail.com<mailto:fakedme@gmail.com>> wrote:




    On 2018-07-10 05:31 PM, Gregg Reynolds wrote:
    >
    >
    > On Tue, Jul 10, 2018, 9:00 AM Dirk Laurie <dirk.laurie@gmail.com
    <mailto:dirk.laurie@gmail.com>
    > <mailto:dirk.laurie@gmail.com <mailto:dirk.laurie@gmail.com>>>
    wrote:
    >
    >     2018-07-10 15:30 GMT+02:00 Lorenzo Donati
    >     <lorenzodonatibz@tiscali.it
    <mailto:lorenzodonatibz@tiscali.it>
    <mailto:lorenzodonatibz@tiscali.it
    <mailto:lorenzodonatibz@tiscali.it>>>:
    >
    >     > Unicode is great for typesetting (I use regularly LaTeX
    and it's
    >     fun to find
    >     > almost every symbol you may imagine, even ancient German runic
    >     scripts!),
    >     > but it sucks (IMHO) for general programming or
    computer-related
    >     stuff. Too
    >     > much mind overhead to use correctly for little gain.
    >
    >     Yes, yes, but — if you will allow me to return to Lua and
    UTF-8 —
    >     there would
    >     be more gain for a programmer if we had (if it is not too
    late already
    >     for Lua 5.4)
    >     utf8 versions of find, sub, match, gsub, gmatch, reverse. Just
    >     those, not asking
    >     for upper/lower, operating only on simple codepoints, no
    combining
    >     characters,
    >     no need for a C library.
    >
    >
    > Utf8 != Unicode. It's an encoding; you don't get to pick a
    subset and
    > still claim Unicode support.
    >
    > "Simple codepoints"? Does Unicode define that? If not, who decides
    > what that means? Zero-width space is pretty simple.
    >
    > No combining chars? Ok, but that would not be Unicode. Practical
    > result: massive confusion and complaining. You cannot accept
    Unicode
    > and reject combining chars.
    >
    >
    >
    >     utf8.find ("Hélène",'n')  --> 5 5
    >     utf8.sub ("Hélène",5)   --> 'ne'
    >     utf8.gsub ("Hélène","[éè]","e")  --> 'Helene' 2
    >     utf8.reverse ("Hélène")   --> 'enèléH'
    >

    https://gist.github.com/SoniEx2/ecd119507f160d9c26e3eabd9e012dc0

Follow-Ups:
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds

References:
- Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Hugo Musso Gualandi
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Axel Kittenberger
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Lorenzo Donati
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Albert Chan
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Sean Conner
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Lorenzo Donati
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Dirk Laurie
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Soni "They/Them" L.
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds

Prev by Date: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Next by Date: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Previous by thread: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Next by thread: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Index(es):
- Date
- Thread