lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


You point being?

On Tue, Jul 10, 2018, 4:15 PM Soni "They/Them" L. <fakedme@gmail.com> wrote:


On 2018-07-10 05:31 PM, Gregg Reynolds wrote:
>
>
> On Tue, Jul 10, 2018, 9:00 AM Dirk Laurie <dirk.laurie@gmail.com
> <mailto:dirk.laurie@gmail.com>> wrote:
>
>     2018-07-10 15:30 GMT+02:00 Lorenzo Donati
>     <lorenzodonatibz@tiscali.it <mailto:lorenzodonatibz@tiscali.it>>:
>
>     > Unicode is great for typesetting (I use regularly LaTeX and it's
>     fun to find
>     > almost every symbol you may imagine, even ancient German runic
>     scripts!),
>     > but it sucks (IMHO) for general programming or computer-related
>     stuff. Too
>     > much mind overhead to use correctly for little gain.
>
>     Yes, yes, but — if you will allow me to return to Lua and UTF-8 —
>     there would
>     be more gain for a programmer if we had (if it is not too late already
>     for Lua 5.4)
>     utf8 versions of find, sub, match, gsub, gmatch, reverse. Just
>     those, not asking
>     for upper/lower, operating only on simple codepoints, no combining
>     characters,
>     no need for a C library.
>
>
> Utf8 != Unicode. It's an encoding; you don't get to pick a subset and
> still claim Unicode support.
>
> "Simple codepoints"? Does Unicode define that? If not, who decides
> what that means? Zero-width space is pretty simple.
>
> No combining chars? Ok, but that would not be Unicode. Practical
> result: massive confusion and complaining. You cannot accept Unicode
> and reject combining chars.
>
>
>
>     utf8.find ("Hélène",'n')  --> 5 5
>     utf8.sub ("Hélène",5)   --> 'ne'
>     utf8.gsub ("Hélène","[éè]","e")  --> 'Helene' 2
>     utf8.reverse ("Hélène")   --> 'enèléH'
>

https://gist.github.com/SoniEx2/ecd119507f160d9c26e3eabd9e012dc0