lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Sat, Jul 7, 2018 at 5:11 PM, Alysson Cunha <alyssonrpg@gmail.com> wrote:
>>>> However the correct space character is 0x20 (32).

> This is what I am telling.. What? Who said that 0x20 is the correct space
> character? Answer: ASCII

Lua designers decide what characters are correct space. They (
correctly, IMO ), decided non-breaking-space is not one of them.

> But in Unicode, we have more than 1 "correct space character", because it is
> Unicode, not ASCII... So, current LUA version does not support unicode
> characters.

If your definition of "correct space char" is so narrow as "being
usable as space separator in lua", you have more than one. I think at
least tabs work too.

And, also, 0xA0 is not a new unicode stuff. It's present in latin-1
and many other iso8859 ( 1 byte per char ) encodings.

It works in unicode because  the first 0x100 code points are the same
as latin-1.

> By miracle, if you do not use the "wrong" unicode characters, LUA accept it,
> because UNICODE was made to be backward compatible with ASCII till some
> point

With latin-1, and, also, utf-8 ( which is a byte encoding ) encodes
the first 0x80 chars the same as ASCII. ( but it does not encode the
second half of latin-1 the same as the usual 1 byte latin1 encoding ).

> Note: Using the public unicode character database it's easy to handle all
> white space characters of unicode.

Yeah, and using said database is quite difficult. Have you eer tried
to implement something with it ( not using a lib which does it,
implement the lib ).

And, anyway, it seems nobody has problems with 0xA0 not being defined
as a space in lua ( either when using utf-8, latin-* or win1252 or
others ).

Francisco Olarte.