[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
- From: Hugo Musso Gualandi <hgualandi@...>
- Date: Sat, 07 Jul 2018 13:41:12 -0300
> By miracle, if you do not use the "wrong" unicode characters, LUA
> accept it, because UNICODE was made to be backward compatible with
> ASCII till some point
To be pedantic, the backwards compatibility is because of the utf-8
encoding, not because of Unicode. And that was on purpose, not by
> Note: Using the public unicode character database it's easy to handle
> all white space characters of unicode.
A full unicode character database takes multiple megabytes. That is
dozens of times larger than the whole Lua interpreter is right now.
You would need to trim down the database, which would mean either a
restrictive "whitelist" of allowed characters (for example, different
whitespace is allowed but not chinese characters) or an overly
permissive system (for example, all characters are allowed in
identifiers, including non-alphabetical ones). I'm not sure either of
these are better than the ASCII status quo.