lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> By miracle, if you do not use the "wrong" unicode characters, LUA
> accept it, because UNICODE was made to be backward compatible with
> ASCII till some point

To be pedantic, the backwards compatibility is because of the utf-8
encoding, not because of Unicode. And that was on purpose, not by
miracle :)

> Note: Using the public unicode character database it's easy to handle
> all white space characters of unicode.

A full unicode character database takes multiple megabytes[1]. That is
dozens of times larger than the whole Lua interpreter is right now.

You would need to trim down the database, which would mean either a
restrictive "whitelist" of allowed characters (for example, different
whitespace is allowed but not chinese characters) or an overly
permissive system (for example, all characters are allowed in
identifiers, including non-alphabetical ones). I'm not sure either of
these are better than the ASCII status quo.

[1] http://apps.icu-project.org/datacustom/