[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
- From: Gregg Reynolds <dev@...>
- Date: Sat, 7 Jul 2018 14:22:46 -0500
I am raising the question: Should the future Lua 5.5 have unicode support? Since a lot (a lot a lot) encoding issues were solved with unicode and we are observing an international trend for the utf-8 use....
In my opinion: unicode is the future (actually, unicode is already the present for the past years), and ASCII was developed in 1960. Today, it is an old and very limitted character encoding.....
I would love to see LUA keep up to date.
Unicode includes many many codepoints whose only justification is either backwards compatibility (see e.g. Arabic presentation forms) or typesetting (non-breaking space, em-space, etc.)
Not to mention support for scholarship. Does Lua really need to support identifiers in Cuneiform or Egyptian hieroglyphics? Those codepoints are only there to support publishing, not programming.
IOW it is a hack, unavoidably.
It would be nice to support identifiers in multiple languages, but that would only be a subset of Unicode anyway. And the typesetting codes would never be needed for a programming language. You could do it but only at a cost, and it makes more sense to put the burden on the programmer to normalize all space chars to one.