[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Plea for the support of unicode escape sequences
- From: Edgar Toernig <froese@...>
- Date: Tue, 28 Jun 2011 21:08:10 +0200
As it's the last chance for probably 5 or 6 years to ask for it:
Could the next version have support for Unicode escape sequences?
(like "A smiley: \u263a, an en-dash: \u2013, an ellipsis: \u2026")
Unicode is in wide use now but encoding characters using the \x hex
escapes is annoying. Even now most extension libraries are at least
UTF-8 transparent but there's no sane way to enter non-trivial
unicode characters. I.e. the above string encoded by hand would
become "A smiley: \xe2\x98\xba, an en-dash: \xe2\x80\x93, an
ellipsis: \xe2\x80\xa6" and as unicode-tables usually don't contain
the UTF-8 encoded form you have to do the conversion manually.
For stock Lua, these Unicode escape sequences should generate UTF-8.
Modified version using wchars may use an appropriate encoding, UTF-16
or UTF-32). I don't care whether the common \u+4hexdigits and
\U+8hexdigits or a variable \u+1to6or8hexdigits sequence is implemented
(but don't make it decimal, unicode tables usually use hex numbering).
I know that Lua's authors try to avoid bloat, but these additional
176 bytes (that's what an implementation of the \u4x/\U8x variant on
x86-32 costs) are IMHO very well spent.