Re: Future plans for Lua and Unicode

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Future plans for Lua and Unicode
From: Roberto Ierusalimschy <roberto@...>
Date: Fri, 6 Jul 2012 09:36:51 -0300

> This make me think of a trick that could be useful in some situations.
> I know it is illegal according to UTF-8 specifications...
> But using overlong UTF-8 sequences could be used to *escape* special
> characters in string literals in a unified way !
> Typically, new line, carriage return, tabulation are entered as \n, \r
> and \t respectively. NUL byte and other control characters are written
> in decimal or hexadecimal form as \000 or \x01. And characters " ' and
> \ must often be entered as \", \' and \\.
> 
> [...]
> Is this idea completely stupid or has any practical interest ?

Sorry Patrick, but this one I would call "mostly stupid".  There are a
number of drawbacks for that approach. Being illegal, most text editors
will reject or silently convert overlong sequences, and do not have
a way to enter such a sequence neither. Other UTF-8 aware software
libraries will also reject overlong sequences. This seriously limit the
number of practical usages !   :)

Moreover, it is not that difficult to add escapes or to use [[...]].

-- Roberto

References:
- Future plans for Lua and Unicode, Simon Orde
- Re: Future plans for Lua and Unicode, Jerome Vuarand
- Re: Future plans for Lua and Unicode, Simon Orde
- Re: Future plans for Lua and Unicode, Dirk Laurie
- Re: Future plans for Lua and Unicode, Roberto Ierusalimschy
- Re: Future plans for Lua and Unicode, Patrick Rapin

Prev by Date: Re: Organizing sources
Next by Date: Re: Future plans for Lua and Unicode
Previous by thread: Re: Future plans for Lua and Unicode
Next by thread: Re: Future plans for Lua and Unicode
Index(es):
- Date
- Thread