lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> This make me think of a trick that could be useful in some situations.
> I know it is illegal according to UTF-8 specifications...
> But using overlong UTF-8 sequences could be used to *escape* special
> characters in string literals in a unified way !
> Typically, new line, carriage return, tabulation are entered as \n, \r
> and \t respectively. NUL byte and other control characters are written
> in decimal or hexadecimal form as \000 or \x01. And characters " ' and
> \ must often be entered as \", \' and \\.
> 
> [...]
> Is this idea completely stupid or has any practical interest ?

Sorry Patrick, but this one I would call "mostly stupid".  There are a
number of drawbacks for that approach. Being illegal, most text editors
will reject or silently convert overlong sequences, and do not have
a way to enter such a sequence neither. Other UTF-8 aware software
libraries will also reject overlong sequences. This seriously limit the
number of practical usages !   :)

Moreover, it is not that difficult to add escapes or to use [[...]].

-- Roberto