lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Alex Queiroz wrote:
     UTF-8 is best for serialisation (writing text to disk, to socket
etc.). For in-memory strings it makes a lot of algorithms harder.
UCS-2 was a bad idea, but UTF-16 works perfectly well. UTF-32 is even

Not much, I'm afraid --- as each glyph can be comprised from multiple code points, having fixed-size code points doesn't help a great deal. Your algorithms still have to cope with variable-sized groups of code points. And if you're going to do that, you might as well use UTF-8 for its ASCII interoperability features.

┌─── ───── ─────
│ "They laughed at Newton. They laughed at Einstein. Of course, they
│ also laughed at Bozo the Clown." --- Carl Sagan