|
Alex Queiroz wrote: [...]
UTF-8 is best for serialisation (writing text to disk, to socket etc.). For in-memory strings it makes a lot of algorithms harder. UCS-2 was a bad idea, but UTF-16 works perfectly well. UTF-32 is even better.
Not much, I'm afraid --- as each glyph can be comprised from multiple code points, having fixed-size code points doesn't help a great deal. Your algorithms still have to cope with variable-sized groups of code points. And if you're going to do that, you might as well use UTF-8 for its ASCII interoperability features.
-- ┌─── dg@cowlark.com ───── http://www.cowlark.com ───── │ │ "They laughed at Newton. They laughed at Einstein. Of course, they │ also laughed at Bozo the Clown." --- Carl Sagan