Re: Plea for the support of unicode escape sequences

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Plea for the support of unicode escape sequences
From: David Given <dg@...>
Date: Thu, 30 Jun 2011 13:20:07 +0100

Jim Whitehead II wrote:
[...]
> The official unicode roadmap includes a code map for Tengwar:
> http://en.wikipedia.org/wiki/Tengwar#Unicode

However, as I discovered the other day when I needed them for a
particularly esoteric program I was writing, there is no Malachim,
Celestial, Theban, or Transitus Fluvii scripts. There isn't even any
Enochian. Unicode 6 *has* added a bunch of alchemical symbols, but
there's only so much you can do with those...

Wrenching the discussion at least back in the direction of being on
topic, I think that at some point they're going to have to lift the
0x110000 limit on the Unicode space size. I'm pretty sure that limit was
only imposed to keep Java and Windows happy; they standardised on UCS-2
way too soon, back when they thought 0x10000 was more than anyone would
need, and as a result shot themselves in the foot really badly. If you
don't believe me, just go look at surrogates, and then check out the
Java String API and the hideous mess that is charAt() vs
codePointAt()... and then try using astral plane code points in online
services and seeing how many of them actually work.

Which is why, of course, people should not be using UCS-2 or UTF-16 for
anything. In fact, I'd suggest not using UCS-4 either --- it encourages
shortcuts in handling Unicode that aren't actually valid, like assuming
you can split strings anywhere. UTF-8 FTW.

--
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "I have always wished for my computer to be as easy to use as my
│ telephone; my wish has come true because I can no longer figure out
│ how to use my telephone." --- Bjarne Stroustrup

Attachment: signature.asc
Description: OpenPGP digital signature

Follow-Ups:
- Re: Plea for the support of unicode escape sequences, Javier Guerra Giraldez

References:
- Plea for the support of unicode escape sequences, Edgar Toernig
- Re: Plea for the support of unicode escape sequences, David Kolf
- Re: Plea for the support of unicode escape sequences, Louis Mamakos
- Re: Plea for the support of unicode escape sequences, Lorenzo Donati
- Re: Plea for the support of unicode escape sequences, David Given
- Re: Plea for the support of unicode escape sequences, KHMan
- Re: Plea for the support of unicode escape sequences, David Given
- Re: Plea for the support of unicode escape sequences, Matthew Frazier
- Re: Plea for the support of unicode escape sequences, Jim Whitehead II

Prev by Date: Re: Lua as (Matlab like) calculator
Next by Date: Re: Changing tables during traversal: clearing and re-adding a field
Previous by thread: Re: Plea for the support of unicode escape sequences
Next by thread: Re: Plea for the support of unicode escape sequences
Index(es):
- Date
- Thread