[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Should Lua be more strict about Unicode errors?
- From: Dirk Laurie <dirk.laurie@...>
- Date: Wed, 9 Sep 2015 11:50:31 +0200
2015-09-08 22:51 GMT+02:00 Ross Berteig <Ross@cheshireeng.com>:
> UTF-8 is at least normalizable in a way that would stabilize and
> be immune to further normalization.
I think the intention of the disclaimer "Any operation that needs
the meaning of a character, such as character classification,
is outside its scope. " is that the utf8 library does not claim to
provide the full Monty. This discussion has amply proved that
it is a nontrivial task to provide such a library.
In the documentation of the utf8 library there are provisos like
"assuming that the subject is a valid UTF-8 string". The scope
of the manual does not include spelling out what happens
when something is out of spec. For example, it is nowhere stated
what #tbl returns when the table is not a sequence.
I'm happy that the manual says enough to warn people that the
utf8 library is not an implementation of a standard.
A logician, a mathematician and a salesman visited Namibia
for the first time. From the window of their bus, a karakul
sheep could be seen.
"Amazing", said the salesman. "The sheep in Namibia are black".
"No", corrected the mathematician. "At least one sheep in
Namibia is black."
The logician pursed his lips and slowly brought the forefinger
and thumb of his right hand together. "There is at least one
sheep in Namibia, and the side of it that we can see is black."