lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, Apr 26, 2007 at 01:47:02PM +0200, David Kastrup wrote:
> The only documentation I have been able to find is in "unitest", and
> it is very, very sketchy.
or let's say terse ;)
"UTF-8 operates on UTF-8 sequences as of RFC 3629".
Even "format ... uses character counts for precision in %s".
The grapheme module counts grapheme clusters.

> --	NOTE: find positions are in bytes for all ctypes!
> --	use ascii.sub to cut found ranges!
right, utf8.find _returning_ byte positions has a special note,
exactly because utf8.sub does NOT work with byte counts.

> It does not exactly sound like character-based indexing to me.
sorry if this is confusing.

Would be great if somebody would write some serious documentation.
However, a quick look at the test cases reveals not only what
the module is supposed to do, but what it actually does.