lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

moin Denis

On Sun, Dec 27, 2009 at 01:42:21PM +0100, spir wrote:
> I'm building a unicode library. Basically, a UniString would be a real sequence of characters; which themselves mainly are defined by their code (point). Then, unistrings would have all typical string methods. (This is in contrast with common unicode string libraries that in fact provide --for me, useless-- methods on utf8 strings.)

while I am too dumb to understand what's your issue with
"unistrings would have all typical string methods
.. in contrast .. useless-- methods on utf8 strings.",
there is a library which has "all typical string methods" on utf8 strings

It is by itself lacking support for all that nifty collations,
but with a suitable libc like glibc you can easily fit that in
by using strcoll and a suitable locale.

Conversion/recoding on IO, including c14n, is a different issue,
probably better left to iconv and friends.

> Hints welcome.

reconsider your attitude to bring the light to the Lua community
every other day.

Try to figure out what is already there with the humbleness
appropriate for an apprentice and you will stumble upon some good stuff.

For example you might understand why Lua does not have
a character data type in the first place,
and how all the upper/lower/ctype etc methods work
pretty well on strings regardless of encoding.