lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

On Thu, Oct 13, 2005 at 11:55:04AM -0700, Mark Hamburg wrote:
> Given the frequency with which these issues come up, it seems pretty clear
> that we need standard libraries for Lua which, if not part of the standard
> distribution, can be readily pointed to.
> 1. A UTF8 version of string. string.byte presumably turns into something
> like string.char. The answer to encoded strings is to use this library
> instead of the standard string library. (If writing this, I would probably
> optimize for the pure ASCII case and fallback to the standard distribution
> code.)
points to slnunicode at
which handles all valid Unicode characters in UTF-8,
not limited to the BMP (64K chars) and avoiding the
excess storage for ASCII.
Unlike Java, it also supports the i'th char with regard
to combining marks. It fails a few special cases which
are documented in the source and easily added.

The module is based on the 5.0 string lib with some
extensions from early 5.1 work; a 5.1 update with
new string find et al and conforming to 5.1 module
stuff is on the way.

> 2. A StringBuffer class that implements editable strings. The table approach
> handles efficient concatenation, but doesn't allow for other changes. Of
> course, combined with the preceding, one actually needs a UTF8 buffer
> class...
we have an implementation announced earlier but not yet
made public based on a tiny generic I/O library (tio).
The "string buffer" is basically a file in RAM
made from a simple buffer chain (or, for that matter,
possibly using an external temp file for large buffers).

As a file it directly supports seek/write anywhere,
but not cut/paste. Adding these is not hard,
yet would cost some simplicity and speed.
See the discussion in April following

Main reason for not finishing this yet was to wait
for 5.1 to settle down; right now we are catching up
to the 5.1 alpha.

> If such things already exist, maybe we just need to make their existence
> more prominently known.
The existing module for 1. should serve most needs.
2. will have multiple implementations for different