lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


github repo: https://github.com/starwing/luautf8


UTF-8 module for Lua 5.x
========================

This module is add UTF-8 support to Lua.

It use data extracted from Unicode Character Database[1], and tested on Lua
5.2.3 and LuaJIT.

parseucd.lua is a pure Lua script generate unidata.h, to support convert
characters and check characters' category.

It mainly used to compatible with Lua's own string module, it passed all
string and pattern matching test in lua test suite[2].

It also add some useful routines against UTF-8 features, some like:
  - a convenient interface to escape Unicode sequence in string. 
  - string insert/remove, since UTF-8 substring extract may expensive.
  - calculate Unicode width, useful when implement e.g. console emulator.
  - a useful interface to translate Unicode offset and byte offset.

[1]: http://www.unicode.org/reports/tr44/
[2]: http://www.lua.org/tests/5.2/


--
regards,
Xavier Wang.