lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, Apr 30, 2013 at 3:24 PM, marbux <marbux@gmail.com> wrote:
> On Tue, Apr 30, 2013 at 5:16 AM, Rob Kendrick <rjek@rjek.com> wrote:
>
>> Also, I'm not sure how well Lua reacts when char is not 8 bits.
>
> This can be problematic absent a library for handling such characters.
> For UTF-8 chars, I've yet to hit a problem using the utf-8.lua library
> with either Lua 5.1.x or 5.2.x.
> <http://www.curse.com/addons/wow/utf8>. Written in pure Lua, it
> provides UTF-8 aware substitutes for string.len, string.sub,
> string.reverse, string.upper, and string.lower. There are other
> solutions. See generally, <http://lua-users.org/wiki/LuaUnicode>.
>
> Paul
>

I think you missed what was being said. UTF-8 still uses an 8-bit code
unit, which can be stored in an array of type "char". But not all
architectures historically have had "char" be 8 bits long -- 7-bit
chars and 9-bit chars historically weren't rare (9-bit chars make a
lot of sense on a system with 36-bit words), and even on some modern
architectures (TI's C54x DSPs for example) "char" is 16 bits long.

/s/ Adam