lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

if you only notice size, maybe this version can give you some hint:

local function utf8_sep(n, a, ...)
    if a < 2^(6-n) then return n+1, a, ... end
    return utf8_sep(n+1, math.floor(a/2^6), 0x80+a%2^6, ...)
local function utf8_gen(n, a, ...)
    print(n, (2^n-1)*2^(8-n) + a, ...)
    return string.char((2^n-1)*2^(8-n) + a, ...)
local function utf8(code)
    if code < 0x80 then return string.char(code) end
    return utf8_gen(utf8_sep(0, code))

it's only 11 loc, witch your [1] has 14.

notice it will produce all value in 32bit, but not yours, yours will
fall on 2097152

2012/6/19 Patrick Rapin <>:
> Essentially as an exercise, I tried to write the smaller possible
> UTF-8 encoder in Lua [1].
> Compared to a naive implementation like in [2], it is around 2.6 times shorter.
> Still, I am wondering if the code could be further shorted (not
> counting space removal).
> [1]
> [2]  (and that implementation doesn't
> handle 4 bytes codes)