lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


There's no utf8.sub. Anyone tried to code it in pure Lua yet?
This is my attempt:

function utf8.sub(s,i,j)
   i = i or 1
   j = j or -1
   if i<1 or j<1 then
      local n = utf8.len(s)
      if not n then return nil end
      if i<0 then i = n+1+i end
      if j<0 then j = n+1+j end
      if i<0 then i = 1 elseif i>n then i = n end
      if j<0 then j = 1 elseif j>n then j = n end
   end
   if j<i then return "" end
   i = utf8.offset(s,i)
   j = utf8.offset(s,j+1)
   if i and j then return s:sub(i,j-1)
      elseif i then return s:sub(i)
      else return ""
   end
end

Notes:

1. Care is taken to avoid calculating utf8.len(s) when it is not necessary.
2. The use of utf8.offset implies that the result is undefined when s is not
a valid UTF8 sequence.

This code is _long_. Sure would be nice if the utf8 library already had that :-)