[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: UTF-8 patterns in Lua 5.3
- From: Roberto Ierusalimschy <roberto@...>
- Date: Mon, 21 Apr 2014 15:04:17 -0300
> Anyway, there is one bit of utf8 functionality I want in liblua.so not
> mentioned so far: safe handling of invalid sequences. Personally, I am
> going to assert() validity everywhere. But I know that "crash the current
> process" is unpopular.
The only "unsafe" function in the library is utf8.offset, which can
return an invalid sequence if it receives an invalid sequence. As
it does nothing with the sequence "meaning" (e.g., compare it to
other stuff, filter it, etc.), any subsequent handing of the returned
subsequence should detect any problem. Moreover, as Dirk already
pointed out, you can use utf8.len to validate a sequence if necessary.
-- Roberto