lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On Fri, Sep 27, 2013 at 11:22 AM, Coda Highland <chighland@gmail.com> wrote:
> I use a very simple JSON encoder that just scans the string character by
> character and substitutes the correct escape sequence whenever one of these
> characters is encountered.  I don't think you need to resort to Base64 or
> other binary encodings unless you really want to.

Two major problems here:

(1) Not every value is a valid Unicode character. There are several
ranges defined as illegal, for various reasons.

(2) Whether 8, 16, or 32 bit, not every byte sequence is a legal UTF
representation.

Sorry, you are absolutely right.  Getting back to the OP's question, the only issue seems to be how to determine when base64 encoding is needed, and I think OP is correct that you will have to scan the string to check whether it contains any invalid sequences for a UTF representation (either that, or just base64 encoding the data all the time regardless).  I couldn't easily tell from http://lua-users.org/wiki/LuaUnicode whether any of the unicode packages already provide a function to test whether or not a string is a valid UTF encoding.