lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


As I understand it, UTF-8 essentially incorporates 7bit ASCII. If the 8th bit
is set, then its value is incremented by the next byte (and so on, up to 6
bytes). Am I wrong?

So any 7-bit ASCII string is effectively a "UTF-8" string? But not
vice-versa...

Roberto Ierusalimschy said:
> > But two identical utf-8 characters can have different encoding, right? 
> 
> No. I mean, if they have the same unicode number, they must have the
> same utf-8 encoding.
> 
> -- Roberto