[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Database connectivity
- From: Jan Behrens <jbe-lua-l@...>
- Date: Fri, 20 Feb 2015 00:18:53 +0100
On Thu, 19 Feb 2015 14:05:46 -0800
William Ahern <william@25thandClement.com> wrote:
> On Thu, Feb 19, 2015 at 04:05:46PM -0500, Daurnimator wrote:
> > So, I probably would not use your library unless you put quite a
> > lot of effort in; This would include things like non-blocking
> > forms, and consistent encodings (e.g. MySQL you need utf8mb4).
> Wow. I just read this page
> where it is says
> The character set named utf8 uses a maximum of three bytes per
> character and contains only BMP characters. As of MySQL
> 5.5.3, the utf8mb4 character set uses a maximum of four bytes per
> character supports supplemental characters.
> For a supplementary character, utf8 cannot store the
> character at all, while utf8mb4 requires four bytes to store it.
> Since utf8 cannot store the character at all, you do not have any
> supplementary characters in utf8 columns and you need not worry about
> converting characters or losing data when upgrading utf8 data from
> older versions of MySQL.
> That's horrendous. I didn't think my opinion about the quality of
> MySQL could sink any lower. They can't even do the honorable thing
> and explicitly say (rather than leave it implied) that the problem
> isn't with UTF-8, but "utf8", a bastardized encoding with a
> confusingly similar name.
Not just an issue of MySQL, I believe.
See also http://en.wikipedia.org/wiki/CESU-8 and
Unicode Technical Report #26: http://www.unicode.org/reports/tr26/
Public Software Group e. V.
Johannisstr. 12, 10117 Berlin, Germany
vorstand at public-software-group.org
eingetragen in das Vereinregister
des Amtsgerichtes Charlottenburg
Registernummer: VR 28873 B