[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: is string.gmatch(), string.upper() 7-bit ascii only?
- From: Marc Balmer <marc@...>
- Date: Thu, 7 Apr 2016 16:40:08 +0200
> Am 07.04.2016 um 15:27 schrieb Roberto Ierusalimschy <roberto@inf.puc-rio.br>:
>
>> I am trying to manipulate text with umlauts. string.upper() does not produce upper case version of umlauts like ä,ö,ü etc.
>>
>> Also the %g pattern, when used in string.gmatch() does not match these umlauts.
>>
>> Is there anything that can be done about it? Or, am I making a stupid mistake?
>
> http://www.lua.org/manual/5.3/manual.html#6.4
> [...]
> The string library assumes one-byte character encodings.
>
> If you are using an 8-bit encoding (e.g., LATIN 1), then these should
> work, given a proper locale. Otherwise (e.g., UTF-8), you will need an
> external library.
>
Well, at least on an Ubuntu 14.04 system it does not work. But I don't blame Lua if the underlying OS supplied toupper() C function doesn't do the job, of course:
$ sudo locale-gen de_CH
Generating locales...
de_CH.ISO-8859-1... done
Generation complete.
$ lua
Lua 5.3.2 Copyright (C) 1994-2015 Lua.org, PUC-Rio
> = os.setlocale('de_CH.ISO-8859-1')
de_CH.ISO-8859-1
> = string.upper('äöü')
äöü
>
(Expected is 'ÄÖÜ')
- MARC