lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Wed, Feb 11, 2015 at 11:47 AM, Christian <cn00@gmx.at> wrote:
>
> On 2015-02-11 20:29 Sean Conner wrote:
>>
>> It was thus said that the Great Igor Medeiros once stated:
>>>
>>> Dear contributors,
>>>
>>> Is there a way to covert a string whose characters are encoded in utf8,
>>> to
>>> a string with characters encoded in iso-8859-1, just using lua standard
>>> libs? I cannot use libs with C codes.
>>>
>>> If there is, could you tell me how to do that or even point some site
>>> with
>>> this information?
>>
>>    I don't know of any existing Lua code to do this, but the concept is
>> straightforward:
>>
>>         1. convert UTF-8 sequence to a Unicode codepoint
>>            (http://en.wikipedia.org/wiki/UTF-8)
>>
>>         2. Convert the Unicode codepoint
>>         (http://www.unicode.org/Public/UCD/latest/charts/CodeCharts.pdf
>> WARNING:
>>         LARGE PDF) to ISO-8859-1 codepoint
>>         (http://en.wikipedia.org/wiki/ISO/IEC_8859-1)
>>
>>         3. Go back to step 1 if more data.
>>
>>    -spc (That should be enough to get you going ... )
>>
>>
>
> Actually, the Latin 1 subset of unicode has the same codepoints as Latin1.
> It's just that UTF-8 is a different encoding. The following suffices to do
> the conversion with Lua 5.3:
>
>     function utf8_to_latin1(s)
>         local r = ''
>         for _, c in utf8.codes(s) do
>             r = r .. string.char(c)
>         end
>         return r
>     end
>
>
> HTH, Christian
>

This assumes string.char() is Latin-1. Is this always true or is it
dependent on the system locale?

/s/ Adam