[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: code page
- From: Marco Antonio Abreu <mabreu.ti@...>
- Date: Wed, 13 May 2009 15:48:42 -0300
The DLL works, don't trancate the vales any more. But now returns the problem writing wrong chars (garbage) at the destination database. In our example, it now writes 'FlÃ¡via' in the field even with the 'N' flag before the string. Should we resolve it with a gsub substituition or you know a better solution?
2009/5/13 Ignacio Burgueño <firstname.lastname@example.org>
Kind of. In fact the problem is that LuaCOM is truncating characters.
David Given wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Marco Antonio Abreu wrote:
When a field
value has one accented char, it truncate the last one ('Flávia' comes
like 'Fl??vi' - ?? are especial chars), if the text has two accented
chars it has the last two chars cutted and so on...
This is a classic symptom of UTF-8 misparsing.
When converting "Flávia", it computes its size (6) and converts to utf-8 (which gives a 7 byte string: FlÃ¡via) BUT, it pushes just 6 bytes to Lua (instead of the required 7).
The issue is this. There's a function to convert from BSTR (utf-16 strings, as used by COM) to Lua strings.
So, the strings got truncated depending on the amount of codepoints present (roughly).
I'll push a fix for that to LuaCOM.
Marco Antonio Abreu
Analista de Sistemas