lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi Ignacio,
The DLL works, don't trancate the vales any more. But now returns the problem writing wrong chars (garbage) at the destination database. In our example, it now writes 'Flávia' in the field even with the 'N' flag before the string. Should we resolve it with a gsub substituition or you know a better solution?
tks
Marco


2009/5/13 Ignacio Burgueño <ignaciob@inconcertcc.com>
David Given wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Marco Antonio Abreu wrote:
When a field
value has one accented char, it truncate the last one ('Flávia' comes
like 'Fl??vi' - ?? are especial chars), if the text has two accented
chars it has the last two chars cutted and so on...

This is a classic symptom of UTF-8 misparsing.

Kind of. In fact the problem is that LuaCOM is truncating characters.


The issue is this. There's a function to convert from BSTR (utf-16 strings, as used by COM) to Lua strings.
When converting "Flávia", it computes its size (6) and converts to utf-8 (which gives a 7 byte string: Flávia) BUT, it pushes just 6 bytes to Lua (instead of the required 7).

So, the strings got truncated depending on the amount of codepoints present (roughly).

I'll push a fix for that to LuaCOM.

Regards,
Ignacio Burgueño




--
Marco Antonio Abreu
Analista de Sistemas