[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: removing diacritical marks from strings
- From: "Eduardo Ochs" <eduardoochs@...>
- Date: Mon, 16 Apr 2007 03:29:36 -0300
Hi Norman,
Does anybody happen to have a mapping (or an algorithm) that will
remove diacritical marks from strings? I'd like to convert Jérôme to
Jerome and that sort of thing.
I use this:
translatechars = function (str, re, tbl)
return (string.gsub(str, re, function (c) return tbl[c] or c end))
end
unaccent_from, unaccent_to =
"ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝàáâãäåçèéêëìíîïñòóôõöøùúûüý",
"AAAAAACEEEEIIIINOOOOOOUUUUYaaaaaaceeeeiiiinoooooouuuuy"
unaccent_table = {}
for i = 1,string.len(unaccent_from) do
unaccent_table[string.sub(unaccent_from, i, i)] =
string.sub(unaccent_to, i, i)
end
unaccent_re = "([\192-\254])"
unaccent = function (str)
return translatechars(str, unaccent_re, unaccent_table)
end
Cheers,
Eduardo Ochs
http://angg.twu.net/
eduardoochs@gmail.com