lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On Oct 28, 2005, at 05:50, Rici Lake wrote:

On 27-Oct-05, at 2:30 PM, Walter Cruz wrote:

Hi all. Somedays algo, someone sen a mail to the list asking for a htmlentities function.

Well, I think about how can I get a more complete list of htmlentities to use as the table to the translation..

There is a complete (and official) list at www.w3.org, for each version of HTML (they are very similar)

Possibly a naive question, but here we go :)

Assuming a long list of character substitutions (e.g. "Ä" -> "A", etc):

http://dev.alt.textdrive.com/file/lu/LUStringBasicLatin.txt

What would be a reasonable implementation to actually perform the substitutions?

Right now, I invoke gsub() repetitively for all the substitution pairs:

function self:basicLatin( aString )
        if aString ~= nil then
                local aMap = self:basicLatinMap()

                for aKey, aValue in aMap:iterator() do
                        aString = aString:gsub( aKey, aValue )
                end
        end

        return aString
end

This works, but quite slowly :)

The strings themselves are UTF-8 encoded, but I'm rather at loss on how to define a pattern to break those into meaningful 'characters' and let gsub() work in term of the pattern instead of each key.

Ideas?

Cheers

--
PA, Onnay Equitursay
http://alt.textdrive.com/