[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Problem with table key iteration when string keys are unpredictable
- From: "S. Fisher" <expandafter@...>
- Date: Tue, 21 May 2013 03:48:45 -0700 (PDT)
--- On Mon, 5/20/13, marbux <marbux@gmail.com> wrote:
> I'm working on an autoreplace script and am hoping for a tip
> that
> might get me past a problem in unpredictability of key
> names. (Users
> enter key/value pairs in a GUI to build a table of strings
> that will
> replace other strings.)
>
> Consider:
>
> function Replace_Substrings_in_String(s, tSubs)
> for k, v in pairs(s, tSub) do
> s = string.gsub(s, k, v)
> end
> return s
> end -- function
>
> tSubs = {
> ["---"] = "—", -- em dash
> ["--"] = "–", -- en dash
> ["sss"] = "§", -- section
> ["ssss"] = "§§", --- sections
> ["ppp"] = "¶", -- paragraph
> ["pppp"] = "¶¶", -- paragraphs
> {
>
> s = Replace_Substrings_in_String(s, tSubs)
>
> (Special characters are UTF-8.)
>
> Because the order in which Lua returns non-array keys is
> unpredictable, this type of substitution is problematic. For
> example,
> if Lua returns the key for the en dash (two hyphens) before
> the key
> for the em dash (three hyphens), the script will produce
> instead of an
> em dash an en dash trailed by a hyphen.
>
> So my question is how I can assure that when multiple
> abbreviations
> share the same leading sequence of identical characters, the
> keys are
> processed in longest to shortest order? (I don't
> anticipate any
> problems if all keys were processed in longest to shortest
> order.)
function escape( s )
return string.gsub(s, "[][^$()%.*+-?]", "%%%0" )
end
function substitute(s, substitutions)
sorted = {}
for k, _ in pairs( substitutions ) do
table.insert( sorted, k )
end
table.sort( sorted, function (a,b) return #a > #b end )
for _, key in ipairs( sorted ) do
s,cnt = string.gsub(s, escape(key), substitutions)
end
return s
end