lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



--- On Mon, 5/20/13, marbux <marbux@gmail.com> wrote:


> I'm working on an autoreplace script and am hoping for a tip
> that
> might get me past a problem in unpredictability of key
> names. (Users
> enter key/value pairs in a GUI to build a table of strings
> that will
> replace other strings.)
> 
> Consider:
> 
>  function Replace_Substrings_in_String(s, tSubs)
>   for k, v in pairs(s, tSub) do
>     s = string.gsub(s, k, v)
>   end
>  return s
> end -- function
> 
> tSubs = {
>     ["---"] = "—", -- em dash
>     ["--"] = "–",   -- en dash
>     ["sss"] = "§", -- section
>     ["ssss"] = "§§", --- sections
>     ["ppp"] = "¶", -- paragraph
>     ["pppp"] = "¶¶", -- paragraphs
> {
> 
> s = Replace_Substrings_in_String(s, tSubs)
> 
> (Special characters are UTF-8.)
> 
> Because the order in which Lua returns non-array keys is
> unpredictable, this type of substitution is problematic. For
> example,
> if Lua returns the key for the en dash (two hyphens) before
> the key
> for the em dash (three hyphens), the script will produce
> instead of an
> em dash an en dash trailed by a hyphen.
> 

> So my question is how I can assure that when multiple
> abbreviations
> share the same leading sequence of identical characters, the
> keys are
> processed in longest to shortest order?  (I don't
> anticipate any
> problems if all keys were processed in longest to shortest
> order.)

function escape( s )
  return string.gsub(s, "[][^$()%.*+-?]", "%%%0" )
end

function substitute(s, substitutions)
  sorted = {}
  for k, _ in pairs( substitutions ) do
    table.insert( sorted, k )
  end
  table.sort( sorted, function (a,b) return #a > #b end )
  for _, key in ipairs( sorted ) do
    s,cnt = string.gsub(s, escape(key), substitutions)
  end
  return s
end