lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi, All,

I'm working on an autoreplace script and am hoping for a tip that
might get me past a problem in unpredictability of key names. (Users
enter key/value pairs in a GUI to build a table of strings that will
replace other strings.)

Consider:

 function Replace_Substrings_in_String(s, tSubs)
  for k, v in pairs(s, tSub) do
    s = string.gsub(s, k, v)
  end
 return s
end -- function

tSubs = {
    ["---"] = "—", -- em dash
    ["--"] = "–",   -- en dash
    ["sss"] = "§", -- section
    ["ssss"] = "§§", --- sections
    ["ppp"] = "¶", -- paragraph
    ["pppp"] = "¶¶", -- paragraphs
{

s = Replace_Substrings_in_String(s, tSubs)

(Special characters are UTF-8.)

Because the order in which Lua returns non-array keys is
unpredictable, this type of substitution is problematic. For example,
if Lua returns the key for the en dash (two hyphens) before the key
for the em dash (three hyphens), the script will produce instead of an
em dash an en dash trailed by a hyphen.

I'm particularly concerned with this problem because such
abbreviations are used nearly universally by power users of word
processors and thus the likelihood is high that my users will create
them.

So my question is how I can assure that when multiple abbreviations
share the same leading sequence of identical characters, the keys are
processed in longest to shortest order?  (I don't anticipate any
problems if all keys were processed in longest to shortest order.)

Thanks in advance,

Paul