String Interpolation

lua-users home
wiki

When variables need to be interpolated in strings, the resultant quoting and unquoting can become slightly unwieldy:

print("Hello " .. name .. ", the value of key " .. k .. " is " .. v .. "!")

Compare to Perl, where variables can be embedded in strings:

print "Hello $name, the value of key $k is $b!\n";

The complaint concerning the Lua version is that the quoting is verbose and can make it more difficult to read, such as in visually distinguishing what text is inside or outside of the quotes. Besides using an editor with syntax highlighting, the latter issue might be improved with a bracketed quoting style:

print([[Hello ]] .. name .. [[, the value of key ]] ..
      k .. [[ is ]] .. v .. [[!]])

This might also be made more terse with string.format:

print(string.format("Hello %s, the value of key %s is %s", name, k, v))

possibly using a helper function:

function printf(...) print(string.format(...)) end

printf("Hello %s, the value of key %s is %s", name, k, v)

The new problem that this presents is that the variables are identified positionally, which presents readability and maintainability problems if the number of variables is large.

The following solutions show how to implement support for interpolating variables into strings in Lua to achieve a syntax somewhat like this:

printf("Hello %(name), the value of key %(k) is %(v)")

Solution: Named Parameters in Table

Here's one simple implementation (-- RiciLake):

function interp(s, tab)
  return (s:gsub('($%b{})', function(w) return tab[w:sub(3, -2)] or w end))
end
print( interp("${name} is ${value}", {name = "foo", value = "bar"}) )

getmetatable("").__mod = interp
print( "${name} is ${value}" % {name = "foo", value = "bar"} )
-- Outputs "foo is bar"

Solution: Named Parameters with Formatting Codes

Here's another implementation (-- RiciLake) supporting Pythonic formatting specifications (requires Lua 5.1 or greater):

function interp(s, tab)
  return (s:gsub('%%%((%a%w*)%)([-0-9%.]*[cdeEfgGiouxXsq])',
            function(k, fmt) return tab[k] and ("%"..fmt):format(tab[k]) or
                '%('..k..')'..fmt end))
end
getmetatable("").__mod = interp
print( "%(key)s is %(val)7.2f%" % {key = "concentration", val = 56.2795} )
-- outputs "concentration is   56.28%"

Solution: Named Parameters and Format String in Same Table

Here's another Lua-only solution (-- MarkEdgar):

function replace_vars(str, vars)
  -- Allow replace_vars{str, vars} syntax as well as replace_vars(str, {vars})
  if not vars then
    vars = str
    str = vars[1]
  end
  return (string_gsub(str, "({([^}]+)})",
    function(whole,i)
      return vars[i] or whole
    end))
end

-- Example:
output = replace{
	[[Hello {name}, welcome to {company}. ]],
	name = name,
	company = get_company_name()
}

Solution: Ruby- and Python-like string formatting with % operator

Both Ruby and Python have a short form for string formatting, using the % operator.

The following snippet adds a similar use of the mod operator to lua:

getmetatable("").__mod = function(a, b)
        if not b then
                return a
        elseif type(b) == "table" then
                return string.format(a, unpack(b))
        else
                return string.format(a, b)
        end
end

Example usage:


print( "%5.2f" % math.pi )

print( "%-10.10s %04d" % { "test", 123 } )

You might like or dislike this notation, choose for yourself.

Hack: Using debug to Access Lexicals

Below is a more complex implementation (-- DavidManura). This makes use of the debug library (particularly debug.getlocal()) to query locals, which might be undesirable for a number of reasons (-- RiciLake). First, it can be used to break into things you shouldn't break into, so it's a bad idea if trusted code is being run. debug.getlocal() is also expensive since it needs to scan through the entire byte code to figure out which variables are in scope. It also does not capture closed variables.

Code:

-- "nil" value that can be stored in tables.
local mynil_mt = {__tostring = function() return tostring(nil) end}
local mynil = setmetatable({}, mynil_mt)

-- Retrieves table of all local variables (name, value)
-- in given function <func>.  If a value is Nil, it instead
-- stores the value <mynil> in the table to distinguish a
-- a local variable that is nil from the local variable not
-- existing.
-- If a number is given in place of <func>, then it
-- uses that level in the call stack.  Level 1 is the
-- function that called get_locals.
-- Note: this correctly handles the case where two locals have the
-- same name: "local x = 1 ... get_locals(1) ... local x = 2".
-- This function is similar and is based on debug.getlocal().
function get_locals(func)
  local n = 1
  local locals = {}
  func = (type(func) == "number") and func + 1 or func
  while true do
    local lname, lvalue = debug.getlocal(func, n)
    if lname == nil then break end  -- end of list
    if lvalue == nil then lvalue = mynil end  -- replace
    locals[lname] = lvalue
    n = n + 1
  end
  return locals
end


-- Interpolates variables into string <str>.
-- Variables are defined in table <table>.  If <table> is
-- omitted, then it uses local and global variables in the
-- calling function.
-- Option level indicates the level in the call stack to
-- obtain local variable from (1 if omitted).
function interp(str, table, level)
  local use_locals = (table == nil)
  table = table or getfenv(2)
  if use_locals then
    level = level or 1
    local locals = get_locals(level + 1)
    table = setmetatable(locals, {__index = table})
  end
  local out = string.gsub(str, '$(%b{})',
    function(w)
      local variable_name = string.sub(w, 2, -2)
      local variable_value = table[variable_name]
      if variable_value == mynil then variable_value = nil end
      return tostring(variable_value)
    end
  )
  return out
end

-- Interpolating print.
-- This is just a wrapper around print and interp.
-- It only accepts a single string argument.
function printi(str)
  print(interp(str, nil, 2))
end

-- Pythonic "%" operator for srting interpolation.
getmetatable("").__mod = interp

Tests:

-- test globals
x=123
assert(interp "x = ${x}" == "x = 123")

-- test table
assert(interp("x = ${x}", {x = 234}) == "x = 234")

-- test locals (which override globals)
do
  local x = 3
  assert(interp "x = ${x}" == "x = 3")
end

-- test globals using setfenv
function test()
  assert(interp "y = ${y}" == "y = 123")
end
local env = {y = 123}
setmetatable(env, {__index = _G})
setfenv(test, env)
test()

-- test of multiple locals of same name
do
  local z = 1
  local z = 2
  assert(interp "z = ${z}" == "z = 2")
  local z = 3
end

-- test of locals with nil value
do
  z = 2
  local z = 1
  local z = nil
  assert(interp "z = ${z}" == "z = nil")
end

-- test of printi
x = 123
for k, v in ipairs {3,4} do
  printi("${x} - The value of key ${k} is ${v}")
end

-- test of "%" operator
assert("x = ${x}" % {x = 2} == "x = 2")

Various enhancements could be made. For example,

v = {x = 2}
print(interp "v.x = ${v.x}")  -- not implemented

Patch to Lua

One of the features I loved in Ruby and PHP was the ability to include variables inside strings, example print "Hello ${Name}" The following patch does the same thing but only for the doc string type, strings starting with [[ and ending with ]]. It uses the "|" character to represent the open and close braces.

To add variables inline example :

output = [[Hello |name|, welcome to |get_company_name()|. ]]

What the patch does is quite literally convert the above to:

output = [[Hello ]]..name..[[, welcome to ]]..get_company_name()..[[. ]]

The following functions are updated in the llex.c file.

Important Note: Somehow, I needed another character as a means to represent the closing brace inside the code, and I have arbitarily chosen '' , what this means if somehow you have that character in your string (specially when you are using foreign language encoding) you will get a syntax error. I don't know if there is the solution to this problem as yet.


int luaX_lex (LexState *LS, SemInfo *seminfo) {
  for (;;) {
    switch (LS->current) {

      case '\n': {
        inclinenumber(LS);
        continue;
      }
      case '-': {
        next(LS);
        if (LS->current != '-') return '-';
        /* else is a comment */
        next(LS);
        if (LS->current == '[' && (next(LS), LS->current == '['))
          read_long_string(LS, NULL);  /* long comment */
        else  /* short comment */
          while (LS->current != '\n' && LS->current != EOZ)
            next(LS);
        continue;
      }
      case '[': {
        next(LS);
        if (LS->current != '[') return '[';
        else {
          read_long_string(LS, seminfo);
          return TK_STRING;
        }
      }
      case '=': {
        next(LS);
        if (LS->current != '=') return '=';
        else { next(LS); return TK_EQ; }
      }
      case '<': {
        next(LS);
        if (LS->current != '=') return '<';
        else { next(LS); return TK_LE; }
      }
      case '>': {
        next(LS);
        if (LS->current != '=') return '>';
        else { next(LS); return TK_GE; }
      }
      case '~': {
        next(LS);
        if (LS->current != '=') return '~';
        else { next(LS); return TK_NE; }
      }
      case '"':
      case '\'': {
        read_string(LS, LS->current, seminfo);
        return TK_STRING;
      }

	// added!!!
        //------------------------------
      case '|': {
	 LS->current = '';
	 return TK_CONCAT;
      }

      case '': {
	read_long_string(LS, seminfo);
	return TK_STRING;
	}
        //------------------------------

      case '.': {
        next(LS);
        if (LS->current == '.') {
          next(LS);
          if (LS->current == '.') {
            next(LS);
            return TK_DOTS;   /* ... */
          }
          else return TK_CONCAT;   /* .. */
        }

        else if (!isdigit(LS->current)) return '.';
        else {
          read_numeral(LS, 1, seminfo);
          return TK_NUMBER;
        }
      }
      case EOZ: {
        return TK_EOS;
      }
      default: {
        if (isspace(LS->current)) {
          next(LS);
          continue;
        }
        else if (isdigit(LS->current)) {
          read_numeral(LS, 0, seminfo);
          return TK_NUMBER;
        }
        else if (isalpha(LS->current) || LS->current == '_') {
          /* identifier or reserved word */
          size_t l = readname(LS);
          TString *ts = luaS_newlstr(LS->L, luaZ_buffer(LS->buff), l);
          if (ts->tsv.reserved > 0)  /* reserved word? */
            return ts->tsv.reserved - 1 + FIRST_RESERVED;
          seminfo->ts = ts;
          return TK_NAME;
        }
        else {
          int c = LS->current;
          if (iscntrl(c))
            luaX_error(LS, "invalid control char",
                           luaO_pushfstring(LS->L, "char(%d)", c));
          next(LS);
          return c;  /* single-char tokens (+ - / ...) */
        }
      }
    }
  }
}


static void read_long_string (LexState *LS, SemInfo *seminfo) {
  int cont = 0;
  size_t l = 0;
  checkbuffer(LS, l);
  save(LS, '[', l);  /* save first `[' */
  save_and_next(LS, l);  /* pass the second `[' */
  if (LS->current == '\n')  /* string starts with a newline? */
    inclinenumber(LS);  /* skip it */
  for (;;) {
    checkbuffer(LS, l);
    switch (LS->current) {
      case EOZ:
        save(LS, '\0', l);
        luaX_lexerror(LS, (seminfo) ? "unfinished long string" :
                                   "unfinished long comment", TK_EOS);
        break;  /* to avoid warnings */
      case '[':
        save_and_next(LS, l);
        if (LS->current == '[') {
          cont++;
          save_and_next(LS, l);
        }
        continue;
      case ']':
        save_and_next(LS, l);
        if (LS->current == ']') {
          if (cont == 0) goto endloop;
          cont--;
          save_and_next(LS, l);
        }
        continue;

// added
//------------------------------
      case '|':
		save(LS, ']', l);  

		LS->lookahead.token = TK_CONCAT;
        goto endloop;
        continue;
//------------------------------

      case '\n':
        save(LS, '\n', l);
        inclinenumber(LS);
        if (!seminfo) l = 0;  /* reset buffer to avoid wasting space */
        continue;
      default:
        save_and_next(LS, l);
    }
  } endloop:
  save_and_next(LS, l);  /* skip the second `]' */
  save(LS, '\0', l);
  if (seminfo)
    seminfo->ts = luaS_newlstr(LS->L, luaZ_buffer(LS->buff) + 2, l - 5);
}

--Sam Lie

Note: the above patch is broken in 5.1. Make sure that f [[Hello |name|, welcome to |get_company_name()|. ]] translates into f([[Hello ]]..name..[[, welcome to ]]..get_company_name()..[[. ]]). Alternately translate it to f([[Hello ]], name, [[, welcome to ]], get_company_name(), [[. ]]). Perhaps use [[ ]] rather than | | to break out of the string because nested [[ ]] are deprecated in Lua 5.1 and by default raise an error, so we are free to redefine its semantics. --DavidManura

Metalua

For a MetaLua implementation, see "String Interpolation" in MetaLuaRecipes.

Var Expand

VarExpand - Advanced version of bash-like inline variable expanding.

Customer searcher that preprocesses

See [gist1338609] (--DavidManura), which installs a custom searcher function that preprocesses modules being loaded. The example preprocessor given does string interpolation:

--! code = require 'interpolate' (code)

local M = {}

local function printf(s, ...)
  local vals = {...}
  local i = 0
  s = s:gsub('\0[^\0]*\0', function()
    i = i + 1
    return tostring(vals[i])
  end)
  print(s)
end

function M.test()
  local x = 16
  printf("value is $(math.sqrt(x)) ")
end

return M

Other Ideas

Other Possible Applications

Embedding expressions inside strings can have these applications:


RecentChanges · preferences
edit · history
Last edited January 14, 2012 4:29 pm GMT (diff)