Lua Type Checking

lua-users home
wiki

Many programming languages provide some form of static (compile-time) or dynamic (run-time) type checking, with each form having its own merits [1]. Lua performs run-time type checking on its built-in operations. For example, this code triggers a run-time error:

> x = 5 + "ok"
stdin:1: attempt to perform arithmetic on a string value

However, unlike in languages like C, there is no built-in mechanism in Lua for type checking the parameters and return values of function calls. The types are left unspecified:

function abs(x)
  return x >= 0 and x or -x
end

This allows a lot of flexibility. For example, a function like print can accept values of many types. However, it can leave functions underspecified and more prone to usage errors. You are allowed to call this function on something other than a number, though the operations inside the function can will trigger a somewhat cryptic error at run-time (where it would be a compile-time error in C).

> print(abs("hello"))
stdin:2: attempt to compare number with string
stack traceback:
        stdin:2: in function 'abs'
        stdin:1: in main chunk
        [C]: ?

Solution: Assertions at Top of Function

For improved error reporting, one is typically told to do something like this:

function abs(x)
  assert(type(x) == "number", "abs expects a number")
  return x >= 0 and x or -x
end

> print(abs("hello"))
stdin:2: abs expects a number
stack traceback:
        [C]: in function 'assert'
        stdin:2: in function 'abs'
        stdin:1: in main chunk
        [C]: ?

That's a good suggestion, but one might complain that this imposes additional run-time overhead, it will only detect program errors for code that gets executed instead of all code that is compiled, the types of the function values are entirely in the function's implementation (not available by introspection), and it can involve a lot of repetitive code (especially when the named arguments are passed via a table).

Here is one ways to proactively check named parameters:

local Box = {}
local is_key = {x=true,y=true,width=true,height=true,color=true}
function create_box(t)
  local x = t.x or 0
  local y = t.y or 0
  local width = t.width or 0
  local height = t.height or 0
  local color = t.color
  assert(type(x) == "number", "x must be number or nil")
  assert(type(y) == "number", "y must be number or nil")
  assert(type(width) == "number", "width must be number be number or nil")
  assert(type(height) == "number", "height must be number or nil")
  assert(color == "red" or color == "blue", "color must be 'red' or 'blue'")
  for k,v in pairs(t) do
    assert(is_key[k], tostring(k) .. " not valid key")
  end
  return setmetatable({x1=x,y1=y,x2=x+width,y2=y+width,color=color}, Box)
end

It is relatively a lot of code. In fact, we might want to use error rather than assert to provide an appropriate level parameter for the stack traceback.

Solution: Function Decorators

Another approach is to place the type check code outside of the original function, potentially with a "function decorator" (see DecoratorsAndDocstrings for background).

is_typecheck = true

function typecheck(...)
  return function(f)
    return function(...)
      assert(false, "FIX-TODO: ADD SOME GENERIC TYPE CHECK CODE HERE")
      return f(...)
    end
  end
end

function notypecheck()
  return function(f) return f end
end

typecheck = is_typecheck and typecheck or notypecheck

sum = typecheck("number", "number", "->", "number")(
  function(a,b)
    return a + b
  end
)

The advantage is that the type information is outside of the function implementation. We can disable all the type checking by switching a single variable, and no added overhead would remain when the functions are executed (though there is some slight added overhead when the functions are built). The typecheck function could also store away the type info for later introspection.

This approach is similar to the one described in LuaList:/2002-07/msg00209.html (warning: Lua 4).

Solution: the checks library

The solutions above present some limitations:

* they are rather verbose, and non-trivial verifications can be hard to read;

* the error messages aren't as clear as those returned by Lua's primitives. Moreover, they indicate an error in the called function, at the place where assert() fails, rather than in the calling function which passed invalid arguments.

The checks library offers a terse, flexible and readable way to produce good error messages. Types are described by strings, which can of course be Lua type names, but can also be stored in an object's metatable, under the __type field. Additional, arbitrary type-checking functions can also be registered in a dedicated checkers table. For instance, if one wants to check for an IP port number, which must be between 0 and 0xffff, one can define a port type as follows:

function checkers.port(x)
    return type(x)=='number' and 0<=x and x<=0xffff and math.floor(x)==0 
end

To remove useless boilerplate code, checks() retrieves the parameters directly from the stack frame, no need to repeat them; for instance, if function f(num, str) needs a number and a string, it can be implemented as follows:

function f(num, str)
    checks('number', 'string')
    --actual function body
end

Types can be combined:

* the vertical bar allows to accept several types, e.g. checks('string|number') accepts both strings and numbers as first parameter.

* a prefix "?" makes a type optional, i.e. also accepts nil. Functionally, it's equivalent to a "nil|" prefix, although it's more readable and faster to check at runtime.

* the question mark can be combined with union bars, e.g. checks('?string|number') accepts strings, numbers and nil.

* finally, a special "!" type accepts anything but nil.

A more detailed description of how the library works can be found in its source's header (https://github.com/fab13n/checks/blob/master/checks.c). The library is part of Sierra Wireless' application framework, accessible here: https://github.com/SierraWireless/luasched. For convinience, it's also available here as a standalone rock: https://github.com/fab13n/checks

Hack: Boxed Values + Possible Values

As mentioned, run-time type checking will not detect program errors that don't get executed. Extensive test suites are particularly essential for programs in dynamically typed languages so that all branches of the code are executed with all conceivable data sets (or at least a good representation of them) so that the run-time assertions are are sufficiently hit. You can't rely as much on the compiler to do some of this checking for you.

Perhaps we can mitigate this by carrying more complete type information with the values. Below is one approach, though is it more a novel proof-of-concept rather than anything for production use at this point.

-- ExTypeCheck.lua ("ExTypeCheck")
-- Type checking for Lua.
--
-- In this type model, types are associated with values at run-time.
-- A type consists of the set of values the value could have
-- at run-time.  This set can be large or infinite, so we
-- store only a small representative subset of those values.
-- Typically one would want to include at least the boundary
-- values (e.g. max and min) in this set.
-- Type checking is performed by verifying that all values
-- in that set are accepted by a predicate function for that type.
-- This predicate function takes a values and returns true or false
-- whether the value is a member of that type.
--
-- As an alternative to representing types as a set of representative
-- values, we could represent types more formally, such as with
-- first-order logic, but then we get into theorem proving,
-- which is more involved.
--
-- DavidManura, 2007, licensed under the same terms as Lua itself.

local M = {}

-- Stored Expression design pattern
-- ( http://lua-users.org/wiki/StatementsInExpressions )
local StoredExpression
do
  local function call(self, ...)
    self.__index = {n = select('#', ...), ...}
    return ...
  end
  function StoredExpression()
    local self = {__call = call}
    return setmetatable(self, self)
  end
end
 
-- Whether to enable type checking (true/false).  Default true.
local is_typecheck = true

-- TypeValue is an abstract type for values that are typed
-- This holds the both the actual value and a subset of possible
-- values the value could assume at runtime.  That set should at least
-- include the min and max values (for bounds checking).
local TypedValue = {}

-- Check that value x satisfies type predicate function f.
function M.check_type(x, f)
  for _,v in ipairs(x) do
    assert(f(v))
  end
  return x.v
end


-- Type check function that decorates functions.
-- Example:
--   abs = typecheck(ranged_real'(-inf,inf)', '->', ranged_real'[0,inf)')(
--     function(x) return x >= 0 and x or -x end
--   )
function M.typecheck(...)
  local types = {...}
  return function(f)
    local function check(i, ...)
      -- Check types of return values.
      if types[i] == "->" then i = i + 1 end
      local j = i
      while types[i] ~= nil do
        M.check_type(select(i - j + 1, ...), types[i])
        i = i + 1
      end
      return ...
    end
    return function(...)
      -- Check types of input parameters.
      local i = 1
      while types[i] ~= nil and types[i] ~= "->" do
        M.check_type(select(i, ...), types[i])
        i = i + 1
      end
      return check(i, f(...))  -- call function
    end
  end
end


function M.notypecheck() return function(f) return f end end
function M.nounbox(x) return x end

M.typecheck = is_typecheck and M.typecheck or M.notypecheck
M.unbox = is_typecheck and M.unbox or M.nounbox

-- Return a boxed version of a binary operation function.
-- For the returned function,
--   Zero, one, or two of the arguments may be boxed.
--   The result value is boxed.
-- Example:
--   __add = boxed_op(function(a,b) return a+b end)
function M.boxed_op(op)
  return function(a, b)
    if getmetatable(a) ~= TypedValue then a = M.box(a) end
    if getmetatable(b) ~= TypedValue then b = M.box(b) end
    local t = M.box(op(M.unbox(a), M.unbox(b)))
    local seen = {[t[1]] = true}
    for _,a2 in ipairs(a) do
      for _,b2 in ipairs(b) do
        local c2 = op(a2, b2)
        if not seen[c2] then
          t[#t + 1] = op(a2, b2)
          seen[c2] = true
        end
      end
    end
    return t
  end
end

-- Return a boxed version of a unary operation function.
-- For the returned function,
--   The argument may optionally be boxed.
--   The result value is boxed.
-- Example:
--   __unm = boxed_uop(function(a) return -a end)
function M.boxed_uop(op)
  return function(a)
    if getmetatable(a) ~= TypedValue then a = M.box(a) end
    local t = M.box(op(M.unbox(a)))
    local seen = {[t[1]] = true}
    for _,a2 in ipairs(a) do
      local c2 = op(a2)
      if not seen[c2] then
        t[#t + 1] = op(a2)
        seen[c2] = true
      end
    end
    return t
  end
end

TypedValue.__index = TypedValue
TypedValue.__add = M.boxed_op(function(a, b) return a + b end)
TypedValue.__sub = M.boxed_op(function(a, b) return a - b end)
TypedValue.__mul = M.boxed_op(function(a, b) return a * b end)
TypedValue.__div = M.boxed_op(function(a, b) return a / b end)
TypedValue.__pow = M.boxed_op(function(a, b) return a ^ b end)
TypedValue.__mod = M.boxed_op(function(a, b) return a % b end)
TypedValue.__concat = M.boxed_op(function(a, b) return a .. b end)
-- TypedValue.__le -- not going to work? (metafunction returns Boolean)
-- TypedValue.__lt -- not going to work? (metafunction returns Boolean)
-- TypedValue.__eq -- not going to work? (metafunction returns Boolean)
TypedValue.__tostring = function(self)
  local str = "[" .. tostring(self.v) .. " in "
  for i,v in ipairs(self) do
    if i ~= 1 then str = str .. ", " end
    str = str .. v
  end
  str = str .. "]"
  return str 
end
-- Convert a regular value into a TypedValue.  We call this "boxing".
function M.box(v, ...)
  local t = setmetatable({v = v, ...}, TypedValue)
  if #t == 0 then t[1] = v end
  return t
end
-- Convert a TypedValue into a regular value.  We call this "unboxing".
function M.unbox(x)
  assert(getmetatable(x) == TypedValue)
  return x.v
end


-- Returns a type predicate function for a given interval over the reals.
-- Example: ranged_real'[0,inf)'
-- Note: this function could be memoized.
function M.ranged_real(name, a, b)
  local ex = StoredExpression()

  if name == "(a,b)" then
    return function(x) return type(x) == "number" and x > a and x < b end
  elseif name == "(a,b]" then
    return function(x) return type(x) == "number" and x > a and x <= b end
  elseif name == "[a,b)" then
    return function(x) return type(x) == "number" and x >= a and x < b end
  elseif name == "[a,b]" then
    return function(x) return type(x) == "number" and x >= a and x <= b end
  elseif name == "(inf,inf)" then
    return function(x) return type(x) == "number" end
  elseif name == "[a,inf)" then
    return function(x) return type(x) == "number" and x >= a end
  elseif name == "(a,inf)" then
    return function(x) return type(x) == "number" and x > a end
  elseif name == "(-inf,a]" then
    return function(x) return type(x) == "number" and x <= a end
  elseif name == "(-inf,a)" then
    return function(x) return type(x) == "number" and x < a end
  elseif name == "[0,inf)" then
    return function(x) return type(x) == "number" and x >= 0 end
  elseif name == "(0,inf)" then
    return function(x) return type(x) == "number" and x > 0 end
  elseif name == "(-inf,0]" then
    return function(x) return type(x) == "number" and x <= 0 end
  elseif name == "(-inf,0)" then
    return function(x) return type(x) == "number" and x < 0 end
  elseif ex(name:match("^([%[%(])(%d+%.?%d*),(%d+%.?%d*)([%]%)])$")) then
    local left, a, b, right = ex[1], tonumber(ex[2]), tonumber(ex[3]), ex[4]
    if left == "(" and right == ")" then
      return function(x) return type(x) == "number" and x > a and x < b end
    elseif left == "(" and right == "]" then
      return function(x) return type(x) == "number" and x > a and x <= b end
    elseif left == "[" and right == ")" then
      return function(x) return type(x) == "number" and x >= a and x < b end
    elseif left == "[" and right == "]" then
      return function(x) return type(x) == "number" and x >= a and x <= b end
    else assert(false)
    end
  else
    error("invalid arg " .. name, 2)
  end
end


return M

Example usage:

-- type_example.lua
-- Test of ExTypeCheck.lua.

local TC = require "ExTypeCheck"
local typecheck = TC.typecheck
local ranged_real = TC.ranged_real
local boxed_uop = TC.boxed_uop
local box = TC.box

-- Checked sqrt function.
local sqrt = typecheck(ranged_real'[0,inf)', '->', ranged_real'[0,inf)')(
  function(x)
    return boxed_uop(math.sqrt)(x)
  end
)

-- Checked random function.
local random = typecheck('->', ranged_real'[0,1)')(
  function()
    return box(math.random(), 0, 0.999999)
  end
)

print(box("a", "a", "b") .. "z")
print(box(3, 3,4,5) % 4)

math.randomseed(os.time())
local x = 0 + random() * 10 - 1 + (random()+1) * 0
print(x + 1); print(sqrt(x + 1)) -- ok
print(x); print(sqrt(x)) -- always asserts! (since x might be negative)

Example output:

[az in az, bz]
[3 in 3, 0, 1]
[8.7835848325787 in 8.7835848325787, 0, 9.99999]
[2.9637113274708 in 2.9637113274708, 0, 3.1622760790292]
[7.7835848325787 in 7.7835848325787, -1, 8.99999]
lua: ./ExTypeCheck.lua:50: assertion failed!
stack traceback:
        [C]: in function 'assert'
        ./ExTypeCheck.lua:50: in function 'check_type'
        ./ExTypeCheck.lua:78: in function 'sqrt'
        testt.lua:30: in main chunk
        [C]: ?

Note: this approach of values holding multiple values does have some similarities to Perl6 junctions (originally termed "quantum superpositions").

Solution: Metalua Runtime Type Checking

There is a Runtime type checking example in Metalua [2].

Solution: Dao

The Dao language, partly based on Lua, has built-in support for optional typing [3].


--DavidManura

I'd argue that, except for very simple programs or those that always have the same input, disabling type checks in a script is a bad idea. --JohnBelmonte

The comment above was anonymously deleted with "This isn't a forum for drive-by opinions". On the contrary, this is established wiki style. People can make comments on pages, and for points of contention that's considered courteous versus summarily changing the original text. The original author can decide to integrate such comments into the original text or not. --JohnBelmonte


RecentChanges · preferences
edit · history
Last edited October 4, 2012 3:08 pm GMT (diff)