Lua Power Patches

lua-users home
wiki

A power patch is a small patch to a source code distribution that makes some useful change. Power patches are judged based on how few lines of code are changed, general usefulness, and backwards compatibility. By limiting them to localized changes, the chance that several such patches can work together is high. It also keeps maintenance work to a minimum as new versions of the source package are released. Truly good patches will have a short life, as the authors of the original program will incorporate them into their code.

New power patches, ports of existing patches to different Lua versions, and bug fixes are welcome.

If you apply a patch that changes the syntax or semantics of Lua, the resulting language should not be called "Lua".

See LuaPowerPatchesArchive for patches for old versions of Lua. Where a patch exists for multiple versions of Lua including the current one, it will appear on this page.

How to Apply a Patch

Unpack the correct Lua distribution into a clean directory. Do a cd to the top of the directory and run:

patch -p1 < patchfile

Patch Tools

one that comes with Solaris) don't support unified diffs properly (the ones with + and - at the start of each line. GNU patch does.

Patch Guidelines

cd lua
make clean
cd ../lua_new
make clean
cd ..
diff -urN lua lua_new > mychange.patch

References for Patch Writers

In general, documentation on Lua's internals is hard to find -- and current versions of such docs are even rarer [5]. Here's a list of resources that may be helpful:


Lua 5.2 patches

Type Metatables for Table and Userdata (5.2.1)

Lua supports per-object metatables for Table and Userdata types and a per-type metatable for each of the other types. There are some use cases for a type metatable for Table type (fewer for Userdata type). For example, we could expose the Table Library functions as methods of table objects in the same way that the String Library does for strings.

This patch makes use of the convenient fact that there are already type metatable pointers for all the types in the Lua state. When a Table or Userdata is created, the patch copies the type metatable pointer into the new object. This means that the object uses any type metatable automatically until and unless its metatable is explicitly overwritten by setting it in the usual way. The type metatable is accessible via a newly created object from 'C' or Lua ('getmetatable') and can be used used in delegation schemes from an object metatable that replaces it.

We need a way of setting the type metatable distinctly from the object metatables. This is only supported from the C API and the Debug Library in line with the existing type metatables. It has always bothered me that you have to supply a 'dummy' object to 'lua_setmetatable' just to specify the type. I have added the API function 'lua_settypemt' which takes a type constant and sets the type metatable for any type. As always, 'lua_setmetatable' is used to set the object metatable of Table and Userdata objects. It is deprecated as a way of setting type metatables, but continues to work in that role for types other than Table and Userdata. From the Debug Library, use for example:

debug.settypemt("table", {__add=table.insert})

Define the symbol 'JH_LUA_TYPEMETA' to compile the patch.

Optionally also define the symbol 'JH_LUA_TABLECLASS' to install a type metatable for Table type which exposes the Table Library in method form.

Binary Number Literals (5.2)

This patch adds binary literals. This is implemented differently to my 5.1 patch due to extensive changes in the underlying code. I did not support Octal this time as it is rarely used these days (although it could easily be added following the same pattern as this patch). I also added Java/Perl style embedded underscore support as it makes it easier to format long binary strings (suggested by RobHoelz below, though this patch only enables underscores in binary literals).

Valid binary literals: 0b1101; 0B100100; 0b1001_1111_0000_1010; 0b11______01

The underscore character is for readability only and is ignored. A string of any number of underscores may be embedded, but there must be at least one binary digit before the first and after the last.

For this patch to be compiled, the symbol 'JH_LUA_BINCONST' must be defined.

Advanced readline support (5.2, 5.1, 5.0)

This patch adds the following features to the existing readline support in Lua 5.x:

After applying the patch start Lua and try these (replace ~ with the TAB key):

~~
fu~foo() ret~fa~end<CR>
io~~~s~~~o~~~w~"foo\n")<CR>

It has been verified to work with Lua 5.0, 5.0.2, 5.1 and 5.2. Compatible readline libraries include GNU readline 2.2.1, 4.0, 4.3, 5.0, 5.1, 6.0, 6.2; Mac OS X libedit 2.11; or NetBSD libedit 2.6.5, 2.6.9. Note that, despite the version number, the 5.2 patch works with all 5.2.x releases.

Better signal handling in the interpreter on POSIX systems (5.2.3, 5.1.5)

Use sigaction instead of signal. This means that, for example, you don't have to press Ctrl-C twice to quit a Lua script blocked on I/O.

Add default module paths for /usr as well as /usr/local (5.2.0)

This patch adds default module paths under /usr as well as the default /usr/local, enabling Lua to find libraries installed as part of the system (e.g. libraries installed by your GNU/Linux distribution).

C/C++-style comments

Allow the use of C-style (/*...*/) and C++-style (//...) comments in Lua source. Either or both styles can be enabled via #defines in llex.h or luaconf.h. Standard Lua comments remain available.

Function fields

Allow the use of special syntax for function fields in tables, such that

tbl = {}
function tbl:my_method(x, y) end
or
tbl = { my_method = function(self, x, y) end }
can instead be written:
tbl = { function my_method(x, y) end }
That is, functions declared with this syntax get methods' automatic "self" parameter.

19 Jan: Fixed a bug that broke declaration of anonymous functions, e.g. t = { function(args) end }

Self-iterating Objects (5.2, 5.1.4)

UPDATE: Following an idea sent to me by Sam Lie from Indonesia, I have extended this patch to establish pairs as the last-resort default iterator for tables. The patch is re-titled to reflect this expanded scope.

This patch adds a new event for metatables __iter. If this is included in a metatable, its value must be a function with the same signature as pairs or ipairs. It provides a default iterator for the object simplifying the generic for statement. If there is no such metamethod, the pairs iterator is used (for tables only).

t = {[3]="three",[2]="two",[1]="one"}
for k,v in t do print(k,v) end        -- uses pairs
setmetatable(t, {["__iter"]=ipairs})
for k,v in t do print(k,v) end        -- uses ipairs
for k,v in pairs(t) do print(k,v) end -- uses pairs (per standard Lua)
Of course, a custom function or closure can be used in place of pairs or ipairs. For lists, a closure could be used to avoid the need for the key variable for example.

I offered this as an alternative to the __pairs and __ipairs events which have been implemented in Lua 5.2. I still believe this approach is better and continue to offer it as a patch. The two approaches can coexist, indeed you could specify all three metamethods for the same object so pairs and ipairs would both be available with customised behaviour and one of them would also be specified as the default iterator.

The patch modifies the Lua Virtual Machine to test the type of the first parameter of the TFORLOOP bytecode instruction. If it is a function, the original code is used. If it is other than a function, an attempt is made to reference its __iter metamethod. If this does not yield a function and the object being iterated is a table, an attempt is made to reference the global pairs function. If a function is obtained by either of these methods, that function is called and its three return values overwrite the original three parameters to TFORLOOP. The original code is then used to process the generic for with the iteration parameters provided by the function. Note that this introduces a subtle change in the standard Lua processing: The case of the first parameter not being a function is detected before the first iteration rather than during it, and it is a test for function type rather than for the ability to be called. This will break some subtle code tricks such as using the __call event to "fake" self-iteration.

For this patch to be compiled, the symbol 'JH_LUA_ITER' must be defined.

Set syntax shortcut for table constructors (5.2, 5.1.4)

This patch adds a new syntax shortcut for constructing set-like tables. If the value of a field is missing it is defaulted to boolean true, so for example {["saturday"], ["sunday"]} constructs the same table as {["saturday"] = true, ["sunday"] = true}.

For this patch to be compiled, the symbol 'JH_LUA_SETINIT' must be defined.

Unpack Tables by Name (5.2, 5.1, 5.0.2)

Enhancement to the assignment statement to unpack named values from tables using the in keyword. (See lua-l message "patch: local a,b from t" [6].)

    local a, b, c in some_table_expression
is syntactic sugar for
    local t = some_table_expression
    local a, b, c = t.a, t.b, t.c

Compound Assignment Operators (5.2)

A patch to allow C-style compound assignment; i.e., statements like "object.counter+=2". The operators supported are +,-,..,/, and *. The details of the syntax are discussed on my personal page (see: SvenOlsen). For a rather different (and 5.1 compatible) approach to the same problem, see [7].

Safe Table Navigation (5.2)

A syntax patch for the safe navigation semantic discussed here [8] -- an indexing operation 't?.v' which suppresses errors on accesses into undefined tables. While there seems to be broad interest in patch of this type, opinions varied on how best to handle the details. I've posted a simple, light-weight version below; my personal page (SvenOlsen) includes code and docs for a more feature-heavy version.

Lua 5.1 patches

These are in order of Lua version, newest to oldest.

Make os.exit close lua_State (5.1.5)

If an optional second argument is true, calls lua_close before exiting, so finalizers will be run. The Lua interpreter already calls lua_close when exiting normally, but other applications embedding Lua may not; and you may want this behavior when using os.exit (to force non-zero exit statuses).

This behavior cannot be provided by a library, but only by patching the Lua source and rebuilding.

Make tables honor __len (5.1.5)

This patch makes #tbl honor a __len metamethod on tbl, as in Lua 5.2. It also provides a new global function, rawlen.

Second Hash (5.1.5)

This patch strengthens Lua 5.1 against Hash DoS attacks. For more details see HashDos.

Print NULs (5.1.5)

Make print print NUL characters, by using fwrite instead of fputs. Someone else tidied it up by patching luaconf.h to let the user supply a luai_puts macro.

Autotoolized Lua (5.1.5)

This patch autotoolizes the Lua distribution, i.e. makes it build with autoconf, automake, and libtool. For Lua 5.1.5.

You must unpack the file before patching:

bunzip2 lua-X.Y.Z-autotoolize-rW.patch.bz2

Note: This patch incorrectly references version 5.1.3 in several places and you may want to fix that.

Next, apply the patch.

After the patching you need to add executable flag to some files:

chmod u+x autogen.sh config.guess config.sub configure depcomp install-sh missing

Now you are ready to run ./configure.

Emergency Garbage Collector (5.1.5)

This patch is described in EmergencyGarbageCollector.

Yieldable For Loop (5.1.5)

Modifies the code generator so that the iterator in for ... in loops can call yield. Details and test code are available at YieldableForLoops.

Note that the current version of the patch orders op codes so as to maintain binary compatibility with compiled Lua scripts, if LUA_COMPAT_TFORLOOP is defined in luaconf.h. This adds a few instructions to the VM, and quite a bit of complexity to the patch, which would otherwise only be about 25 lines.

string.format %s patched to use __tostring (5.1.5)

Changes string.format %s to apply __tostring to non-string %s arguments.

Custom error object support (5.1.5)

This patch improves Lua's support for custom error objects. Changes:

instance of the error (since it will be mutated). Rather than this scheme, the ideal solution would be to have the Lua core manage the location separate from the error object.

See "Exception Patterns in Lua"[4] for more information.

Equality operators that work like arithmetic operators (5.1.5)

This modifies the behavior of the equality operator functions so they are able to handle values with dissimilar types. For instance, in standard Lua if the left operand is a userdata and the right is a number, the equality test will fail. This patch causes the __eq metamethod of the userdata to be used, if available. But note, one reason Lua does not support this is because the __eq, __lt and __le metamethods are used for ~=, > and >= as well, by reversing the operands. Therefore, if both the right and left operands have metamethods, you might be surprised by which one gets chosen. As it is, the left metamethod is preferred. But of course, this is the RIGHT metamethod for the ~=, > and >= tests! A good solution to this might be to add __ne, __gt and __ge metamethods. Then the equality operators would truly behave exactly like the arithmetic operators.

Octal and Binary Number Constants (5.1.4)

This simple patch adds octal and binary constants. These will be recognised in source code literals or in string contents converted implicitly or explicitly by tonumber. Binary constants take the form 0b10101. Octal constants take the form 0o176. Upper-case radix specifier is also supported for consistency with the hexadecimal format but for obvious reasons it is not recommended for octal!

For this patch to be compiled, the symbol 'JH_LUA_BINOCTAL' must be defined.

Use NaN packing for TValue (5.1.4)

Use NaN packing for TValue on x86 to reduce memory usage and tiny performance gain (same as in LuaJIT 2).

It's fully ABI compatible with standard Lua libraries.

On one test script memory consumption reduced from 28Mb to 21Mb and performance improved about 3.5-5%

Multi-dimensional array indexing with comma syntax (5.1.4)

Modifies the parser to support multi-dimensional array indexing with comma syntax, i.e. m[1,2] istreated by the parser as being identical to m[1][2], thus allowing for code such as the following:

   -- test multi-dimensional arrays with comma syntax. also test
   -- constructors and multiple assignment to show they're not broken.
   m = {[1]={}, [2]="foo"}
   m[1,2] = "bar"
   print(m[2], m[1][2], m[1,2])  --> foo  bar bar
   m.foo = {}
   m.foo.bar = "baz"
   print(m["foo","bar"])         --> baz

The virtual machine is unchanged.

C function names in backtrace (5.1.4)

Currently GNU/Linux only, this patch will add the names of Lua C functions to the traceback if debugging information has been compiled into the shared library from which they were loaded.

Allow underscores in numbers (5.1.4)

One thing I like about Perl is the ability to break up large numbers using underscores, so instead of this:

local num = 1000000000000

you can do this:

local num = 1_000_000_000_000

Bitwise operators, integer division and != (5.1.4)

All these features can be disabled by undefining LUA_BITWISE_OPERATORS in luaconf.h.

Bitwise operators first convert a lua_Number to a lua_Integer and then convert the result back to a lua_Number.

Remove auto string<->number conversion (5.1.4)

Prevent auto-conversion between strings and numbers in arithmetic and concatenation. This is good because it prevents bugs; when auto-conversion is a good idea (as in the print function) it can still be done by calling the relevant conversion functions.

The current version does not scrupulously remove all undesirable casting from the libraries; I have preferred correctness over completeness.

Use defaults for LUA_INIT, LUA_PATH and LUA_CPATH (5.1.4)

Adds a -t switch to the standalone interpreter that uses the default values for the above variables, making it easier to run Lua in a controlled way.

No varargs (5.1.4)

Varargs are arguably a needless complication. This patch removes them.

Interpreter Bailout Flag (5.1.3)

At [Sim Ops Studios], we embedded Lua by spawning a kernel-supported thread that called a Lua script using the C API. In that configuration, a script could hang forever, making it impossible to cleanly escape from the thread without killing it outright. This script provides an alternative: the "bailout" state flag, and two commands (luaL_getbailout and luaL_setbailout) to manage the flag. When the flag is true, every opcode is interpreted as a return operation, forcing the Lua interpreter unconditionally back to the top level. The C code can then check the status of the flag and cleanly exit the thread if the bailout flag is true. Using this flag makes it impossible to write a script that can run without interruption (although it is possible that a script could be terminated in an unexpected place).

Check for undefined globals (5.1.3)

This patch is described in DetectingUndefinedVariables.

Continue Statement (5.1.3)

A "continue" statement is added to the parser. The virtual machine is unchanged. Comes with a small test suite.

Instead of patching Lua, one might consider luaSub, which contains a syntax mod for this same purpose.

Table scope patch (5.1.3)

Allows table constructs to enclose a scope so that variables used inside the table have special meaning. See TableScope.

LNUM - number mode patch ("integer patch") (5.1.3)

Allows Lua built-in numbers to be a combination of any of the following: LNUM_DOUBLE / LNUM_FLOAT / LNUM_LDOUBLE (long double) LNUM_INT32 / LNUM_INT64 LNUM_COMPLEX

Uses: 32- or 64-bit integer accuracy internally for any Lua numbers. Intensively (40-500%) boosts Lua performance on non-FPU platforms. Totally transparent to the application (script) level, as well as existing Lua/C API.

Latest svn (with test suite):

svn export svn://slugak.dyndns.org/public/2008/LuaPatches/LNUM2

Latest release: [LuaForge LNUM Files]

Real-world testing and performance data from applications are still appreciated. The patch is essentially "ready"; apart from LDOUBLE mode there are no known bugs; if you find any, please share your experience.

The patch leaves the integer realm graciously, falling into floating point accuracy if results won't fit in integers.

For performance results, there is a spreadsheet and easy to use "make-plain/float/double/ldouble/complex" targets to run on your own system.

Integer ASCII values (5.1.2)

Yet another syntactic bloat for the lexer:
print(#'a',#'\n')
97      10

Note that this could be probably done via token filter as well, but this is useful for ASCII value mangling (base64, obscure protocol parsing). I just wanted something more expressive than if c>=b2a("A") and c<=b2a("Z"). There was no way to get ASCII value as a constant in Lua at this point.

This syntax already has a meaning (string length) because strings can be given inside single quotes. --lhf

Module Execution Proposal (5.1.2)

Provides a new command-line switch (-p) that loads a function with the given package name via the searchers in package.loaded and then executes that function as a script, passing the command-line arguments to the function as arguments. Patch and description are in ModuleExecutionProposal.

Concise anonymous functions (5.1.2)

Some syntactic sugar for writing more concise anonymous functions, very useful when passing arguments to higher-order functions. Completely backwards compatible and only modifies the parser, emits the same bytecode as the old syntax. See [The Readme] for details.

Extend table constructor syntax to allow multiple expansion of multi-return functions (5.1.1)

As discussed several times on the mailing list, and implemented in Aranha, here is a patch which modifies table constructor syntax so that ; allows the preceding list item to be expanded.

With this patch, {foo(); bar()} creates a table with all the return values of foo followed by all the return values of bar, while {foo(), bar()} continues to have the same semantics as current Lua. To be more precise, if a list-item is followed by a comma then it is truncated to one return value; otherwise it represents all the return values (possibly none). Consequently, {foo(),} truncates, but {foo();} and {foo()} do not.

The patch also makes the order of field definitions precisely left to right, unlike the current Lua implementation in which {[3] = "foo", 1, 2, 3} has undefined behaviour. That may lead to performance problems with very large tables which are defined like this:

t = {
  "a", a = 1,
  "b", b = 2,
  "c", c = 3,
  -- etc
}

Other than that, the performance implications are minimal; sometimes it is a bit faster, sometimes it is a bit slower, but there is little difference. Personally, I prefer the precise ordering guarantee, but the patch can be easily modified to come closer to current semantics. Contact me for more details.

The implementation is straight-forward. When a semi-colon is encountered, the compiler emits code to append the current list of expressions, leaving the last one as multi-return. In order to do this, it's necessary to keep the current table array insertion point on the stack, instead of hard-coding it into the vm code, so the opcode to create a new table is modified to use two stack slots, putting the table in the first one and initializing the second one to 1. The opcode which adds array values to a table uses the second stack slot as the starting index, and updates it to the next starting index. Although this uses one extra stack slot, it is roughly the same speed as the existing code.

Let files combined with luac access arguments (5.1.1)

When luac combines multiple files into a single bytecode chunk, the resulting chunk does not accept any arguments. This small patch passes ... into all files in the combined chunk

__usedindex metamethod (5.1.1)

__usedindex behaves exactly like __newindex but when the indexed key actually exists (value is overwritten). This allows simple implementation of read-only tables, mirroring C structures etc. without slow/lengthy/fragile table proxying constructs. Known to be broken with LuaJIT, fixes are welcome.

PHP-like 'break N' to break across multiple loops (5.1.1)

Allows syntax like: while foo do while bar do if baz then eek() break 2 end end end

if "foo and bar and baz" condition holds, 'break 2' escapes loops immediately. The number is counted towards breakable scopes (thus, "if", "do" etc are not counted).

Enum/bit operations patch (5.1.1)

Adds C API functions (toenum, isenum, ...) for handling unsigned 32-bit bitfields. Such enum values have overloaded [], () operations for performing bitwise operations on the Lua side. Enums are grouped in 'families', to prevent accidential use of wrong bitmask in wrong function.

The implementation uses negative 'tt' (Lua type) values for enums, and they are not garbage collectable (= should be fast). 'type()' function returns two values: "enum" and the family name. Documentation is lacking.

svn cat svn://slugak.dyndns.org/public/lua-bitwise/lua-5.1.1-enum-patch.diff
svn cat svn://slugak.dyndns.org/public/lua-bitwise/test.lua
svn cat svn://slugak.dyndns.org/public/lua-bitwise/README

Go Long Lua! (5.1)

This patch removes floating point operations used by Lua 5.1 by changing the type of Lua numbers from double to long. It implements division and modulus so that x == (x / y) * y + x % y. The exponentiation function returns zero for negative exponents. The patch removes the difftime function, and the math module should not be used. The string.format function no longer handles the floating point directives %e, %E, %f, %g, and %G. By removing the definition of LUA_NUMBER_INTEGRAL in src/luaconf.h, one obtains a Lua number implementation based on doubles.

CNUMBER patch (5.1)

Provides a more efficient mechanism for accessing numeric C variables from Lua.

Kilobyte/Megabyte Number Suffix (5.1)

Add 'K' and 'M' suffixes for numbers (e.g. 150K or 12M); binary (K=2^10) not metric (K=10^3).

Useful when using Lua as a configuration language in some domains (in our case, a build process).

Do patch (5.1 beta)

Makes "= do ... end" be syntactic sugar for "= function() ... end" (handy with simple callbacks).

svn cat svn://slugak.dyndns.org/public/lua-patches/do.patch
svn cat svn://slugak.dyndns.org/public/lua-patches/do.txt

Instead of patching Lua, one might consider luaSub, which contains a syntax mod for this same purpose.

Literals (hex, UTF-8) (5.1 beta)

Allows \x00..\xFFFF (hex) and \u0000..\uFFFF (UTF-8 encoded) characters within strings.

svn cat svn://slugak.dyndns.org/public/lua-patches/literals.patch
svn cat svn://slugak.dyndns.org/public/lua-patches/literals.txt

Mutate Operators (5.1-work6)

This patch adds mutate operators to Lua. Specifically, the ":=" operator can now be used for value assignment (or whatever else you wish) by attaching a "__mutate_asn" metamethod to a Lua object. Adding additional mutate operators (such as +=, or << for example) is straightforward.

Retrieve function arity with debug.getinfo (5.1.5)

Sometimes it's useful to know how many arguments a function expects. This patch allows you to specify 'a' to debug.getinfo (or alternatively to lua_getinfo), which makes the function's arity available in a field named 'arity'.


RecentChanges · preferences
edit · history
Last edited February 24, 2014 9:47 pm GMT (diff)