[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Proposal for a new construct in Lua
- From: Michael Rose <michael.h.rose@...>
- Date: Sat, 14 May 2011 19:26:18 +0200
Hello,
I've read many of the discussions about 'switch' statements, the
'continue' statement and 'computed goto' in the mailing list as well as
in the wiki.
I have programmed my own 'continue' statement for Lua 5.2-alpha just to
get a feeling for the parser though there is a patch for an earlier
version of Lua available in the WIKI. I won't propose my implementation
here (though its fully functional), because I think its a matter of
taste if this construct is really necessary in the language or just
some avoidable blow up. I think I would appreciate general gotos and
labels in the language as well as line number annotations just for the
purpose to be able to build something like Metalua which outputs direct
Lua code instead of using its own bytecode-emitter with all the
maintainance issues and compatibility problems (Metalua doesn't work
with Luajit), but I understand, that Luajit has effiency rationales
against such a general goto-construct. I agree that new language
constructs must be carefully selected.
But still there is one kind of construct, which I find quite usefull and
therefore I just want to ask the community and especially the core Lua
team, whether this construct might be valuable or not for a later Lua
release. I call it jumptable and it has something in common with
a computed goto (without labels) as well as of a switch statement
(reduced basically to integers). My point of view is that a switch
statement would be pointless (just syntactic sugar) if it is internally
mapped to an if-elseif chain. But often there is a need for state
machines or code selection based on integers or string characters (->
string.format) which leads to very compact data coding. I know that
jumps on e.g. integers can be realized by
function tables in O(1) time, but for inner loops and especially when it
seems to be natural to deal with common local variables in the different
"jump targets", functions have some overhead and are also a bit clumpsy.
I attached a readme file about my rationale as .txt file and also a
zipped patch for Lua 5.2-alpha, which demonstrates this new language
construct with some (admittedly) silly examples just to give some ideas.
Note that this patch seems to work without restrictions but is only
roughly testet by me. It serves just to explain the idea.
Discussion welcome ...
Regards,
Michael Rose
Here is some code snipped, which demonstrates the basics:
=========================================================
-- File: tst_format.lua
-- One example for jump tables in format issues.
local format = "We have %d%% off for %s in %s!\n"
local args = {10,"butter","our nice grocery"}
function my_format(p,fmt,...)
local args = {...}
local start,i,n,argi = 1,1,#fmt,0
while i < n do
if fmt:sub(i,i) == "%" then
if start < i then p(fmt:sub(start,i-1)) end
start = i + 2
i = i + 1
-- NEW JUMP TABLE CONSTRUCT:
jump val = fmt:byte(i) in "sSdDcC" do
"dD" do args[argi+1] = tostring(args[argi+1]) end
"sS" do argi = argi+1; p(args[argi]) break end
"cC" do argi = argi+1; p(args[argi]:sub(1,1)) break end
-- add other values...
do p(string.char(val)) end -- default target
end
-- END OF NEW JUMP TABLE CONSTRUCT.
end
i = i + 1
end
if start <= i then p(fmt:sub(start,i)) end
end
my_format(io.write, format, table.unpack(args))
-- End of File: tst_format.lua
This is a proposal for a jump table construct, which could be
included into Lua.
A roughly tested patch for 'lua-5.2.0-alpha.tar.gz'
is given as attachment 'lua-5.2.0-alpha-jumptable.patch'.
No warranty is given for correctness! Use at your own risk.
In addition this patch inherits the license of Lua.
The current jump table syntax is given by
jump ::= 'jump' name '=' init 'in' rng (',' rng)... 'do' entry... 'end'
entry ::= entrylist 'do' block 'end'
| default
default ::= 'do' block 'end'
entrylist ::= ( entryrange ',' )... entryrange
entryrange ::= ( item '..' )... item
init ::= int_expression
rng,item ::= int_literal | string_literal
Example (with "i" as index into string "str"):
jump chr = str:byte(i) in 0,255 do
"a".."z" do print("lower") break end
"A".."Z" do print("higher") break end
32,"_-:" do print("delimiter") break end
0 .. 31, 127 .. 255 do print("unprintable") break end
do print("other") end
end
Use the 'luac -l <name>.lua' method to get insight in the implementation
with the corresponding opcode listing.
* Each "int_literal" represents the corresponding number.
* Each "string_literal" represants the ordinal ascii numbers of its
characters.
* Ranges of items are represented by the ".." operator.
* Items and ranges of items can be handled by a common block when they
are combined by "," or ";" keyword.
* A bare "do" block "end" represents a unique default action.
* The whole jump table responds to the break statement.
* A break statement at the end of each entry block must be used to
avoid fall through (like the C switch statement).
* The jump table is just as large that the minimum and maximum numbers
specified by 'rng' values are allowable entry items.
Rationale:
==========
* This construct is different than a switch statement, which has been
heavily discussed in the mailing list in the past.
* It is strictly and fast compiled as a one pass construct fitting nicely
into the general compiler construction of Lua.
* The ranges "low" and "high" are needed to specify the ranges and
the size of the table in the opcode list. The table consists of
1 + (high - low + 1) jump opcodes (the first jump corresponds to
the default action. The jump table is constructed first and is
patched correctly while the entry items are parsed.
* A new opcode OP_JUMPTAB has been introduced to perform the indirect
jump in the virtual machine.
* "init" must be a Lua number. All numerical values are excepted,
but fractional values, values out of range and values without entry items
are mapped to the default action.
* If the default action is not present, it is equivalent to a
break statement, leaving the whole construct.
* The jump table need not be fully populated, the default action is
patched to every entry, which is not defined. This allows sparse
jump tables.
* Each entry item must be in the range of the table, otherwise
an error is thrown at compile time.
* Jump tables are faster than if..elseif..elseif.. constructs. They are
most useful in combinatin with integers. Ascii codes are very convenient
in certain applications, therefore they are included in the proposal.
* All other uses of switch like statements can be represented by
the usual if..elseif.. constructs, because this essentially corresponds
to the compiled code if jump tables cannot be used.
* Jump tables have advantages in comparison to function tables, because
the jump table construct avoids closure creation and can use
all local variables of the lexical environment. This was the main
rational to investigate it. If the functions from a function table
needs some common upvalues from the local environment, their closures
have to be constructed in the inner loop of such constructs.
Jump tables are also very compact in the final code size.
Some things which should be discussed:
======================================
* For me the fall through similar to the C switch statement can be very
useful in certain applications.
But to avoid pitfalls, the syntax could be changed, so that the normal
case is no fall through (like an implicit break statement at the end
of each entry block). An explicit fall through could then be
realized by an additional keyword like
entry ... do ... fallthrough end
entry ...
* The keywords are arbitrary and can be changed to anything useful
if this patch is eventually accepted by the community.
I tried to use as much of already available keywords as made sense to me.
* Strings are not accepted as "init" value. This could be changed
eventually (e.g. the first ascii ordinarial would be used there as
integer value). I have not done it, because I need to subtract
an offset from the "init" value before the new opcode takes control.
This was done to avoid additional opcodes etc.
A more general implementation might as well map all other values
to the default clause.
Michael Rose
Attachment:
lua-5.2.0-alpha-jumptable.patch.gz
Description: GNU Zip compressed data