Hi!
Imagine the following situation: students are taking the Lua
exam.
Besides usual questions, this exam has a practical part: a
task to write some Lua program and optimize it for speed.
This is the task:
Write a function which returns a table containing a
fractal-like chain of values described below.
The chain resembles the Cantor set; its definition is
recursive:
This is a table containing a level 0 chain:
{"Lua"}
This is a table containing a level 1 chain
(concatenation of two level 0 chains):
{"Lua", "Lua"}
This is a table containing a level 2 chain (two level
1 chains with a nil in between):
{"Lua", "Lua", nil,
"Lua", "Lua"}
This is a table containing a level 3 chain (two level
2 chains with double nil in between):
{"Lua", "Lua", nil,
"Lua", "Lua", nil, nil, "Lua", "Lua", nil, "Lua", "Lua"}
And so on:
to create a chain of level n, concatenate two chains of
level (n-1) separated by (n-1) nils.
Your function should return a table containing a chain; the
level of the chain is passed to the function as an argument
(a non-negative integer).
The faster your function runs the more exam points you earn.
Four students are solving this task.
----
Student #1 uses simple recursion.
This is his solution:
$ cat student1.lua
local function append(n, t, idx)
-- appends a chain of level n to table t starting from
index idx+1
-- returns the last touched index
if n == 0 then -- append chain of level 0
idx = idx + 1
t[idx] = "Lua"
return idx
else -- recursively append chain of a higher level
idx = append(n-1, t, idx) -- append chain of level
(n-1)
idx = idx + (n-1) -- skip (n-1) nils
return append(n-1, t, idx) -- append chain of level
(n-1)
end
end
function solution1(level)
local t = {}
local last_index = append(level, t, 0)
return t, last_index
end
local level = tonumber(arg[1])
local t, last_index = solution1(level)
----
Student #2 has a deeper knowledge of Lua.
He knows that calling a function in Lua is an expensive
operation.
So he decides to use a loop instead of recursion.
He also knows that Lua has a special function to copy a
continuous range of indices:
table.move() is written in C, so it should work really fast.
This is his solution:
$ cat student2.lua
function solution2(level)
local t = {"Lua"} -- chain of level 0 is already
in the table
local last_index = 1 -- last touched index
local table_move = table.move
for n = 1, level do
-- skip (n-1) nils and append a copy of the current
content of the table
table_move(t, 1, last_index, last_index + n)
last_index = 2*last_index + n-1
end
return t, last_index
end
local level = tonumber(arg[1])
local t, last_index = solution2(level)
Student #2 is absolutely sure that his code would be faster
than the program of student #1.
But a benchmark for level 20 shows that the code written by
student #1 is significantly better.
(A chain of level 20 consists of 2097131 elements: 1048576
strings and 1048555 nils.)
$ lua -v
Lua 5.4.2 Copyright (C) 1994-2020 Lua.org, PUC-Rio
$ time lua student1.lua 20
real 0m0.395s
user 0m0.352s
sys 0m0.040s
$ time lua student2.lua 20
real 0m4.498s
user 0m4.432s
sys 0m0.060s
Student #2 says "Damn it, table.move is a slow shit, I'll
never use it again!" and leaves the classroom slamming the
door.
Obviously, he is very disappointed in the benchmark results.
----
Question to Roberto:
How would you explain to student #2 why table.move() is
so slow despite having been implemented in C?
----
Student #3 is a smart guy.
He has noticed that the number of nils after the N-th string
equals the number of trailing zeroes in the binary
representation of the number N.
He also knows the formula of the number of trailing zeroes:
LOG_2(XOR(N,N-1)+1)-1
math.log() is slow in Lua, so he uses a table lookup
instead.
This is his solution:
$ cat student3.lua
function solution3(level)
local step = {}
for j = 1, level do
step[(1<<j)-1] = j
end
local t = {"Lua"}
local last_index = 1
for j = 1, (1<<level)-1 do
last_index = last_index + step[j ~ (j-1)]
t[last_index] = "Lua"
end
return t, last_index
end
local level = tonumber(arg[1])
local t, last_index = solution3(level)
This code runs 1.3 times faster than the solution #1:
$ time lua student3.lua
20
real 0m0.296s
user 0m0.260s
sys 0m0.036s
----
Student #4 is an experienced Lua programmer.
He avoids all pitfalls.
His solution runs twice as fast as the solution #3.
$ time lua student4.lua
20
real 0m0.142s
user 0m0.124s
sys 0m0.020s
----
A mini-exam for the reader:
How has student #4 solved the task?
(You might want to stop here and find
your own candidate for "solution #4"
before continuing reading this thread)