lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]



On 07/08/14 12:24 PM, Mason Mackaman wrote:
I understand why str:len() is faster than str.len(str). What I don’t understand is why str:len() is faster than string.len(str).
On Aug 7, 2014, at 9:39 AM, Thiago L. <fakedme@gmail.com> wrote:

On 07/08/14 10:20 AM, Mason Mackaman wrote:
Why is “get self ‘len’” faster than “get ‘string’, get ‘len’”?
On Aug 7, 2014, at 8:14 AM, Thiago L. <fakedme@gmail.com> wrote:

On 07/08/14 10:10 AM, Mason Mackaman wrote:
for the following, ‘str’ is some string value.

So I just tested the speed of str:len() and string.len(str) to see how much slower str:len() would be, but to my surprise it was actually faster. How is that possible based on what Lua says about how indexing of tables works? When you call string.len(str) Lua has to get the ‘length' key in the string table. According to gettable_event(table,key) in the reference manual, Lua will see that ‘string’ is a table, then rawget ‘length’, check to see that it’s not nil and then return it. In str:len() Lua has to check that it’s a table, that will fail so now it finds the __index key in ‘str’s metatable, sees that it’s ‘string’, checks if it’s nil, and checks if it’s a function, then finally calls string[‘length’]. How can that be faster?! It has to do a whole bunch of steps just to get to the same spot string.len(str) starts!

While I was writing this it occurred to me that the only possible explanation was the ‘:’ operator (let me know if operator is the right word to use there). So I check str.len(str) and sure enough it was slower than string.len(str). Reading the reference manual we see that in str:len(), ‘str’ is only evaluated once, and in str.len(str), ‘str’ is evaluated twice, so it makes since why the former is faster than the later, but ‘str’ is only evaluated once in string.len(str), and you don’t have the extra steps to get ‘string.len’.
get "string", get "len" from it, get "str", call it
vs
get "str", get self "len" from it, call it
vs
get "str", get "len" from it, get "str" (again), call it
get self "len" does both the "get 'len' from it" and "get 'str' (again)" parts in C...

if you wanna compare str:len() vs str.len(str), you should try:

```
-- global gstr
gstr = "something"
local n = 1e6
local c;
local osclock = os.clock() -- locals are faster

-- test 1
c = osclock()
for i=1,n do
    gstr:len()
end
print(osclock() - c)

-- test 2
-- make sure to include the local declaration in the loop or else it's not the same
c = osclock()
for i=1,n do
    local str = gstr
    str.len(str)
end
print(osclock() - c)
```

the 1st test should output bytecode that gets the global "str" (which's a table access), then gets self "len" from it (which's basically a metatabled table get + a register copy), then calls it (or something like that, I don't remember the exact bytecode pattern)

the 2nd test should output bytecode that gets the global "str" (table access), then gets "len" from it (metatabled table get), then copies "str" into the next register (register copy), then calls it


in 5.2+ string.len(str) is exactly the same as _ENV.string.len(str), in 5.1 it's basically getfenv().string.len(str), in both cases "string.len(str)" does a table access for "string" (the global env is just a table), then another for "len" (and another for "str")

in 5.2+ you could also write it as _ENV["string"]["len"](_ENV["str"])