lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Because string.len(str) has to look up both string and str in the
globals, while str:len() (and str.len(str)) can do a register copy
instead of a second lookup.

/s/ Adam

On Thu, Aug 7, 2014 at 8:24 AM, Mason Mackaman <masondeanm@aol.com> wrote:
> I understand why str:len() is faster than str.len(str). What I don’t understand is why str:len() is faster than string.len(str).
> On Aug 7, 2014, at 9:39 AM, Thiago L. <fakedme@gmail.com> wrote:
>
>>
>> On 07/08/14 10:20 AM, Mason Mackaman wrote:
>>> Why is “get self ‘len’” faster than “get ‘string’, get ‘len’”?
>>> On Aug 7, 2014, at 8:14 AM, Thiago L. <fakedme@gmail.com> wrote:
>>>
>>>> On 07/08/14 10:10 AM, Mason Mackaman wrote:
>>>>> for the following, ‘str’ is some string value.
>>>>>
>>>>> So I just tested the speed of str:len() and string.len(str) to see how much slower str:len() would be, but to my surprise it was actually faster. How is that possible based on what Lua says about how indexing of tables works? When you call string.len(str) Lua has to get the ‘length' key in the string table. According to gettable_event(table,key) in the reference manual, Lua will see that ‘string’ is a table, then rawget ‘length’, check to see that it’s not nil and then return it. In str:len() Lua has to check that it’s a table, that will fail so now it finds the __index key in ‘str’s metatable, sees that it’s ‘string’, checks if it’s nil, and checks if it’s a function, then finally calls string[‘length’]. How can that be faster?! It has to do a whole bunch of steps just to get to the same spot string.len(str) starts!
>>>>>
>>>>> While I was writing this it occurred to me that the only possible explanation was the ‘:’ operator (let me know if operator is the right word to use there). So I check str.len(str) and sure enough it was slower than string.len(str). Reading the reference manual we see that in str:len(), ‘str’ is only evaluated once, and in str.len(str), ‘str’ is evaluated twice, so it makes since why the former is faster than the later, but ‘str’ is only evaluated once in string.len(str), and you don’t have the extra steps to get ‘string.len’.
>>>> get "string", get "len" from it, get "str", call it
>>>> vs
>>>> get "str", get self "len" from it, call it
>>>> vs
>>>> get "str", get "len" from it, get "str" (again), call it
>>>
>> get self "len" does both the "get 'len' from it" and "get 'str' (again)" parts in C...
>>
>> if you wanna compare str:len() vs str.len(str), you should try:
>>
>> ```
>> -- global gstr
>> gstr = "something"
>> local n = 1e6
>> local c;
>> local osclock = os.clock() -- locals are faster
>>
>> -- test 1
>> c = osclock()
>> for i=1,n do
>>    gstr:len()
>> end
>> print(osclock() - c)
>>
>> -- test 2
>> -- make sure to include the local declaration in the loop or else it's not the same
>> c = osclock()
>> for i=1,n do
>>    local str = gstr
>>    str.len(str)
>> end
>> print(osclock() - c)
>> ```
>>
>> the 1st test should output bytecode that gets the global "str" (which's a table access), then gets self "len" from it (which's basically a metatabled table get + a register copy), then calls it (or something like that, I don't remember the exact bytecode pattern)
>>
>> the 2nd test should output bytecode that gets the global "str" (table access), then gets "len" from it (metatabled table get), then copies "str" into the next register (register copy), then calls it
>>
>
>