[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: __pairs metamethod - is it in?
- From: Rici Lake <lua@...>
- Date: Mon, 7 Feb 2005 13:05:34 -0500
On 7-Feb-05, at 12:01 PM, Mike Pall wrote:
I think the real reason why 'for k,v in t' is deprecated is that it's
both ambiguous and slow. At compile time it is not clear whether t is
a table or a function. Therefore a (slow) runtime check in the form of
an
extra VM instruction (OP_TFORPREP) is needed for _every_ iterator loop.
TFORPREP is executed once per for loop, not once per iteration.
In any event, the vast majority of for loops (at least in my code)
use pairs, whether explicitly or not. Without OP_TFORPREP, the
explicit invocation would require three op codes: GETGLOBAL, CALL,
JUMP. I fail to see how this is an optimization.
(I do think TFORPREP is ugly, though.)
And there's an implicit lookup of 'next' in the global table, too
(which is ugly from a VM design standpoint).
Agreed. next is only looked up once but it is called for every
iteration. I experimented with a hack which directly calls the
internal next function, only lua_calling if the iterator function
is not next; this created a measurable speedup on trivial loops
but I doubt whether it is enough to justify the hackery.
Whenever this compatibility check is gone, you can still write
'for k,v in obj' provided you add a __call metamethod to the table
(or userdata). Providing a default (implicit) __call metamethod for
plain tables may be an interesting way to solve the compatibility
problem
(alas, one side-effect is that _all_ tables are then callable objects,
too).
That would effectively mean that the __call metamethod becomes the
default iterator generating function. This is possible, but a
__pairs metamethod seems to me to better capture the semantics.
The other argument is that most containers do not have just a single
iterator.
That's arguable. I would say that most containers (if not all) do have a
canonical iterator, although some have additional iterators.
There are two standard iterators for plain tables and more could
be conceived. And complex objects may provide dozens of iterators ...
So which one do you expect to be used when you write 'for i in obj'?
Which one do the readers of your code expect to be used?
The canonical one :) That should be documented in the container
description.
It is actually quite rare that both pairs and ipairs be used on the
same table; in fact, if a __pairs metamethod existed, I would
probably make ipairs the canonical iterator generator for my vectors
(i.e. add __pairs = ipairs to the metatable).
Going with 'explicit is better than implicit', it's just good
programming
practice to spell out _which_ iterator you want to use.
That's debatable, but I could live with it. However, if you deny the
fact that tables have canonical iterators, then you are at risk of
creating a requirement that the consumer of a collection know what
the name of the canonical iterator for the collection is. That would
go against the 'separation of concerns' programming practice. (And
it is really a major pain if you use alternative collection
implementations.)
Remember that
your code is much more often read than written. Therefore it is more
important to convey your intentions clearly than to save a few
keystrokes.
My intention with 'for k, v in t do' is to iterate over all
the key-value pairs in t in the way that makes the most sense
for that table. Currently I'm forced to rewrite the global
"pairs" function to get this effect, which hides an
important implementation detail from potential readers of my code.
Notwithstanding all of the above, I could actually live with
explicitly using pairs if that function invoked the canonical
iterator generator for its argument by means of a __pairs metamethod.
The current "implicit pairs" semantics is ugly because it does
not allow the use of a table with a __call metamethod to be used
as an iterator (*not* an iterator generator) which means that
tables with __call metamethods cannot consistently be used as
functions. Requiring the explicit use of the global 'pairs'
would solve this minor wart, but at some cost (the extra
opcodes and function call for every for loop). I'm not sure
that it is worth it, but I have no strong objections either.