lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Sat, 16 Aug 2014 08:37:25 -0500
Andrew Starks <andrew.starks@trms.com> wrote:

> On Saturday, August 16, 2014, Jan Behrens
> <jbe-lua-l@public-software-group.org> wrote:
> 
> > On Fri, 15 Aug 2014 10:50:04 -0700
> > Coda Highland <chighland@gmail.com> wrote:
> >
> > > On Fri, Aug 15, 2014 at 10:47 AM,
> > > Doug Currie <doug.currie@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 15, 2014 at 1:16 PM,
> > > > Jan Behrens <jbe-lua-l@public-software-group.org> wrote:
> > > > >
> > > > > On Fri, 15 Aug 2014 13:37:41 -0300
> > > > > Roberto Ierusalimschy <roberto@inf.puc-rio.br> wrote:
> > > > >
> > > > > > If the goal is only to reduce calls to length,
> > > > > > another option would be to modify ipairs so that it
> > > > > > stops in the first nil entry with an index larger
> > > > > > than #t. That would have some nice properties:
> > > > >
> > > > > It sounds like a "hack" to me.
> > > >
> > > > It is very clever. I like it.
> > >
> > > I like it too!
> >
> >
> > Maybe I should explain what I meant with "hack":
> > it helps only if nil is an exception.
> >
> > In all other cases it doesn't speed things up.
> >
> > [...]
> >
> > Running this program with different approaches, I get the following
> > run-time:
> >
> > [...]
> >
> > Here, Roberto's approach would be slowest.
> >
> > [...]
> >
> > But as I previously said: more important than these benchmarks would be
> > (for me) to understand the semantics of ipairs first. In particular:
> >
> > Should
> >
> > * for i, v in ipairs(t) do ... end
> >
> > always do the same as:
> >
> > * for i = 1, #t do local v = t[i]; ... end
> >
> > Or should __ipairs allow specific optimizations and/or iterations over
> > objects that might not even have a length (or where the length can't be
> > determined easily).
> >
> >
> > Regards
> > Jan
> >
> >
> ipairs is the only way to get ordinal retrieval, as far as I can tell.
> That's why it's critical, unless I'm wrong about that.

What do you mean with "ordinal retrieval"? As of Lua 5.3.0-alpha, you
can always write:

  for i = 1, #t do local v = t[i]; ... end

instead of

  for i, v in ipairs(t) do ... end

if #t is defined and #t doesn't change during iteration.

Am I right here? Isn't both "ordinal retrieval"?

Therefore, I'd argue that in Lua 5.3.0-alpha, ipairs is merely
syntactic sugar... unless it's an explicit feature that the ipairs
iterator will evaluate the length multiple times. Is it?

I therefore also can't see why ipairs is "critical" in this case. If
Lua 5.3 will remove the __ipairs metamethod, then I'd most probably
always use the arithmetic for (for i = 1, #t) instead of using ipairs.

> Other than that, iterators are not one of those things where I need
> consistency across libraries. I can always write an iterator as part of the
> module.
>
> [...]

There is a problem here: Sometimes you want to write a function
(my_function) that accepts a sequence of values (e.g. to apply
some functions to that sequence's values). There are several
ways of accepting such a sequence that I can think of:

* Require the caller to pass a raw table where integer keys
  from 1 to n must be assigned (i.e. a sequence). In this case,
  feeding something that is not a raw table into that function
  would require a prior cast, e.g.:

  local seq = {}
  for row in sql:rows("SELECT * FROM customer") do
    seq[#seq+1] = row
  end
  my_function(seq)

* Allow to pass a function to my_function that gets called to
  retrieve each item.

  Cool, if sql:rows(...) returns a single function, But what if
  sql:rows(...) returns an iterator triplet, e.g.
  (sql_iteraux, db_result, 0)? Then one would need to write:

  local iter, res, pos = sql:rows("SELECT * FROM customer")
  my_function(function()
    pos = pos + 1
    local i, v = sql_iteraux(db_result, pos)
    return v
  )

* Allow the caller to pass a function *and* arguments that get
  called to retrieve each item. Or even better: simulate the
  behavior of the generic for loop. Here you might write
  something like: 

  -- assume my_function looks somehow like this:
  function my_function(iter, state, pos)
    for i, v in iter, state, pos do
      -- do something with "v" here
    end
  end

  -- and then call it like:
  my_function(sql:rows("SELECT * FROM customer"))
  -- (passing three arguments at once)

  But this works only as long as the iterator triplet iterates
  over a pair of an integer i and the value v (see implementation
  of my_function above). If sql:rows() returns a single function
  that directly returns a row upon each call (without an integer
  as first return value), then my_function must look differently:

  -- assume my_function looks somehow like this:
  function my_function(iter, state)
    for v in iter, state do
      -- do something with "v" here
    end
  end

  Either way, it's nasty to write (requires to pass up to three
  arguments).

* Require the caller to pass an argument that has the length
  operator defined and allow retrieval through numeric keys:

  function my_function(seq)
    for i = 1, #seq do
      local v = seq[i]
      -- do something with "v" here
    end
  end

  This, however, requires translation if sql:rows returns a
  closure that retrieves rows successivly and doesn't know
  about the total number of rows. In this case you would
  need to convert first:

  local seq = {}
  for row in sql:rows("SELECT * FROM customer") do
    seq[#seq+1] = row
  end
  my_function(seq)

  (same as in the first example)

So when you say "iterators are not one of those things where I need
consistency across libraries", I would dissent: It's really sad that
there is no consistent interface to pass "iterable" values. The
__ipairs metamethod of Lua 5.2 could be seen as such interface. Here,
it's also defined whether an incrementing integer is returned as first
value by the iterator triplet, or not:  It is included!  Therefore,
every function knows how to handle the output of the ipairs iterator.
The function my_function in the examples above could be implemented
like this:

  function my_function(seq)
    for i, v in ipairs(seq) do  -- assuming __ipairs is respected
      -- do something with "v" here
    end
  end

And this function could work for a lot of different types (including
raw tables that don't have a metatable set).

But even more is possible:

If we extended ipairs(...) in such way that if a function is passed as
argument, ipairs would return an iterator triplet that iterates over an
increasing integer i and the values returned by that function, then
this would allow my_function to also accept simple function iterators:

do
  local function my_ipairs_funcaux(f, i)
    local v = f()
    if v then
      return i + 1, v
    else
      return nil
    end
  end
  function improved_ipairs(x)
    if type(x) == "function" then
      return my_ipairs_funcaux, x, 0
    else
      return ipairs(x)
    end
  end
end

do
  local letter = nil
  function my_iterator()  -- some example iterator
    if letter == nil then
      letter = "a"
    elseif letter == "z" then
      return nil
    else
      letter = string.char(string.byte(letter) + 1)
    end
    return letter
  end
  local abc = {"a", "b", "c"}
  for i, v in improved_ipairs(my_iterator) do
    print(i, v)
  end
  for i, v in improved_ipairs(abc) do
    print(i, v)
  end
  -- or in case of an SQL cursor:
  -- for i, v in improved_ipairs(sql:rows("SELECT * FROM xxx")) do
  --   print(i, v)
  -- end
end


Additionally, improved_ipairs could even be further extended such that
it accepts any form of iterator triplet (not just simple functions):

do
  local function my_ipairs_funcaux(f, i)
    local v = f()
    if v then
      return i + 1, v
    else
      return nil
    end
  end
  function supercool_ipairs(x, s, i)
    if type(x) == "function" then
      if s == nil and i == nil then
        return my_ipairs_funcaux, x, 0
      else
        local n = 0
        return function()  -- closure is not avoidable here
          n = n + 1
          local j, v, v2, v3, v4 = x(s, i)
          -- needs C implementation for variable number of values v1..vn
          if j == nil then
            return nil
          else
            i = j
            return n, j, v, v2, v3, v4
          end
        end
      end
    else
      return ipairs(x)
    end
  end
end


Considering all this, my proposal for Lua 5.3 would be to make ipairs a
generic interface for (ordinal) iteration. That would mean:

* Keep the __ipairs metamethod to allow customized behavior

* Allow functions to be passed to the global ipairs function
  (according to one of the implementations above:
  improved_ipairs or supercool_ipairs)

I'm unsure, however, how the default ipairs should work if __ipairs is
undefined but if __len and/or __index are defined. I guess it's a
matter of taste. Having the default ipairs respect __len would require
multiple evaluation of the length (as shown before). On the other hand,
if we keep having __ipairs, then this problem could be circumvented if
we really want/need to.


Regards
Jan