Simpler For Iterator

lua-users home
wiki

Possibly the current model of iterator "for" loops may be simplified. Here is both a trial to explain the present model and an introduction to a simpler alternative -- maybe. It seems the complication is due to:

The current syntax reads:

for A, data in iterator_func, X, Y do block end

Data is the actual data returned by the function and later used in the block. A, X, & Y are left to further explaination below. Here is a possible implementation of both a collection iterator and a generator iterator, based on the tutorial example (tried to be very explicit, and started at 1 for a change):

-- collection iterator --
numbers = {1,3,5,7,9,11,13}
function coll_squares(coll)
    local function next_square(coll, index)
        if index > #coll then
            return nil
        end
        n = coll[index]
        return index+1, n*n
    end
    return next_square, coll, 1
end
for i, square in coll_squares(numbers) do print (square) end     --> OK

-- generator iterator --
function gen_squares(limit)
    local function next_square(limit, number)
        if number > limit then
            return nil
        end
        return number+1, number*number
    end
    return next_square, limit, 1
end
for n, square in gen_squares(7) do print (square) end     --> OK

So, what are A, X, & Y? In case of a collection:

In case of a generator:

It is difficult to find a common ground in order to explain and name A, X & Y meaningfully. X is called 's' in the reference manual, and 'state' in the tutorial. In the reference manual, A is called var1, while Y is called var. Here is a trial to make sense out of that:

[If anyone finds better names...] In addition to their use in yielding next data, the mark and the range are also used together to know when to stop iterating. It is not trivial to guess what the iterator and the iterator func are supposed to return, as well what the func implicitely receives from lua, and the proper order of all these values.

The code above may be rewritten as follows:

-- collection iterator --
function coll_squares(coll)
    local index = 1
    local coll = coll       -- just to make things clear
    local function next_square()
        if index > #coll then
            return nil
        end
        n = coll[index]
        index = index+1
        return n*n
    end
    return next_square
end
for square in coll_squares(numbers) do print (square) end     -- OK

-- generator iterator --
function gen_squares(limit)
    local number = 1
    local limit = limit     -- ditto
    local function next_square()
        if number > limit then
            return nil
        end
        n = number
        number = number+1
        return n*n
    end
    return next_square
end
for square in gen_squares(7) do print (square) end     -- OK

There are little differences which are all simplifications, except for the last one:

The last point makes the mark (index or number) a local var in the iterator which is reachable to the nested func _closure_ as an upvalue (right?). The "range" can only be a local var in the iterator, so there is no need to pass it explicitely as an argument to the function. (please correct if anything is wrong here, including terminology)

We can imagine more complex cases, eg specifying the generator interval. Additional data becomes iterator parameters:

-- generator iterator --
function gen_squares(start, stop, step)
    local number = start
    local function next_square()
        if number > stop then
            return nil
        end
        n = number
        number = number+step
        return n*n
    end
    return next_square
end
for square in gen_squares(3,9,2) do print (square) end     --> OK

Idem, if we complexify a collection iterator (here rather artificially):

-- collection iterator --
require "math"
numbers = {1,3,5,7,9,11,13,15,17}
function coll_squares(coll, modulo)
    local index = 1
    local function number_filter()
        -- return next number in coll multiple of modulo, else nil
        while (index < #coll) do
            number = coll[index]
            if math.fmod(number, modulo) == 0 then
                return number
            end
            index = index+1
        end
        return nil
    end
    local function next_square()
        -- yield squares of multiples of modulo in coll
        n = number_filter()
        if not n then
            return nil
        end
        index = index+1
        return n*n
    end
    return next_square
end
for square in coll_squares(numbers, 3) do print (square) end     --> OK

In all cases, it seems A, X & Y are not needed. This way of implementing iterators makes a good use of lua basic features: funcs as values, nested funcs, closures/upvalues. So, a question is: can we simplify the interface between "for" syntax, iterator, and iterator func by getting rid of A, X & Y? If yes, a new syntax could be:

for data in iterator_func do block end
While the present one is:
for A, data in iterator_func, X, Y do block end

As a consequence, the variety of iterators would not be globally caught by the syntax itself, in a rather complicated manner, but let to the user implementation instead. It would sertainly be easier to learn & explain both the syntax and the proper way to write an iterator for a given task.

The reference manual states:

<< f, s, and var are invisible variables. The names are here for explanatory purposes only. >>
In the present proposal, they are inexistent. The necessary data is passed as parameters to the iterator, as is done now: collection, bounds or whatever.

(first page formulation by DeniSpir)


RecentChanges · preferences
edit · history
Last edited November 13, 2009 3:41 pm GMT (diff)