  wiki

## A Need for Better Notation for Sequence Operations

Sequences are sequential streams of values that are produced by an iterator function. In the simplest form, they are represented by a function (or callable object) which returns a number of values each time it is called, with `nil` indicating the end. Sequences can be single-valued (like `io.lines`) or multi-valued (like `pairs`) and mostly they produce values of the same type. It is useful to extract some common operations we use with sequences and tables, and make them available as a library. This is one of the main goals behind the PenlightLibraries.

(Please note that this is an idiosyncratic definition: the Lua manual defines sequence as "a table where the set of all positive numeric keys is equal to {1..n} for some integer n" .)

With pl.seq together with the placeholder expressions provided by pl.func we can say:

```seq.printall(seq.filter(seq.list{1,2,3,4},Gt(_1,2)))
```

Which is fair enough, but awkward; I personally would find it hard to motivate this rather than a simple loop. Of course, this is a toy example, but more complicated combinations become even harder for humans to parse. Sequence adapters make it possible to express the same operation as a method chain:

```S{1,2,3,4}:filter(Gt(_1,2)):printall()
```

This is more readable, for several reasons. The first is that the qualifying `seq.` isn't everywhere, and the second is that we are culturally conditioned to read an operation sequence from left to right, unlike function application which is right to left. (This is also the famous pipeline metaphor made so popular by Unix.) Here is another example, where the length operator is mapped over a sequence of strings:

```S{'one','tw','t'} :map '#' :printall()  --> output: 3 2 1
```

It is more interesting to use a real sequence, such as generated by `io.lines`. This creates a sequence of all unique lines in a file, and then copies it to a table.

```ls = S(io.lines(fname)):unique():copy()
```

Another pattern which would be very convenient is to apply methods to all elements of the sequence:

```S{'[one]','[two]','[three]'}:sub(2,-2):upper():printall() --> output: ONE TWO THREE
```

Here is a more involved example; given a Lua code string, find all the variable names. `lexer.lua` will create a double-valued sequence, with the first value the token type and the second the token value. The first value is used to filter the token stream so that only variables ('iden') are passed through; `map(_2)` only lets the values through, `unique` collects the names, and finally `copy` turns these into a table.

```str = 'for i=1,10 do for j = 1,10 do print(i,j) end end'
ls = S(lexer.lua(str)):filter(Eq(_1,'iden')):map(_2):unique():copy()
print(List(ls))
---> output: {i,print,j}
```

The source is [here].

## Implementation

Since there is no type that uniquely indentifies a sequence (apart from it being callable) we need a wrapper object. The iterator function is put inside a table and a metatable is attached so we can control method lookup.

The `__index` metamethod is called whenever an unknown method is called, and the method is then looked up in the `seq` table. However, this function can't be used directly, since (a) it is expecting a sequence as its first argument and (b) it must return its results in a sequence wrapper so we can keep the method chain going.

```SMT = {
__index = function (tbl,key)
local s = seq[key]
if s then
return function(sw,...) return S(s(sw.iter,...)) end
end
end,
}

function S (iter)
return setmetatable({iter=iter},SMT)
end
```

This demonstrates the concept, but in practice doesn't handle all the necessary cases. Some `seq` functions return plain values (like `reduce`) or tables (like `copy`), and wrapping these would be confusing. Functions like `map` have their arguments the wrong way around (function first, then sequence). It would be convenient that `S` can wrap a table as a sequence, but should not do so for the result of `copy`. And so forth. And there is still the issue of how to do methods upon the sequence values. The key is this function in `seq`:

```function mapmethod (iter,name,arg1,arg2)
local val = iter()
local fn = val and val[name]
return function()
if not val then return end
local res = fn(val,arg1,arg2)
val = iter()
return res
end
end
```

Unlike plain `map`, `mapmethod` is given a function name, which it looks up in the first value generated by the sequence - thereafter returns a sequence which is the result of applying that function to every value. (As a convenience, the first two extra arguments are explicitly captured for the function call). So the strategy in the `__index` metamethod when it cannot look up a function name is to call `mapmethod` with the name of the method.

```
-- seqa.lua
local seq = require 'pl.seq'

-- can't look these directly up in seq because of the wrong argument order...
local overrides = {
map = function(self,fun)
return seq.map(fun,self)
end,
reduce = function(self,fun)
return seq.reduce(fun,self)
end
}

SMT = {
__index = function (tbl,key)
local s = overrides[key] or seq[key]
if s then
return function(sw,...) return SW(s(sw.iter,...)) end
else
return function(sw,...) return SW(seq.mapmethod(sw.iter,key,...)) end
end
end,
__call = function (sw)
return sw.iter()
end,
}

function callable (v)
return type(v) == 'function' or getmetatable(v) and getmetatable(v).__call
end

function S (iter)
if not callable(iter) then
if type(iter) == 'table' then iter = seq.list(iter)
else return iter
end
end
return setmetatable({iter=iter},SMT)
end

function SW (iter)
if callable(iter) then
return setmetatable({iter=iter},SMT)
else
return iter
end
end
```

RecentChanges · preferences
edit · history
Last edited July 4, 2012 9:25 am GMT (diff)