Iteration protocol

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Iteration protocol
From: Rici Lake <lua@...>
Date: Sun, 27 Nov 2005 02:16:59 -0500

Based on my comment in the thread on nil as a table key, I did aproof-of-concept implementation of the suggested iteration protocol.

The benchmarking I did shows speedups of 6-10%, but one test seems toindicate a speedup of 33%.

The diff is not very big; it can be viewed here:<https://trac.bulix.org/projects/rici/changeset/84> or downloaded as adiff <https://trac.bulix.org/projects/rici/changeset/84?format=diff>.The latter URL includes the benchmarking program I used(test/benchmark.lua).

It does not include my patch for yieldable iterators, but theequivalent patch would be much simpler for the modified iterationprotocol since the termination test could be implemented with anunmodified OP_TEST. I didn't spend a lot of time thinking aboutoptimizations or things that could be removed. All I did was changeOP_TFORLOOP to use the new protocol, and modify pairs(),string.gmatch() and io.lines(). (ipairs() needs to be modified as well,but I didn't do that. Oops.)

The implementation differs slightly from the description given in myearlier message, in that it stops on any false return from theiterator, rather than only on a nil return; the change seemed feasiblewith state separated from iteration variables, but it may not be thebest thing.

Here are the benchmark results, compiled with -O2 on gcc 3.3.3 runningon FreeBSD:

1) Simple for loop; iterates over 10 numeric and 10 non-numeric keys inthe same table. Presumably the speedup is simply from not having tocopy the control variable in OP_TFORLOOP.


beta-5.1: 7.02 seconds (1e6 repetitions)
modified: 6.59 seconds

saving: 6%

2) string.gmatch. Uses the standard idiom str:gmatch("%S+") to iterateover a string with 12 short words. Here the speedup is rather moremarked, since we can avoid creating a closure on each loop.


beta-5.1: 9.02 seconds (1e6 repetitions)
modified: 8.18 seconds

saving: 9.3%

Note 1: Of course, you can't actually implement gmatch() withoutcreating a closure even with the suggested iteration protocol. However,you don't have to create a closure every time; you only need one foreach pattern. The sample implementation provides the APIstring.gmatcher(pattern) which returns a closure; this can be providedto string.gmatch instead of the pattern argument. (It also allows youto specify a starting position in the string in string.gmatch()). Ididn't spend a lot of time thinking about the API, but the results canbe inspected (see below).

Note 2: The above timings are done with os.clock(), but I also timedthe complete benchmark of a million iterations of each of the abovetests. The results:


beta-5.1: 16.12 seconds (user)
modified: 14.84 seconds

saving: 7.9%

3) A simple (ish) implementation of insertion-ordered multiple-valuetables, such as the data structure you might want to use for storingHTTP headers or LDAP entries. This is a fairly complex piece of code,so it ought to be some sort of indication of the impact on a realprogram. The implementation differs only in the iterator; the 5.1-betaiterator has to create a closure for each iteration, while themodified-lua iterator does not; a single iteration-state (an integer)is sufficient to record the iteration state.


The test:

a) converts an LDIF record (taken from the RFC) into an insert-orderedtable. The LDIF record has 12 entries and nine distinct keys.

b) deletes all entries (3 of them) with a key.
c) adds two new entries with the same key
d) makes a shallow copy (which iterates over all key/value pairs)

e) iterates over all values for each of three keys (a total of sixkey/value pairs)

The results are a bit odd, so I'll just report them. There are twomeasures: the first is simply the time reported by os.clock(); thesecond is the total time to run the benchmark reported by /usr/bin/time(which therefore includes parsing, test setup, etc., but mostimportantly closing the lua_State at the end of the run, which garbagecollects everything. The sample was executed 100,000 times:


beta-5.1:  (os.clock) 27.78      (/usr/bin/time)  37.47
modified:             26.32                       25.12  (!)

saving:                5.2%                       33%

There is a highly noticeable gc pause at the end of the beta-5.1 run,so I presume that this is the result of garbage collecting all theclosures which were created. On the other hand, each iteration of thetest allocates about 10 tables, which you would think would add upquite a bit, too; furthermore, the memory usage reported by the rusagestructure is not significantly different between the two tests. Myguess is that closures are small but gc-intensive.

Follow-Ups:
- Re: Iteration protocol, Rici Lake

Prev by Date: Re: wxLua at Sourceforge
Next by Date: Re: Iteration protocol
Previous by thread: Re: Why is nil not a valid table key?
Next by thread: Re: Iteration protocol
Index(es):
- Date
- Thread