lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Wed, May 4, 2011 at 11:39 AM, Henderson, Michael D
<michael.d.henderson@lmco.com> wrote:
> I’d like something friendlier like
>         result[rowNumber][columnB]
>
> I can’t figure out how write that in lua. I made the result set look like
>
>         resultSet[1] = {
>                 columnA = [[e]],
>                 columnB = 0,
>                 columnC = nil,
>                 columnD = [[2010/12/25 00:00:00 UTC]]
>         }
>
> I’m stumped with the [x][y] reference. Is this harder than it seems because
> I’m not thinking this through using a Lua idiom?


with that resultset, the user would write
    result [rowNumber].columnB

or:
   local row = result[rowNumber]
   .... row.columnB
   .... row.columnD

and this would be the most 'Lua way' to do it, IMHO.


> I know that it’s too early to concentrate on optimizing, but I am concerned
> that looking up via a string key will be slower than via a numeric key.
> Since this will be operating in a loop over large data sets (daily runs with
> a set of a million rows), little improvements may help.

string constants are interned at script compile time, so a string
lookup is no different from a numeric _hashtable_ lookup.  but you're
right that array access is slightly faster than hashtable access, and
that is what would occur here. (because tables with contiguous integer
keys starting at 1 are transparently optimized as arrays)

still, the speed difference is unlikely to be significant unless all
your script does is access and modify RAM records.  as soon as you do
IO, any difference will be lost in statistic noise.  Quite likely,
table allocation would be a bigger issue (and it's roughly the same on
hashtables and array-like tables)

But, what could easily become an issue is memory consumption if you
have to manage all those millions of record in RAM at the same time.
in that case, i think the array-like tables do have an advantage (but
could be wrong).

If you can process your records stream-like (that is, a single record
or a few contiguous records at a time) is to use multiple return
values and local variables:

for fldA, fldB, fldC, fldD in iterator(query) do
   .....
end

where 'iterator(query)' is some iterator expression that returns the
separate fields of each record as separate values, without creating a
table.

but that's unlikely to be needed.  try the other approach first.


-- 
Javier