lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


2012/12/17 Billy <spooky_loco_pv@yahoo.com>
Anyone know how i can optimize this? Or if i even wrote this correctly. I wanted
to create a function that would return the table corresponding with the fields
that the csv was created

If you don't need a table with actual fields but only a way to access the fields by name you could set a metatable for each row where the __index does the named field lookup. As you don't need to create another table for each row and copy every field its much faster to create and also uses half the memory in my testcase.

function CSVToTableMT(file_in)
local file=assert(io.open(file_in,"r"))
local line=assert(file:read("*l"))
local header=fromCSV(line)
local headerlookup={}
for i,field in ipairs(header) do
headerlookup[field]=i
end
local mt={
__index=function(tbl,key)
local idx=headerlookup[key]
if idx==nil then
return nil
else
return tbl[idx]
end
end
}
local tbl={}
line=file:read("*l")
while line~=nil do
table.insert(tbl,setmetatable(fromCSV(line),mt))
line=file:read("*l")
end
file:close()
return tbl
end

Test:
Windows xp with luajit 2.0
30MB csv file with 19 columns and 400000 rows
reading all rows in a table and then writing all out again, indexing each field of each row by name.

Your approach:
7.54 seconds 400MB memory usage

With optimisations mentioned by  David Favro:
7.27 seconds 398MB memory usage

Metatable solution:
5.9 seconds 177MB memory usage

so at least under LuaJit 2.0 ist the fastest approach which also uses much less memory.

LG,
Michael

ps:
thats the optimized function with the original approach i used:

function CSVToTable2(file_in)
local file=assert(io.open(file_in,"r"))
local line=assert(file:read("*l"))
local header=fromCSV(line)
local tbl={}
line=file:read("*l")
while line~=nil do
local fields={}
for i,field in ipairs(fromCSV(line)) do
fields[header[i]]=field
end
table.insert(tbl,fields)
line=file:read("*l")
end
file:close()
return tbl
end