lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


All right, couldn't help myself; I went ahead and tried out option 2
myself. I ended up having to get a little fancier, breaking it up into
sections of 40,000 lines each and making each section into a bunch of
assignment statements within a function, but in the end I got a total
load time of just about two minutes on my Core i5 @2.53 Ghz.

Attached is a zipfile containing:
1) massage_data.sh, which will turn the input data into executable Lua
code to be "required." This process inflates the data by about 14%. You
should tune it up a bit to improve performance, then call it from within
your Lua code. Usage: sh massage_data.sh data.in > data.lua
2) gen_data.lua, a simple script to generate a test load. Usage: lua
gen_data.lua > data.in
3) main.sh, a script which invokes massage_data.sh and then invokes Lua
on its output; this script is intended as a benchmark. Usage:
time ./main.sh. 

Consider all attached code to be public domain. Have fun. :)

-Max.

On Sat, 2010-10-16 at 18:52 -0700, Max E wrote:
> Well, here are two ideas off the top of my head, haven't tried either,
> your mileage may vary. 
> 
> Option 1:
> Write it in C, using the Lua API. Read a line at a time using fgets(),
> then either parse it yourself with strtok() or offload the parsing to
> the Lua functions. This one is pretty much guaranteed to be workable and
> will probably get you the best performance, although it may take the
> longest to work.
> 
> Option 1.5:
> Use something like this (http://lua-users.org/wiki/LuaGenPlusPlus ) to
> convert your existing Lua code into a sequence of Lua API calls
> automatically, then hand-optimize it. Performance might not be as good
> though.
> 
> Option 2:
> Okay, this is kind of cheating, but (assuming a UNIX-like command line
> is available,) try something like this:
> echo "data = {" > data.lua
> cat inputfile | sed "s/.*/&,/" >> data.lua
> echo "nil }" >> data.lua
> I imagine there is a DOS/batch equivalent as well. These commands will
> massage your input file into an executable Lua file as if you'd written
> it as a Lua script in the first place. They just surround the whole file
> with braces and put a comma after each line, with the final nil thrown
> in so the comma on the last line doesn't cause problems. Then you can
> require ("data.lua") and look in the global table "data" for your data.
> I imagine that this will take a very long time during the require()
> call, but depending on how efficiently Lua is implemented, this could
> easily be the fastest you could possibly get it. Or Lua might just choke
> on it.
> 
> Regards,
> -Max.
> 

Attachment: massive_table_load.zip
Description: Zip archive