[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Load large amount of data fast
- From: Max E <maxxedout@...>
- Date: Sat, 16 Oct 2010 18:52:48 -0700
Well, here are two ideas off the top of my head, haven't tried either,
your mileage may vary.
Option 1:
Write it in C, using the Lua API. Read a line at a time using fgets(),
then either parse it yourself with strtok() or offload the parsing to
the Lua functions. This one is pretty much guaranteed to be workable and
will probably get you the best performance, although it may take the
longest to work.
Option 1.5:
Use something like this (http://lua-users.org/wiki/LuaGenPlusPlus ) to
convert your existing Lua code into a sequence of Lua API calls
automatically, then hand-optimize it. Performance might not be as good
though.
Option 2:
Okay, this is kind of cheating, but (assuming a UNIX-like command line
is available,) try something like this:
echo "data = {" > data.lua
cat inputfile | sed "s/.*/&,/" >> data.lua
echo "nil }" >> data.lua
I imagine there is a DOS/batch equivalent as well. These commands will
massage your input file into an executable Lua file as if you'd written
it as a Lua script in the first place. They just surround the whole file
with braces and put a comma after each line, with the final nil thrown
in so the comma on the last line doesn't cause problems. Then you can
require ("data.lua") and look in the global table "data" for your data.
I imagine that this will take a very long time during the require()
call, but depending on how efficiently Lua is implemented, this could
easily be the fastest you could possibly get it. Or Lua might just choke
on it.
Regards,
-Max.
On Sun, 2010-10-17 at 05:15 +0400, Alexander Gladysh wrote:
> On Sun, Oct 17, 2010 at 05:06, Petite Abeille <petite.abeille@gmail.com> wrote:
> > On Oct 17, 2010, at 2:57 AM, Alexander Gladysh wrote:
>
> >> I take it that you suggest me to write my own Lua parser (or use a
> >> custom one)?
>
> > Hmmm... I doubt that your problem is the parsing itself... more of an overall design issue perhaps... but then again, not enough information about what you are trying to do with that data to offer any concrete help :)
>
> I'm trying to load that 3M entries in to Lua table in memory faster
> than I do it now.
>
> Other ways of solving my original task are out of the scope of this
> question. :-)
>
> Thanks,
> Alexander.
>
> P.S. I've accidentally killed my data crunching process and had to
> start over. :-(
>
> I've added some timing, here it is for the first 900K entries.
>
> at line 100000 : Sun Oct 17 04:47:10 2010
> at line 200000 : Sun Oct 17 04:48:29 2010
> at line 300000 : Sun Oct 17 04:50:18 2010
> at line 400000 : Sun Oct 17 04:53:02 2010
> at line 500000 : Sun Oct 17 04:55:52 2010
> at line 600000 : Sun Oct 17 04:58:55 2010
> at line 700000 : Sun Oct 17 05:01:26 2010
> at line 800000 : Sun Oct 17 05:07:00 2010
> at line 900000 : Sun Oct 17 05:10:18 2010
>