lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi Shane,
I'm interested in using Lua to make things faster and optimized.

To be more lucid the logfile1.csv has data like :
1166212618.66,Fri Dec 15 14:56:58,0,0,0,0,0,0,0,0,0,0,0
,1,0,1,0,0,0,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 3.2,3.2,-3.2,-3.2,0,0,1,-1,1,1,1,0,0,1,1,1,0,1,0,2,82,0,9,4,3902,3.79,0,318.2,0,0,1,1
1166212618.72,Fri Dec 15 14:56:58,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 3.2,3.2,-3.2,-3.2,0,0,1,-1,1,1,1,0,0,1,1,1,0,1,0,2,82,0,9,4,3902,3.79,0,318.2,0,0,1,1
1166212618.78,Fri Dec 15 14:56:58,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 3.2,3.2,-3.2,-3.2,0,0,1,-1,1,1,1,0,0,1,1,1,0,1,0,2,82,0,9,4,3902,3.79,0,318.2,0,0,1,1

So the first field is the timestamp(Linux format) . 

logfile10.csv similarly has the same type of data:

1166212618.84 ,Fri Dec 15 14:56:58,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3.2,3.2,-3.2,-3.2,0,0,1,-1,1,1,1,0,0,1,1,1,0,1,0,2,82,0,9,4,3902,3.79,0,318.3 ,0,0,1,1
1166212618.91,Fri Dec 15 14:56:58,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3.2,3.2,-3.2,-3.2,0,0,1,-1,1,1,1,0,0,1,1,1,0,1,0,2,82,0,9,4,3902, 3.79,0,318.3,0,0,1,1
1166212618.97,Fri Dec 15 14:56:58,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3.2,3.2,-3.2,-3.2,0,0,1,-1,1,1,1,0,0,1,1,1,0,1,0,2,82,0,9,4,3902, 3.79,0,318.3,0,0,1,1
1166212619.03,Fri Dec 15 14:56:59,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3.2,3.2,-3.2,-3.2,0,0,1,-1,1,1,1,0,0,1,1,1,0,1,0,2,82,0,9,4,3902, 3.79,0,318.6,0,0,1,1
1166212619.09,Fri Dec 15 14:56:59,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,1,0,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,0,1,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3.2,3.2,-3.2,-3.2,0,0,1,-1,1,1,1,0,0,1,1,1,0,1,0,2,82,0,9,4,3902, 3.79,0,318.6,0,0,1,1

One advantage I have is that my logfiles are named sequentially in the directory like logfile1.csv ,logfile2.csv and so on.So when I want to combine all the data between logfile1.csv and logfile10.csv -its has sequentially increasing timestamps(like logfile10 would always have timestamps of higher value than logfile1.csv)  .

Now when timestamp1=1166212618.72 ( match in logfile1.csv) and timestamp2=1166212619.03 ( match in logfile10.csv) then I would take all the data after the timestamp1's record from logfile1.csv ,all data from logfile2.csv ,logfile3.csv ...upto the timestamp2's record from logfile10.csv into the new CSV outlogfile.csv

My problem is the logfiles are huge -I have a few which are 1-4 GB in size,so the searching for timestamp and then corresponding record copying has to be fast and efficient.As of now I'm using a MATLAB script to do so,but its very slow and inefficient.
Please suggest a way which is fast and efficient.


Thanks Shmuel  for the code,but it would really help if someone comes up a faster way-or if at all doing this in Lua is a good idea.


Keep those suggestions coming in !!!
--
Regards,
Chandan