Re: feedback on chunk

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: feedback on chunk
From: David Rio <driodeiros@...>
Date: Tue, 19 Apr 2011 14:50:08 -0500

On Tue, Apr 19, 2011 at 2:10 PM, Doug Currie <doug.currie@gmail.com> wrote:
>
> On Apr 19, 2011, at 11:58 AM, David Rio Deiros wrote:
>
>> I was wondering if I could get some feedback on the following chunk of
>> lua code.
>
> How big is pl?

Very big. It can have millions of keys.

> Can you
> (a) replace 'N' with '.' in the strings in pl

Yes, I could you use '.'

> (b) turn your loop inside out and use gmatch [1] over read
> ?

> Something like:
>
> local function slide_over_read(read, pl)
>   for patt, tbl in pairs(pl) do
>      for w in read.gmatch(patt) do
>         -- do something with the matched substring w and table tbl
>      end
>   end
> end

pl is very big. Prior of running that chunk, a big file is hashed into
a table (pl). Then we iterate over another file and run the above
chunk per each line (read). That approach you are suggesting would
take more time to compute.

The input file that fills pl can be between 1 and 3 millions entries (keys).
The read file can be hundred of millions.
For my tests pl=1M and read file is 4M.

-drd

P.S: full tool's code: goo.gl/PhgGY

References:
- feedback on chunk, David Rio Deiros
- Re: feedback on chunk, Doug Currie

Prev by Date: Re: feedback on chunk
Next by Date: Re: feedback on chunk
Previous by thread: Re: feedback on chunk
Next by thread: dumping a table
Index(es):
- Date
- Thread