[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Reading large files
- From: Wim Couwenberg <w.couwenberg@...>
- Date: Wed, 24 Aug 2005 21:49:54 +0200
> "The current method for determining if a file is binary involves reading
> in the entire file into memory in lua, and then calling a C++
> function to determine if that string is binary. Lua is stunningly
> inefficient at reding in a large file (reading in 100MiB copies 1.4GiB)."
Maybe the last remark hints at the concatenation problem explained in
LTN 9:
http://www.lua.org/notes/ltn009.html
This would occur if you read a file line by line and naively
concatenate them into a large chunk. Anyway, here's a simplistic
script to test binary-ness. Adjust the pattern in "find" to something
more sensible, if you like. Usage:
lua isbin.lua <file-name>
---------------
file isbin.lua:
---------------
local now = os.clock()
local input, err = io.open(arg[1], "rb")
assert(input, err)
local isbin = false
local chunk_size = 2^12
local find = string.find
local read = input.read
repeat
local chunk = read(input, chunk_size)
if not chunk then break end
if find(chunk, "[^\f\n\r\t\032-\128]") then
isbin = true
break
end
until false
input:close()
now = os.clock() - now
if isbin then
print "this file is binary..."
else
print "this is a text file..."
end
print(string.format("this took %.3f seconds", now))
-----------
end of file
-----------
Wim