lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

> "The current method for determining if a file is binary involves reading
> in the entire file into memory in lua, and then calling a C++
> function to determine if that string is binary.  Lua is stunningly 
> inefficient at reding in a large file (reading in 100MiB copies 1.4GiB)."

Maybe the last remark hints at the concatenation problem explained in
LTN 9:

This would occur if you read a file line by line and naively
concatenate them into a large chunk.  Anyway, here's a simplistic
script to test binary-ness.  Adjust the pattern in "find" to something
more sensible, if you like.  Usage:

    lua isbin.lua <file-name>

file isbin.lua:

local now = os.clock()

local input, err =[1], "rb")
assert(input, err)

local isbin = false
local chunk_size = 2^12
local find = string.find
local read =

        local chunk = read(input, chunk_size)
        if not chunk then break end

        if find(chunk, "[^\f\n\r\t\032-\128]") then
                isbin = true
until false


now = os.clock() - now

if isbin then
        print "this file is binary..."
        print "this is a text file..."

print(string.format("this took %.3f seconds", now))

end of file