lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> "The current method for determining if a file is binary involves reading
> in the entire file into memory in lua, and then calling a C++
> function to determine if that string is binary.  Lua is stunningly 
> inefficient at reding in a large file (reading in 100MiB copies 1.4GiB)."

Maybe the last remark hints at the concatenation problem explained in
LTN 9:

    http://www.lua.org/notes/ltn009.html

This would occur if you read a file line by line and naively
concatenate them into a large chunk.  Anyway, here's a simplistic
script to test binary-ness.  Adjust the pattern in "find" to something
more sensible, if you like.  Usage:

    lua isbin.lua <file-name>


---------------
file isbin.lua:
---------------

local now = os.clock()

local input, err = io.open(arg[1], "rb")
assert(input, err)

local isbin = false
local chunk_size = 2^12
local find = string.find
local read = input.read

repeat
        local chunk = read(input, chunk_size)
        if not chunk then break end

        if find(chunk, "[^\f\n\r\t\032-\128]") then
                isbin = true
                break
        end
until false

input:close()

now = os.clock() - now

if isbin then
        print "this file is binary..."
else
        print "this is a text file..."
end

print(string.format("this took %.3f seconds", now))

-----------
end of file
-----------

Wim