lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi everyone,

I have been thinking - and doing some work - about adapting the zlib
binding to be able to support zlib streams using a file like
interface.

How would this differ from the current implementation? When
decompressing data i would provide a source where the binding would
read some data and that process would be somewhat transparent to who
is reading from the zlib stream.

On a more practical example, for decompression:
zlib interface:

  stream = zlib.inflate(
    string | function | table | userdata,
    windowBits, [15]
  )
[...] represent default values

alternative:

  stream = zlib.inflate{
    source = {
      read = function,
      [peek = function,]
      [close = function,]
    },
    -- or (the table could act as the source)
    read = function,
    [peek = function,]
    [close = function,]
    -- other parameters
    [windowBits = number]
  }

Where the first parameter is a string, a function, table or userdata.
When it is a string it is the complete compressed stream available
when reading.
When is it a function it will be used to read blocks of information.
When it is a table or userdata it will look for a "read" function to
read blocks of data. The provided read function will receive an
integer value as a parameter that can be used as a hint as the size of
the block to return with compressed data, it can be ignored if the
function wants to. This would allow to pass a normal file as the
source of information and read from it transparently.

Another thought i had was to look for a 'peek' function to get the
block before calling 'read' as a means for the source to be able to
know how much data was actually processed when decompressing data -
for cases when you have an embedded stream and need to know when the
compressed block ends.

The provided stream could handle both zlib and gzip headers (auto
detected by zlib itself).

The stream would have a file like interface, with the available functions:

  stream:read(format,...) -- format would be one of number, '*a',
'*l', '*n', defaults to '*l'
  stream:write() -- raise error
  stream:close() -- close stream
  stream:seek() -- raise error? allow to skip forward?
  stream:flush() -- do nothing?
  stream:lines() -- do nothing? raise error?


For compressing it would provide a similar interface:
  stream = zlib.deflate(
    function | table | userdata,
    compression level, [Z_DEFAULT_COMPRESSION]
    method, [Z_DEFLATED]
    windowBits, [15]
    memLevel, [8]
    strategy, [Z_DEFAULT_STRATEGY]
  )
[...] represent default values

The first parameter is similar to the one in zlib.inflate(), but
instead of looking for a read function it would use the write function
and act as a 'sink'. When writing to the stream whenever some data is
outputted by the zlib algorithm it would 'write' it to the 'sink'.

  stream:read() -- raise error
  stream:write() -- compress data, write output to sink
  stream:close() -- call zlib with Z_FINISH, send output to sink, close stream
  stream:seek() -- raise error?
  stream:flush() -- call zlib deflate with Z_SYNC_FLUSH and send write to sink
  stream:lines() -- similar to io.lines

When stream:close() is called should it look for a close function on
the provided table/userdata source or sink and call it too?

This would allow the usage of compress/decompress filters on files or
blocks of data without too much trouble.

What can be done using this interface?
The gzip module could be written in pure lua code by opening a file
through io.open and passing it to zlib.inflate or zlib.deflate and
maintain compatibility with current implementation.
A zip module could be written on top of it taking advantage of the
streaming support (work in progress by Enrico Tassi).

If someone has some comments, or ideas for the interface zlib module
should provide, please share.

Regards,
Tiago