[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: bit.lshift and performance - luabitop v.s. lua-5.2.0-work4
- From: KHMan <keinhong@...>
- Date: Fri, 15 Oct 2010 09:26:01 +0800
On 10/15/2010 8:54 AM, David Manura wrote:
On Thu, Oct 14, 2010 at 9:50 AM, KHMan wrote:
On 10/14/2010 11:55 AM, David Manura wrote:
now requires two special cases: nbits == 0 and nbits == 32.
This is not for deflate, is it? There is no real reason to be requesting 0
bits or 32 bit in deflate, because the original pkzip architecture is
designed for 64KB segments. Huffman codes are limited in length as well,
etc. The encoding and decoding process has no real reason to be asking for 0
or 32 bits... ???
A good question. ModuleCompressDeflateLua never extracts 32-bits at a
time from the bitstream during Deflate. Rather, in parsing the gzip
header/footer, there are three occasions where it extracts 32-bit
integers from the bitstream:
local mtime = bs:read(32) -- Modification TIME
local crc32 = bs:read(32)
local isize = bs:read(32) -- ignored
Granted, the gzip wrapping could be processed byte-wise instead, but
for consistency and given its triviality, it just reads from the same
bitstream as Deflate.
These are all well-aligned standard header fields. Reading them
the usual way should be fine. The variable bit stream calls are
for the compressed Huffman tree and the LZ encoding, and these are
I think designed to be within 1-16 bit in length, per the original
Just a little puzzled as to why a compression algorithm needs to call to get
0 bits -- which sounds like a no-op. Yet to see any algorithm that does it.
If it's not for deflate, then getting 32 bits can just skip everything and
use a faster call.
It reads a variable number (possibly zero) of "extra" bits:
else -- val> 256 (i.e. extra bits)
local nextrabits = tdecode_len_nextrabits[val]
local extrabits = bs:read(nextrabits)
local len = len_base + extrabits
local dist_nextrabits = tdecode_dist_nextrabits[dist_val]
local dist_extrabits = bs:read(dist_nextrabits, true)
local dist = dist_base + dist_extrabits
Oh that, I see why you would want to optimize that way.
The encoded length codes has a linear distribution (meaning that
with the extra offset bits, the actual distribution falls
exponentially) or more often, exponential (actual distribution
will be a very steep exponential curve). Same for distance codes.
I have very detailed data for LZF, if you want to see some trends.
So, deflate is designed so that many of encoded length values do
not need any extra bits, or "more has fewer bits".
If we use an 'if' test, then the zero extra bits portion should be
a bit faster and will run more often. This might just balance out
having to execute all those extra lines if you were to use
something that works with [0,32].
Furthermore, it skips over a variable number (possibly zero) of bits
to the next byte alignment (e.g. on uncompressed blocks or the gzip
if btype == BTYPE_NO_COMPRESSION then
Of course, these are minor cases, more of convenience than necessity,
and it would be easy enough to handle 0 and 32 specially.
Once per block, not much there. I think using the thing for [1,31]
is fine, but that's just me. :-)
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia