lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Mike Pall wrote:
> David Given wrote:
>> I can now fit the entire Prime Mover executable, which is 5000 lines of
>> Lua, a patched Lua interpreter source and lposix, into under 64kB of
>> bzipped data. That's not bad...
> 
> Impressive. But you may need to ship it with bunzip2 (esp. for
> Windows). Or does that number include the code for bunzip2?

Win32 does not have gzip by default also, so I guess it is not
meant for Visual Studio... If there is a *nix shell interface,
then bzip2 is almost as ubiquitous as gzip nowadays.

> Maybe other compression systems with less efficiency, but with a
> tiny, standalone decompressor are more attractive. Or go for the
> top contenders and pick the one with the smallest decompressor.
> Here's an exhaustive comparison:
>   http://www.maximumcompression.com/

maximumcompression is cute, but it tends to a benchmark jockey
culture. Looks like it trivializes time stats, which you need to
get from one Multiple file compression (MFC) page only. Also,
there are memory requirements which is not mentioned anywhere.
Personally I prefer to dissuade overheated benchmarking enthusiasm
when it comes to compression methods.

The choice of compression method depends on the application. We
must never drool at the better stats in maximumcompression but
instead choose the best method for the job. Often, the practical
'best' does not mean best compression size.

> Bzip2 is popular, but not even close to the top 10. Check out PAQ:
>   http://cs.fit.edu/~mmahoney/compression/

Haven't seen mainstream adoption of PAQ; memory and processing
requirements waaaay too large. Some of those methods have
compression/decompression performance that is pretty symmetrical,
IIRC, PAQ is one of them. Look at the list ranked by compression
time -- PAQ8 can take over 11 hours to decompress. Ouch. Heh heh.

Yes, PAQ is getting a lot of airtime for Hutter Prize exploits,
but Hutter is about some really bizzare tie-in with AI (which
sounds very much like nonsense to me), and not about good
memory/effort performance. Context or predictor models doesn't
sound like revolutionary AI methods to me. To a normal developer,
one cannot implement a compressed Wikipedia viewer using PAQ
because it will be way too slow for many years yet.

Also, maximumcompression does not appear to give any info on
decompressor sizes or availability (as in e.g. a zlib1.dll). If
decompressor size is important, I recommend a study of the choices
made by UPX and NSIS. The LZMA has one of the best size/speed
performance for decompression, and its continuing adoption means
that it is pretty mature and stable. LZMA also avoids the
arithmetic patent minefield by utilizing range coding, which
according to the LZMA people was not patented when described in
the 1970s. But if you want minimal CPU load, then a choice of LZO,
lzf or UCL barely loads the CPU as I/O time will dominate -- and a
smaller compressed size also reduces overall I/O time.

Anyway, given David's application, I don't see how we can run away
from using one of the standard archivers.

-- 
Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia