[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: experimental new dump
- From: "Aaron Brown" <arundelo@...>
- Date: Tue, 26 Jun 2007 06:32:52 -0400
lhf wrote:
I've written a version of ldump.c that saves only a single
instance of any string (of course, I've also written the
corresponding lundump.c):
http://www.tecgraf.puc-rio.br/~lhf/tmp/dump.tar.gz
I've been dealing with some automatically generated data
files with lots of duplicate strings, but also lots of
unique strings. For example, file-b.lua has 1923663
strings, or 283801 not counting duplicates. The 8
most-duplicated distinct strings count for about half of the
counting-duplicates figure; 37590 strings are unique. (The
average length is 7 characters.)
All four .luac files were generated with the -s switch.
There's not a massive decrease in size:
33997307 file-a.lua
30841144 file-a-old-dump.luac
30689172 file-a-new-dump.luac
37521826 file-b.lua
35676747 file-b-old-dump.luac
35209112 file-b-new-dump.luac
Here some typical timings, with the .lua files included for
comparison. The difference between the the two undumps
looks invisible in the noise:
$ time lua file-a.lua
real 0m16.934s
user 0m9.743s
sys 0m0.996s
$ time lua file-a-old-dump.luac
real 0m9.407s
user 0m3.552s
sys 0m0.798s
$ time src/*lhf*/src/lua file-a-new-dump.luac
real 0m10.202s
user 0m3.692s
sys 0m0.719s
$ time lua file-b.lua
real 0m17.402s
user 0m10.776s
sys 0m0.953s
$ time lua file-b-old-dump.luac
real 0m10.454s
user 0m3.914s
sys 0m0.827s
$ time src/*lhf*/src/lua file-b-new-dump.luac
real 0m9.929s
user 0m3.980s
sys 0m0.840s
Hope that's helpful.
--
Aaron
http://arundelo.com/