Hamming Numbers |
|
The problem is frequently trotted out to explain why "x Programming" is better (particularly for the values of x: functional and logic). An O(n) solution (where n is the size of the sequence of numbers) is possible if you ignore the cost of manipulating large numbers. Buried deep in an interesting [thread] on [Lambda the Ultimate] is an analysis which indicates that the cost might be O(n^(4/3)).
The Haskell solution to this problem is delightfully brief:
-- Merges two infinite lists merge :: (Ord a) => [a] -> [a] -> [a] merge (x:xs)(y:ys) | x == y = x : merge xs ys | x < y = x : merge xs (y:ys) | otherwise = y : merge (x:xs) ys -- Lazily produce the hamming sequence hamming :: [Integer] hamming = 1 : merge (map (2*) hamming) (merge (map (3*) hamming) (map (5*) hamming))
although even shorter ones exist. The essence of the above solution is that Haskell evaluates lazily, so the infinite sequence hamming can be self-referential. That's possible to do in Lua, using a sort of promise. The code below implements just enough of a bignum implementation to be able to do the multiplications by 2, 3 and 5, and to print out the results. (The magic number 7305 results from the fact that Hamming number 7305 is 2^52; actually, a slightly larger limit could have been chosen. In practice, the bignum interface only adds about 33% to the cost of the computation so it could have been written as a pure bignum program; the interesting thing is that all of the code to implement Hamming is completely oblivious to the actual implementation of the numbers, provided that the implementation supports <, ==, and multiply-by-a-small-integer.
I checked the performance of the program by computing hamming(k) for values of k from 50,000 to 1,000,000; the performance does appear to be roughly O(n) in both space and time, although I made no attempt to optimise and the space consumption is somewhat porcine.
usec RSS
n sec. Mb /n kb/n value
------- ---- ----- ---- ---- --------------------------------------------------------------------
50000 2.71 3472 54.2 69.4 2379126835648701693226200858624
100000 5.80 5048 58.0 50.5 290142196707511001929482240000000000000
150000 9.05 6060 60.3 40.4 136551478823710021067381144334863695872000000
200000 12.41 7176 62.0 35.9 4479571262811807241115438439905203543080960000000
250000 15.71 8092 62.8 32.4 29168801439715713801701411811093381120000000000000000
300000 19.09 8856 63.6 29.5 62864273786544799804687500000000000000000000000000000000
350000 22.38 9608 63.9 27.5 60133943158606419031489628585123616917982648729600000000000
400000 26.01 13276 65.0 33.2 30774090693237851027531250000000000000000000000000000000000000
450000 29.66 14356 65.9 31.9 9544028161703913537712243143807801346335324481500000000000000000
500000 33.49 14952 67.0 29.9 1962938367679548095642112423564462631020433036610484123229980468750
550000 37.25 15824 67.7 28.8 286188649932207038880389529600000000000000000000000000000000000000000
600000 41.06 16976 68.4 28.3 31118367413797724237140050340541118674764220486395061016082763671875000
650000 44.69 17620 68.8 27.1 2624748786803793305341077210881955201024000000000000000000000000000000000
700000 48.55 18508 69.4 26.4 177288702931066945536000000000000000000000000000000000000000000000000000000
750000 52.38 19696 69.8 26.3 9844116074857088479400896102109753641824661421606444941755765750304761446400
800000 56.15 20196 70.2 25.2 458936503790258814279745536000000000000000000000000000000000000000000000000000
850000 60.03 20620 70.6 24.3 18286806033409508387421738007105886936187744140625000000000000000000000000000000
900000 63.88 21772 71.0 24.2 632306706993546185913146473467350006103515625000000000000000000000000000000000000
950000 68.05 22756 71.6 24.0 19221158232427643481897048859403926759149694825267200000000000000000000000000000000
1000000 73.27 23084 73.3 23.1 519312780448388736089589843750000000000000000000000000000000000000000000000000000000
So here's the code:
local function Seq(t, func)
local meta = {}
function meta:__index(k)
if k == "tail" then
local rv = func()
self.tail = rv
setmetatable(self, nil)
return rv
end
end
return setmetatable(t, meta)
end
local function SMap(seq, func)
return Seq({head = func(seq.head)},
function() return SMap(seq.tail, func) end)
end
local function SMerge(seq1, seq2)
local head1, head2 = seq1.head, seq2.head
local step
if head1 < head2 then
function step() return SMerge(seq1.tail, seq2) end
elseif head2 < head1 then
function step() return SMerge(seq1, seq2.tail) end
head1 = head2
else
function step() return SMerge(seq1.tail, seq2.tail) end
end
return Seq({head = head1}, step)
end
local function times(k)
return function(x) return x * k end
end
function Hamming(seq)
local seq2, seq3, seq5 =
SMap(seq, times(2)), SMap(seq, times(3)), SMap(seq, times(5))
local function step()
return SMerge(seq2, SMerge(seq3, seq5))
end
return Seq(seq, function() return SMerge(seq2, SMerge(seq3, seq5)) end)
end
function hamming(k)
local start = 1
if k > 7305 then start = Bignum(start) end
local h = Hamming({head = start})
for i = 2, k do h = h.tail end
return h.head
end
and here's the simple Bignum implementation (define this first if you want to test the above code):
do
-- very limited bignum stuff; we only implement multiplication
-- (by less than the base) and comparison.
local base = 1e15
local fmt = "%015.0f"
local meta = {}
function meta:__lt(other)
if self.n ~= other.n then return self.n < other.n end
for i = 1, self.n do
if self[i] ~= other[i] then return self[i] < other[i] end
end
end
function meta:__eq(other)
if self.n == other.n then
for i = 1, self.n do
if self[i] ~= other[i] then return false end
end
return true
end
end
function meta:__mul(k)
-- If the base where a multiple of all possible multipliers, then
-- we could figure out the length of the result directly from the
-- first "digit". On the other hand, printing the numbers would be
-- difficult. So we accept the occasional overflow.
local offset = 0
if self[1] * k >= base then offset = 1 end
local carry = 0
local result = {}
local n = self.n
for i = n, 1, -1 do
local tmp = self[i] * k + carry
local digit = tmp % base
carry = (tmp - digit) / base
result[offset + i] = digit
end
if carry ~= 0 then
n = n + 1
if offset == 0 then
for i = n, 2, -1 do
result[i] = result[i - 1]
end
end
result[1] = carry
end
result.n = n
return setmetatable(result, meta)
end
function meta:__tostring()
local tmp = {}
tmp[1] = ("%.0f"):format(self[1])
for i = 2, self.n do
tmp[i] = fmt:format(self[i])
end
return table.concat(tmp)
end
function Bignum(k)
return setmetatable({k, n = 1}, meta)
end
end
For reference, the time/space chart was created (except for the heading) with the following (which will probably only work on FreeBSD or similar):
for ((i=50000; $i<=1000000; i=$i+50000)); \
do /usr/bin/time -apl -o hamming.log ../src/lua \
-e "dofile 'hamming.lua'; print('result', $i, hamming($i))" >> hamming.log; \
done
egrep '(result|user|maximum)' hamming.log | \
lua -e 'local a = io.read"*a"; \
for n, res, time, size in \
string.gfind(a, "(%d+)%s+(%d+)%D+([%d.]+)%s+(%d+)") do \
print(string.format("%8i %8.2f %8i %6.1f %6.1f %s", \
n, time, size, (time*1e6)/n, (size*1e3)/n, res)) \
end' > hamming.res