Hamming Numbers

lua-users home
wiki

Hamming Numbers are numbers whose only prime factors are 2, 3 and 5. They are named after [Richard Hamming] but became famous (or notorious) after [Edsger Dijkstra] posed the question of how to efficiently enumerate them in numeric order.

The problem is frequently trotted out to explain why "x Programming" is better (particularly for the values of x: functional and logic). An O(n) solution (where n is the size of the sequence of numbers) is possible if you ignore the cost of manipulating large numbers. Buried deep in an interesting [thread] on [Lambda the Ultimate] is an analysis which indicates that the cost might be O(n^(4/3)).

The Haskell solution to this problem is delightfully brief:

-- Merges two infinite lists
merge :: (Ord a) => [a] -> [a] -> [a]
merge (x:xs)(y:ys)
  | x == y    = x : merge xs ys
  | x <  y    = x : merge xs (y:ys)
  | otherwise = y : merge (x:xs) ys

-- Lazily produce the hamming sequence
hamming :: [Integer]
hamming 
  = 1 : merge (map (2*) hamming) (merge (map (3*) hamming) (map (5*) hamming))

although even shorter ones exist. The essence of the above solution is that Haskell evaluates lazily, so the infinite sequence hamming can be self-referential. That's possible to do in Lua, using a sort of promise. The code below implements just enough of a bignum implementation to be able to do the multiplications by 2, 3 and 5, and to print out the results. (Hamming number 7305 is 2^52, so you'll need Bignum support if you want to compute more Hamming numbers than that. In practice, the bignum interface only adds about 33% to the cost of the computation; the interesting thing is that all of the code to implement Hamming is completely oblivious to the actual implementation of the numbers, provided that the implementation supports <, ==, and multiply-by-a-small-integer.

I checked the performance of the program by computing hamming(k) for values of k from 50,000 to 1,000,000; the performance does appear to be roughly O(n) in both space and time, although I made no attempt to optimise and the space consumption is somewhat porcine.

                             usec    RSS
       n     sec.       Mb     /n   kb/n   value
 -------     ----    -----   ----   ----   --------------------------------------------------------------------
   50000     1.99     3768   39.8   75.4   2379126835648701693226200858624
  100000     4.41     5524   44.1   55.2   290142196707511001929482240000000000000
  150000     6.87     6784   45.8   45.2   136551478823710021067381144334863695872000000
  200000     9.42     7932   47.1   39.7   4479571262811807241115438439905203543080960000000
  250000    11.99     8932   48.0   35.7   29168801439715713801701411811093381120000000000000000
  300000    14.57     9944   48.6   33.1   62864273786544799804687500000000000000000000000000000000
  350000    17.18    10768   49.1   30.8   60133943158606419031489628585123616917982648729600000000000
  400000    20.06    14240   50.1   35.6   30774090693237851027531250000000000000000000000000000000000000
  450000    23.10    16104   51.3   35.8   9544028161703913537712243143807801346335324481500000000000000000
  500000    25.98    17392   52.0   34.8   1962938367679548095642112423564462631020433036610484123229980468750
  550000    28.98    18724   52.7   34.0   286188649932207038880389529600000000000000000000000000000000000000000
  600000    32.05    19256   53.4   32.1   31118367413797724237140050340541118674764220486395061016082763671875000
  650000    35.27    20332   54.3   31.3   2624748786803793305341077210881955201024000000000000000000000000000000000
  700000    38.13    21372   54.5   30.5   177288702931066945536000000000000000000000000000000000000000000000000000000
  750000    41.34    21536   55.1   28.7   9844116074857088479400896102109753641824661421606444941755765750304761446400
  800000    44.51    22280   55.6   27.9   458936503790258814279745536000000000000000000000000000000000000000000000000000
  850000    47.59    23896   56.0   28.1   18286806033409508387421738007105886936187744140625000000000000000000000000000000
  900000    50.64    23952   56.3   26.6   632306706993546185913146473467350006103515625000000000000000000000000000000000000
  950000    54.02    26552   56.9   27.9   19221158232427643481897048859403926759149694825267200000000000000000000000000000000
 1000000    57.36    26800   57.4   26.8   519312780448388736089589843750000000000000000000000000000000000000000000000000000000

So here's the code:

do 
  local meta = {}
  function meta:__index(k)
    if k == "tail" then
      local rv = self.gen(self)
      self.tail, self.gen = rv, nil
      return rv
    end
  end
  function Seq(h, gen)
    return setmetatable({head = h, gen = gen}, meta)
  end
end

function SMap(func, seq)
  return Seq(func(seq.head),
             function() return SMap(func, seq.tail) end)
end

function SMerge(seq1, seq2)
  local head1, head2 = seq1.head, seq2.head
  local step
  if head1 < head2 then
    function step() return SMerge(seq1.tail, seq2) end
  elseif head2 < head1 then
    function step() return SMerge(seq1, seq2.tail) end
    head1 = head2
  else
    function step() return SMerge(seq1.tail, seq2.tail) end
  end
  return Seq(head1, step)
end

function Times(k)
  if k then
    return function(x) return x * k end
  else
    return function(x, y) return x * y end
  end
end

function Hamming()
  local init = 1
  if Bignum then init = Bignum(init) end
  local seq = Seq(init)
  local seq2, seq3, seq5 =
        SMap(Times(2), seq), SMap(Times(3), seq), SMap(Times(5), seq)  
  function seq.gen() return SMerge(seq2, SMerge(seq3, seq5)) end
  return seq
end

function SeqTail(seq, k)
  for i = 1, k do
    seq = seq.tail
  end
  return seq
end

if arg then
  local start, finish, inc = tonumber(arg[1]), tonumber(arg[2]), tonumber(arg[3])
  if not start then start, finish, inc = 1, 20, 1 end
  local h = SeqTail(Hamming(), start-1)
  print("hamming", start, h.head)
  if finish then
    while start + inc <= finish do
      start = start + inc
      h = SeqTail(h, inc)
      print("hamming", start, h.head)
    end
  end
end

Here are a few more interesting infinite sequence functions, including a Fibonacci generator (although I wouldn't recommend it in practice, it does run in constant space and more-or-less linear time.)

function SAlways(val)
  return Seq(val, function(seq) return seq end)
end

function SDist(func, init, seq)
  return Seq(init, function(self) return SDist(func, func(self.head, seq.head), seq.tail) end)
end

function SMap2(func, seq1, seq2)
  return Seq(func(seq1.head, seq2.head),
             function() return SMap2(func, seq1.tail, seq2.tail) end)
end

function Plus(k)
  if k then
    return function(x) return x + k end
  else
    return function(x, y) return x + y end
  end
end

Iota = SDist(Plus(), 1, SAlways(1))

function Fib(i, j)
  i, j = i or 1, j or 1
  local seq = Seq(Bignum(i))
  seq.tail = SDist(Plus(), Bignum(j), seq)
  return seq
end

and here's the simple Bignum implementation (define this first if you want to test the above code):

do
  -- very limited bignum stuff; just enough for the examples here.
  -- Please feel free to improve it.
  local base = 1e15
  local fmt = "%015.0f"
  local meta = {}
  function meta:__lt(other)
    if self.n ~= other.n then return self.n < other.n end
    for i = 1, self.n do
      if self[i] ~= other[i] then return self[i] < other[i] end
    end
  end
  function meta:__eq(other)
    if self.n == other.n then
      for i = 1, self.n do
        if self[i] ~= other[i] then return false end
      end
      return true
    end
  end
  function meta:__mul(k)
    -- If the base where a multiple of all possible multipliers, then
    -- we could figure out the length of the result directly from the
    -- first "digit". On the other hand, printing the numbers would be
    -- difficult. So we accept the occasional overflow.
    local offset = 0
    if self[1] * k >= base then offset = 1 end
    local carry = 0
    local result = {}
    local n = self.n
    for i = n, 1, -1 do
      local tmp = self[i] * k + carry
      local digit = tmp % base
      carry = (tmp - digit) / base
      result[offset + i] = digit
    end
    if carry ~= 0 then
      n = n + 1
      if offset == 0 then
        for i = n, 2, -1 do
          result[i] = result[i - 1]
        end
      end
      result[1] = carry
    end
    result.n = n
    return setmetatable(result, meta)
  end
  -- Added so that Fib would work; other must be a Bignum
  function meta:__add(other)
    local result = {}
    if self.n < other.n then self, other = other, self end
    local n, m = self.n, other.n
    local diff = n - m
    local carry = 0
    local result = {}
    for i = m, 1, -1 do
      local tmp = self[i + diff] + other[i] + carry
      if tmp < base then
        carry = 0
      else
        tmp = tmp - base
        carry = 1
      end
      result[i + diff] = tmp
    end
    for i = diff, 1, -1 do
      local tmp = self[i] + carry
      if tmp < base then
        carry = 0
      else
        tmp = tmp - base
        carry = 1
      end
      result[i] = tmp
    end
    if carry > 0 then
      n = n + 1
      for i = n, 2, -1 do
        result[i] = result[i - 1]
      end
      result[1] = carry
    end
    result.n = n
    return setmetatable(result, meta)
  end

  function meta:__tostring()
    local tmp = {}
    tmp[1] = ("%.0f"):format(self[1])
    for i = 2, self.n do
      tmp[i] = fmt:format(self[i])
    end
    return table.concat(tmp)
  end
  function Bignum(k)
    return setmetatable({k, n = 1}, meta)
  end
end

For reference, the time/space chart was created (except for the heading) with the following (which will probably only work on FreeBSD or similar):

for ((i = 50000; $i<=1000000; i = $i + 50000)) do /usr/bin/time -apl -o hamming.log ./hamming.lua $i >> hamming.log; done
egrep '(hamming|user|maximum)' hamming.log | \
 lua -e 'local a = io.read"*a"; \
         for n, res, time, size in string.gfind(a, "(%d+)%s+(%d+)%D+([%d.]+)%s+(%d+)") do \
           print(string.format("%8i %8.2f %8i %6.1f %6.1f   %s", \
                               n, time, size, (time*1e6)/n, (size*1e3)/n, res)) \
         end' > hamming.res

--RiciLake

See Also


RecentChanges · preferences
edit · history
Last edited May 28, 2007 9:42 pm GMT (diff)