lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 3/23/2012 6:44 PM, Alexander Gladysh wrote:
On Fri, Mar 23, 2012 at 13:37, KHMan wrote:
On 3/22/2012 2:37 AM, Luiz Henrique de Figueiredo wrote:

same files (timing in sec, lower of 2 runs)

=========================================
dataset ->        md5     sha1    sha256
--------------------------------------
lua-5.1.5       0.658   0.670   0.715

These are very short run times. I'd say that it is better to loop for
1, 100 to get at least run time of seconds — this way your
measurements would less depend on various random factors.

Main loop x100...

same files (timing in sec, lower of 2 runs)
=========================================
dataset ->      md5     sha1    sha256
 hex str len    32      40      64
--------------------------------------
lua-5.1.5       2.584   2.599   2.650
lua-5.2.0       2.314   2.332   2.401
lua-5.2.1wk1
 shortlen=16    3.043   3.240   3.462
 shortlen=32    2.505   2.909   3.137
 shortlen=48    2.288   2.313   2.956
 shortlen=64    2.303   2.300   2.364
 shortlen=128   2.326   2.332   2.406

The difference in performance of long and short string compares are now more pronounced.

test-same-files2.lua
====================

local io = require "io"
local string = require "string"
local setA = {}
for l in io.lines(arg[1]) do
  local hashA, fpathA = string.match(l, "^(%S+)%s+(.+)$")
  setA[fpathA] = hashA
end
local hashesB, fpathsB = {}, {}
for l in io.lines(arg[2]) do
  local h, fp = string.match(l, "^(%S+)%s+(.+)$")
  hashesB[#hashesB + 1] = h
  fpathsB[#fpathsB + 1] = fp
end
local identicaln, changedn = 0, 0
for i = 1, 100 do -- repeated
  local identical, changed = {}, {}
  for j = 1, #fpathsB do
    local hashB = hashesB[j]
    local fpathB = fpathsB[j]
    local hashA = setA[fpathB]
    if hashA then
      if hashA == hashB then
        identical[#identical + 1] = fpathB
      else
        changed[#changed + 1] = fpathB
      end
    end
  end
  identicaln = identicaln + #identical
  changedn = changedn + #changed
end
print("Files identical: "..identicaln)
print("Files changed:   "..changedn)

Deleting the "if hashA == hashB then" block and counting identical fpaths only, the effect of removing str==str is as follows:

no str==str (timing in sec, lower of 2 runs)
=========================================
dataset ->      md5     sha1    sha256
 hex str len    32      40      64
--------------------------------------
lua-5.1.5       2.611   2.617   2.666
lua-5.2.0       2.317   2.310   2.379
lua-5.2.1wk1
 shortlen=16    2.733   2.874   2.968
 shortlen=32    2.466   2.510   2.594
 shortlen=48    2.295   2.297   2.331
 shortlen=64    2.281   2.271   2.334
 shortlen=128   2.309   2.319   2.375

When shortlen=48, most setA[fpathB] lookups are short strings only.

--
Cheers,
Kein-Hong Man (esq.)
Kuala Lumpur, Malaysia