[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Pentium 4 and misaligned doubles
- From: Rici Lake <lua@...>
- Date: Tue, 16 Aug 2005 00:13:43 -0500
Just to confirm the previous, it occurred to me that in an array, every
sixth value will cross a cache boundary. Indeed, empirical results
demonstrate that there is an effect:
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[1] = a[1] + 1 end
13.51 real 13.44 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[2] = a[2] + 1 end
13.54 real 13.49 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[3] = a[3] + 1 end
13.51 real 13.44 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[4] = a[4] + 1 end
13.52 real 13.47 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[5] = a[5] + 1 end
13.51 real 13.45 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[6] = a[6] + 1 end
16.05 real 15.97 user 0.03 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[7] = a[7] + 1 end
13.61 real 13.55 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[8] = a[8] + 1 end
13.63 real 13.58 user 0.00 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[1] = a[1] end
10.91 real 10.85 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[2] = a[2] end
10.88 real 10.83 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[3] = a[3] end
10.88 real 10.83 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[4] = a[4] end
10.91 real 10.85 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[5] = a[5] end
10.88 real 10.83 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[6] = a[6] end
13.41 real 13.35 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[7] = a[7] end
10.91 real 10.86 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[8] = a[8] end
10.86 real 10.82 user 0.01 sys
Note that the constant 6 is probably also misaligned. Also, there is a
difference between the layout of Lua 5.0.2 and Lua 5.1w6, so although
the periodicity should be the same, it will probably have a different
offset.
I should also add that I'm doing these tests with FreeBSD, whose
allocator will not artifically cross a 64-byte boundary. (That wasn't a
design goal, as far as I know -- the documentation only talks about
page boundaries -- but it is a consequence of rounding all small
allocations up to a power of two, and then assigning a whole page to
objects of the same size.) I believe the Linux allocator is more memory
conservative, and may allocate a small object across a 64-byte
boundary.