|
Roberto Ierusalimschy <roberto@inf.puc-rio.br> wrote:There was a recent article linked via Hacker News about one of the
> * Because the GC traverses all objects in the system, it is one of the
> main "candidates" to suffer from a memory corruption created somewhere
> else.
popular web cache applications (maybe it was Varnish?). Anyway, this
program uses a lot of RAM, and they've been seen un-reproducable bugs
occasionally. The authors suspect it is a hardware problem in many
cases, so they've incorporated a memory test as part of their crash
dump. They were doing something clever with multiple XOR passes to
check the memory and still preserve the original data.
At any rate, many hosting providers are not using ECC memory on their
servers, and as memory sizes keep getting larger and larger (and
process size for the memory itself smaller and smaller) the rate of
bit errors due to cosmic radiation keeps going up.
The short version: also try to run a memory test on the server and see
if that turns up anything.
James Graves