On Wed, May 09, 2007 at 04:19:17PM +0200, Philippe Lhoste wrote:
One (frequent?) use of MD5 is to compute a hash value for files and use
it for fast comparison (duplicates, is this file changed?, and so on).
I suppose that for this use, it is still OK, at worse involving a binary
comparison to be sure (for some uses).
Aside: The Plan9 OS has a filesystem called "Venti", based on the idea
that each block of data in a stored file can be indexed by its hash
value. This allows it to store only one copy of the block's data, for
any number of files or copies of files that contain that data. It's
intended for archival storage. This design requires that every unique
block of data, in every file in the filesystem, has a unique hash
value. A collision results in data loss.
"Using the Sha1 hash function, the probability of a collision is
less than 10^-20. Such a scenario seems sufficiently unlikely that
we ignore it [...]"
-Dave Dodge