## Re: [the_gdf] About the Merkle Hash Trees

Expand Messages
• ... You would need the hash tree up to that level. If the root itself comes from a trusted source, the hash tree itself doesn t necessarily have to be from a
Message 1 of 3 , Dec 14, 2005
• 0 Attachment
> I hava some question about the Merkle Hash Trees. that is said,"S1 can be
> combined up to equal the ROOT hash, even without seeing the other
> segments."used the other internal hashes,but how can I get the other
> internal hashes ,from other untrusted source ?

You would need the hash tree up to that level. If the root itself
comes from a trusted source, the hash tree itself doesn't necessarily
have to be from a trusted source because you can verify the tree
using the root hash.

--
Christian
• From: Christian Biere ... True, but only if the hashing function used in to combine nodes in the Merkle Tree is cryptographically
Message 2 of 3 , Dec 14, 2005
• 0 Attachment
From: "Christian Biere" <christianbiere@...>
>> I hava some question about the Merkle Hash Trees. that is said,"S1 can be
>> combined up to equal the ROOT hash, even without seeing the other
>> segments."used the other internal hashes,but how can I get the other
>> internal hashes ,from other untrusted source ?
>
> You would need the hash tree up to that level. If the root itself
> comes from a trusted source, the hash tree itself doesn't necessarily
> have to be from a trusted source because you can verify the tree
> using the root hash.

True, but only if the hashing function used in to combine nodes in the Merkle Tree is cryptographically strong. Otherwise it would extremely easy to create an alternate hash tree that matches the root hash. (For example the simplest addition or XOR function allows this, because these hashing functions is obviously not cyptographically strong, as they are reversible in a small fixed number of algebric operations).

So you need a one-way hash function (which is not easily reversible) and which has no obvious collisions that can be computed in small order of steps face to the order of the number of possible result values with the hash function. A good (cyrptographically strong) hash function should not be reversible and should not allow finding collisions in less that O(2^(N/2)) algebric steps (where the result value, and the computing time or required resources to compute it, are both considered as part of a result) for hash result sizes of N bits (i.e. with about (2^N) possible result values if all values are possible).

In fact it isquite hard to find one-way collision-free hash functions that have this property (and it's also extremely difficult to demonstrate that these functions are collision-free) Most secure hash functions comes from the experience in cryptography researches (notably in cryptanalysis). Often, some collisions arefound that reduce the cryptographic strength of a hash function of N bits by some bits (i.e. binary orders of magnitude). As long as this number of weakbits remains small, the rest of the bits remain strong, but this allows comparing the effective strength of two hashfunctions with the same bit-size.

Today, MD5 is considered much less trong than SHA1 truncated to the same number of bits, Tiger is still stronger than SHA1 but still longer to compute on 32-bit architectures, but will be faster on 64-bit even with alarger bitsize, SHA256 and SHA512 are until now the state of the art for known public algorithms (but they are much longer to compute, and can't be used assimple replacement of SHA1 for small devices, unless the data to hash is kept small, in which case the difference does not matter).

In Merkle hash trees, the data to hash at each node is small, but the total sizeof the data to hash is in the same order of magnitude (but a bit larger than) as the total file size. For Gnutella, sharing lots of files requires expensive hash computing time, so the performance of the hashalgorithm on today's common architectures is definitely a limiting factor.

But one good thing in the Merkle Hash Trees is that you don't need to get the full file data to compute the total file hash (i.e. the root hash in the Merkle Tree) because you cancompute it from a smaller digest consisting in hashes of seprate parts of the file. This partial digest is called the hash tree data. The smallest digest is of course the Merkle root hash value, whose size is simply equal to the size of its associated hashing function (for Tiger-based Merkle trees used in Gnutella, the root hash is 192-bit wide), but this allows verifying only the full file data after you have finished downloading it.