Re: [the_gdf] About the Merkle Hash Trees
> I hava some question about the Merkle Hash Trees. that is said,"S1 can beYou would need the hash tree up to that level. If the root itself
> combined up to equal the ROOT hash, even without seeing the other
> segments."used the other internal hashes,but how can I get the other
> internal hashes ,from other untrusted source ?
comes from a trusted source, the hash tree itself doesn't necessarily
have to be from a trusted source because you can verify the tree
using the root hash.
- From: "Christian Biere" <christianbiere@...>
>> I hava some question about the Merkle Hash Trees. that is said,"S1 can beTrue, but only if the hashing function used in to combine nodes in the Merkle Tree is cryptographically strong. Otherwise it would extremely easy to create an alternate hash tree that matches the root hash. (For example the simplest addition or XOR function allows this, because these hashing functions is obviously not cyptographically strong, as they are reversible in a small fixed number of algebric operations).
>> combined up to equal the ROOT hash, even without seeing the other
>> segments."used the other internal hashes,but how can I get the other
>> internal hashes ,from other untrusted source ?
> You would need the hash tree up to that level. If the root itself
> comes from a trusted source, the hash tree itself doesn't necessarily
> have to be from a trusted source because you can verify the tree
> using the root hash.
So you need a one-way hash function (which is not easily reversible) and which has no obvious collisions that can be computed in small order of steps face to the order of the number of possible result values with the hash function. A good (cyrptographically strong) hash function should not be reversible and should not allow finding collisions in less that O(2^(N/2)) algebric steps (where the result value, and the computing time or required resources to compute it, are both considered as part of a result) for hash result sizes of N bits (i.e. with about (2^N) possible result values if all values are possible).
In fact it isquite hard to find one-way collision-free hash functions that have this property (and it's also extremely difficult to demonstrate that these functions are collision-free) Most secure hash functions comes from the experience in cryptography researches (notably in cryptanalysis). Often, some collisions arefound that reduce the cryptographic strength of a hash function of N bits by some bits (i.e. binary orders of magnitude). As long as this number of weakbits remains small, the rest of the bits remain strong, but this allows comparing the effective strength of two hashfunctions with the same bit-size.
Today, MD5 is considered much less trong than SHA1 truncated to the same number of bits, Tiger is still stronger than SHA1 but still longer to compute on 32-bit architectures, but will be faster on 64-bit even with alarger bitsize, SHA256 and SHA512 are until now the state of the art for known public algorithms (but they are much longer to compute, and can't be used assimple replacement of SHA1 for small devices, unless the data to hash is kept small, in which case the difference does not matter).
In Merkle hash trees, the data to hash at each node is small, but the total sizeof the data to hash is in the same order of magnitude (but a bit larger than) as the total file size. For Gnutella, sharing lots of files requires expensive hash computing time, so the performance of the hashalgorithm on today's common architectures is definitely a limiting factor.
But one good thing in the Merkle Hash Trees is that you don't need to get the full file data to compute the total file hash (i.e. the root hash in the Merkle Tree) because you cancompute it from a smaller digest consisting in hashes of seprate parts of the file. This partial digest is called the hash tree data. The smallest digest is of course the Merkle root hash value, whose size is simply equal to the size of its associated hashing function (for Tiger-based Merkle trees used in Gnutella, the root hash is 192-bit wide), but this allows verifying only the full file data after you have finished downloading it.
With a partial tree of height 8, you can verify the file data partially downloaded in fragments that are between 1/128th and 1/256th of the total filesize, and you can verify all of them individually immediately whenthey are downloaded. The size of the fragments does not matter, because you can still double the downloaded fragment size and get the same result for the total Merkle hash function. This means that you don't need to download the whole hash tree, but only its leaf nodes (the rest can be deduced and computed by the recipient. (This divides by about 2 the size of the download for the tree data, in order to compute the same complete binary Merkle tree).
Note that for small file fragments, it is cryptographically difficult to find a collision for each fragment, but it won't be more difficult to compute for the original file. With more fragments, you easily multiply the strength of your download. So the strength of the upload is not limited by the Merkle tree itself or by the sender, but by the choice of the bas cryptograpic hash function, the bitsize of its generated key, and the n-arity of the Merkle Tree (in Gnutella the trees are binary which results in the highest strength; higher arities would not be significantly faster to compute for uploaders that share their Merkle tree data, because almost all of the computing time is spent in computing the hashes for leaf nodes associated to the smallest file fragments).