I've released the complete (that is, with core deduplication library) source on www.exdupe.com
Problem with rsync and other rolling hash functions is that they scatter their hash table access randomly, creating one expensive L1 cache miss per input byte.
eXdupe computes hash values in another way. For any 8 KB chunk it identifies the first position that fulfills
char((SMALL_PRIME * (653 + src[i] + src[i + 33*percent] + src[i + 66*percent] + src[i + len - slide - 4]))) == 0, where i is incremented in a loop.
Obviously this is fulfilled after 256 bytes on average. From *that* i-position, a quick hash (let's call it Q) is computed from a handfull of bytes.
The Q value is used as index for a stronger hash of the entire 8 KB block, starting from offset 0, so that hashtable[Q] = strong hash.
eXdupe performs two passes on the input data: First it hashes each concecutive 8 KB block. Next it uses an 8 KB sliding window to test for blocks seen in first pass. The inner loop of the second pass is basically just the above loop that searches for a position - it doesn't do any hashtable lookups whatsoever (only after 256 bytes on average).