i publish the last algorithm i've developed for FreeaArc 0.40:
Delta: binary tables preprocessor v1.0 (c) Bulat.Ziganshin@gmail.com 2008-03-13
This algorithm preprocess data improving their further compression. It detects tables
of binary records and 1) substracts sucessive values in columns, 2) reorder columns trying
to maximize results of further compression.
Algorithm includes 3 phases:
1) Preliminary table detection. It finds 6+ repetitions of the same byte at the same distance,
i.e. anything like a...a...a...a...a...a where '.' denotes any byte except for 'a'.
This is done in delta_compress
2) Candidates detected at first phase are then checked by FAST_CHECK_FOR_DATA_TABLE looking for
monotonic sequence of bytes with fixed distance. Most candidates found at first stage are
filtered out here
3) Remaining candidates are tested by slow_check_for_data_table() that finds exact table boundaries
and detects columns that should and that shouldn't be substracted. Only if table is large enough
it will be finally processed
The algorithm processes 20 mb/sec on 1GHz CPU, but i'm sure that the speed may be 3-fold increased
http://www.haskell.org/bz/delta10.zip
Now http://www.haskell.org/bz page includes all the algorithms i've developed for FreeArc so far


)? 