I created 4 experimental ZPAQ compatible BWT based compressors. The goal is fast but reasonably good compression. Some benchmarks:
Times are process times in seconds with zpaq v2.05 on one of two cores to compress and decompress enwik8 on a 2.0 GHz T3200 with 3 GB memory under 32 bit Windows. The current version sorts the whole file into 1 block so it is limited to files smaller than 128 MB and uses 640 MB memory. These are experimental models. A production version (maybe added to pzpaq) would split the input into smaller blocks to remove the file size limitation and process them in separate threads. Compression would also skip the preprocessor check (26 seconds on enwik after I am sure there are no bugs in it.
config calgary.tar enwik8 Time Memory
------ ----------- ------ ------- ------
original 3152896 100000000 0 0 0 MB
bwt1 844305 21965906 95 69 671
bwt2 800498 21246030 122 93 671
bwtrle1 861300 22823439 82 56 671
bwtrle2 828249 21985593 97 66 672
min 1027299 33460930 41 43 4
fast 806959 24837469 78 86 38
mid 699191 20941558 265 301 111
max 644190 19448650 716 727 246
BWT encoding uses libdivsufsort-lite (30 sec. for enwik and inverse BWT coded in both C++ and ZPAQL for comparison. BWT decoding enwik8 takes about 23 sec. in C++, 26 sec. in automatically converted ZPAQL (both compiled with g++ 4.5.0), and 61 sec. when interpreted. I modified the BWT to include the EOF byte to simplify the decoding.
As an optimization, two of the config files also run length encode after BWT. Runs of 2-257 or more are encoded as the first 2 bytes plus a count byte. RLE is very fast, 2 seconds at most, and speeds up modeling at some cost in compression. The models are more complicated because they have to model the RLE parse state (literal or count) as context.
For each model, I used a fast version (bwt*1.cfg) and slower (bwt*2.cfg) with 1 or 2 ZPAQ components respectively. The fast version is an indirect order 0 model. The slow version is an order 0-1 ISSE chain.
I also tried experimenting with BWT+MTF (move to front) and BWT+MTF+RLE but compression was a lot worse and MTF offers no speedup.. RLE does speed up compression but the size penalty seems larger than I was expecting.
Code: bwt.1.zip from http://mattmahoney.net/dc/zpaq.html#configurations
Technical details in the readme file.