Here is a preview of PAQ9
http://cs.fit.edu/~mmahoney/compression/paq9a.zip
It uses the same archive format and command line interface as lpq1 (based on lpaq1) with similar compression ratio, but a little slower. For highly redundant files (like fp.log), compression is faster. It lacks models for specific file types like exe, bmp, wav, jpeg, etc. I plan to add those later. I plan to have the file type detected automatically (or by command line options) and only turn on the models needed. This should improve speed over paq8, which has most of the models on at once.
paq9a has an LZP preprocessor. It codes a match using one bit and a literal using 9 bits, which are modeled separately. This speeds up compression of highly redundant files but also hurts compression because you have to include the mispredicted byte as context for modeling the literals. As a compromise, only matches of length 12 or more are coded and only longer contexts include the extra byte.
I also experimented with a different form of context mixing. Instead of mixing all the contexts at once (using a neural network), I used a chain of 2 input mixers (again a neural network). I am not sure if this compresses better, though.
Also, I reimplemented the APM (SSE) as a 2 input mixer with one input fixed. This saves memory and reduces the number of free parameters, which should help for small files or rapidly changing data.
paq9a -1 (19 MB)
824,845 a10.jpg.paq9a-1
1,255,208 acrord32.exe.paq9a-1
487,957 english.dic.paq9a-1
3,643,525 FlashMX.pdf.paq9a-1
395,669 fp.log.paq9a-1
1,654,770 mso97.dll.paq9a-1
754,253 ohs.doc.paq9a-1
757,053 rafale.bmp.paq9a-1
499,172 vcfiu.hlp.paq9a-1
470,607 world95.txt.paq9a-1
10,743,059 bytes
paq9a -9 (1585 MB)
823,883 a10.jpg.paq9a-9
1,235,497 acrord32.exe.paq9a-9
457,932 english.dic.paq9a-9
3,633,260 FlashMX.pdf.paq9a-9
392,231 fp.log.paq9a-9
1,610,168 mso97.dll.paq9a-9
727,424 ohs.doc.paq9a-9
739,561 rafale.bmp.paq9a-9
493,760 vcfiu.hlp.paq9a-9
431,508 world95.txt.paq9a-9
10,545,224 bytes
19,974,112 enwik8.paq9a-9 539 510 sec. (comp + decomp)
165,193,368 enwik9.paq9a-9 4200 sec.
(still testing decompression)

. But I don?t get it. Will paq and lpaq be the same and only difference will be some small features and the special file modells? Compression now seems to be little over lpaq8 (on general files). Without beeing into context mixing my thoughts were, that lpaq uses many less modells (without counting the modells for special files). Thats the reason why it is faster but with less compression. Is the overall compression not finished yet?

I compensated by adding more models, which is why it is slower than lpaq1. paq9a has some sparse models which improve compression of binary files, but text is a bit worse.




