I just found a way to improve CM compression, which should be applied, when bits are modeled using a state machine. This idea came to my mind when thinking about a (faster) alternative to SSE. Some terminilogy:
* a "historymap" is a table which maps the bitmodel's state to a prediction.
How it works: instead of mapping a bit history directly to a bit model, select a history map by a small context. I found that a context which is made of the bit position (0...7, 3 bits) and the previous bit's prediction (not the one of the corresponding model, the final, mixed prediction!) error scaled down to 3 bits works very well.
I could gain up to 7% on very redundant files (like FP.LOG) vs an average of 3% normally (measured on the file compressed with a direct mapping as a base).
The idea behind: using the bit position as a context seperates more inaccurate predictions (e.g. the first bit seperates 256 possible values into 2x12 from more accurate predictions (e.g. the last bit seperates 2 possible values 0/1). This can be further improved by using the previous predictions error as a measure of predictivity.
A nonlinear quantization of the bit position (finer at the first bits) works too, but not so well. I tried the following position to code mappings: 0, 1, 23, 4567; 0, 1, 3, 34567; 0,1, 234, 567.
I didn't try a similar quantization to the prediction error. This might give another improvement.
I think i'll finish cmm3 during the next few weeks. It is a combination of LZP+CM (using some tricks, as the one mentioned above). As i said before, my LZP layer is very similar to Matt's in paq9a; and i'll make it open source.
Comments are welcome!