Over the last week I updated the large text benchmark with some candidate programs (not in main table yet): symbra, lpaq8, lpaq8e, lcssr, fpaqa, hook 1.3, lzpm 0.13, cmm1, cmm2, fpaqb, fpaq0m, bit 0.1. cmm1 compresses enwik9 better than cmm2 and is over 30 days old so it is ranked in the main table.
fpaqa and fpaqb are my experimental implementations of Jarek Duda's asymmetric binary coder, which he proposed as an alternative to arithmetic coding. However, my implementation is slower, or else I would use it in other programs. As an experiment I tried the coder from fpaqb (my newer version) in lpaq1. It worked with only minor changes, but it was 1% slower and compressed 0.01% larger, so I saw no advantage to using it. An asymmetric coder requires dividing the input into blocks for compression, saving the bit predictions and coding them in reverse order. The extra space is for a 3 byte block header for each 64KB input block, which needs 2.5 MB memory to save the predictions and then reverse the compressed block. Decompression doesn't need to reverse anything so it doesn't use any extra memory.
fpaqa uses lookup tables and no multiplication or division. In theory it would be faster than arithmetic coding on old hardware, but on newer machines multiplication is faster than a table lookup. fpaqb uses only one small table of inverses to replace division with a multiplication during compression and no table during decompression. This is why decompression is faster than compression (but still slower than arithmetic coding). If I could figure out how to make it faster, I would use it in all my programs (paq8, bbb, sr2, lpaq, etc). fpaqb compresses smaller, uses less memory, and is simpler and faster than fpaqa, but not fast enough.