Since some posts here disappeared, I am going to write a brief summary of what did I find about byte-based compression since I started this thread.
1. By far the best discussions of the issues and relevant ideas for byte-aligned compressors can be found in blog posts by Charles Bloom, esp. his series of posts: https://cbloomrants.blogspot.com/201...nclusions.html
2. I do not have access to Charles Bloom's Oodle compression suite. It is very likely, from the available compression test results, that Oodle's faster packers (Selkie and, possibly, Mermaid) are byte-aligned. Their modern design means that they are very likely to be the best currently available compressors for compression ratio vs decompression speed. They seem to tick all the boxes in the requirements listed in my original post. Nevertheless, since none of my programming projects pay any money, and Oodle is neither publicly available nor cheap, I cannot really discuss it.
3. These are some selected results from my recent tests:
Compressors above the first horizontal line all use mixed bit/byte streams. This flexibility gives them higher compression ratio. Exomizer and aPLib have Z80 decompressors and both are fairly popular in appropriate applications. All compressors below the first horizontal line use byte-aligned formats of compressed data. Crush, funnily enough, seems to have a kind of compression ratio that is approximately at the borderline of what seems possible with publicly-available byte-aligned compressors. Otherwise, Crush trails quite far behind most compressors using mixed compressed streams.
lzoma 0.2 c,7,100000 138732
Exomizer 2.11 139579
Appack (ver. by r57shell, 17/10/2017) 140724
bCrush 0.1.0 --optimal 154826
Crush 1.1 -9 155697
Crush 1.00 cx 157332
lzop 1.03 -9 158095
LZ5 1.4.1 -15 158632
lzop 1.03 -7 158654
LZ5 1.5.0 -15 163431
LZF 1.03 cx 167634
Lizard -29 170601
smallz4 1.2 -9 173235
4. The decompression speed for LZ4 on Z80 is very high, about 3x the fastest possible copying speed or 1.5x the "default" copying command LDIR. Thus, I am not really interested in compressors with compression ratio below LZ4. Instead, I am more fascinated by the top-end of the byte-aligned compression, mainly because this is where, in my opinion, new Pareto-optimal compressors can be made. In this category there are basically three compressors:
- LZOP contains an implementation of LZO1X, the best byte-based compression algorithm in this test. It is interesting, I party read it, but I am still studying it and would love to discuss it at some future point in time. However, the algorithm is somewhat complex, source code is seriously unorthodox and documentation is next to non-existent, so I am not overly keen.
- LZ5 v.1.4.1 is almost as good, but comes with much clearer documentation and clarity of format and is pleasure to work with. The key to its good compression ratio seems to be in its use of "repeated offset" command. I implemented a simple WIP decompressor for Z80 and its speed is amazing given its compression ratio. This is especially relevant due to the fact that part of the compressed data format responsible for long (3-byte offset) matches cannot be used for small Z80 files. Thus, there is a clear scope for modifying LZ5 in ways that would benefit Z80 implementation very significantly. At the same time, other versions of LZ5/Lizard represent really unpredictable offerings regarding the compression ratio. The fact that LZ5 1.5 compresses much worse that LZ5 1.4.1 is weird. It seems that the author is mostly concerned with the compression/decompression speed, so the compression ratio gets overlooked.
- Doboz also looks neat, but it cannot really compete with LZ5 v.1.4.1.
- LZF offers a fair improvement in compression ratio compared to LZ4 but cannot compete as is with either LZO or LZ5. However, given the difference in compression ratios between LZ4 and LZ5, I am pretty confident that the addition of repeated offsets to LZF is likely to produce a byte-aligned compressor with the ratio that would be able to compete with Crush.