I don't know which program was used to create the original GIF, but here are the results of some tests and conclusions:
- We can see from the PCF sizes and by looking at their content that all 3 GIF files contain the same content, so it seems there is no loss involved.Code:1_original.pcf 8.501.861 2_imagemagic.pcf 8.502.353 3_gifsicle.pcf 8.502.771 1_original.pcf.srep 8.501.933 12.pcf.dat 17.004.214 (concatenation of 1_original.pcf and 2_imagemagic.pcf) 12.pcf.dat.srep 8.505.164 1_original.pcf.7z 3.379.092 1_original.pcf.ccm2 2.990.794 1_original.pcf.zpaq_5 2.901.642 1_original.pcf.paq8p_3 2.879.986 1_original.pcf.paq8p_4 2.771.220
- When concatenating the first two PCF files and processing the result with SREP, there is no big difference to processing only one of the files with SREP, so most of the content is identical
- Concatenation+SREP won't work for any other combination of two PCF files, but I think this is only because the gifsicle file has the same content, but a changed palette order.
- The compression of the original GIF file is indeed very good - it surprises me that it beats 7-Zip.
I first thought the original GIF file would be the result of some usual optimization like palette reordering, using the transparent color or optimizing the size of the boxes in each frame where the image changed, but this doesn't seem to be the case, it seems like this is just the result of a very good LZW strategy.
Original.gif contains plenty of dithering, and still so compact. Maybe we finally observe image quantized from compression perspective, something like lossy LZW.
I modificated Precomp to not apply the GIF differences it stores, the result is this GIF file, 3,549,090 bytes in size. Outputting the diff codes shows that they are all clear codes, exactly 518 of them, so it seems that the strategy used just was to make heavy use of the clear codes (the GIF consists of 86 frames, so it was used about 6 times per frame) so that the LZW process doesn't expand the data because of the growing dictionary.
Lossy LZW is possible, too, but not really useful at is can be noticed very soon. Here are some results from a lossy LZW transform I made when working for Ocarina:
Image URLs (EDIT: Tolerance 50 link fixed):Code:tolerance size 35 2,888,736 40 2,760,815 50 2,540,638 75 2,143,709
As you can see, at least the approach I used there (replacing parts of the image with previous seen parts to increase matches) creates clearly visible artifacts when the tolerance is set too high.
Last edited by schnaader; 28th January 2013 at 21:26.
From what i see this artifacts are basically single color runs, with max length about 30. It looks like RLE optimization, not the LZW one.
Maybe single color runs can be limited, or treated separately.
p.s. tolerance 50 link is messed up
AFAIR, the algorithm replaces pixel blocks with one of these 3 possibilities:
- Previous part of the image
- Single color runs
- Transparent color runs
All of the optimizations are limited by the set tolerance which basically was a RGB delta.
For this image, the first one doesn't seem to occur very often or perhaps not at all, so the two others are used. You can see the last one used in the tolerance 75 image often, lots of data from the previous frames "shines through".
When developing lossy GIF encoder I've noticed that regular encoders are unable to efficiently re-encode a lossy LZW file.
I suspect this file has been encoded using a slightly lossy LZW compressor - dithering is noisier than from a typical ordered or floyd-steinberg diffusion, so it could have been done by taking advantage of lossy encoder's distortion for dithering.
Hm, it is corrupt indeed -- ImageMagick agrees. I guess that's what you get from an experimental image