Results 1 to 8 of 8

Thread: Research question: Difference between "implode" vs. "deflate"?

  1. #1
    Member
    Join Date
    May 2008
    Location
    USA
    Posts
    42
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Question Research question: Difference between "implode" vs. "deflate"?

    Sorry to post such an irrelevant question for today's compression research, but I'm trying to compare older compression standards and I ran across PKZIP 1.x's "implode" method and am wondering if there is a description of it anywhere. I've RE'd it for close to an hour and, unless I'm mistaken, it's very close to deflate except it uses a 4K dictionary instead of deflate's 32K. Is this correct, or am I missing something? The PKWARE APPNOTE.TXT and the Deflate RFCs don't mention PKZIP 1.x "implode" anywhere

    (In case you're wondering why I'm looking at ancient history, it's because I wrote a decompression routine for 8088/8086 CPUs as part of my hobby and I want to compare it to the pkware DCL's performance. And yes, I believe my code to be the fastest of its kind ever written for the 808x but of course I want to back that up with results before making everything available

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Quote Originally Posted by unzip source
    Explode an imploded compressed stream. Based on the general purpose
    bit flag, decide on coded or uncoded literals, and an 8K or 4K sliding
    window. Construct the literal (if any), length, and distance codes and
    the tables needed to decode them (using huft_build() from inflate.c),
    and call the appropriate routine for the type of data in the remainder
    of the stream. The four routines are nearly identical, differing only
    in whether the literal is decoded or simply read in, and in how many
    bits are read in, uncoded, for the low distance bits.
    see explode.c in http://switch.dl.sourceforge.net/pro...unzip60.tar.gz
    or ImplodeDecoder.cpp in http://downloads.sourceforge.net/sevenzip/7z920.tar.bz2

  3. #3
    Member
    Join Date
    May 2008
    Location
    HK
    Posts
    160
    Thanks
    4
    Thanked 25 Times in 15 Posts
    Quoted from http://www.tylogix.com/Articles/PKZI...Techniques.htm :

    PKZIP's usage of Probabilistic, Lempel-Ziv-Welch, Shannon-Fano and Huffman Coding Methods:

    The following excerpt comes from Technical Support Magazine. It shows what methods are used behind the scenes when PKZIP yields messages such as "Reducing", "Imploding" and "Deflating" as it compresses a file. It is apparent that compounding different compression methods produce better results.

    The Reducing Algorithm:

    The Reducing algorithm is actually a combination of two distinct algorithms. The first algorithm compresses repeated byte sequences. The second algorithm takes the compressed stream from the first and applies a probabilistic compression method.

    The Imploding algorithm:

    The Imploding algorithm is also a combination of two distinct algorithms. The first algorithm compresses repeated byte sequences using a sliding dictionary (as explained in the LZW algorithm earlier). The second algorithm is used to compress the encoding of the sliding dictionary output, using multiple Shannon-Fano trees.

    The Deflating Algorithm:

    The Deflate algorithm is similar to the Implode algorithm using a sliding dictionary of up to 32 KB with a secondary compression from Huffman/Shannon-Fano codes. The compressed data is stored in blocks with a header describing the block and the Huffman codes used in the data block.

    Quoted from http://www.binaryessence.com/dct/en000220.htm :

    Shrunk
    Modified LZW (Lempel-Ziv-Welch) compression method with partial clearing and dynamical addressing of the table entries.

    Reduced (1 to 4)
    A dictionary based algorithm used for addressing byte sequences from former contents. Finally a variable length coding is applied according to the probability distribution of the codes. This method is scalable in four steps.

    Imploded
    A dictionary based algorithm accessing a 4 or 8 KByte large dictionary. Sequence lengths, addresses and uncompressed symbols (optional) use Shannon Fano codes.

    Tokenizing (reserviert)
    Tokenizing is only part of the parameter list. The implementation is not specified and it is not used by PKWARE.

    Deflate
    A dictionary based algorithm accessing an up to 32 KByte large dictionary. Additionally a Huffman coding will be used for redundancy reduction.
    Last edited by roytam1; 5th January 2013 at 09:15.

  4. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Trixter, afair, Implode is fully described in pkzip's appnote.txt, and then Deflate is introduced via differences to Implode. Main differences are 4/8kb->32kb dictionary and multiple Huffamn tables (Implode compressed entire file with one Huffman table computed on first few kilobytes of the file). They may have other, smaller differences such as representation of Huffman tables and minimum match length (does Implode had minlen=2/3 for dict=4/8kb respectively? i don't remember)

    if you need good old appnote, get http://freearc.org/download/research/lzh.7z

  5. #5
    Member
    Join Date
    May 2008
    Location
    USA
    Posts
    42
    Thanks
    0
    Thanked 2 Times in 2 Posts
    All of these responses have been extremely helpful, thanks everyone!! (It also confirms for me that, for my tests, the implode DCL code I have from pkware is ok for me to use as a comparison to my own because most of the relevant samples I'm giving it wouldn't exceed a 4K dictionary.)

    Once I've got my research in a digestible format, I'll post a link here. Thanks again.

  6. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    i'm not sure but pkware DCL may have nothing common to their implode/deflate format

  7. #7
    Member
    Join Date
    May 2008
    Location
    USA
    Posts
    42
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Once I've got my research in a digestible format, I'll post a link here.
    Apologies; I completely forgot to do that. Here's the link with some informal test results and source: http://www.oldskool.org/pc/lz4_8088

    The goal of the project was to write a "best in class" decompressor for 808x systems, which I think I succeeded at. I haven't compared to LZO or other 808x-compatible libraries (Rob Northen's RDC was pointed out to me), but looking at the complexity of their decompression assembler code, as well as their compression efficiency, I would be shocked if they were substantially faster than the LZ4 decomp code I wrote.

  8. #8
    Member
    Join Date
    May 2015
    Location
    Italy
    Posts
    46
    Thanks
    0
    Thanked 11 Times in 8 Posts
    On my site
    http://digilander.libero.it/borsa77
    you can find at the link
    LZ-Welch
    a Pascal 16-bit version of the PKWare Shrink and an 8086-asm of the LZW method
    by Storer's book "Data Compression", originally written in Pascal. Both do not include
    any entropy stage and that could give a better case for comparisons.

Similar Threads

  1. "Extreme" compression of DNS domains
    By nickety in forum Data Compression
    Replies: 20
    Last Post: 22nd October 2011, 01:20
  2. File "Type" identification tool
    By soor in forum Data Compression
    Replies: 4
    Last Post: 6th June 2011, 04:04
  3. PAQ8 C++ precedence bug (or "-Wparentheses is annoying")
    By Rugxulo in forum Data Compression
    Replies: 13
    Last Post: 21st August 2009, 20:36
  4. LZ77 speed optimization, 2 mem accesses per "round"
    By Lasse Reinhold in forum Forum Archive
    Replies: 4
    Last Post: 11th June 2007, 21:53
  5. Freeware "Send To" interface for CCM and QUAD
    By LovePimple in forum Forum Archive
    Replies: 2
    Last Post: 20th March 2007, 17:22

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •