Results 1 to 6 of 6

Thread: understanding of deflate

  1. #1
    Member
    Join Date
    Jul 2017
    Location
    Moscow
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    understanding of deflate

    Good day! I try to understand some principles of Compression with dynamic Huffman Codes (Deflate). I made a ZIP with txt file inside and started to analyse compressed data. Here it is.
    ed c2 01 0d 00 00 0c 02 a0 ac bf fd 3b 98 c3 0d c6 7d 54 55 55 55 55 55 55 75 6e 01 (in hex format)

    11101101 11000010 0000000 1 00001101 00000000 00000000 00001100 00000010 10100000 10101100 10111111 11111101 00111011 10011000 11000011 00001101 11000110 01111101 01010100 01010101 01010101 01010101 01010101 01010101 01010101 01110101 01101110 00000001 (in binary)

    here are my reasoning. (i use RFC1951)

    First 3 bits - 101: 1 - last block; 10 - Compression with dynamic Huffman codes.
    So 11101 = HLIT - 257; HLIT = 286; 00010 = HDIST - 1; HDIST = 3; 1110 = HCLEN - 4; HCLEN = 18.
    After that i need to read HCLEN*3 bits. In my example i need to read 18*3 bits and build Huffman tree. So lets go.

    000 – 0
    000 – 0
    010 – 2
    011 – 3
    000 – 0
    001 - 1
    000 – 0
    000 – 0
    000 – 0
    000 – 0
    000 – 0
    110 – 6
    000 – 0
    100 – 4
    000 – 0
    000 – 0
    000 – 0
    100 - 4

    and finally
    0 - 110
    1 - 1110
    2 - 0
    3 - 1111
    4 - 1000000 (i cant get length equal 6 )
    5 - 0
    6 - 0
    7 - 0
    8 - 0
    9 - 0
    10 - 0
    11 - 0
    12 - 0
    13 - 0
    14 - 0
    15 - 0
    16 - 0
    17 - 0
    18 - 10

    Now i have two main questions:
    1) Is everything above correct?
    2) What should i do next? I cant understand reading only RFC1951 without examples.

    Please, help me to understand this algorithm.

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > I try to understand some principles of Compression with dynamic Huffman Codes (Deflate).

    Note that deflate's huffman is actually static, ie not adaptive.
    Deflate's "dynamic huffman" just means that length table is stored in the block's header (RLE + huffman-coded itself).

    > Please, help me to understand this algorithm.

    I'd suggest looking at sources in http://nishi.dreamhosters.com/u/defl_matches_v0.rar (raw2dec.cpp mainly)

    What do you want to do with it, anyway?

  3. #3
    Member
    Join Date
    Jul 2017
    Location
    Moscow
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for your answer!
    Well i just need to decompress not all compressed data, but only a few first bytes of it and so i need to write my own function, which will be able to do it. I cant find in the net suitable code, unfortunately.

  4. #4
    Member
    Join Date
    Jul 2017
    Location
    Moscow
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    After wasting some time on the reading raw2dec.cpp i still don`t understand how to decode just a few bytes from compressed data. Please help me to find out how to do it.

  5. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    1. You can extract a .raw file (deflate stream without headers) using rawdet utility

    2. Bits are processed in bit0..bit7 order, in other words, first bit read from a byte 0x80 would be 0. (See bits() function).

    3. 3 header bits are read from stream (last and type fields), then dynamic() is called to process type=2 header.

    4.
      nlen  = bits(5) + 257;
    ndist = bits(5) + 1;
    ncode = bits(4) + 4;

    // read secondary huffman code-length table, which is used to decompress primary huffman table
    const word order[19] = {16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15};
    for( index=0; index<ncode; index++ ) lengths[order[index]] = bits(3);
    // construct huffman code from length table
    err = lencode.construct( lengths, 19 );


    5. This is followed by decoding of primary length table (which is compressed by RLE+huffman).
    Really, there's too much to describe in english, so you have to learn to read C somehow.

    If you don't like my coding style, you can look at Adler's sources:
    http://nishi.dreamhosters.com/u/puff1a_v0.rar
    http://nishi.dreamhosters.com/u/puff1a_trace.txt

    In type=2 mode you can't "just decode a few bytes" - you'd have to include half of raw2dec source,
    because to decode these bytes, you need to determine which huffman codes are assigned to them.
    And for that, you need to decode the primary length table and build the huffman code from it.

  6. #6
    Member
    Join Date
    Jul 2017
    Location
    Moscow
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thank you for answer. I will read the code more attentively.

Similar Threads

  1. Optimal Deflate
    By FunkyBob in forum Data Compression
    Replies: 14
    Last Post: 15th June 2017, 15:46
  2. Help understanding Recompression
    By rexferal0009 in forum Data Compression
    Replies: 11
    Last Post: 28th November 2016, 10:16
  3. Deflate stream question
    By Razor12911 in forum Data Compression
    Replies: 5
    Last Post: 20th September 2016, 00:08
  4. I'm looking for the best free implementation of deflate
    By caveman in forum Data Compression
    Replies: 2
    Last Post: 22nd November 2010, 09:27
  5. deflate model for paq8?
    By kaitz in forum Data Compression
    Replies: 2
    Last Post: 6th February 2009, 21:48

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •