Results 1 to 15 of 15

Thread: Optimal Deflate

  1. #1
    Member FunkyBob's Avatar
    Join Date
    Jun 2017
    Location
    Australia
    Posts
    24
    Thanks
    1
    Thanked 4 Times in 3 Posts

    Optimal Deflate

    Please go easy one me, this is my first posting, though i've been casually studying compression since I was a teen.

    In my daytime life I spend a lot of time working on web sites, and find the need for two kinds of deflate tools:

    1. Fast, for on-the-fly compression of dynamic content
    2. Tight, for compress-once / serve-many content

    For case 2, it seems the current best at hand is AdvanceCOMP, which I believe uses the 7-zip and zopfli engines.
    In order to eek out the last few bytes, it can be set to an iterative mode, which I assume experiments with different parsings to see how it effects the huffman stage.

    For case 1, currently zlib at "default" level with max window / memory seems common practice.

    Now, having seen the performance of tools like zstd, I wonder how much of that expertise could be put to use on building "the best" deflate for the two common uses.

    I realise a lot of work is going into trying to supplant deflate in the general world, but look how long it took for brotli to get in "all" the browsers...

    I guess my questions are:
    - What deflate libs are there that can compress (as fast but tighter) or (as tight but faster) than zlib?
    - Is there an "optimal" deflate lib for offline compression that isn't just an iterative guess?

    --
    Curtis

    P.S. I have looked around at various "optimised" zlibs and alternatives... none of them seem to have retained their competitive edge, though I'm quite happy to be shown to be wrong

  2. #2
    Member jibz's Avatar
    Join Date
    Jan 2015
    Location
    Denmark
    Posts
    114
    Thanks
    91
    Thanked 69 Times in 49 Posts
    If your use case allows having the entire input and output in memory at once, then libdeflate might be worth a look.

  3. #3
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts
    For comparison you might have a look at the web compression benchmark or
    make your own benchmark with the TurboBench Compression Benchmark

    - Fast, for on-the-fly compression of dynamic content
    igzip, brotli 4
    - Tight, for compress-once / serve-many content
    brotli 11, zopfli
    Last edited by dnd; 12th June 2017 at 13:53.

  4. The Following User Says Thank You to dnd For This Useful Post:

    Jyrki Alakuijala (12th June 2017)

  5. #4
    Member FunkyBob's Avatar
    Join Date
    Jun 2017
    Location
    Australia
    Posts
    24
    Thanks
    1
    Thanked 4 Times in 3 Posts
    I have tried with it before, and it didn't seem any better.

    So I tried again just now:

    At default levels, it compresses my sample 190k JSON file...
    gzip -9k : 38839 bytes
    libdeflate/gzip -9k : 38873 bytes

    Time (using time repeatedly) :
    gzip : ~8ms
    libdeflate/gzip: ~5ms

    So, in this case it's ~37% faster, which I guess is nothing to sneeze at.

    Would it be even faster with a pre-loaded memory buffer?

    At level 9 compression:
    gzip : 12ms / 38522 bytes
    libdeflate: 23ms / 37472 bytes

    And level 12:
    libdeflate : ~45ms / 37042 bytes

    advzef -z4: Additional .82 seconds / 36691 bytes

    So, yeah, libdeflate looks like a potential contender.... any others?

    --
    C

  6. #5
    Member
    Join Date
    Aug 2015
    Location
    Urbana, IL
    Posts
    144
    Thanks
    10
    Thanked 151 Times in 80 Posts
    ECT (of which I am author) offers more compression than zopfli when using high levels, and is much faster than zopfli at the same compression ratio. It is based on zopfli and also iterative.
    For case 1, libdeflate will be the best solution.

  7. The Following User Says Thank You to fhanau For This Useful Post:

    Jyrki Alakuijala (12th June 2017)

  8. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    try gzip/libdefalte in -1 mode if you really look for speed

  9. #7
    Member FunkyBob's Avatar
    Join Date
    Jun 2017
    Location
    Australia
    Posts
    24
    Thanks
    1
    Thanked 4 Times in 3 Posts
    Quote Originally Posted by fhanau View Post
    ECT (of which I am author) offers more compression than zopfli when using high levels, and is much faster than zopfli at the same compression ratio. It is based on zopfli and also iterative.
    For case 1, libdeflate will be the best solution.
    Indeed, ECT did slightly better than AdvanceCOMP for maximum compression, achieving 36657 compared to 36691 on my JSON test sample.

  10. The Following User Says Thank You to FunkyBob For This Useful Post:

    encode (12th June 2017)

  11. #8
    Member FunkyBob's Avatar
    Join Date
    Jun 2017
    Location
    Australia
    Posts
    24
    Thanks
    1
    Thanked 4 Times in 3 Posts
    For reference, here's the same file at -1 to -9 using stock gzip:

    38522 index.9.gz
    38654 index.8.gz
    38717 index.7.gz
    38839 index.6.gz
    39434 index.5.gz
    40898 index.4.gz
    43492 index.3.gz
    44951 index.2.gz
    46964 index.1.gz

    190217 index.html



    I'm not looking for pure speed, per se... of course it's the ol' "best compression for the time".

    Certainly, I need to solidify my boundaries to decide how much time is worth how much compression.

    But after reading Yan's work on huff0 / FSE ... and, I believe, cbloom also did work on making the fastest huffman he could.

    Has any of this work found its way into a deflate, where possible?

  12. #9
    Member
    Join Date
    Nov 2014
    Location
    Earth
    Posts
    38
    Thanks
    0
    Thanked 77 Times in 19 Posts
    For compression, libdeflate currently has different advantages over zlib at different levels:

    Level 1: libdeflate is 1.5-2 times as fast and compresses significantly better.
    Level 6: libdeflate is 2-3 times as fast and compresses about the same or very slightly better.
    Level 9: libdeflate is about the same speed but compresses significantly better.
    Level 12: libdeflate is slower than zlib level 9, but compresses significantly better.

    There is no compressor, even Zopfli or ECT, that can perform optimal DEFLATE compression for arbitrary inputs. But it doesn't really matter because you can get very close with heuristics.

    As far as fast Huffman coding optimizations, that's primarily a concern for decompression, not compression. While both sides have been optimized in libdeflate, and libdeflate's decompressor is significantly faster than zlib's, it is fundamentally limited by the DEFLATE format. There are changes that could be made to the format to make it faster to decompress; the Zstandard format, for example, is somewhat better designed in this regard.

  13. #10
    Member FunkyBob's Avatar
    Join Date
    Jun 2017
    Location
    Australia
    Posts
    24
    Thanks
    1
    Thanked 4 Times in 3 Posts
    For comparison, here are the sizes for libdeflate/gzip :

    37042 index.html.12.gz
    37064 index.html.11.gz
    37186 index.html.10.gz
    37472 index.html.9.gz
    38053 index.html.8.gz
    38661 index.html.7.gz
    38873 index.html.6.gz
    39178 index.html.5.gz
    40239 index.html.4.gz
    40821 index.html.3.gz
    41563 index.html.2.gz
    43322 index.html.1.gz

  14. #11
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts
    First of all you must clarify how and where you want use these libraries.


    1. Fast, for on-the-fly compression of dynamic content
    A module (nginx,apache,...) for this library must be installed on your web server.


    2.- Tight, for compress-once / serve-many content
    In this case you can only recompress your files with zopfli,AdvanceComp,...
    You can also use brotli when supported at the server side.
    Only gzip,deflate, brotli (also bzip2 and xz) compression is supported by the major browsers.
    No chance to use other compression format like zstd.


    For http-transfer the decompression speed is not so important as supposed to be.
    It is also implemented on the browser (user side).

  15. #12
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    i wonder why there are only two deflate libraries mentioned here?

    intel 300 MB/s deflate compressor: https://software.intel.com/en-us/art...ion-algorithms

    afair, intel has one more defalte compressor. there is also compressor by cloudflare. so on, so on

    look at https://www.snellman.net/blog/archiv...b-patches.html

  16. #13
    Member SolidComp's Avatar
    Join Date
    Jun 2015
    Location
    USA
    Posts
    222
    Thanks
    89
    Thanked 46 Times in 30 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    i wonder why there are only two deflate libraries mentioned here?

    intel 300 MB/s deflate compressor: https://software.intel.com/en-us/art...ion-algorithms

    afair, intel has one more defalte compressor. there is also compressor by cloudflare. so on, so on

    look at https://www.snellman.net/blog/archiv...b-patches.html
    And SLZ: http://www.libslz.org/

  17. #14
    Member FunkyBob's Avatar
    Join Date
    Jun 2017
    Location
    Australia
    Posts
    24
    Thanks
    1
    Thanked 4 Times in 3 Posts
    Quote Originally Posted by dnd View Post
    First of all you must clarify how and where you want use these libraries.
    I plan to use this in a web server... or on content to be served.


    Quote Originally Posted by dnd View Post
    1. Fast, for on-the-fly compression of dynamic content
    A module (nginx,apache,...) for this library must be installed on your web server.
    This is there already, however I'm looking to see what can be done about reducing the overhead.

    My next planned step is to add some infrastructure so I can see just how much that overhead is.

    Quote Originally Posted by dnd View Post
    2.- Tight, for compress-once / serve-many content
    In this case you can only recompress your files with zopfli,AdvanceComp,...
    You can also use brotli when supported at the server side.
    Only gzip,deflate, brotli (also bzip2 and xz) compression is supported by the major browsers.
    No chance to use other compression format like zstd.

    For http-transfer the decompression speed is not so important as supposed to be.
    It is also implemented on the browser (user side).
    Yes, I currently use AdvanceComp, however I was curious if there were any better tools available.
    I've also written a patch to support offline compressed brotli assets.

    I'm curious which browser(s) support bzip2 and/or xz ?

    --
    Curtis

  18. #15
    Member FunkyBob's Avatar
    Join Date
    Jun 2017
    Location
    Australia
    Posts
    24
    Thanks
    1
    Thanked 4 Times in 3 Posts
    After looking more at the various options, and at the deflate.c from zlib, ISTM I can improve performance considerably by removing options.

    If the code is retuned for assuming, for instance, -6 all the time, fixed memory size, fixed strategy, etc etc... I wonder what difference it would make.

    So, I'll toy with that on my own time...

    As for replacing zlib with libdeflate, the program I'm trying to do this on [uwsgi] currently looks to rely on the streaming interface, which cuts down my options somewhat.

    --
    Curtis

Similar Threads

  1. Optimal Preprocessing/Parsing for LZ?
    By comp1 in forum Data Compression
    Replies: 68
    Last Post: 11th February 2015, 20:27
  2. Replies: 5
    Last Post: 15th July 2014, 20:24
  3. [LZ] Optimal parsing
    By Bulat Ziganshin in forum Data Compression
    Replies: 14
    Last Post: 19th March 2014, 21:56
  4. Optimal ordered bitcodes (or smth like this)
    By Piotr Tarsa in forum Data Compression
    Replies: 29
    Last Post: 26th July 2011, 13:18
  5. optimal parsing
    By flounder in forum Data Compression
    Replies: 10
    Last Post: 3rd August 2008, 13:07

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •