Results 1 to 17 of 17

Thread: Intel's fast OSS deflate implementation

  1. #1
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts

    Intel's fast OSS deflate implementation

    https://software.intel.com/en-us/art...r-genomic-data

    https://software.intel.com/sites/def.../igzip_042.zip

    http://www.intel.com/content/dam/www...sion-paper.pdf

    these are just libraries providig API described in ReleaseNotes.txt and lacks main() code. for benchmarking purposes i've attached demo executables from IPP itself. it may be good alternative to lz4/zstd


    ADD: two more optimized zlib implementations are mentioned and benchmarked at http://www.htslib.org/benchmarks/zlib.html (note than intel zlib != igzip):

    Attached Files Attached Files
    Last edited by Bulat Ziganshin; 27th May 2015 at 12:59.

  2. The Following 5 Users Say Thank You to Bulat Ziganshin For This Useful Post:

    Cyan (25th May 2015),encode (1st March 2016),lorents17 (6th August 2015),nemequ (25th May 2015),Paul W. (27th May 2015)

  3. #2
    Member just a worm's Avatar
    Join Date
    Aug 2013
    Location
    planet "earth"
    Posts
    96
    Thanks
    29
    Thanked 6 Times in 5 Posts
    Is anyone able to assemble/compile the source code for Windows 32 bit?

    It's a shame that the compression ratio is much lower than the output of zlib
    Last edited by just a worm; 26th May 2015 at 10:39.

  4. #3
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    it's 3-4x faster, so why you expect the same compression ratio?

  5. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by just a worm View Post
    Is anyone able to assemble/compile the source code for Windows 32 bit?
    these are just libraries providig API described in ReleaseNotes.txt. it may be good alternative to lz4/zstd

    i've attached demo executables from IPP itself to the first post
    Last edited by Bulat Ziganshin; 26th May 2015 at 12:22.

  6. The Following User Says Thank You to Bulat Ziganshin For This Useful Post:

    just a worm (27th May 2015)

  7. #5
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    I've looked into these before. While the BAM specific one is good, it is perhaps not so helpful long term. It gets the speed by having precomputed huffman tables and simply working on the (reasonable) asumption that what is good for one BAM file is good for another. The compression ratio suffers a little bit, but not as much as you'd think, while speed is highly significant.

    However you can get better speed and compression out of using things like zstd. That's no good for actual "external" BAM files, but for temporary intermediate files (such as generated during a merge sort) we don't need to stick to deflate at all. Indeed some tools are using Snappy or LZ4 already for such internal temporary files.

    Intel also have a more general purpose zlib speed up, which isn't BAM specific. I did some benchmarks, related to BAM, at http://www.htslib.org/benchmarks/zlib.html to compare vanilla zlib, intel zlib and also the optimised CloudFlare one. As expected Intel wins out at their level 1 compression as they chose speed over ratio (a useful tradeoff to have), but otherwise CloudFlare look to be a solid contender too.

  8. The Following 3 Users Say Thank You to JamesB For This Useful Post:

    Bulat Ziganshin (26th May 2015),Cyan (26th May 2015),Paul W. (27th May 2015)

  9. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    thank you, but it's a different library. just compare code at https://github.com/jtkukunas/zlib and https://software.intel.com/sites/def.../igzip_042.zip

  10. #7
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    I'm aware of that, but it's primarily designed for genomic data compression and it is in that context that I tested it. The other bit was "Intel *also* have...". . I thought I'd even commented on this library here at some stage.

    Intel have a pull request to make Samtools (the primary C code for manipulating SAM/BAM) deal with igzip. Unfortunately the complicated build environment has stalled it. Our hope is it can be achieved via a plugin system so that maintaining and building the code is punted back to the authors!

    https://github.com/samtools/htslib/pull/112
    https://github.com/samtools/htslib/pull/115

    There are some benchmarks I did in the discussions there. What I hadn't noticed until looking at it again is that they have a static huffman table for non-genomic data too. Presumably it's a "best guess" thing which works on some data and just expands on other data.

  11. #8
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    you are probably still talking about intel zlib from github, that is separate project

  12. #9
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    I'm talking about igzip, not zlib. igzip is not zlib compatible and has a different API, although it matches the deflate RFC.

  13. #10
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    oh, sorry, i was misparsed you post. but still, you benchmarked only intel zlib that has much less asm-optimized code. i just hope that people don't take this as benhcmark of igzip. and thank you - i just added link to your post to the header. once again, we need the Compression Wiki :)

    btw, it's also interesting download: https://01.org/intel%C2%AE-storage-a...source-version - Intel open-sourced asm-optimized implementation of erasure codes that can be used to add recovery record to the archiver. it implements the reed-solomon codes in GF(2^8) meaning that number of cooperating original+recovery blocks is limited to 256
    Last edited by Bulat Ziganshin; 27th May 2015 at 13:00.

  14. #11
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    The pull request I linked to has some benchmarks too, but admittedly rather poorly done (which is entirely my own fault as it was just a noddy quick test and I couldn't be bothered with a full blown analysis). Stupidly I didn't even write down the vanilla zlib timings and sizes, nor which file I used. Duh.

    Basically on that 1GB BAM file:

    Intel zlib -1: 553508373 bytes, 13.1s round trip
    Intel zlib -2: 414335449 bytes, 22.4s
    igzip1c: 430955563 bytes, 16.9s
    igzip0c: 446580855 bytes, 11.3s
    ZSTD: 405947800 bytes, 10.8s

    So igzip0c beats their optimised deflate on both speed and size, but ZSTD beats both. For temporary files therefore Deflate is the wrong algorithm to be trying to optimised. They were also trying to get us to just merge in a prebuilt .a file and store that in github. Umm, no thanks! Our (the samtools development team) stance on this is to just facilitate this via a plugin so people can use prebuilt run-time libraries if they wish, while also absolving ourselves from having to maintain an assembly/C++ hybrid in an otherwise pure-C library.

    I see there is definite potential here for tools that *must* use deflate as it's clearly the fastest out there and if you can accept lower compression ratios then it rocks.

    Edit: I also notice the htslib.org benchmarks I originally referred to also has an igzip benchmark on it too. It showed it being around 13% larger than the default zlib on the BAM file, while being ~4 times as fast.
    Last edited by JamesB; 27th May 2015 at 19:34.

  15. #12
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    Ugh: This table was produced with "Go Advanced". Why did it bork its own system? Well the numbers are in there somewhere...

    Enwiki9 timings on a local 64-bit machine.

     
    Prog
    igzip0c
    izgip1c
    lz4
    lz4 -9
    zstd
    FSE
    rANS -o0
    rANS -o1
    Vanilla gzip -1
    Intel minigzip -1

    I didn't check decompression time as igzip doesn't do this from what I can tell (you'd need one of the other zlib implementations). I expect lz4 to beat it considerably on decompression speed.

    rANS here is my hack of Ryg's 32-bit one (so not the fastest implementation out there) with an order-1 model added.

    Based on this, igzip0c is pretty solid for text files although igzip1c doesn't add much as it's similar speed to zstd but beaten on ratio. Like I say, I've no idea how decompression speed compares.

    I tried it on a tarball of /usr/bin for this linux system and I get around igzip0c 163887608 1.81s, igzip1c 156849910 2.69s, lz4 176497040 1.47s, zstd 145205961 2.45s.

    So in the right ballpark for binary data too. Not too shabby.

  16. #13
    Member Jaff's Avatar
    Join Date
    Oct 2012
    Location
    Dracula's country
    Posts
    100
    Thanks
    112
    Thanked 20 Times in 16 Posts
    The table above... as picture

    .
    Name:  Clipboard03.png
Views: 2417
Size:  1.8 KB

  17. The Following 5 Users Say Thank You to Jaff For This Useful Post:

    Bulat Ziganshin (30th May 2015),Cyan (26th April 2017),JamesB (31st May 2015),just a worm (31st May 2015),lorents17 (6th August 2015)

  18. #14
    Member
    Join Date
    Apr 2011
    Location
    Russia
    Posts
    168
    Thanks
    163
    Thanked 9 Times in 8 Posts
    Please, compile igzip for Windows

  19. #15
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    isn't it attached to the first post?

  20. #16
    Member
    Join Date
    Apr 2011
    Location
    Russia
    Posts
    168
    Thanks
    163
    Thanked 9 Times in 8 Posts
    Bulat Ziganshin
    At start of ipp_gzip to me gives an error message. Prompt ipp_gzip and igzip one and too?
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	Снимок1.PNG 
Views:	170 
Size:	9.4 KB 
ID:	4407   Click image for larger version. 

Name:	Снимок2.PNG 
Views:	138 
Size:	9.9 KB 
ID:	4408   Click image for larger version. 

Name:	Снимок3.PNG 
Views:	128 
Size:	9.9 KB 
ID:	4409  

  21. #17
    Member
    Join Date
    Apr 2009
    Location
    here
    Posts
    202
    Thanks
    165
    Thanked 109 Times in 65 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    this needs SSE4 it seems

Similar Threads

  1. Replies: 2
    Last Post: 28th December 2017, 12:35
  2. Intel AVX-512, MPX and SHA Extension
    By Bulat Ziganshin in forum Download Area
    Replies: 7
    Last Post: 1st June 2015, 21:43
  3. Intel Avoton
    By Sportman in forum Data Compression
    Replies: 0
    Last Post: 4th September 2013, 01:57
  4. Fast Huffman implementation
    By Gribok in forum Data Compression
    Replies: 5
    Last Post: 26th January 2012, 01:26
  5. I'm looking for the best free implementation of deflate
    By caveman in forum Data Compression
    Replies: 2
    Last Post: 22nd November 2010, 09:27

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •