Results 1 to 11 of 11

Thread: XPACK - experimental compression format (LZ77+FSE)

  1. #1
    Member
    Join Date
    Nov 2014
    Location
    Earth
    Posts
    38
    Thanks
    0
    Thanked 77 Times in 19 Posts

    XPACK - experimental compression format (LZ77+FSE)

    https://github.com/ebiggers/xpack

    XPACK is an experimental compression format. It is intended to have better performance than DEFLATE as implemented in the zlib library and also produce a notably better compression ratio on most inputs. The format is not yet stable.

    XPACK has been inspired by the DEFLATE, LZX, and Zstandard formats, among others. Originally envisioned as a DEFLATE replacement, it won't necessarily see a lot of additional development since other solutions such as Zstandard seem to have gotten much closer to that goal first (great job to those involved!). But I am releasing the code anyway for anyone who may find it useful.

    Format overview

    Like many other common compression formats, XPACK is based on the LZ77 method (decomposition into literals and length/offset copy commands) with a number of tricks on top. Features include:

    • Increased sliding window, or "dictionary", size (like LZX and Zstd)
    • Entropy encoding with finite state entropy (FSE) codes, also known as table-based asymmetric numeral systems (tANS) (like Zstd)
    • Minimum match length of 2 (like LZX)
    • Lowest three bits of match offsets can be entropy-encoded (like LZX)
    • Aligned and verbatim blocks (like LZX)
    • Recent match offsets queue with three entries (like LZX)
    • Literals packed separately from matches, and with two FSE streams (like older Zstd versions)
    • Literal runs (like Zstd)
    • Concise FSE header (state count list) representation
    • Decoder reads in forwards direction, encoder writes in backwards direction
    • Optional preprocessing step for x86 machine code (like LZX)


    Implementation overview

    libxpack is a library containing an optimized, portable implementation of an XPACK compressor and decompressor. Features currently include:

    • Whole-buffer compression and decompression only
    • Multiple compression levels
    • Fast hash chains-based matchfinder
    • Greedy and lazy parsers
    • Decompressor automatically uses Intel BMI2 instructions when supported


    In addition, the following command-line programs using libxpack are provided:

    • xpack (or xunpack), a program which behaves like a standard UNIX command-line compressor such as gzip (or gunzip). The command-line interface should be compatible enough that xpack can be used as a drop-in gzip replacement in many cases --- though the on-disk format is incompatible, of course.
    • benchmark, a program for benchmarking in-memory compression and decompression


    Note that currently, all the programs internally use "chunks", as the library does not yet support streaming. This will worsen the compression ratio slightly, compared to what is possible.

    All source code is available at https://github.com/ebiggers/xpack.

    Building

    See README.md file for build instructions. Alternatively, Windows binaries may be downloaded from https://github.com/ebiggers/xpack/releases. As usual, the 64-bit binaries are faster and should be preferred.

  2. The Following 15 Users Say Thank You to Zyzzyva For This Useful Post:

    Bulat Ziganshin (14th May 2016),Christoph Diegelmann (15th May 2016),comp1 (15th May 2016),Cyan (14th May 2016),encode (16th May 2016),inikep (16th May 2016),JamesB (17th May 2016),Jarek (14th May 2016),jibz (14th May 2016),lorents17 (17th May 2016),Matt Mahoney (18th May 2016),Mike (14th May 2016),schnaader (14th May 2016),Stephan Busch (15th May 2016),Turtle (15th May 2016)

  3. #2
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    872
    Thanks
    457
    Thanked 175 Times in 85 Posts

    Question

    first results are pretty good, and with the standard chunk size the 9 testsets of SqueezeChart are compressed in about 500 seconds.

    How can the preprocessor for x86 code be activated?
    Do you also plan to add some delta filter to XPACK (like in CSC https://github.com/fusiyuan2010/CSC) ?

  4. #3
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts
    xpack now included in TurboBench

  5. #4
    Member
    Join Date
    Nov 2014
    Location
    Earth
    Posts
    38
    Thanks
    0
    Thanked 77 Times in 19 Posts
    How can the preprocessor for x86 code be activated?
    Do you also plan to add some delta filter to XPACK (like in CSC https://github.com/fusiyuan2010/CSC) ?
    For now, you have to recompile with make ENABLE_PREPROCESSING=yes to enable the x86 preprocessing. I just borrowed the x86 preprocessing code from my LZX implementation. Preprocessing is one of several areas where more work/research is needed. Ideally, there would be several preprocessing algorithms and a fast heuristic to choose between them automatically.

  6. #5
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    872
    Thanks
    457
    Thanked 175 Times in 85 Posts
    This is what CSC sourcecode offers.

  7. #6
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    When compressing the attached file, I get a crash:
    Code:
    ➜  afl-1.94b xpack/xpack-notrap -1 /tmp/4848
    lib/xpack_compress.c:614:35: runtime error: shift exponent 64 is too large for 64-bit type 'machine_word_t' (aka 'unsigned long')
    SUMMARY: AddressSanitizer: undefined-behavior lib/xpack_compress.c:614:35 in
    BTW,
    Code:
    --- a/Makefile
    +++ b/Makefile
    @@ -8,8 +8,8 @@
     # TODO: ENABLE_PREPROCESSING option
     #
     
    -CC := gcc
    -AR := ar
    +CC ?= gcc
    +AR ?= ar
     
     STATIC_LIB_SUFFIX := .a
     SHARED_LIB_SUFFIX := .so
    Attached Files Attached Files

  8. #7
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Zyzzyva View Post
    XPACK has been inspired by the DEFLATE, LZX, and Zstandard formats, among others. Originally envisioned as a DEFLATE replacement, it won't necessarily see a lot of additional development since other solutions such as Zstandard seem to have gotten much closer to that goal first (great job to those involved!). But I am releasing the code anyway for anyone who may find it useful.
    I think that ZSTD is the first codec that has a real chance of replacing Deflate. There are several reasons for that:
    * the lz-ans generation of codecs that it represents really raises the bar on the efficiency front, beating deflate much more handsomely than previous codecs did
    * Yann has successfully fielded a codec that nearly eradicated all competition in its area before
    * ZSTD has nearly all that is needed: efficiency, good QA, permissive license w/ no patents, a community of followers. A large corporate backer would make the list full.

    You say that it got better results faster. Frankly, that's no surprise as Yann is really good at what he does and he gets community help that your codec lacked. You probably made a mistake of opening the code so late, though late is certainly better than never. Combined the lateness with the bullet points above, I agree with you that the chances of XPACK replacing Deflate are minuscule. But that definitely doesn't make your codec worthless. You seem to be giving up, but I can tell you that there's value in persisting. First and foremost, keeping up you'll learn a thing or two. Second, you'll provide an alternative codec. According to Stephan Busch, you already have a well-working one. You may re-tweak it, so it makes tradeoffs significantly different from ZSTD, so you score nice wins in some area of your choice.

  9. #8
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Yann works at FB and it seems his main duty at work

  10. #9
    Member
    Join Date
    Nov 2014
    Location
    Earth
    Posts
    38
    Thanks
    0
    Thanked 77 Times in 19 Posts
    @m^3

    Thanks - I've fixed both of those things.

    @m^2

    I didn't really mean to give the impression that I'm "giving up"; I'll likely keep updating libxpack and making improvements when I have time. I just see the project as being more on the experimental side and a way to provide some ideas to the community. I also work on other projects as well.

    As far as Yann supposedly working at FB on Zstd, I think it's great. I see open source compression libraries/algorithms as often being subject to a "tragedy of the commons" effect --- almost everyone uses them, but few contribute. So assuming no ulterior motives, it's nice to see some investment by a company.

  11. #10
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    xpack is an interesting and well written code, sharing common roots with zstd.

    There might be a few interesting things to learn from your experiment, which could prove useful for zstd too.
    Last edited by Cyan; 18th May 2016 at 18:20.

  12. #11
    Member
    Join Date
    May 2008
    Location
    Kuwait
    Posts
    301
    Thanks
    26
    Thanked 22 Times in 15 Posts
    did you know about BIX http://www.softpedia.com/get/Compres...Archiver.shtml is had a similar features but was replaced by 7-zip

Similar Threads

  1. UTF-8 transformation to 7 bits format, helps compression
    By caveman in forum Data Compression
    Replies: 3
    Last Post: 14th January 2013, 14:04
  2. Reference of compression format
    By Silky in forum Data Compression
    Replies: 6
    Last Post: 24th April 2012, 04:18
  3. NEW BZP Experimental Archiver !
    By Nania Francesco in forum Data Compression
    Replies: 34
    Last Post: 20th January 2009, 21:22
  4. LZTURBO 0.0.1 - Experimental version
    By donotdisturb in forum Forum Archive
    Replies: 50
    Last Post: 29th August 2007, 20:57
  5. Fast arithcoder for compression of LZ77 output
    By Bulat Ziganshin in forum Forum Archive
    Replies: 13
    Last Post: 15th April 2007, 17:40

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •