Results 1 to 7 of 7

Thread: gipfeli

  1. #1
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts

    gipfeli

    Gipfeli is a high-speed compression/decompression library aiming at slightly higher compression ratios (around 30 % less bytes produced for text) than other high-speed compression libraries. On a single core of a Core i7 processor in 64-bit mode, gipfeli compresses at about 180 MB/s or more and decompresses at about 300 MB/s, typically 5x faster than zlib, but does not quite achieve its compression density.
    https://code.google.com/p/gipfeli/

    Looks like a LZHUFF competitor that's open source (Apache 2 license).

  2. #2
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I tried to use it to get some numbers.
    First, the API is wrong.
    Code:
      Status Compress(const char* input, size_t input_length, std::string* output);
      Status Uncompress(const char* compressed, size_t compressed_length, std::string* uncompressed);
    Functions take char* as input and string* as output...internally the first thing they do with output is resize it to the size they need and then use just like a byte array. Then why use strings at all? It only makes users convert things back and forth.

    Second, I made it work.
    On calgary.tar, on anything else it returns decompression errors. I asked the author about it, can't wait for reply.

    For now, calgary result:
    Code:
    Codec                                   version      args
    C.Size      (C.Ratio)        C.Speed   D.Speed      C.Eff. D.Eff.
    gipfeli                                 2011-10-19
        1337335 (x 2.358)        48 MB/s   78 MB/s      1439e3 2302e3
    miniz                                   1.11         1
        1372044 (x 2.298)        30 MB/s   68 MB/s       895e3 1967e3
    miniz                                   1.11         2
        1149652 (x 2.742)        18 MB/s   78 MB/s       614e3 2540e3
    zlib                                    1.2.7        1
        1205442 (x 2.616)        16 MB/s   76 MB/s       511e3 2431e3
    LZ4                                     r73          12
        1620566 (x 1.946)       101 MB/s  319 MB/s      2532e3 7959e3
    Shrinker                                r6
        1482747 (x 2.126)        66 MB/s  233 MB/s      1816e3 6346e3
    QuickLZ                                 1.5.1b6      1
        1496892 (x 2.106)        79 MB/s   80 MB/s      2133e3 2159e3
    QuickLZ                                 1.5.1b6      2
        1302087 (x 2.421)        38 MB/s   68 MB/s      1167e3 2070e3
    Codec                                   version      args
    C.Size      (C.Ratio)        C.Speed   D.Speed      C.Eff. D.Eff.
    done... (3x20 iteration(s)).
    Treat the results with a big grain of salt, calgary is not great benchmark data and to measure performance it's just too small.

  3. #3
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts

    Angry

    enwik7(10000000) compressed to 4634903 in 0.3 sec (AMD-E350, 1.6GHz x2).
    decompress always failed?

    confused with The API. (Why don't use std::vector or something other than std::string?)

  4. #4
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Found another data on which it works, one of my standard text benchmarks *cut into 128K pieces*:
    Code:
    e:\projects\benchmark04\tst>fsbench gipfeli miniz,1 miniz,2 zlib,1 lz4 shrinker
    quicklz quicklz,2 -b131072 ..\nbbs.tar
    memcpy: 154 ms, 30289408 bytes = 750 MB/s
    Codec                                   version      args
    C.Size      (C.Ratio)        C.Speed   D.Speed      C.Eff. D.Eff.
    gipfeli                                 2011-10-19
       10858699 (x 2.789)        90 MB/s  144 MB/s        14e6   23e6
    miniz                                   1.11         1
       11190309 (x 2.707)        47 MB/s  111 MB/s      7697e3   17e6
    miniz                                   1.11         2
        9960869 (x 3.041)        23 MB/s  117 MB/s      4088e3   19e6
    zlib                                    1.2.7        1
       10384961 (x 2.917)        21 MB/s  107 MB/s      3595e3   17e6
    LZ4                                     r73          12
       11754726 (x 2.577)       145 MB/s  477 MB/s        22e6   73e6
    Shrinker                                r6
       11066709 (x 2.737)        75 MB/s  304 MB/s        12e6   48e6
    QuickLZ                                 1.5.1b6      1
       11598552 (x 2.611)       104 MB/s  126 MB/s        16e6   19e6
    QuickLZ                                 1.5.1b6      2
       11011252 (x 2.751)        66 MB/s  146 MB/s        10e6   23e6
    Codec                                   version      args
    C.Size      (C.Ratio)        C.Speed   D.Speed      C.Eff. D.Eff.
    done... (3x4 iteration(s)).
    No, it doesn't work on all small files.

  5. #5
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    872
    Thanks
    457
    Thanked 175 Times in 85 Posts
    is there a working version I could use for SqueezeChart corpora?

  6. #6
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I didn't get a reply, there's been no update recently, I guess no. You may try your luck contacting the authors.

  7. #7
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    After nearly 2 years I have received an answer for my inquiry.
    The problems appear to have been corrected and the project has been moved to:
    https://github.com/google/gipfeli

    Initial results in fsbench on core2:
    Note that gipfeli has an API based on std::string that doesn't match well to what fsbench expects and requires my adaptation layer to do memcpy back and forth. A project designed to use gipfeli would have better performance to what I show below. Also, fsbench is compiled with SSE2 and intel-zlib's special quick mode is enabled (unlike in what I've shown before).
    Code:
    katmacadapc% ./fsbench zlib,1 shrinker gipfeli miniz,1 quicklz,2 files/_FOSSIL_ 
    Codec                                   version      args
    C.Size      (C.Ratio)        E.Speed   D.Speed      E.Eff. D.Eff.
    zlib                                    2014-06-16 Intel 1
         144872 (x 6.877)     91.0 MB/s  359 MB/s        77e6  306e6
    Shrinker                                r6           
         169186 (x 5.889)      293 MB/s  802 MB/s       243e6  665e6
    gipfeli                                 2014-06-30   
         188905 (x 5.274)      221 MB/s  432 MB/s       179e6  350e6
    miniz                                   1.11         1
         154118 (x 6.465)      186 MB/s  306 MB/s       157e6  258e6
    QuickLZ                                 1.5.1b6      2
         158442 (x 6.288)      170 MB/s  416 MB/s       143e6  350e6
    Codec                                   version      args
    C.Size      (C.Ratio)        E.Speed   D.Speed      E.Eff. D.Eff.
    done... (4*X*1) iteration(s)).
    katmacadapc% ./fsbench zlib,1 shrinker gipfeli miniz,1 quicklz,2 files/scc1.tar 
    Codec                                   version      args
    C.Size      (C.Ratio)        E.Speed   D.Speed      E.Eff. D.Eff.
    zlib                                    2014-06-16 Intel 1
        5829409 (x 2.161)     38.5 MB/s  162 MB/s        20e6   86e6
    Shrinker                                r6           
        6691068 (x 1.882)      144 MB/s  570 MB/s        67e6  267e6
    gipfeli                                 2014-06-30   
        6244270 (x 2.017)      124 MB/s  250 MB/s        62e6  125e6
    miniz                                   1.11         1
        6245122 (x 2.017)     73.7 MB/s  136 MB/s        37e6   68e6
    QuickLZ                                 1.5.1b6      2
        6141358 (x 2.051)     90.5 MB/s  162 MB/s        46e6   82e6
    Codec                                   version      args
    C.Size      (C.Ratio)        E.Speed   D.Speed      E.Eff. D.Eff.
    done... (4*X*1) iteration(s)).
    katmacadapc% ./fsbench zlib,1 shrinker gipfeli miniz,1 quicklz,2 files/calgary.tar 
    Codec                                   version      args
    C.Size      (C.Ratio)        E.Speed   D.Speed      E.Eff. D.Eff.
    zlib                                    2014-06-16 Intel 1
        1281280 (x 2.549)     37.5 MB/s  158 MB/s        22e6   95e6
    Shrinker                                r6           
        1540611 (x 2.120)      133 MB/s  501 MB/s        70e6  264e6
    gipfeli                                 2014-06-30   
        1387112 (x 2.355)      117 MB/s  201 MB/s        67e6  115e6
    miniz                                   1.11         1
        1425257 (x 2.292)     70.0 MB/s  123 MB/s        39e6   69e6
    QuickLZ                                 1.5.1b6      2
        1354211 (x 2.412)     87.7 MB/s  170 MB/s        51e6   99e6
    Codec                                   version      args
    C.Size      (C.Ratio)        E.Speed   D.Speed      E.Eff. D.Eff.
    done... (4*X*1) iteration(s)).
    
    Overall, it's a pareto frontier, but I expected more.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •