Results 1 to 28 of 28

Thread: Search compressor for PyPyJS, better than deflate

  1. #1
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Search compressor for PyPyJS, better than deflate

    PyPy.js is PyPy compiled into JavaScript. (PyPy is Python written in RPython)

    Some Links:


    Current show stopper of PyPyJS: The load time.
    To init PyPyJS two bug files must be loaded. Both a ~20MB

    I'm searching for usable compression. A better compression than the default HTTP server/client: gzip/deflate

    My first try was LZMA: https://github.com/pypyjs/pypyjs.git...ment-111549828
    I compress with Python (LZMA is there since Python 3.3) and use https://github.com/nmrugg/LZMA-JS to decompress in JavaScript.

    The ~20MB will be compressed to ~2,9MB

    But this is only usable on fast machines. Then the decompression is done in ~5 sec.
    But on slower machines is takes >12sec. (Maybe on a RaspberryPi is will take forever )

    So i try to find a better compression...

    I found the nice benchmark: https://quixdb.github.io/squash-benchmark/
    I choose "dataset" the "Tarred source code of Samba 2-2.3" and here the "cleaned" results with the "beagleboard-xm" machine:

    Click image for larger version. 

Name:	b42ac87a-12d0-11e5-9847-76923e62afa7.png 
Views:	288 
Size:	47.6 KB 
ID:	3716

    IMHO, interesting compression with fair "ratio to decompression speed" are:
    • doboz
    • lzham
    • zlib:deflat
    • lzo


    But i didn't found a JavaScript implementation of "doboz" and "lzham"...
    zlib:deflat seems to be here: https://github.com/imaya/zlib.js

    for lzo i only found "miniLZO" in JavaScript here: https://github.com/abraidwood/minilzo-js

    I tried to compile "lzham" with emscripten to JavaScript, without success: https://github.com/richgel999/lzham_...ment-112318038

    So i used zlib:deflat for next tests:
    The compressed size is ~1.93MB and on the same machine, where lzg needs 12Sec is zlib.js ready in ~350ms


    Now i find http://mattmahoney.net/dc/text.html with more compression programs...
    compression with fair "ratio to decompression speed" are:


    • mcm 0.83 -x11 (closed source)
    • nanozipltcb 0.09
    • pcompress 3.1 -c libbsc -l14 -s1000m
    • lzturbo 1.1 -49 -b1000 -p0


    But i need the decompression in JavaScript...


    Has somebody ideas?

  2. #2
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    I tried zopfli but it seems it's not worth it:

    Code:
    Compress 'pypy.vm.js' with zlib level=9 to './pypy.vm.js_level9.deflate'
    uncompressed.......:  12.00 MBytes
    compression time...:   0.96 sec.
    compressed.........:   1.00 MBytes
    compression ratio..:  14.82 %
    
    Compress 'pypy.vm.js.mem' with zlib level=9 to './pypy.vm.js.mem_level9.deflate'
    uncompressed.......:   6.00 MBytes
    compression time...:   1.67 sec.
    compressed.........:   1.00 MBytes
    compression ratio..:  29.34 %
    
    ===============================================================================
    
    Compress 'pypy.vm.js' with zopfli to './pypy.vm.js_zopfli'
    uncompressed.......:  12.00 MBytes
    compression time...:  91.86 sec.
    compressed.........:   1.00 MBytes
    compression ratio..:  14.29 %
    
    Compress 'pypy.vm.js.mem' with zopfli to './pypy.vm.js.mem_zopfli'
    uncompressed.......:   6.00 MBytes
    compression time...:  39.36 sec.
    compressed.........:   1.00 MBytes
    compression ratio..:  28.82 %

  3. #3
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    No comments? No suggestions?

  4. #4
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Have a look at my TarsaLZP: https://github.com/tarsa/TarsaLZP
    It has implementations for plain C, Java, JavaScript and Python. All implementations produce identical files.
    It's tested on LTCB: http://mattmahoney.net/dc/text.html#2088

    It's symmetrical, so it means it doesn't have very fast decompression. But you still can try it. There is one trick to speed it up. Make:
    LZP Low Context Length == LZP High Context Length
    LZP Low Mask Size == LZP High Mask Size
    Then it will use single LZP model enabling higher speed. Reducing context length and mask size makes compression quicker at the cost of compression ratio.


    Turns out it has broken saving from web browser. A few years ago everything worked fine though
    Last edited by Piotr Tarsa; 20th June 2015 at 00:01.

  5. #5
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Have you compare values with zlib:deflate ?

  6. #6
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Have you looked at the LTCB link I've posted?

    It compresses textual data better than zlib:deflate but it's much slower at decompression.

  7. #7
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    I have looked at https://github.com/tarsa/TarsaLZP and clone and try to use it. But no success... (btw. the source file layout is strange )

    But i need more a not-symmetrical one. Decompression-Speed is the bottleneck.

  8. #8
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Cloning: git clone https://github.com/tarsa/TarsaLZP.git
    Then run file javascript/TarsaLZP/Main.html in browser
    C & Java versions are NetBeans projects.
    JavaScript & Python versions are IntelliJ IDEA projects.

  9. #9
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Quote Originally Posted by jedie View Post
    • mcm 0.83 -x11 (closed source)
    • nanozipltcb 0.09
    • pcompress 3.1 -c libbsc -l14 -s1000m
    • lzturbo 1.1 -49 -b1000 -p0
    Just a small correction:
    mcm and pcompress are open source projects.
    nanozip and lzturbo are closed source projects.

  10. #10
    Member snowcat's Avatar
    Join Date
    Apr 2015
    Location
    Vietnam
    Posts
    27
    Thanks
    36
    Thanked 11 Times in 8 Posts
    In my opinion, maybe SR2 fit your requirements. But you need to rewrite from C to JavaScript.

  11. #11
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Tree ( http://mattmahoney.net/dc/text.html#1672 ) has very good compression ratio and very high decompression speed. Memory usage during decompression is much lower than during compression. But there's no JavaScript decompressor.

  12. #12
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Skymmer View Post
    Just a small correction:
    mcm and pcompress are open source projects.
    nanozip and lzturbo are closed source projects.
    I only marked that mcm is close source. Because i found the information here: http://mattmahoney.net/dc/text.html#1449 (So it's out-dated?)

    I just copy&paste the interesting and suggested compressors form http://mattmahoney.net/dc/text.html
    sort by decompress time:
    Code:
                    Compression                      Compressed size      Decompresser  Total size   Time (ns/byte)
    Program           Options                       enwik8      enwik9     size (zip)   enwik9+prog  Comp Decomp  Mem  Alg Note
    -------           -------                     ----------  -----------  -----------  -----------  ----- -----  ---  --- ----
    lzham 1.0         -d29 -x                     25,002,070  202,237,199    191,600 s  202,428,799   1096   6.6 7800  LZ77 70
    lzturbo 1.1       -49 -b1000 -p0              24,416,777  194,681,713    110,670 x  194,792,383   1920     9 14700 LZ77 59
    tornado 0.6       -16                         25,768,105  217,749,028     83,694 s  217,832,722   1482     9 1290  LZ77 48
    rh5_x64           -window:27 c6               29,078,552  254,220,469     36,744 x  254,257,213    196   9.4  145  ROLZ 48
    lza 0.82b         -mx9 -b7 -h7                26,396,613  222,808,457    285,766 x  223,094,223    449   9.7 2000  LZ77 48
    cabarc 1.00.0601  -m lzx:21                   28,465,607  250,756,595     51,917 xd 250,808,853   1619    15   20  LZ77
    glza 0.2                                      20,806,740  167,274,338     15,218 sd 167,289,556   4713  16.4 6027  Dict 67
    xz 5.2.1--lzma2=preset=9e,dict=1GiB,lc=4,pb=0 24,703,772  197,331,816     36,752 xd 197,368,568   5876    20 6000  LZ77 73
    lzip 1.14-rc3     -9 -s512MiB                 24,756,063  199,410,543     21,682 s  199,432,225   2409    21 5632  LZ77 57
    csarc 3.3         -m5 -d1024m                 24,516,202  203,995,005     69,848 s  204,064,853    621    22 2463  LZ77 48
    nanozipltcb 0.09                              20,537,902  161,581,290    133,784 x  161,715,074     64    30 3350  BWT  40
    7zip 9.20                                     25,895,909  227,905,645    518,536 x  228,424,181   1031    42       LZMA  26
    mcm 0.83          -x11                        18,233,295  144,854,575     79,574 s  144,934,149    394   281 5961  CM   72
    nanozip 0.09a     w32c -cc -m3g -nm           18,723,846  150,037,341          0 xd 150,037,341   1110  1084 2693  CM   40
    I tested LZMA and it's so slow. So i think decompressing times more than this doesn't fit.
    So mcm and nanozip it no choice...

    What i miss on http://mattmahoney.net/dc/text.html is normal zlib:deflate This would be interesting as a reference...


    If i look at this copy&paste table... I think glza is really interesting: Better compression, smaller decompressor size and little less decompression-memory usage... Don't know if 16.4 decompression-time is acceptable...

    lzturbo seems to use to mutch RAM.

    I will search for javascript decompression implementations...

  13. #13
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    I just test GLZA:
    Next compressor test: **GLZA**

    * GLZA by Kennon Conrad
    * licensed under the Apache License, Version 2.0
    * Download: http://encode.ru/threads/1909-Tree-a...ll=1#post44023

    Compress windows batch file:

    Code:
    @echo off
    call:compress pypy.vm.js
    call:compress pypy.vm.js.mem
    pause
    
    :compress
        echo on
        GLZAformat %1 %1.glzf
        GLZAcompress %1.glzf %1.glzc
        GLZAencode %1.glzc %1.glze
        GLZAdecode %1.glze %1.glzd
        @echo off
    goto:eof
    Compressed file sizes:
    pypy.vm.js:

    Code:
    zlib level=5     2.02MB
    bzip2 level=1    1.66MB
    lzma preset=3    1.42MB
    GLZA             1.13MB
    pypy.vm.js.mem:

    Code:
    zlib level=5    1.98MB
    bzip2 level=1   1.94MB
    lzma preset=3   1.47MB
    GLZA            1.68MB

    decompress speed:
    Code:
    GLZAdecode pypy.vm.js.glze pypy.vm.js.glzd
    Decompressed 13293637 bytes in 1609 msec
    
    GLZAdecode pypy.vm.js.mem.glze pypy.vm.js.mem.glzd
    Decompressed 6911400 bytes in 388 msec

    Seems to me, that the decompress speed is to slow. It's my i4790K and the usage of the nativ compiled GLZAdecode.exe... Compiled via emscripten to JavaScript will made a slow down...

    From the first post: zlib.js decompress both files in ~350ms

    Another look on the benchmark results:
    Code:
                    Compression                      Compressed size      Decompresser  Total size   Time (ns/byte)
    Program           Options                       enwik8      enwik9     size (zip)   enwik9+prog  Comp Decomp  Mem  Alg Note
    -------           -------                     ----------  -----------  -----------  -----------  ----- -----  ---  --- ----
    lzham 1.0         -d29 -x                     25,002,070  202,237,199    191,600 s  202,428,799   1096   6.6 7800  LZ77 70
    lzturbo 1.1       -49 -b1000 -p0              24,416,777  194,681,713    110,670 x  194,792,383   1920     9 14700 LZ77 59
    tornado 0.6       -16                         25,768,105  217,749,028     83,694 s  217,832,722   1482     9 1290  LZ77 48
    rh5_x64           -window:27 c6               29,078,552  254,220,469     36,744 x  254,257,213    196   9.4  145  ROLZ 48
    lza 0.82b         -mx9 -b7 -h7                26,396,613  222,808,457    285,766 x  223,094,223    449   9.7 2000  LZ77 48
    cabarc 1.00.0601  -m lzx:21                   28,465,607  250,756,595     51,917 xd 250,808,853   1619    15   20  LZ77
    glza 0.2                                      20,806,740  167,274,338     15,218 sd 167,289,556   4713  16.4 6027  Dict 67
    Maybe only lzham, lzturbo, tornado, rh5 and lza are fast enough?
    But lzturbo is closed source and needs to much RAM.
    Last edited by jedie; 22nd June 2015 at 13:47.

  14. #14
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Next is tornado:

    Code:
    tor64.exe -%%a -o -qh -cpu pypy.vm.js
    
    -1: compressed 13,293,637 -> 3,566,090: 26.83%, time 0.031 secs, speed 405.690 mb/sec
    -2: compressed 13,293,637 -> 3,116,018: 23.44%, time 0.031 secs, speed 405.690 mb/sec
    -3: compressed 13,293,637 -> 2,447,626: 18.41%, time 0.047 secs, speed 270.460 mb/sec
    -4: compressed 13,293,637 -> 2,400,941: 18.06%, time 0.047 secs, speed 270.460 mb/sec
    -5: compressed 13,293,637 -> 1,971,624: 14.83%, time 0.141 secs, speed 90.153 mb/sec
    -6: compressed 13,293,637 -> 1,825,296: 13.73%, time 0.203 secs, speed 62.414 mb/sec
    -7: compressed 13,293,637 -> 1,598,698: 12.03%, time 0.375 secs, speed 33.807 mb/sec
    -8: compressed 13,293,637 -> 1,533,542: 11.54%, time 0.531 secs, speed 23.864 mb/sec
    -9: compressed 13,293,637 -> 1,509,208: 11.35%, time 0.781 secs, speed 16.228 mb/sec
    -10: compressed 13,293,637 -> 1,483,479: 11.16%, time 0.906 secs, speed 13.989 mb/sec
    -11: compressed 13,293,637 -> 1,650,052: 12.41%, time 0.484 secs, speed 26.174 mb/sec
    -12: compressed 13,293,637 -> 1,496,130: 11.25%, time 0.828 secs, speed 15.309 mb/sec
    -13: compressed 13,293,637 -> 1,429,882: 10.76%, time 1.344 secs, speed 9.435 mb/sec
    -14: compressed 13,293,637 -> 1,395,415: 10.50%, time 1.703 secs, speed 7.444 mb/sec
    -15: compressed 13,293,637 -> 1,357,953: 10.22%, time 2.938 secs, speed 4.316 mb/sec
    -16: compressed 13,293,637 -> 1,296,442: 9.75%, time 4.469 secs, speed 2.837 mb/sec

    Code:
    tor64.exe -%%a -o -qh -cpu pypy.vm.js.mem
    
    -1: compressed 6,911,400 -> 3,275,175: 47.39%, time 0.016 secs, speed 421.838 mb/sec
    -2: compressed 6,911,400 -> 2,831,794: 40.97%, time 0.016 secs, speed 421.838 mb/sec
    -3: compressed 6,911,400 -> 2,163,411: 31.30%, time 0.047 secs, speed 140.613 mb/sec
    -4: compressed 6,911,400 -> 2,122,102: 30.70%, time 0.094 secs, speed 70.306 mb/sec
    -5: compressed 6,911,400 -> 1,879,148: 27.19%, time 0.156 secs, speed 42.184 mb/sec
    -6: compressed 6,911,400 -> 1,845,531: 26.70%, time 0.203 secs, speed 32.449 mb/sec
    -7: compressed 6,911,400 -> 1,790,498: 25.91%, time 0.344 secs, speed 19.174 mb/sec
    -8: compressed 6,911,400 -> 1,782,496: 25.79%, time 0.547 secs, speed 12.053 mb/sec
    -9: compressed 6,911,400 -> 1,779,100: 25.74%, time 0.719 secs, speed 9.170 mb/sec
    -10: compressed 6,911,400 -> 1,780,917: 25.77%, time 1.031 secs, speed 6.391 mb/sec
    -11: compressed 6,911,400 -> 1,766,211: 25.56%, time 0.438 secs, speed 15.066 mb/sec
    -12: compressed 6,911,400 -> 1,722,855: 24.93%, time 0.641 secs, speed 10.289 mb/sec
    -13: compressed 6,911,400 -> 1,708,699: 24.72%, time 0.828 secs, speed 7.959 mb/sec
    -14: compressed 6,911,400 -> 1,704,007: 24.66%, time 1.125 secs, speed 5.859 mb/sec
    -15: compressed 6,911,400 -> 1,696,814: 24.55%, time 1.547 secs, speed 4.261 mb/sec
    -16: compressed 6,911,400 -> 1,687,355: 24.41%, time 1.594 secs, speed 4.136 mb/sec

    Code:
    tor64.exe -16 -opypy.vm.js.tor pypy.vm.js
    Compressing 13,293,637 bytes with optimal parser fb512, 1gb+128mb:128 bt5 + 4mb:
    8 exhash4 + 256kb hash3 + 16kb hash2, buffer 128mb, aricoder
    -16: compressed 13,293,637 -> 1,297,123: 9.76%, time 5.005 secs, speed 2.533 mb/
    sec
    
    tor64.exe -16 -opypy.vm.js.mem.tor pypy.vm.js.mem
    Compressing 6,911,400 bytes with optimal parser fb512, 1gb+128mb:128 bt5 + 4mb:8
     exhash4 + 256kb hash3 + 16kb hash2, buffer 128mb, aricoder
    -16: compressed 6,911,400 -> 1,687,355: 24.41%, time 1.750 secs, speed 3.767 mb/
    sec

    Code:
    tor64.exe -t pypy.vm.js.tor
    Unpacked 1,297,123 -> 13,293,637: 9.76%, time 0.072 secs, speed 176.857 mb/sec
    
    tor64.exe -t pypy.vm.js.mem.tor
    Unpacked 1,687,355 -> 6,911,400: 24.41%, time 0.074 secs, speed 89.109 mb/sec
    Code:
    pypy.vm.js:
    
    zlib level=5     2.02MB
    bzip2 level=1    1.66MB
    lzma preset=3    1.42MB
    GLZA             1.13MB
    tornado          1.23MB
    
    pypy.vm.js.mem:
    
    zlib level=5    1.98MB
    bzip2 level=1   1.94MB
    lzma preset=3   1.47MB
    GLZA            1.68MB
    tornado         1.60MB

  15. #15
    Member
    Join Date
    Dec 2012
    Location
    japan
    Posts
    149
    Thanks
    30
    Thanked 59 Times in 35 Posts
    This program is simple bwt compresser. It doesn't use Typed Array.
    Attached Files Attached Files

  16. #16
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Quote Originally Posted by jedie View Post
    I only marked that mcm is close source. Because i found the information here: http://mattmahoney.net/dc/text.html#1449 (So it's out-dated?)
    No, its not outdated. Its just written for people who can read additional lines after the first one. So special for you: Click image for larger version. 

Name:	mcm_is_open_source.png 
Views:	213 
Size:	2.2 KB 
ID:	3730

    Quote Originally Posted by jedie View Post
    I just copy&paste the interesting and suggested compressors form http://mattmahoney.net/dc/text.html ...
    I tested LZMA and it's so slow. So i think decompressing times more than this doesn't fit ...
    So mcm and nanozip it no choice ...
    Don't know if 16.4 decompression-time is acceptable...
    lzturbo seems to use to mutch RAM ...
    From my point of view you go into the wrong direction. You're trying to analyze the results from LTCB and extrapolate it at your situation.
    LTCB's main table shows best results for a given program so in most cases it means that compressor\archiver either configured for best compression either specially programmed for ENWIK8\ENWIK9 data sets.
    They can behave differently on your data. Also, you say that for example Nanozip is not a choice. Do you actually know that nanozip supports a wide range of algorhytms and supports parallel compression and memory tuning? Have you tried to take it and experiment with it? Do you know that LZTurbo can be set up to use the 1MB blocksize instead of 1GB?
    And, well, if you don't know for yourself if 16.4 decompression-time is acceptable then what kind of help do you expect?

    Quote Originally Posted by jedie View Post
    What i miss on http://mattmahoney.net/dc/text.html is normal zlib:deflate This would be interesting as a reference...
    You can test it by yourself. Test files are available for downloading.

    Quote Originally Posted by jedie View Post
    Maybe only lzham, lzturbo, tornado, rh5 and lza are fast enough?
    No, not only them )

  17. #17
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by xezz View Post
    This program is simple bwt compresser. It doesn't use Typed Array.
    Can you say more about the sources?!? Where do they come from? Who is the author? Released under which license?


    Quote Originally Posted by Skymmer View Post
    No, its not outdated. Its just written for people who can read additional lines after the first one. So special for you: Click image for larger version. 

Name:	mcm_is_open_source.png 
Views:	213 
Size:	2.2 KB 
ID:	3730
    Oh, sorry. Yes, i didn't read all the text

    Quote Originally Posted by Skymmer View Post
    From my point of view you go into the wrong direction. You're trying to analyze the results from LTCB and extrapolate it at your situation.
    LTCB's main table shows best results for a given program so in most cases it means that compressor\archiver either configured for best compression either specially programmed for ENWIK8\ENWIK9 data sets.
    They can behave differently on your data. Also, you say that for example Nanozip is not a choice. Do you actually know that nanozip supports a wide range of algorhytms and supports parallel compression and memory tuning? Have you tried to take it and experiment with it? Do you know that LZTurbo can be set up to use the 1MB blocksize instead of 1GB?
    And, well, if you don't know for yourself if 16.4 decompression-time is acceptable then what kind of help do you expect?
    Yes the most programs has large parameter to tune.

    Maybe i get better results if i change parameter. But currently this all makes no sense if there is no decompressor in javascript.

    Currently i use inflare.min.js from https://github.com/imaya/zlib.js/

    I don't know if a other solution (converted via emscripten) will have better compression ratio with similar decompression time and similar footprint...

  18. #18
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Skymmer View Post
    You can test it by yourself. Test files are available for downloading.
    But i need the exact same machine to the comparable results

  19. #19
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    GLZA has substantially lower decompression speeds than Tree, yet it doesn't always have substantially higher compression ratio. So I suggest you to test eg Tree 0.19.


    A very important point regarding LTCB - timings and other measuremens are done on multiple different machines. If you follow values in the "note" column, then you can find that one compressor is tested on some weak single core CPU while other compressor is tested on strong multiple core CPU. Therefore you need to look at the system configurations before filtering the results.


    Another thing is that bzip2 compression and decompression speed does not vary much with compression level. Decompression of data compressed with bzip2 -9 should achieve comparable speed to decompression of data compressed with bzip2 -1, unless you're doing multithreaded (de)compression and/ or your CPU caches are pretty small.

  20. #20
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    LTCB uses lots of different hardware because I don't want to put any limits on CPU, memory, etc. when people submit a top ranked result that only runs on a 10,000 core supercomputer. If you want speed comparison, a better choice is http://mattmahoney.net/dc/10gb.html

  21. #21
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts

    Try with brotli

    Seems like you could try Brotli. For large files it gives a little less compression than LZMA/LZHAM, but more for shorter files. Decompression speed is often a little better than that of LZHAM, and a lot better than LZMA.

  22. #22
    Member
    Join Date
    Dec 2012
    Location
    japan
    Posts
    149
    Thanks
    30
    Thanked 59 Times in 35 Posts
    repair and crepair has high speed decompression.
    Attached Files Attached Files

  23. #23
    Member
    Join Date
    Jun 2015
    Location
    DE
    Posts
    15
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    Seems like you could try Brotli. For large files it gives a little less compression than LZMA/LZHAM, but more for shorter files. Decompression speed is often a little better than that of LZHAM, and a lot better than LZMA.
    Brotli is listed in https://quixdb.github.io/squash-benchmark/ :


    • data: Tarred executables of Mozilla 1.0 (Tru64 UNIX edition)
    • machine: beagleboard-xm



    Click image for larger version. 

Name:	df578568-18d1-11e5-91a3-6fd37f556f0f.PNG 
Views:	217 
Size:	27.9 KB 
ID:	3732

    So decompression is faster than LZMA, but lzham is faster as brotli and compressed a little bit better...

    But i can't find a brotli decompressor for browser usage
    There is: https://github.com/devongovett/brotli.js but this is only for NODE.js...


    Quote Originally Posted by xezz View Post
    repair and crepair has high speed decompression.
    I must do some searching, before find the origin source: https://code.google.com/p/re-pair/

    Do you have any comparison values? compression ratio / decompressing speed compared to zlib:deflate / LZMA ?




    btw. i have made thee online test cases here: https://github.com/jedie/pypyjs_test_compression :



    Some resuts are here: https://github.com/pypyjs/pypyjs.git...ment-115233587
    Conclusion: it seems that .zip and decompress with github.com/Stuk/jszip is currently the best choice...

  24. #24
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    have you tried this compressor?

    https://code.google.com/p/jslzjb/

    it is javascript ..

    https://en.wikipedia.org/wiki/LZJB

    LZJB is a lossless data compression algorithm invented by Jeff Bonwick to compress crash dumps and data in ZFS.
    It includes a number of improvements to the LZRW1 algorithm ...
    The name LZJB is derived from its parent algorithm and its creator—Lempel Ziv Jeff Bonwick. Bonwick is also one of two architects of ZFS

    You should see also the new squash-benchmark https://quixdb.github.io/squash-benchmark/#results

    best regards

  25. #25
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    LZJB is very bad, especially for such use, because:
    * it's very weak for large files, probably twice as big as gzip
    * decompression will be slower than that of gzip because the latter is implemented in browser in C

  26. The Following User Says Thank You to m^2 For This Useful Post:

    joerg (22nd August 2015)

  27. #26
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @m^2: "LZJB is very bad"
    -------
    * twice as big as gzip
    * decompression will be slower than that of gzip

    I thought it would be fast ...

    thank you very much for your comment

  28. #27
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts

    check out the latest squash benchmark results

    Quote Originally Posted by jedie View Post
    Brotli is listed in https://quixdb.github.io/squash-benchmark/ :


    • data: Tarred executables of Mozilla 1.0 (Tru64 UNIX edition)
    • machine: beagleboard-xm



    Click image for larger version. 

Name:	df578568-18d1-11e5-91a3-6fd37f556f0f.PNG 
Views:	217 
Size:	27.9 KB 
ID:	3732

    So decompression is faster than LZMA, but lzham is faster as brotli and compressed a little bit better...
    Brotli decompresses faster and compresses more densely in the latest squash benchmark results. Also, faster compression was enabled by quality settings.

    Click image for larger version. 

Name:	brotli2.png 
Views:	166 
Size:	22.4 KB 
ID:	3774

  29. #28
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by joerg View Post
    @m^2: "LZJB is very bad"
    -------
    * twice as big as gzip
    * decompression will be slower than that of gzip

    I thought it would be fast ...

    thank you very much for your comment
    LZJB in C is fast, though not nearly as much as ones like LZ4 or Nakamichi.
    LZJB in JS will be fast for a JS codec, but slow compared to fast codecs in C. Unlikely to match gzip in C.

Similar Threads

  1. Search is broken on this site
    By boxerab in forum The Off-Topic Lounge
    Replies: 3
    Last Post: 24th May 2014, 22:56
  2. ROLZ and Search Trees ?
    By Guzzo in forum Data Compression
    Replies: 5
    Last Post: 1st August 2012, 00:03
  3. I'm looking for the best free implementation of deflate
    By caveman in forum Data Compression
    Replies: 2
    Last Post: 22nd November 2010, 09:27
  4. DEFLATE/zlib implementations
    By GerryB in forum Data Compression
    Replies: 10
    Last Post: 7th May 2009, 17:03
  5. deflate model for paq8?
    By kaitz in forum Data Compression
    Replies: 2
    Last Post: 6th February 2009, 21:48

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •