Page 2 of 5 FirstFirst 1234 ... LastLast
Results 31 to 60 of 130

Thread: HFCB: Huge Files Compression Benchmark

  1. #31
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    I agree about real world, but look what is being used in compression benchmarks. Large Text Compression Benchmark primarily uses enwik9 (1 GB), which is one of the larger files used for testing so far. There is also SqueezeChart, Sportsman's benchmark, Sami's benchmark, UCLC, Monster of Compression, AFAIK none of them uses files > 10 GB. Larger files are bit hard to host somewhere...

    And yes, with upload speeds here even 100 MB file is too large
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  2. #32
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    HFCB: added rar -m5 and pigz.

    pbzip2 is incompatible with my scripts, so i can show only compresion times ATM:

    PBZip2 1.0.5
    -1 148.363
    -2 151.183
    -3 154.825
    -5 186.288
    -9 191.154
    Last edited by Bulat Ziganshin; 4th December 2009 at 11:59.

  3. #33
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    My new personal record: 781 MiB (819 614 056). Now the compression chain is following:

    PreComp 0.3.8 * -> 7z -m0=BCJ2 -> SREP -> 7z -mx=9
    * - command line given above

    I've also experimented with SREP proccessed file a little. Details:
    Code:
    FA  lzma:d112m:h128m:mfhc4      835 767 443
    FA  lzma:d512m:h128m:mfht4      833 928 958
    7z  lzma:d=64m:mf=bt2           822 886 343
    7z  lzma:d=64m:mf=bt3           820 173 975
    Seems that SREP leaves no chances to other match finders Though bt3 can be exception here. F.e. it gave me better results in another test with DDS files. And yes, its surely possible to improve this result with larger dict. or NZ but its enough for me at this moment

  4. #34
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    why don't use 128/256 mb dictionary? it needs just 1400-2500 mb RAM to compress

  5. #35
    Member
    Join Date
    May 2009
    Location
    France
    Posts
    95
    Thanks
    13
    Thanked 72 Times in 42 Posts
    Hello,

    Quote Originally Posted by Bulat Ziganshin View Post
    decompression time = testing time. my HDD is slow, CPU is fast so extraction time has much more overhead than average system
    Don't you think having a slow HDD disadvantages archivers with huge compression/decompression speed? Even for testing, if the archiver is multithreaded, couldn't your HDD be a bottleneck?

    Do you use 3 partitions for your tests to be reproducible ? 3 HDD? A mix of that?


    AiZ

  6. #36
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    why don't use 128/256 mb dictionary? it needs just 1400-2500 mb RAM to compress
    I have only 1GB of RAM and quarter of it eated by MS stuff.

  7. #37
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    Some more Precomp + 7-Zip results from me. As I said, I'm using a developer version, 0.4.1dev which is in fact not that different to 0.4, the output should even be the same except for the header. Command line used is "precomp -slow -t-j -v".

    Data was split to 700 MB pieces, stored with 7-Zip, processed by SREP and compressed using 7-Zip Ultra (GUI settings).

    Code:
    Total PCF size: 5183575996
    SREP:           3460922640
    SREP + 7z:      842807881
    Didn't try any tuning so far and as the result is only 3 MB smaller than Skymmer's, it doesn't make much sense.

    By the way, I also tested PCF + 7z without the SREP stage which gives a much worse result, 869294969 bytes.
    http://schnaader.info
    Damn kids. They're all alike.

  8. #38
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Quote Originally Posted by schnaader View Post
    Didn't try any tuning so far and as the result is only 3 MB smaller than Skymmer's, it doesn't make much sense.
    3 MB ? Probably you haven't seen my last achievement (post #33).

    Anyway, I have a new one. Now its 777 MiB (815 006 956).
    The chain now is:
    PreComp 0.3.8 -> 7z -m0=BCJ2 -> SREP -> 7z -m0=lzma:d=64m:lc=3:fb=273:mf=bt4:mc=10000
    Last edited by Skymmer; 6th December 2009 at 04:15.

  9. #39
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    Quote Originally Posted by Skymmer View Post
    3 MB ? Probably you haven't seen my last achievement (post #33).
    I've seen it - actually that's why I stopped trying - you are doing very well and I'd just waste my time for results only 3 MB better.

    Quote Originally Posted by Skymmer View Post
    Anyway, I have a new one. Now its 777 MiB (815 006 956).
    The chain now is:
    PreComp 0.3.8 -> 7z -m0=BCJ2 -> SREP -> 7z -m0=lzma:d=64m:lc=3:fb=273:mf=bt4:mc=10000
    Nice - the perfect size for 7-Zip until it gets down to 777 777 777 bytes
    http://schnaader.info
    Damn kids. They're all alike.

  10. #40
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by AiZ View Post
    Don't you think having a slow HDD disadvantages archivers with huge compression/decompression speed? Even for testing, if the archiver is multithreaded, couldn't your HDD be a bottleneck?

    Do you use 3 partitions for your tests to be reproducible ? 3 HDD? A mix of that?
    my main interest was to show optimal decompressors and i reached this goal by comparing testing times

    using HDDs makes tests highly unrepeatable, and it's why i don't push too far comparison of fastest modes. nevertheless, i moved all test data to dedicated partition that is just 1/10 of my HDD. seems that this dramatically reduced seek times and now compressors are limited only by linear read/write HDD speeds (that is still seriously limits some comrpessors)

    the best way to bench fastest compressors is to use RAM drive. i don't have a budget for buying more RAM


    so, i've updated results of most compressors using new testing environment. precomp, CSC and 7-zip are still waiting in the queue. as you can see, fast compressors now show much better results

    http://freearc.org/HFCB.aspx

  11. #41
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Bulat, can you add an option to download the results as a csv?
    I'd like to load them to a spreadsheet, so I can sort them by various parameters.

  12. #42
    Member
    Join Date
    May 2009
    Location
    France
    Posts
    95
    Thanks
    13
    Thanked 72 Times in 42 Posts
    Hi!

    Quote Originally Posted by Bulat Ziganshin View Post
    so, i've updated results of most compressors using new testing environment. precomp, CSC and 7-zip are still waiting in the queue. as you can see, fast compressors now show much better results
    Many thanks for your time and your dedication. And... Yeah, those 0.60 experimental results look really really promising!

    Have a nice day,


    AiZ

  13. #43
    Member
    Join Date
    Jun 2008
    Location
    USA
    Posts
    111
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by schnaader View Post
    Interesting, didn't know 7-Zip had some detection based on the file extension. Doesn't feel right though, it should better depend on the actual content than on filenames or extensions... How big is the difference?
    I reported this as a bug a while back, but Igor preferred to say "missing feature" and that he'll implement it eventually.

  14. #44
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by m^2 View Post
    Bulat, can you add an option to download the results as a csv?
    I'd like to load them to a spreadsheet, so I can sort them by various parameters.
    updated http://freearc.org/download/testing/benchmarking.zip

  15. #45
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    Thanks.

  16. #46
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    updated CSC/7-zip results (faster I/O due to 50Gb partition), added more 7-zip modes (m/t lzma2 and bzip2)

  17. #47
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    Am i the only one to have some problems with downloading the vm image file ?
    It seems it always end up being a corrupted archive on my laptop.

    Maybe an alternate source ?

  18. #48
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    just downloaded and tested - it's ok for me

    but still - http://freearc.org/download/testing/vm.7z

  19. #49
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    Still no luck

    Maybe it's the proxy which ruins the end of the file...

  20. #50
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    @Bulat:
    maybe post a torrent for it or some recovery data like
    http://en.wikipedia.org/wiki/Par2 ?
    http is not really that reliable for large files.

  21. #51
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by Cyan View Post
    Still no luck

    Maybe it's the proxy which ruins the end of the file...
    are you use any download manager? i think that using good download manager should solve the problem. for example free http://www.freedownloadmanager.org/download.htm

  22. #52
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    HFCB updated:
    • added 7-zip results with -m0=bzip2 (+bcj/bcj2)
    • all results updated with testing on 50GB partition
    • improved FreeArc results due to faster CRC calculation
    • added results with precomp:slow (797 mb in 38 hours!!!)

  23. #53
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    yees!!! finally FreeARC exceeds NanoZIP 0.7!!!!! Fine Bulat Ziganshin!!
    Can you say me the command line that did you use for it?

    "-m=precomp:slow:t-j:v+srep+delta+7z:x9:d256m"

    I don't understand..
    Did you use precomp? After...srep? delta?? and the end 7z set up to "-x9 -d256m"..

    Excuse me Bulat.
    Last edited by PiPPoNe92; 13th December 2009 at 22:44.

  24. #54
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    HFCB updated:
    • added results with precomp:slow (797 mb in 38 hours!!!)
    What? 38 hours??? Hmmm... Bulat, I presume that's something wrong with your system. In my tests with my chain PreComping took about 2 hours on my (slower) system. And all other chain parts took not more than 2 hours too.

  25. #55
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    And by the way, I have new personal record. Now its: 779 709 433
    The chain is:
    Code:
    PreComp 0.3.8 -> 7z -m0=BCJ2 -> SREP -> NanoZIP v0.07 -nm -cO -m680m

  26. #56
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    Quote Originally Posted by Skymmer View Post
    What? 38 hours??? Hmmm... Bulat, I presume that's something wrong with your system. In my tests with my chain PreComping took about 2 hours on my (slower) system. And all other chain parts took not more than 2 hours too.
    Bulat's result (Precomp 0.4) compared to yours (Precomp 0.3. can have several reasons:

    - Slow HD (made a test on an external USB 2.0 drive today, was 20 times slower (!) than on the internal drive)
    - Debug mode (especially if the output isn't piped to a file)
    - Precomp 0.4 uses recursion

    As I also tested with a 0.4.1 version and also had to wait very long (definitely longer than 5 hours) and although the PC wasn't idle and it's not a very fast one, I think Bulat's timing can be correct.

    I think the third point (recursion) is the most important. To understand, see what Precomp has to do if slow mode and recursion are combined: It won't only search/find zLib streams everywhere in the original file, but also in its decompressed variants, so instead of processing 4 GB, it will process ~8-10 GB and it'll slow down even more because of the additional streams that are found.

    So, 38 hours is very extreme and I agree that something could went wrong there (system wasn't idle all the time or some weird errors), but I could also understand if it's correct. I hope I'll get rid of the temporary files soon so that things like this will improve.
    http://schnaader.info
    Damn kids. They're all alike.

  27. #57
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    schnaader, thanks for clarification!

    Quote Originally Posted by schnaader View Post
    I hope I'll get rid of the temporary files soon so that things like this will improve.
    Nice idea but I think it should be optional. For example I have one map file from Modern Warfare 2. When I Precomp it the overall size of temp files is about 1 GB so it can lead to memory overflow for users with low memory. I can suggest to make putting the temp files into memory by default but also introduce some option to disable it or to control maximum memory consumption.

  28. #58
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    Quote Originally Posted by Skymmer View Post
    Nice idea but I think it should be optional. For example I have one map file from Modern Warfare 2. When I Precomp it the overall size of temp files is about 1 GB so it can lead to memory overflow for users with low memory. I can suggest to make putting the temp files into memory by default but also introduce some option to disable it or to control maximum memory consumption.
    I also thought about this problem, it can be avoided by letting the user set a max. memory size (default 64 MB) and switch to a temporary file only if this size is exceeded.

    Anyway, for most streams like zLib, this perhaps won't be necessary as they are... well... streams and we can always keep only a small amount of them in memory, try to decompress, recompress and then proceed to the next part without exceeding < 1 MB memory usage regardless of how big the complete stream is. This should work for zLib and bZip2 streams, perhaps for GIF, but won't be possible for PackJPG.
    Last edited by schnaader; 20th December 2009 at 19:35.
    http://schnaader.info
    Damn kids. They're all alike.

  29. #59
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    At last !
    I could get the vm.7z file fully downloaded, and have a try with it

    Zhuff got it compressed to 1 552 485 185 Bytes
    which is within range of zip -1 (1,519,313,253).
    Regarding time, gzip -1 needs 105s on my rig, while zhuff needs 30s.

    This looks consistent with other results (using other files).

  30. #60
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    I used this huge file compression benchmark file to compare the same quick archivers I tested in the Zhuff thread.

    Input file: vm ubuntu 4,244,176,896 bytes

    archiver comp. decomp. output size

    tor64 -1 - 18.976 sec - 17.565 sec - 2,012,529,981 bytes
    lzbw1 -c - 21.250 sec - 26.743 sec - 1,957,122,960 bytes
    6pack -1 - 26.121 sec - 15.530 sec - 1,934,108,140 bytes
    6pack -2 - 25.230 sec - 15.555 sec - 1,866,119,327 bytes
    qpress -L1 - 11.836 sec - 8.800 sec - 1,843,834,143 bytes
    lz4 -c - 23.108 sec - 8.374 sec - 1,815,887,459 bytes
    lzop -1 - 42.104 sec - 13.257 sec - 1,760,786,279 bytes
    lzp2 -c - 19.013 sec - 12.083 sec - 1,757,474,323 bytes
    blz c - 76.565 sec - 40.131 sec - 1,753,754,620 bytes
    qpress -L2 - 24.908 sec - 10.026 sec - 1,740,120,303 bytes
    thor e1 - 23.008 sec - 44.452 sec - 1,718,282,972 bytes
    tor64 -2 - 23.026 sec - 19.417 sec - 1,712,507,022 bytes
    lzturbo -11 -p1 - 18.230 sec - 13.283 sec - 1,700,061,916 bytes
    qpress -L3 - 63.014 sec - 6.550 sec - 1,684,399,514 bytes
    slug c (1.1b) - 24.361 sec - 28.995 sec - 1,650,988,996 bytes
    lzturbo -21 -p1 - 22.736 sec - 19.508 sec - 1,630,226,554 bytes
    thor e2 - 41.006 sec - 41.377 sec - 1,582,829,280 bytes
    zhuff c - 31.923 sec - 17.614 sec - 1,552,485,185 bytes
    lzturbo -12 -p1 - 28.692 sec - 12.990 sec - 1,522,579,476 bytes
    nz64 a -cf -m32m - 31.318 sec - 32.601 sec - 1,493,046,406 bytes
    nz64 a -cf -m3072m - 31.994 sec - 33.894 sec - 1,493,043,977 bytes
    lzturbo -31 -p1 - 31.623 sec - 31.618 sec - 1,470,987,594 bytes
    lzturbo -22 -p1 - 34.542 sec - 19.622 sec - 1,454,793,129 bytes
    lzturbo -13 -p1 - 77.845 sec - 12.513 sec - 1,451,031,642 bytes - compare fail
    thor e3 - 51.360 sec - 50.527 sec - 1,443,965,860 bytes
    7z a -tzip -mx=1 - 158.512 sec - 55.109 sec - 1,431,120,618 bytes
    arc a -m1 -mt1 - 47.489 sec - 48.467 sec - 1,391,007,213 bytes
    tor64 -3 - 48.971 sec - 33.472 sec - 1,391,006,982 bytes
    lzturbo -23 -p1 - 86.731 sec - 19.138 sec - 1,378,822,094 bytes - compare fail
    thor e5 - 160.854 sec - 43.731 sec - 1,365,175,924 bytes
    rar a -m1 -mt1 - 127.551 sec - 55.853 sec - 1,354,555,696 bytes
    slug c - 45.772 sec - 51.386 sec - 1,349,861,645 bytes
    lzturbo -32 -p1 - 46.709 sec - 28.868 sec - 1,342,743,214 bytes
    tor64 -4 - 54.959 sec - 31.820 sec - 1,317,844,770 bytes
    thor e4 - 149.559 sec - 62.067 sec - 1,296,040,932 bytes
    sx c - 51.213 sec - 62.529 sec - 1,292,083,492 bytes
    nz64 a -cF -m3072m - 55.059 sec - 78.609 sec - 1,277,538,548 bytes
    lzturbo -33 -p1 - 118.328 sec - 27.857 sec - 1,276,226,424 bytes - compare fail
    tor64 -5 - 105.116 sec - 43.117 sec - 1,239,697,601 bytes
    nz64 a -cd -m32m - 82.566 sec - 30.363 sec - 1,212,994,979 bytes
    nz64 a -cd -m1024m - 140.081 sec - 30.266 sec - 1,174,859,420 bytes
    tor64 -6 - 191.066 sec - 41.219 sec - 1,166,288,638 bytes
    nz64 a -cD -m32m - 168.116 sec - 42.925 sec - 1,148,230,331 bytes
    arc a -m2 -mt1 - 187.358 sec - 56.424 sec - 1,147,751,803 bytes
    flashzip a -m0 -c7 -b8 - 245.518 sec - 211.985 sec - 1,123,527,144 bytes
    nz64 a -cD -m1024m - 291.452 sec - 41.538 sec - 1,098,183,262 bytes
    Last edited by Sportman; 27th December 2009 at 17:41.

Page 2 of 5 FirstFirst 1234 ... LastLast

Similar Threads

  1. convert swf files to avi files
    By Jabilo in forum The Off-Topic Lounge
    Replies: 13
    Last Post: 26th October 2016, 11:39
  2. New benchmark for generic compression
    By Matt Mahoney in forum Data Compression
    Replies: 20
    Last Post: 29th December 2008, 09:20
  3. MONSTER OF COMPRESSION - New Benchmark -
    By Nania Francesco in forum Forum Archive
    Replies: 222
    Last Post: 5th May 2008, 10:04
  4. Compression speed benchmark
    By Sportman in forum Forum Archive
    Replies: 104
    Last Post: 23rd April 2008, 16:38
  5. Synthetic compression benchmark
    By giorgiotani in forum Forum Archive
    Replies: 6
    Last Post: 3rd March 2008, 12:14

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •