Page 1 of 85 1231151 ... LastLast
Results 1 to 30 of 2528

Thread: zpaq updates

  1. #1
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts

    zpaq updates

    http://mattmahoney.net/dc/#zpaq

    zpaq version 1.03 updates:
    - Uses mid.cfg as a default configuration file.
    - Does not store path names by default (just the file name). Use "r" command to override. 1.02 and earlier stored full paths by default.
    - Will not extract to absolute paths or paths containing "../" or "..\" unless you specify the destination during extraction. (Safety feature suggested by Yuri Grille).
    - No longer tries to recover from file open errors when compressing or extracting more than one file.
    - Won't trash an archive if you try to compress a nonexistent file.
    - Fixed "s" command to dump the whole header, which is more convenient only if you are writing a compressor.
    - Supports splitting a file into separately compressed segments ("k" command). Added support for this to unzpaq 1.03. It was described as a recommendation in the reference but never implemented. (I am planning a use for this ).

    There is no change in the ZPAQ spec, compression, or configuration file format, so there is no need to update any benchmarks. However I did manage to equal paq8px on the generic benchmark with a simple config file. http://mattmahoney.net/dc/uiq/

  2. The Following 5 Users Say Thank You to Matt Mahoney For This Useful Post:

    239 (12th April 2017),Alexander (6th June 2017),carlosnewmusic (3rd January 2019),h1127910 (26th September 2016),Simorq (28th March 2017)

  3. #2
    Programmer Jan Ondrus's Avatar
    Join Date
    Sep 2008
    Location
    Rychnov nad Kněžnou, Czech Republic
    Posts
    278
    Thanks
    33
    Thanked 137 Times in 49 Posts
    Can't find compiled version. zpaq.exe missing in http://mattmahoney.net/dc/zpaq103.zip

  4. #3
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts

  5. #4
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    wooow, I haven't never tested zpaq! Today I have tested it. Fantastic compression! it's superior than 7z! Takes only 278 mb RAM! Good. Why don't increase ram memory required?

  6. #5
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Matt!

  7. #6
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    zpaq is configurable. You describe the compression algorithm in a config file which is stored in the archive. You can do all the usual tradeoffs between speed, memory, and size, and use a custom algorithm for each file. I'm still playing around with it. I haven't quite beat paq8k2 on the generic test yet.
    http://mattmahoney.net/dc/uiq/

    The real purpose is to have a standard format that won't break every time you find a better algorithm. It is based on a PAQ like architecture where you can arrange the components how you like and specify arbitrary contexts and preprocessing steps in a hard to program language called ZPAQL. Your new algorithm will still decompress with older versions of the decompressor.

    Of course a good compressor should do this automatically. It examines the file and picks the best algorithm for it. But I'm not quite there yet.

  8. #7
    Programmer Jan Ondrus's Avatar
    Join Date
    Sep 2008
    Location
    Rychnov nad Kněžnou, Czech Republic
    Posts
    278
    Thanks
    33
    Thanked 137 Times in 49 Posts
    I modified max.cfg configuration file:
    SFC result:
    max.cfg: 10427843 bytes (used 278 MB memory)
    max2.cfg: 10355775 bytes (used 282 MB memory)
    Attached Files Attached Files

  9. #8
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    yes! more ram! thanks Jan Ondrus!

  10. #9
    Programmer Jan Ondrus's Avatar
    Join Date
    Sep 2008
    Location
    Rychnov nad Kněžnou, Czech Republic
    Posts
    278
    Thanks
    33
    Thanked 137 Times in 49 Posts
    Quote Originally Posted by PiPPoNe92 View Post
    yes! more ram! thanks Jan Ondrus!
    Here is max3.cfg using 551 MB (I can't test this one, my PC has only 512MB RAM)
    Attached Files Attached Files

  11. #10
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    wwoooow!! improved compression with 551 of RAM!! The program don't crash Jan! All fine.

  12. #11
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Nice news! Matt, thanks for update! Jan, thanks for new configs!
    Here are a couple of tests.
    Code:
    mp_military_2.db from S.T.A.L.K.E.R.: Clear Sky game
    
    min     21 746 902      22.616       4.264 MB
    mid     15 807 269     186.676     111.494 MB
    max     13 968 889     536.135     278.474 MB
    max2    13 952 512     614.461     281.620 MB
    max3    13 700 708     620.041     550.778 MB
    Code:
    UnTARed files from TestBed.tar
    
              default        post p         post x
    ------------------------------------------------
    min      9 670 834     9 670 834     10 143 255
    mid      7 718 294     7 755 997     ----------
    max      7 140 115     7 201 914      7 211 659
    max2     7 090 131     7 140 455      7 163 297
    max3     7 090 369     7 140 010      7 163 715
    In last test max3.cfg managed to produce worse results than max2.cfg.

  13. #12
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post

    Talking

    in my test with max3.cfg (I have compressed a file .7z) the compression is improved...
    Last edited by PiPPoNe92; 11th September 2009 at 13:10.

  14. #13
    Member
    Join Date
    Feb 2009
    Location
    Cicero, NY
    Posts
    9
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I've posted some zpaq configs i've made for three compression benchmarks.

    uiq2

    I've made small changes to Matt's u2.cfg file which was tuned for the UIQ2 benchmark.

    PPMonstr Compressed uiq2_data = 3005994 bytes
    zpaq compressed uiq2_data = 2864442 bytes
    Ratio = 2864442/3005994 = .9529

    Run the following with the included config:
    zpaq.exe cuiq2_ver2.cfg uiq2_ver2.zpaq uiq2_data v

    I will probably continue to push on this config a little longer to see if it can close in on the top of the uiq2 benchmark, ratio = .9460.

    I made a config for enwik8 and enwik9 months ago that improved the compression of max.cfg. Someone may find these useful, included in post.

    enwik8
    Uses 2002.732 MB, 1201.12 sec. on:
    CPU Intel Q6800 @ 2.93 GHz RAM = 3.5 GB
    enwik8 100000000 -> 18238435, Compression = (1-18238435/100000000)*100 = 81.76%

    enwik9
    Uses 1994.376 MB, 11961.43 sec. on:
    CPU Intel Q6800 @ 2.93 GHz RAM = 3.5 GB
    enwik8 1000000000 -> 149376058, Compression = (1-149376058/1000000000)*100 = 85.06%

    Let me know if these results are not repeatable on another machine.

    Thanks,

    Mike
    Attached Files Attached Files
    Last edited by russelms; 11th September 2009 at 15:36.

  15. #14
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    these your config files are good only for enwik8-9, for other files the compression ratio is worst than original cfg.

    P.S. in my PC your cfg go, they don't crash. I have quadcore 2,5 Ghz with 2 Gb ram.

  16. #15
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    zpaq has an E8E9 transform. I changed max3.cfg:
    Code:
    post
      0
    end
    to
    Code:
    post
      x
    end
    With these results:

    Code:
    1,184,197 acrord32.exe.max3
    1,112,307 acrord32.exe.max3x
    1,548,925 mso97.dll.max3
    1,506,743 mso97.dll.max3x

  17. #16
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    fantastic! where can I download it? So, i can test it!

  18. #17
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Quote Originally Posted by PiPPoNe92 View Post
    fantastic! where can I download it? So, i can test it!
    Quote Originally Posted by Matt Mahoney
    zpaq has an E8E9 transform. I changed max3.cfg:
    Code:
    post
      0
    end
    to
    Code:
    post
      x
    end
    ...
    Last edited by Simon Berger; 11th September 2009 at 22:37.

  19. #18
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    Download max2.cfg or max3.cfg above and change the 0 to x in the POST section.

  20. #19
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    ok..done. I have incremented ram required to 1500 mb, the compression is highly improved. Thanks.

  21. #20
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    Bill Pettis has raised the bar again. http://mattmahoney.net/dc/uiq/

    Edit: compression takes 3 hours and 1460 MB with paq8k3 -9u, 5 hours with -9.
    Last edited by Matt Mahoney; 14th September 2009 at 15:36.

  22. #21
    Member
    Join Date
    Aug 2009
    Location
    Bari
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    where is it paq8k3? So, I can test it. Thanks in advance

  23. #22
    Member
    Join Date
    May 2009
    Location
    China
    Posts
    36
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Angry

    How to download paq8k3? where?

  24. #23
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    http://ilovemyking.googlepages.com/paq8k3.cpp
    http://ilovemyking.googlepages.com/paq8k3.exe

    Bill Pettis posted the links on facebook. Didn't see them anywhere else.

  25. #24
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    After 6 months I finally got around to benchmarking zpaq. http://mattmahoney.net/dc/text.html#1652

    max.cfg has about the same compression as paq9a except that it is 3 times slower and uses 1/6 as much memory. For now, zpaq is on the memory Pareto frontier, but that will probably change when I start testing models with more memory (max2, max3, max_enwik9, some I will write myself). Lots more to do.

    It will also be nice to add a dictionary transform to improve text compression. The ZPAQL code should not be too hard, but encoding the dictionary for best compression is another matter.

  26. #25
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    I wasn't able to get max_enwik8.cfg or max_enwik9.cfg to run on my computer (out of memory, I have 3 GB). However I did modify max3.cfg to max4.cfg using the same model with about 1.4 GB. I also improved the compression using a dictionary preprocessor (drt from lpaq9m).

    http://mattmahoney.net/dc/text.html#1572

    The drt result is not listed in the main table. I added a rule that each compressor can only be listed once, so there is no separate listings for xwrt|ppmonstr even though it gets better compression than either program by itself. drt is already combined with lpaq9m so I didn't add drt|zpaq to the main table. It would go from 9'th to 5'th if I did.

    My next projects will probably be to remove a bit of memory from max_enwik9 and build in a dictionary preprocessor. It looks to me like max_enwik8 and max_enwik9 are the same?

    Edit: I added your results. zpaq still moves to 5'th.
    Last edited by Matt Mahoney; 17th September 2009 at 18:57.

  27. #26
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts

    ZPAQ results!

    I believe that ZPAQ is a great improvement in comparison to PAQ!
    nevertheless LPAQ is also a great project!

  28. #27
    Member
    Join Date
    May 2008
    Location
    Earth
    Posts
    115
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Matt Mahoney View Post


    My next projects will probably be to remove a bit of memory from max_enwik9 and build in a dictionary preprocessor.
    Possibly a better (but requiring much more work) way is to make ZPAQL backend for some C++ compiler.

  29. #28
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    Optimizing ZPAQL is a hard problem. One idea I had in designing the language was to use a restricted subset of x86 that could be executed directly but could be checked to have no instructions accessing code or data outside its sandbox. But then do I use 32 or 64 bit code? And you still need an interpreter for non x86 machines.

    I am also thinking about languages that would have longer instructions, so fewer to interpret for better speed. It would be nice to have a language easier to write code in, maybe a subset of C that could also be compiled. Not an easy problem.

  30. #29
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    Some more test results on max_enwik9.

    Original: enwik8 -> 18,238,435, 2002 MB
    Reduced memory -> 18,249,266, 1952 MB, 1900 sec.
    enwik8.drt -> 18,106,404, 1133 sec.

    I reduced the memory by changing 22 to 21 in max_enwik9 here: 6 mix 21 0 6 49 255

  31. #30
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,254
    Thanks
    305
    Thanked 775 Times in 484 Posts
    I improved enwik8.drt to 17,935,357 bytes. DRT encodes words using bytes in the range 128-255 so I modified the code to compute the order 0 and 1 word hashes (components 13-14) to accept all characters 65 or higher and made it case sensitive. Here is the config file.

    Code:
    comp 5 9 0 3 26 (hh hm ph pm n)
      0 const 158
      1 icm 5
      2 isse 12 1
      3 isse 17 2
      4 isse 19 3
      5 icm 20
      6 mix 21 0 6 49 255
      7 isse 21 6
      8 isse 21 7
      9 isse 22 8
      10 isse 22 9
      11 icm 22
      12 match 24 25
      13 icm 20
      14 isse 22 13
      15 icm 9
      16 icm 9
      17 mix 17 0 17 13 255
      18 icm 12
      19 mix 16 0 19 6 255
      20 mix 8 0 19 41 255
      21 mix2 0 19 20 73 0
      22 sse 8 20 8 255
      23 mix2 8 21 22 36 255
      24 sse 21 23 17 255
      25 mix2 0 23 24 85 0
    hcomp
      c++ *c=a b=c a=0 (save in rotating buffer)
      d= 2 hash *d=a b-- hash
      d++ hash *d=a b--
      d++ hash *d=a b--
      d++ hash *d=a b--
      d++ hash hash *d=a (6 mix)
      d++ hash *d=a b--
      d++ hash *d=a b--
      d++ hash b-- hash *d=a b--
      d++ hash b-- hash *d=a b--
      d++ hash b-- hash *d=a b--
      d++ hash *d=a b-- (12 match)
      d++ a=*c (words)
        a< 65 jt 10
        d++ hashd d-- (14 update order 1 word hash)
        *d<>a a+=*d a*= 19 *d=a jmp 9  (13 order 0 word hash)
        a=*d a== 0 jt 3 (order 1 word)
           d++ *d=a d--
        *d=0 d++
      d++ b=c hash *d=a (sparse 2)
      d++ hash *d=a (sparse 3)
      d++ b-- hash *d=a (sparse 4)
      d++ a=0 hash *d=a
      d++ a=*c a<<= 8 *d=a (19 mix)
      d++ a=*c a<<= 14 *d=a (20 mix)
      d++ hash 
      d++ 
      d++ 
      d++ *d=a (24 sse)
      halt
    post
      0  (may be 0 for PASS or x for EXE/DLL (E8E9))
         (if x, set ph=0, pm=3)
    end

  32. The Following User Says Thank You to Matt Mahoney For This Useful Post:

    Nania Francesco (26th February 2015)

Page 1 of 85 1231151 ... LastLast

Similar Threads

  1. ZPAQ self extracting archives
    By Matt Mahoney in forum Data Compression
    Replies: 31
    Last Post: 17th April 2014, 03:39
  2. ZPAQ 1.05 preview
    By Matt Mahoney in forum Data Compression
    Replies: 11
    Last Post: 30th September 2009, 04:26
  3. zpaq 1.02 update
    By Matt Mahoney in forum Data Compression
    Replies: 11
    Last Post: 10th July 2009, 00:55
  4. Metacompressor.com benchmark updates
    By Sportman in forum Data Compression
    Replies: 79
    Last Post: 22nd April 2009, 03:24
  5. ZPAQ pre-release
    By Matt Mahoney in forum Data Compression
    Replies: 54
    Last Post: 23rd March 2009, 02:17

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •