Results 1 to 14 of 14

Thread: mod_CM - another paq submodel

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts

    mod_CM - another paq submodel

    I made a new CM for another purpose (logistic mix of o0..o5, no SSE, 205,997 on book1)
    and decided to test it with paq, in hope that its sufficiently different.

    http://nishi.dreamhosters.com/u/paq8p_pmd_CM_v0.rar (source included)
    (also see https://github.com/Shelwien/mod_CM/b...lder_m.inc#L30
    https://encode.ru/threads/2515-mod_ppmd)

    Code:
             book1             wcc386
    paq8p_-- 192227  28.734s   194069  23.469s
    paq8p_-C 191819  33.656s   193960  27.344s
    paq8p_P- 191293  33.968s   193781  27.235s
    paq8p_PC 191151  38.641s   193763  30.828s
    "P" = mod_ppmd present
    "C" = mod_CM present

    The model uses precise 8-byte counters and a binary mixer tree,
    and a relatively new trick of "probability extrapolation", which
    is also applied to mixer weights (binary mixer weights are actually
    probabilities of one of inputs being better for coding).

  2. The Following 7 Users Say Thank You to Shelwien For This Useful Post:

    byronknoll (15th February 2019),Darek (15th February 2019),encode (15th February 2019),kaitz (27th February 2019),Mike (15th February 2019),moisesmcardona (15th February 2019),xinix (17th February 2019)

  3. #2
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    223
    Thanks
    106
    Thanked 102 Times in 63 Posts
    Thanks Shelwien. Can you expand on what you mean by "binary mixer tree"? From the two files you posted it looks like this CM does better than the paq one. Does "mod_CM present" mean that you replaced the paq CM, or did you use both CMs and combine them together? Do you think it makes sense to switch to using this CM in all paq branches? Are there any known issues or tradeoffs?

  4. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > Thanks Shelwien. Can you expand on what you mean by "binary mixer tree"?

    https://github.com/Shelwien/mod_CM/b...odelC.cpp#L104
    Code:
    p = mix[o1&0xC0FF](  
      mix[o1]( 0.5, o0 ),
      mix[o1]( 
        mix[o2]( 
          mix[o3]( 
            mix[o4]( o5, o4 ), 
            o3 
          ), 
          o2 
        ), 
        o1 
      )
    )
    Its convenient, because each binary mix can have each own context and parameters,
    and I can be certain about nature of its weight.

    > From the two files you posted it looks like this CM does better than the paq one.

    Yes; At first I actually added individual submodel predictions to the paq8p mixer,
    and got +5k to book1 compressed size.

    > Does "mod_CM present" mean that you replaced the paq CM,
    > or did you use both CMs and combine them together?

    Just added another submodel, same as with mod_ppmd.

    > Do you think it makes sense to switch to using this CM in all paq branches?

    Replacing paq orderN submodel?
    Maybe, but this CM is order5, needs some more contexts for that (o16?).

    Switching to this mixing method in general?
    Maybe, but its completely different from paq design,
    would be hard to port everything.

    > Are there any known issues or tradeoffs?

    My models are explicitly parametric.
    I also have an optimizer script for it and all.
    But it requires more work (and computing time) than paq method.

    Both faster and stronger though.

    One problem with this CM is that it uses 8-byte counters,
    so more than 8x of paq's memory usage (my hashtable is also more precise, so has more overhead).
    The idea was to design the most precise counter, then find a way to build FSMs from it,
    but here I just used it as is.

  5. The Following User Says Thank You to Shelwien For This Useful Post:

    xinix (17th February 2019)

  6. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Manually adding higher orders was annoying, so I finally made a generator from custom description syntax for it.
    Current model is order7, 201272 on book1 (paq8p order12 is 202518), still no SSE - just counters and binary mixing.

    This is the script:
    Code:
    Value Cm: 0,             X()
    Count C7: o7_ctx^(z<<8), HT(64),         M_f7
    Count C6: o6_ctx^(z<<8), HT(64),         M_f6
    Count C5: o5_ctx^(z<<8), HT(64),         M_f5
    Count C4: o4_ctx^(z<<8), HT(64),         M_f4
    Count C3: o3_ctx^(z<<8), HT(64),         M_f3
    Count C2: o2_ctx^(z<<0), D(256*256*256), M_f2
    Count C1: o1_ctx^(z<<0), D(256*256),     M_f1
    Count C0: o0_ctx^(z<<0), D(256),         M_f0
    Mixer X7: o6_ctx^(z<<8), HT(64)
    Mixer X6: o5_ctx^(z<<8), HT(64)
    Mixer X0: o4_ctx^(z<<8), HT(64)
    Mixer X1: o3_ctx^(z<<8), HT(64)
    Mixer X2: o2_ctx^(z<<0), D(256*256*256), M_X2
    Mixer X3: o1_ctx^(z<<0), D(256*256),     M_X3
    Mixer X4: o1_ctx^(z),    D(256*256),     M_X4
    Mixer X5: (o1_ctx&0xC000)+z, D(256*256), M_X5
    =====
    X5
      X4(Cm,C0)
      X3
        X2
          X1
            X0
              X6
                X7(C7,C6)
                C5
              C4
            C3
          C2
        C1
    =====
    Stuff like this is generated (there're 3 more files).
    (Note that Mixer+Counter with same context are automatically placed into the same table cell, for better caching.)
    Code:
    C_Hasher0 tC0;
    CM_Hasher<C_Hasher0,M_Hasher0>  tC1;
    CM_Hasher<C_Hasher0,M_Hasher0>  tC2;
    CM_Hasher<C_Hasher,M_Hasher>  tC3;
    CM_Hasher<C_Hasher,M_Hasher>  tC4;
    CM_Hasher<C_Hasher,M_Hasher>  tC5;
    CM_Hasher<C_Hasher,M_Hasher>  tC6;
    C_Hasher tC7;
    M_Hasher0 tX4;
    M_Hasher0 tX5;
    
    C_Holder H_C0( tC0, M_f0P0,M_f0P1, M_f0mw, M_f0C, M_f0wr0, M_f0wr1 );
    C_Holder H_C1( tC1, M_f1P0,M_f1P1, M_f1mw, M_f1C, M_f1wr0, M_f1wr1 );
    C_Holder H_C2( tC2, M_f2P0,M_f2P1, M_f2mw, M_f2C, M_f2wr0, M_f2wr1 );
    C_Holder H_C3( tC3, M_f3P0,M_f3P1, M_f3mw, M_f3C, M_f3wr0, M_f3wr1 );
    C_Holder H_C4( tC4, M_f4P0,M_f4P1, M_f4mw, M_f4C, M_f4wr0, M_f4wr1 );
    C_Holder H_C5( tC5, M_f5P0,M_f5P1, M_f5mw, M_f5C, M_f5wr0, M_f5wr1 );
    C_Holder H_C6( tC6, M_f6P0,M_f6P1, M_f6mw, M_f6C, M_f6wr0, M_f6wr1 );
    C_Holder H_C7( tC7, M_f7P0,M_f7P1, M_f7mw, M_f7C, M_f7wr0, M_f7wr1 );
    
    M_Holder H_X4( tX4, M_X4W0,M_X4WC, M_X4PC, M_X4wr,&Cm,&H_C0.p0 );
    M_Holder H_X7( tC6, M_X7W0,M_X7WC, M_X7PC, M_X7wr,&H_C7.p0,&H_C6.p0 );
    M_Holder H_X6( tC5, M_X6W0,M_X6WC, M_X6PC, M_X6wr,&H_X7.p3,&H_C5.p0 );
    M_Holder H_X0( tC4, M_X0W0,M_X0WC, M_X0PC, M_X0wr,&H_X6.p3,&H_C4.p0 );
    M_Holder H_X1( tC3, M_X1W0,M_X1WC, M_X1PC, M_X1wr,&H_X0.p3,&H_C3.p0 );
    M_Holder H_X2( tC2, M_X2W0,M_X2WC, M_X2PC, M_X2wr,&H_X1.p3,&H_C2.p0 );
    M_Holder H_X3( tC1, M_X3W0,M_X3WC, M_X3PC, M_X3wr,&H_X2.p3,&H_C1.p0 );
    M_Holder H_X5( tX5, M_X5W0,M_X5WC, M_X5PC, M_X5wr,&H_X4.p3,&H_X3.p3 );
    
    for( i=0; i<DIM(hC2); i++ ) hC2[i]=tC2;
    for( i=0; i<DIM(hC1); i++ ) hC1[i]=tC1;
    for( i=0; i<DIM(hC0); i++ ) hC0[i]=tC0;
    for( i=0; i<DIM(hX4); i++ ) hX4[i]=tX4;
    for( i=0; i<DIM(hX5); i++ ) hX5[i]=tX5;

  7. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    http://nishi.dreamhosters.com/u/CMo8_v0.rar
    > x64 SSE4 IntelC19 executable
    + added o6-o8
    + made a code generator for new model definition syntax

    Usage:
    CMo8.exe c book1 book1.cm (default is c6)
    CMo8.exe d book1.cm book1.unp
    or
    CMo8.exe c12 book1 book1.cm
    CMo8.exe d12 book1.cm book1.unp (non-default memory size has to be specified in decoder too)

    Code:
    c#  enwik8       mem    c.time    d.time
     6: 22,120,027   545M 1181.172s 1120.485s  
     7: 21,552,909   929M 1271.235s 1212.125s  
     8: 21,081,153  1697M 1104.907s 1068.500s  
     9: 20,730,979  3233M  912.860s  888.359s  
    10: 20,523,217  6305M  711.360s  698.328s  
    11: 20,447,681 12449M  544.594s  533.266s  
    12: 20,444,900 24731M  478.250s  444.046s  
     
        book1
     6: 200,596      545M    3.031s    3.063s
    The main problem is that it requires like 14x memory comparing to paq,
    mostly because of 8-bytes counters and strict hashtable collision resolution.

    Plans:
    - modify model generator to enable compiler auto-vectorization
    - modify model generator to move model code into .idx class
    - replace hashtable with a tree
    - add SSE/matchmodel since o8+ gives very little on book1
    - add wordmodel since its what changes paq8p result from 202518 to 192872
    - mod_CM update
    - re-optimization to enwik6 target

  8. The Following 3 Users Say Thank You to Shelwien For This Useful Post:

    Darek (24th March 2019),Mike (24th March 2019),moisesmcardona (24th March 2019)

  9. #6
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    909
    Thanks
    531
    Thanked 359 Times in 267 Posts
    Some comparisons of paq8p vs. paq8p_ppmd and paq8p_PC.

    I don't know why but my version of paq8p_ppmd have older version of WAV model then for my testset 0.WAV and L.PAK files got lost to original paq8p and paq8p_PC.

    Differences are the biggest for textual files -> for enwik8 and enwik8.drt there are respectively 1.3% and 0.8% of improvements from paq8p_ppmd (paq8p_P) - it's a very good gain!
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	paq8p_PC_4_Corpuses.jpg 
Views:	15 
Size:	211.9 KB 
ID:	6528   Click image for larger version. 

Name:	paq8p_PC_DBA.jpg 
Views:	21 
Size:	99.9 KB 
ID:	6529   Click image for larger version. 

Name:	paq8p_PC_enwik8.jpg 
Views:	16 
Size:	14.6 KB 
ID:	6530  
    Last edited by Darek; 25th March 2019 at 22:58.

  10. The Following User Says Thank You to Darek For This Useful Post:

    Shelwien (26th March 2019)

  11. #7
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > my version of paq8p_ppmd have older version of WAV model

    That might be due to "Transform fails at 0" - somehow detector fails when compiled with IntelC.
    I'd try to pay attention to it next time.

    > it's a very good gain!

    Hopefully next one would be even better, since on book1 there's 5k of difference.

  12. #8
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    909
    Thanks
    531
    Thanked 359 Times in 267 Posts
    I wonder if it could be implemented in latest paq versions, CMV and cmix.

  13. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    The mod_CM that I posted can be integrated very easily, same as mod_ppmd which is already used in cmix and paq8pxd.
    CMo8 still has to be modified for that.

    https://github.com/Shelwien/mod_CM/b...aq8p.cpp#L2732

  14. The Following 3 Users Say Thank You to Shelwien For This Useful Post:

    byronknoll (27th March 2019),Darek (26th March 2019),kaitz (27th March 2019)

  15. #10
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    223
    Thanks
    106
    Thanked 102 Times in 63 Posts
    Quote Originally Posted by Shelwien View Post
    The mod_CM that I posted can be integrated very easily, same as mod_ppmd which is already used in cmix and paq8pxd.
    Great, I would like to try using it in cmix. I just made an attempt at adding it into the PAQ8 model in cmix, but had trouble getting it to compile. Does it only work in Windows? I see an error: "undefined reference to `VirtualAlloc'", which looks like part of the Windows API.

  16. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Its this: https://github.com/Shelwien/mod_CM/b.../CM/valloc.inc
    Just replace VirtualAlloc with "calloc(1,s);" and VirtualFree with "free(p);"

    Or maybe add this to gcc options:
    -DMEM_COMMIT=0 -DVirtualAlloc(a,s,b,c)=calloc(1,s) -DVirtualFree(p,a,b)=free(p)
    and likely -fpermissive

  17. #12
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Built paq8p with mod_CM on linux server with gcc 4.8.4, seems to work:
    http://nishi.dreamhosters.com/u/_paq8p2a.rar

    Final build cmdline:
    g++ -O3 paq8p.cpp -DNOASM CM/modelC.cpp -std=gnu++11 -D'__cdecl=' -DMEM_COMMIT=0 -D'VirtualAlloc(a,s,b,c)=calloc(1,s)' -D'VirtualFree(p,a,b)=free(p)' -fpermissive
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	image_2019_03_29T13_32_22_121Z.png 
Views:	19 
Size:	107.6 KB 
ID:	6539  

  18. The Following 2 Users Say Thank You to Shelwien For This Useful Post:

    byronknoll (30th March 2019),moisesmcardona (30th March 2019)

  19. #13
    Member
    Join Date
    Mar 2011
    Location
    USA
    Posts
    223
    Thanks
    106
    Thanked 102 Times in 63 Posts
    I added mod_CM into the paq8 model in cmix and tested on enwik8:

    before: 14874726 (23455604 KiB memory)
    after: 14874548 (23823440 KiB memory)

    Unfortunately it doesn't seem to do significantly better. I ran some other tests and verified that CM_Model is working (by disabling all other cmix models).

  20. The Following User Says Thank You to byronknoll For This Useful Post:

    Shelwien (31st March 2019)

  21. #14
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    I guess we'd try again later with added improvements.
    Although I wonder if this result is a mixing artifact somehow.
    Maybe paq8 model itself has a low weight now?

    One thing that I wanted to try is generated contexts - generate a few bytes ahead using current stats,
    then compute prediction using generated bytes as context.
    I need "right contexts" for contrepl anyway, so it would be useful for model optimization and testing.

    Btw, another thing that I want to try is mod_lzma. Using it to compute predictions is actually pretty simple (though slow) -
    decode stuff from 10-16 bits of arithmetic code (all possible values), then integrate symbol probabilities.
    Or integrate distance probabilities for symbols corresponding to a range of distances (literal model is already a CM).
    It could be interesting, because paq currently doesn't have any models with parsing optimization.
    The actual problem is unrolling the changes - LZs work with lookahead, so it would be necessary to simulate EOF at current pos
    and have lzma encoder parse the data, then discard changes from last 4k or so, add a symbol, and try again.

  22. The Following User Says Thank You to Shelwien For This Useful Post:

    xinix (7th April 2019)

Similar Threads

  1. ULTRACompress (Still using PAQ)
    By moisesmcardona in forum Data Compression
    Replies: 5
    Last Post: 9th May 2019, 21:10
  2. Multiline PAQ
    By kampaster in forum Data Compression
    Replies: 2
    Last Post: 5th August 2011, 17:25
  3. lzma submodel shares / redundancy measurement
    By Shelwien in forum Data Compression
    Replies: 21
    Last Post: 10th December 2010, 19:42
  4. Asymmetric PAQ
    By kampaster in forum Data Compression
    Replies: 11
    Last Post: 27th August 2010, 04:16
  5. can someone help me compiling paq by myself?
    By noshutdown in forum Forum Archive
    Replies: 4
    Last Post: 4th December 2007, 10:49

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •