Results 1 to 14 of 14

Thread: CCM/LPAQ competitor

  1. #1
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    307
    Thanks
    68
    Thanked 166 Times in 63 Posts

    CCM/LPAQ competitor

    I believe that many people are waiting for a fast open-source CM library (it can be commercial, but free for non-commercial use) that will replace widely-used Shkarin's PPMd. For me it's obvious that LPAQ won't replace PPMd as it's too slow. But a CM compressor that has similar/the same speed as PPMd and better compression have a great chance.

    So the goal is e.g. to achieve 70-86 MB (10-27% better than PPMd.J1) on Maximum Compression's MFC (http://www.maximumcompression.com/data/summary_mf.php) at similar/the same speed (172-277s). CCM achieved the goal, but it's still not open-source (we are waiting Christian .

    Eugene has accepted my "challenge" and in a single day (maybe in two days) created a preliminary version of CM compressor called Mix v0:

    http://ctxmodel.net/files/MIX/mix_v0.rar

    The compression ratio achieved in such a short time is very good (http://ctxmodel.net/rem.pl?-24), however, speed should be improved.
    We are waiting for your comments.

  2. #2
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I believe that Shelwien is able to the best thing. Maybe, it's time to start more deeply explaining something inside CCM? Do you hear me Christian?

  3. #3
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I think, we are walking around wrong place. CCM have filters. Well designed filters can speed up real compression (of course not all of them). Also, I haven't mentioned compression gain yet. I think, we should compare LPAQ/CCM with Shelwien's new compressor with non-usual files. So, we can see CCM pure power and speed. This is fair for me. Bulat said that he has some kind of file which fits in this state.

    Also, we can do some modification on well-known files like Matt does (for example removing a column of the PIC file). He described in his paper some modification on calgary corpus. For example, RK loses the compression by some modifications.
    Last edited by osmanturan; 5th July 2008 at 20:36.

  4. #4
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 150 Times in 18 Posts
    Pheew, it seems to be really hard these days to write free closed source software.

    Yes I do hear you Osmanturan. But why should I give out a blueprint of CCM's algorithm. I mean, I worked on it for quite some time. But giving all ideas away, so someone can create an opensource copy of CCM until I find the time to release it myself? Nah, most people wouldn't do that either. And frankly speaking, all those demands make me lose interest in releasing it. Additionally, I don't think someone as skillful as Eugene wants to copy other people's ideas.

  5. #5
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Arrow Improving a PAQish architecture

    As i said on ctxmodel.net i've made some experiments to simulate a linear counter in the logistic domain. For a PAQish architecture you need to transform all probabilities to the logistic domain, e.g.:

    Code:
     
    ...
    using math::Stretch; // less typing
    nn.Input(5) = Stretch( histMaps[5].P1( bitHists[5][x].Touched(t) )  );
    ...
    which creates more or less random memory access, hence it is cache inefficient. I've experimented with this:

    Code:
    ...
    //  nn.Input(5) = histMaps[5].L( bitHists[5][x].Touched(t) );
    ...
    The experimental bit model:

    Code:
    class bitmodel_log
    {
    public:
      static const uint LD_SCALE = 16;
      static const int SCALE = 1<<LD_SCALE;
    
    protected:
      static const uint LD_Q = 6;
      static const int L0 = -SCALE/2;
      sint16 l;
    
      static const sint16 lu0[(1<<LD_Q)+1];
    
    public:
      inline void Set( const int m )
      { l=m; }
    
      inline bitmodel_log& Update( const int y )
      {
        static const int S = LD_SCALE-LD_Q;
        const int cl = y==0 ? l : -l;
        const short* lu = &lu0[cl-L0>>S];
        const int w = cl-L0&(1<<S)-1;
        const int nl = ((lu[1]-lu[0])*w>>S)+ lu[0];
    
        l = y==0 ? nl : -nl;
        return *this;
      }
    
      inline int L() const
      {
        //printf( "%d\n", l>>LD_SCALE-math::LD_SCALE ); getchar();
        return l>>LD_SCALE-math::LD_SCALE;
      }
    };
    
    const sint16 bitmodel_log::lu0[] = {
      -32768, -31760, -30736, -29712, -28688, -27664, -26640, -25616,
      -24592, -23568, -22544, -21520, -20496, -19472, -18448, -17424,
      -16400, -15376, -14352, -13328, -12304, -11281, -10257,  -9233,
       -8210,  -7186,  -6163,  -5140,  -4117,  -3095,  -2073,  -1052,
         -32,    987,   2005,   3022,   4036,   5048,   6056,   7060,
        8059,   9050,  10033,  11005,  11962,  12902,  13821,  14714,
       15576,  16400,  17181,  17913,  18591,  19208,  19764,  20255,
       20684,  21052,  21363,  21623,  21837,  22012,  22154,  22268,
       22359
    };
    Compression dropped (my approximation only uses 64 vertices, one could do it better), resulting a loss of 80kB at SFC. I expected a larger speed gain due to increased cache efficiency, surprisingly this wasn't the case.
    It looks like the linear counter's probability output clusters near
    P(y=1) = 0 and 1, so a "Stretch(.)" is mostly cached.

    I've created the lookup like this:

    Code:
    #include<cstdio>
    #include<cmath>
    
    template<typename T> T Min( const T& a, const T& b ) { return a<b?a:b; }
    template<typename T> T Max( const T& a, const T& b ) { return a>b?a:b; }
    
    template<typename T> T Bound( const T& a, const T& min, const T& max )
    { if (a<min) return min; else if (a>max) return max; else return a; }
    
    
    const double L_MIN = -8.0;
    const double L_MAX = 8.0;
    const double C = 1.0/256.0;
    const int N = 64;
    
    double P( const double l )  // logistic -> probability
    { return 1.0/(1.0+exp(-l)); }
    
    double L( const double p )  // probability -> logistic
    { return log(p) - log(1.0-p); }
    
    
    // The logistic update function (mimics a linear counter)
    static const int SCALE = 1<<12;
    static const int L0 = -8*SCALE;
    static const int L1 =  8*SCALE-1;
    static const int BITS = 16;
    short lu0[N+1];
    
    inline int LogUpdate( const int y, int l )
    {
      l = y==0 ? l : -l;
      const short* lu = &lu0[l-L0>>BITS-6];
      const int w = l-L0&(1<<BITS-6)-1;
      const int nl = ((lu[1]-lu[0])*w>>BITS-6)+ lu[0];
    
      return y==0 ? nl : -nl;
    }
    
    int main( int, char** )
    {
    
    
      for (int i=0; i<=N; ++i) {
        const double l = (L_MAX-L_MIN)/N*i + L_MIN;
        const double p = P(l);
        const double l0 = Bound( L((1.0-C)*p), L_MIN, L_MAX );
        const double l1 = Bound( L((1.0-C)*p + C), L_MIN, L_MAX );
    
        lu0[i] = SCALE*l0;
    
    //    printf( "%.4f -> 0: %.4f (%d) 1: %.4f\n", l, l0, lu0[i], l1 );
    
    //    printf( "%
    
        printf( "%6d, ", lu0[i] );
        if ((i%8)==7) putchar('\n');
      }
    
      int l = 0;
      printf( "\n%d\n", l=LogUpdate(0,l) );
      printf( "%d\n", l=LogUpdate(1,l) );
      printf( "%d\n", l=LogUpdate(1,l) );
      printf( "%d\n", l=LogUpdate(0,l) );
    
      getchar();
    
      return 0;
    }
    Maybe someone can tune it and insert it into LPAQ to see how good things could be. This is mostly a speed optimization.

    Another counter approach i'll use for my next project:

    For a future project i plan to use delayed counters and a secondary model (per primary model), which estimates probabilities based on a quantisation of a primary model's counter sate. A final estimate is a static mix of both probabilities. This should allow to be more than competative to the FSM approach at higher orders.

  6. #6
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    307
    Thanks
    68
    Thanked 166 Times in 63 Posts
    Quote Originally Posted by Christian View Post
    But giving all ideas away, so someone can create an opensource copy of CCM until I find the time to release it myself?
    I think it will be hard to create a copy of CCM, even if you'd give more details as compression ratio depends on small implementation details.

    Quote Originally Posted by Christian View Post
    And frankly speaking, all those demands make me lose interest in releasing it.
    I understand your point of view. You're afraid that people would take ideas from CCM to create their own compressors, but:
    1. CCM can be decompiled/deassembled to stole some ideas.
    2. I believe that PAQ became the most powerful compressor as it was open-source. Without PAQ you'd not hear about CM and you'd not create CCM.
    3. Open-source makes progress in data compression. Someone can help you in improving CCM just like Matt achieved help from many people.
    4. In several years if you will stop to create a new versions of CCM people will forget about CCM. I've tested hundreds of compressors that currently nobody remembers. But PPMd is widely-used as it was fast, powerful (just like CCM) and open-source

    Please think about it. All my programs are open-source and I don't regret.

  7. #7
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 150 Times in 18 Posts
    Please check your PM, inikep.

  8. #8
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Christian View Post
    Pheew, it seems to be really hard these days to write free closed source software.

    Yes I do hear you Osmanturan. But why should I give out a blueprint of CCM's algorithm. I mean, I worked on it for quite some time. But giving all ideas away, so someone can create an opensource copy of CCM until I find the time to release it myself? Nah, most people wouldn't do that either. And frankly speaking, all those demands make me lose interest in releasing it. Additionally, I don't think someone as skillful as Eugene wants to copy other people's ideas.
    I'm not concern about your source code or CCM itself. If I had really taken care of it, at least I would decompile it (Believe me, it should be more easier than getting your some ideas!!!). We are trying to write a new compressor for competitor with PAQ series (at least some of us trying to give motivation). So, personelly I just wait your ideas if you want to give us. That's all. Else, this is your choice. No matter for me.

  9. #9
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    What a pitty, that it looks like nobody within this forum is willing to discuss anything about cm...

  10. #10
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    I made a new version:

    Mix v1:
    - new improved (and stream-oriented) E8 filter
    - buffered i/o from fpaq0pv4B instead of caching the whole input file
    - new optimization target, based on SFC (concatenated samples of each SFC file)
    o5mix - v0 coders for comparison, just tuned to the new target
    o6mix /
    o5mix-a - same coders with new i/o and E8
    o6mix-a /
    o6mix1a - a simplified o6 with hashed o2 and without dual o0-o2 counters
    o6mix1b - o6mix1a with alternate parameters

    http://ctxmodel.net/files/MIX/mix_v1.htm
    http://ctxmodel.net/files/MIX/mix_v1.rar

    But, mysteriously, overall results are worse in my benchmark - speed
    only increased by 20% in o6mix1 (where 3 submodels were removed),
    and compression was better in v0.

    I still have to test durilca'light and Mix v0 and v1 on SFC (because
    v1 was supposedly tuned to it) before posting all this on ctxmodel.net,
    but these tests are really slow, so here it is for now.

    Guess I'd have to retune these coders back to book1+wcc386 and
    maybe restore the dual counters at least at o0 and o1.

    Edit:
    Durilca'light results added, and its compression is comparable to CCM,
    but with 30% faster decoding. As to compression speed, there's
    just a lot of filters separately processing the data, so I think it could
    be significantly optimized. But still, durilca filters are not open-source.

    And also this is the SFC benchmark:
    http://ctxmodel.net/files/MIX/mix_v1_SFC.htm
    So I guess it got successfully tuned to SFC, but that tuning is not
    optimal for other data.

    ...Also I just understood that there's 6 unnecessary hash function
    calls in the o6mix model %)
    Last edited by Shelwien; 8th July 2008 at 13:04.

  11. #11
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    The horrible speed of these implementations is caused by the model alignment in memory. If you use a flat decomposition i can't imagine any faster hashed data structure (for a complete bit tree) than Matt's.

    You should consider making a nibble fit into a <= 64 byte chunk (e.g. reduce your T to just 1 byte, or use only one T per nibble) and merge your mixers into this memory chunk (if possible). I would do something like this:

    struct chunk {
    static const uint N = 15;
    word P[N]; // 30 bytes

    Variant 1:
    byte T[N]; // 15 bytes

    Variant 2:
    word T; // only one T, maybe you can group these (, e.g. one T for each 2 bit subtree within this 4 bit tree)
    ...
    The remaining bytes:
    * Information for collision handling ?
    * A mixer (since your mixers use the same contexts).
    * Symbol ranking/match model stuff...
    };

    sizeof(struct chunk)==64

    The benefit of your T at orders 0 and 1 is negliable.

    Your current data structures result in random access right after the first nibble (which might be cached). You might also concider splitting the collision domains for a low and a high nibble (two hash tables).

  12. #12
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts

  13. #13
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts

  14. #14
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts

Similar Threads

  1. CCM 1.3x
    By Christian in forum Forum Archive
    Replies: 67
    Last Post: 25th April 2008, 20:22
  2. CCM 1.25 is here!
    By Christian in forum Forum Archive
    Replies: 84
    Last Post: 16th November 2007, 11:00
  3. CCM(x) multithreaded ?
    By SvenBent in forum Forum Archive
    Replies: 2
    Last Post: 15th September 2007, 15:29
  4. CCM 1.2x branch
    By Christian in forum Forum Archive
    Replies: 107
    Last Post: 8th June 2007, 17:56
  5. CCM - 1.1.x branch
    By Christian in forum Forum Archive
    Replies: 105
    Last Post: 19th March 2007, 23:50

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •