Results 1 to 11 of 11

Thread: CTW/CTS compressor by Joel Veness

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts

    CTW/CTS compressor by Joel Veness

    http://jveness.info/software/cts-v1.zip (x86+x64+source)

    Its slow and not especially good, comparing to existing coders.
    But at least its something new :)

  2. #2
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    Code:
    Compressing obj2 (246814 bytes)
    
    compressor   size    time   work.set
    ----------   -----   ----   --------
    ctw          98378   69     677
    facctw       88702   74     1266
    cts          80043   9      274
    gz           78176
    bz2          76441
    faccts       72642   10     409
    7z           61525
    lpaq8 6      57961   0.5    178
    fp8 -5       50001   3      92
    uda          48483   4      152
    paq8px -5    44334   21     170
    Wanted to test on something more relevant than obj2, but this thingie have eaten ~4GiB RAM on 6Mib file and i killed the beast.
    Don't know why the hell anyone would use this. Tonnes of math, hundreds pages of science - but the result?..
    Last edited by nimdamsk; 12th April 2012 at 07:12.

  3. #3
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I just had a beer with Joel yesterday

    Cheers
    M1, CMM and other resources - http://sites.google.com/site/toffer86/ or toffer.tk

  4. #4
    Member
    Join Date
    Jan 2012
    Location
    Edmonton
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for the mention. Yes, it's very much just a proof of concept implementation. Just to be clear, the paper/code is about comparing the recursive weighting scheme in CTW with the one used in CTS, while keeping similar theoretical gaurentees. Its not meant to be a practical compression tool. A more memory friendly version is of course possible, but that wasn't my goal for the first release. I hope this clarifies things.
    Last edited by jveness; 10th April 2012 at 19:53.

  5. #5
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    I've looked at CTS as end user, not as programmer or researcher. And i wouldn't understand 90% of the papers anyway.
    But i am just curious - what is the reason to work on such an ideas? CTW itself never had any comparative good result in benchmarks.
    So improving it that much to make it usable... Never the less, as Shelwein said - at least something new.
    "Something new" is good enough reason to try. The work seems to show that CTx way is dead way

  6. #6
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    @jveness:
    Anyway, do you have any idea about potential compression improvement for your model?
    The pure PPMII (ppmonstr) / CM (ash) results for book1 are about 2.11bpc.
    Afair paq8's results are similar too, if one hacks it up to disable "unfair" text models.
    The model would still have some uses (like HP) even if its slow, but only if its better than
    existing implementations.

    To be specific,
    ppmonstr is http://compression.ru/ds/ppmdj1.rar (book1=203247 with defaults)
    ash is http://www.ctxmodel.net/files/ASH/ash07.rar (book1=203361 with defaults)
    or http://www.ctxmodel.net/files/ASH/ash04a.rar (book1=202932 with /s12
    (v07 has lower counter precision to reduce memory usage)

  7. #7
    Member
    Join Date
    Jan 2012
    Location
    Edmonton
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    @nimdamsk: As a researcher, CTW is interesting since it is one of the few methods that provides really strong theoretical gaurentees. So with CTS, my motivation was to see how far I could push the performance while still keeping good theoretical guarentees. I can understand though if this motivation seems a bit weird.

  8. #8
    Member
    Join Date
    Jan 2012
    Location
    Edmonton
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    @Shelwien:

    My latest development version of CTS (which is marginally better than the source associated with the paper) gives:
    book1, 207727 bytes, 2.16bpc
    So it is worse, but maybe it is possible to catch up or do a little better with more work. I do have some more ideas, so I haven't completely given up hope just yet.

    Can you tell me what you mean by HP? Sorry, I am unfamiliar with the acronym.

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    I thought you'd know it, of all people - http://prize.hutter1.net/

    207727 is better, but still not good enough, as that's still in the range for BWT compressors
    (see http://compressionratings.com/bwt.html#book1)
    bwtmix with alphabet reordering also reaches 207130

    Btw, also for reference there's http://encode.ru/threads/541-Simple-...xt-mixing-demo
    It gives about 214k, and I think that any real CM/PPM/BWT should be better than that.

    And as to "more ideas", are you aware of the secondary estimation / probability mapping idea?
    Its the main difference between ppmonstr and ppmd, and ash's result with disabled SSE is 205417.

  10. #10
    Member
    Join Date
    Jan 2012
    Location
    Edmonton
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Re: Hutter prize: Oh, yeah, that.

    Re: secondary estimation / probability mapping
    I think Chris told me about this idea a couple days ago. Is this the idea where you maintain some kind of 2d table, which you use to improve the output from your model by incorporating information from a match model?

    By the way, I very much appreciate these suggestions.
    Last edited by jveness; 13th April 2012 at 05:12.

  11. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > Is this the idea where you maintain some kind of 2d table, which you
    > use to improve the output from your model by incorporating
    > information from a match model?

    Yes, but that's not the point at all.

    The main idea is that to improve some probability estimation,
    we can collect secondary statistics in context of that probability.
    Such components are usually called "SSE" (secondary symbol estimation) -
    it was first used by C.Bloom in ppmz as "SEE" (secondary escape estimation),
    then extended to all symbols by D.Shkarin in ppmonstr... then I decompiled
    ppmonstr and it propagated further :).

    So the initial version was like this:
    Code:
           
    Counter X[Q]; // secondary stats
    p = 1..SCALE-1; // prediction
    q = p*Q/SCALE;
    p' = X[q]; // secondary prediction
    X[q].Update( bit );
    then it was improved by adding linear interpolation:
    Code:
           
    Counter X[Q+1]; // secondary stats
    p = 1..SCALE-1; // prediction
    q = p*Q/SCALE;
    w = (p - q*SCALE)/Q;
    p' = w*X[q+0] + (1-w)*X[q+1]; // secondary prediction
    X[q+0].Update( bit );
    X[q+1].Update( bit );
    (see bsc_mixer.inc in http://nishi.dreamhosters.com/u/bsc240_qlfc_v2.rar )

    then by "indirect update"
    (update the mixed value as a counter, then back-propagate the difference)
    Code:
           
    Counter X[Q+1]; // secondary stats
    p = 1..SCALE-1; // prediction
    q = p*Q/SCALE;
    w = (p - q*SCALE)/Q;
    p' = w*X[q+0] + (1-w)*X[q+1]; // secondary prediction
    Counter tmp = p';
    tmp.Update(bit);
    d = tmp - p';
    X[q+0] += d;
    X[q+1] += d;
    Of course, its helpful to use extra context for secondary statistics -
    its possible to add things to secondary context where it won't fit
    as primary, mainly because probability mappings can be initialized
    as "identity transformation".

    Same way, its possible to extend the approach to multiple dimensions
    and mix submodel predictions with it.
    The results of that are fairly good, but its relatively slow.
    (see http://encode.ru/threads/1158-SSE2(o...ll=1#post22841 )

    The most important point is that SSE can at the same time capture estimation errors
    (eg. learn to map prior estimations to posterior) _and_ actual data correlations
    (specific patterns in context history) - but the latter only works with static
    functions of histories. Its important for understanding why linear_counter+SSE
    compresses better than adaptive_rate_counter+SSE, even though standalone
    adaptive_rate_counter is significantly better than linear_counter
    (also why its rarely a good idea to apply SSE multiple times, or after mixing).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •