Page 1 of 3 123 LastLast
Results 1 to 30 of 63

Thread: Simple Variable Order Rangecoder

  1. #1
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts

    Simple Variable Order Rangecoder

    I recently made a variable order rangecoder. It uses the last n bits as context, where n is the order of the model. It then counts the number of one and zero bits after each occurrence of the context and rangecodes the next bit using its calculated probability. The coder outputs 32 bit integers, but I also included code for 8 bit output. It's slow and doesn't have good compression ratios, but I thought that I'd post it anyway for anyone interested.

    Also, it seems that CompressionRatings.com can't get it tested due to an error. I can't figure it out, so if anyone has any idea what is going wrong please let me know. Thanks in advance.
    Attached Files Attached Files

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    You can download the test files from http://compressionratings.com/download.html
    and test it yourself first.

  3. #3
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    It works for me on all test data and the checksums match for the decompressed data. The reason for disqualification was described as "program outputs error message then exits". I've tested it on multiple computers from the command line and launched via another program without fail. That's why I'm confused as to what error it's giving.

  4. #4
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by david_werecat View Post
    It works for me on all test data and the checksums match for the decompressed data. The reason for disqualification was described as "program outputs error message then exits". I've tested it on multiple computers from the command line and launched via another program without fail. That's why I'm confused as to what error it's giving.
    Do you "return 0" or do "exist(0)" when your programs terminates?
    Enjoy coding, enjoy life!

  5. #5
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    If your code depends on a shared library, such as MSVCRT, it is likely that this library is not installed into the CompressionRatings testbed.
    The solution is to statically link such dependancy into the binary.

  6. #6
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    .
    Last edited by Cyan; 17th November 2011 at 19:13. Reason: removed (duplicate)

  7. #7
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    Thanks for your suggestions. My code requires no CRT library at all. I wrote a very small built in CRT so that I would not need to have the external dependency. It also ensures small exe size, because statically linking the CRT adds a lot to the size of the compiled binaries. As for the exit code, I just end the program with a void return statement. I did some test runs, and it exited with an exit code of zero every time. I'm not sure if this is the same in all cases though, so I'm going to look into the ExitProcess function. Once again, thanks for the suggestions.

  8. #8
    Member kampaster's Avatar
    Join Date
    Apr 2010
    Location
    ->
    Posts
    55
    Thanks
    4
    Thanked 6 Times in 6 Posts
    Hi david_werecat
    RangeCoder 64 bits, probably disqualified it.
    Last edited by kampaster; 17th November 2011 at 19:21.

  9. #9
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    Thanks for the suggestion. In the description, Sami says "Some software has been updated to latest versions (64-bit versions if available)". It is also mentioned that the test system is running Windows XP 64 bit, so there shouldn't be a problem with a 64 bit submission. I also was sure to check that all of the suggested configurations are within the memory limits. I don't have access to a copy of Windows XP 64 bit, so it will be difficult to test compatibility. I'll try submitting a 32 bit version just in case, though.

  10. #10
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Can't test in 32 bit Windows, so I tried compiling:

    C:\tmp>g++ -O3 rangecoder.c -o rangecoder.exe
    rangecoder.c:2:21: fatal error: MiniCRT.h: No such file or directory
    compilation terminated.

    c:\tmp>cl /O2 RangeCoder.c
    Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
    Copyright (C) Microsoft Corporation. All rights reserved.

    RangeCoder.c
    RangeCoder.c(2) : fatal error C1083: Cannot open include file: 'MiniCRT.h': No such file or directory

  11. #11
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    I just did compiles for both 32 and 64 bit for testing purposes. The problem with the 32 bit compiles is that it imports _aulldiv and _allmul, so I linked in the MASM object files for those functions. Once I'm done organizing the project files I'll repost the full Visual Studio 2010 project.
    Attached Files Attached Files

  12. #12
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    Here is the Visual Studio 2010 project for anyone interested.

    Also, I'm experimenting with an indirect model, but it seems to have worse ratios. The indirect model is also included in the project files in the file 'IndirectBitModel.h'.
    Attached Files Attached Files

  13. #13
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Some results on enwik8. I guess that "order" is bitwise mod 32. Memory usage looks like 2^(order+4) bytes. Time is 60-140 seconds on a 2 GHz T3200 (32 bit version) depending on order. The highest I could test was order 26 (1 GB memory).

    Code:
    33,761,533 enwik8.26
    34,635,889 enwik8.25
    35,625,897 enwik8.24
    35,625,897 enwik8.56
    46,805,877 enwik8.48
    46,805,877 enwik8.16
    73,016,189 enwik8.40
    73,016,189 enwik8.8
    80,009,581 enwik8.7
    87,831,925 enwik8.6
    94,154,825 enwik8.5
    95,670,157 enwik8.4
    96,963,829 enwik8.3
    97,987,717 enwik8.2
    99,660,153 enwik8.1
    99,801,301 enwik8.0
    99,801,301 enwik8.32
    99,801,301 enwik8.64
    Edit: tests on enwik9. http://mattmahoney.net/dc/text.html#3209
    Last edited by Matt Mahoney; 24th November 2011 at 22:53.

  14. #14
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    Thanks for testing! It's good to see that it works.

  15. #15
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    I've finished with version 1.3. Memory usage for the standard model is halved, as well as there is an increase in speed. It also includes two additional models (stored in separate executables), the indirect and double model. The indirect model is adaptive, using bit histories instead of counts. The double model uses a main model to select from several submodels, improving compression at the expense of memory and speed. The standard model is compatible with version 1.2. Once again I've posted the full Visual Studio 2010 project for all those interested.
    Attached Files Attached Files

  16. #16
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    http://mattmahoney.net/dc/text.html#2852

    I couldn't test the indirect version because there wasn't a 32 bit Windows version and my attempts to compile were unsuccessful.

  17. #17
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    Tested on a 2,4 GHz dual core, 4 GB RAM, Windows 7 64-bit using Igor's timer for timing and Windows task manager for memory usage:

    Code:
    enwik8/9   32/64-bit   Version   Order Time      mem     size
    
    enwik8     32          Double    26    03:57     1116    30.934.993
    enwik8     64          Double    27    03:10     2233    30.554.077
    enwik8     64          Indirect  27    04:24     1183    36.108.281
    
    enwik9     32          Double    26    38:13     1116    285.258.957
    enwik9     64          Double    27    30:00     2233    279.848.057
    enwik9     64          Indirect  27    37:52     1183    326.318.969
    EDIT: Edited for enwik9 results and accurate timing
    Last edited by schnaader; 27th November 2011 at 13:20.
    http://schnaader.info
    Damn kids. They're all alike.

  18. #18
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Here are some comparable ZPAQ results for some single component models using 1 GB memory on enwik8. Timed on a 2 GHz T3200, 1 thread with zpaq v4.01 (released today).
    You can remove parenthesis to uncomment.

    Code:
    comp 0 2 0 0 1 (hh hm ph pm n)
       0 icm 24   (26731771, 62 sec)
      (0 cm 28 4  (27312337, 57 sec))
      (0 cm 28 8  (27195780, 57 sec))
      (0 cm 28 16 (27266000, 58 sec))
    hcomp
      *b=a hash b++ hash b++ hash b++ hash *d=a halt
    post
      0
    end
    The context is a bytewise order 4 hash, so kinda cheating.
    (The M buffer pointed to by B wraps around at 4).
    ICM is an indirect context model: context -> bit history -> prediction.
    CM is a direct context model: context -> prediction.
    The first parameter in each case selects 1 GB memory. The second for CM is the learning rate, 1/count with count bounded to 4n.

  19. #19
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    From the looks of the ZPAQ results, I still have a lot to learn.

    I have made yet another new version of RangeCoderC. This version eliminates the size of the file in the header and adds support for stdin/stdout coding. This unfortunately breaks compatibility with older versions. Also, it adds the hashed model, where the context is a hash instead of a raw bitstream. Once again, I'm posting the full Visual Studio 2010 project.

    I also have a question; is there a more accepted name for the double model? I'm almost certain that it isn't anything new and that there is a better name for it.
    Attached Files Attached Files

  20. #20
    Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    275
    Thanks
    6
    Thanked 23 Times in 16 Posts
    Quote Originally Posted by david_werecat View Post
    I also have a question; is there a more accepted name for the double model? I'm almost certain that it isn't anything new and that there is a better name for it.
    There are different techniques trying to achieve want you want,

    you can try to select the best model among others with a switching sheme or you can weight or mix different models.

    Switching is difficult because your prediction bases entirely on the past performance and the weighting you do to select the best model. You could be completly wrong...more wrong then weighting different predictions.

    You can for example view a linear convex combination between different models say
    p=a_1*p_1+a_2*p_2+...+a_n*p_n where the p_i are different models and the coefficents sum up to 1

    when the p_i are cumulative probability distributions you can view this also as a mix.

    This is a standard problem which can be solved by matrix inversion.

    And there are countless other ways to simplify or extend this idea.

  21. #21
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    You mean mixing predictions from different models. It is better to do this in the logistic domain so you give more weight to prediction near 0 or 1 which presumably have a high degree of certainty. So you first transform p to ln(p/(1-p)), average them, and transform back by the inverse function 1/(1+exp(-p)). In PAQ and ZPAQ these are called the stretch and squash functions.

    You should see better compression with fixed weights, but better still if you adaptively adjust the weights to reduce prediction errors. You compute error = bit-p and adjust each weight a_i = a_i + stretch(p_i)*error*rate, where rate is usually around 0.001 to 0.01.

    In ZPAQ, squash() and stretch() are implemented using lookup tables. Stretch() inputs a linear probability as a 15 bit number representing an odd multiple of 2^-16 and outputs a 12 bit number in the range -32 to 32 with 6 bits of precision after the decimal point. Weights are 20 bit signed number in the range -8 to 8 with 16 bits after the decimal. These bounds are needed to prevent 32 bit overflow, but there is no need to ensure that the weights add up to 1 like with linear mixing.

    ZPAQ also uses ISSE mixing chains where each component maps a context to a bit history, and the history selects a pair of weights to mix the output of the previous component and a fixed constant 1. The components are chained in increasing order starting with an order 0 indirect context model (context -> bit history -> prediction). In the -m3 (mid) model, it uses both mixing techniques. First it uses an order 0..5 chain, then it mixes them all plus an order 7 match model.

  22. #22
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    Thanks for the explanations, I think I now better understand CM. I have an unreleased model from a while ago that mixes different predictions with adaptive weights, but not in the logistic domain. It uses an average of the error rates of each model to mix the predictions. The accuracy of the prediction is calculated by using Confidence[i] = Confidence[i] + Abs(bit - prediction[i]), and incrementing a global bit count. The weight of each model is Weight[i] = Confidence[i] / BitCount, and they are mixed to make the prediction P=((prediction[0] * Weight[0]) + (prediction[1] * Weight[1]) + ... + (prediction[n] * Weight[n])) / (Weight(0) + Weight(1) + ... + Weight(N)). I did a test of accuracy for logistic mixing versus the linear model by testing their average error rates. The logistic model seems better for large files or changing data, whereas the linear model seems better for small files or similar data. As soon as I figure out how to work with floating point arithmetic without the CRT I'll include both types of mixer with the models.

    The double model doesn't really use special model switching, it uses the last n bits of the context to index millions of order 1 bitwise models. Think of how long it would take to mix that many models!

  23. #23
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Your "double" model is what I call an indirect model (context -> bit history -> prediction). ZPAQ uses an 8 bit state to represent a bit history, but instead of just the last 8 bits, it represents a pair of counters (n0,n1) and the value of the last bit. The counters are variably bounded, so that one count can be large if the other is small. The second stage of the model is just a regular context model with a low, fixed learning rate (error/1024) initialized to (n1+0.5)/(n0+n1+1).

    I don't use any floating point arithmetic in modeling because it can introduce compatibility errors because results can depend on compiler optimizations (for example, maybe 0.1 * 10 != 1). I do use it to initialize the squash and stretch tables, but I checked that the table entries are far away enough from integer values that rounding won't be a problem, and then do a checksum on the tables to make sure.

  24. #24
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    I've made yet another version or RangeCoderC. I hope that the update frequency isn't too fast. Version 1.5 merges all models into one executable and adds significant improvements to the speed and memory usage of the indirect model. Also, the double model was renamed to the 'indexed bitwise model array'. Starting with Version 1.5, the model number is stored with the order in the file header, so this once again breaks compatibility with earlier versions.

    Also, I have included a beta version of a linear mixer which is included in the file 'LinearMixer.h'. Next version I hope to include both linear and logistic mixing of models. I will also work on cleaning up the code, sorry to anyone who finds it to be messy.
    Attached Files Attached Files

  25. #25
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    The hashed model is a nice improvement. http://mattmahoney.net/dc/text.html#2713

  26. #26
    Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    275
    Thanks
    6
    Thanked 23 Times in 16 Posts
    although the thread title says "logistic mixing" there are different versions of 2-input mixers in the linear domain

    http://encode.ru/threads/1154-Logistic-mixing

    update_ls minimizes (p-bit)^2
    update_log minimizes coding cost

    Code:
    static struct tdiv32tbl {// 32-bit 1/i (i=1..PSCALEm) div-tbl
      tdiv32tbl() {for (int i=1;i<PSCALE;i++) tbl[i]=0xffffffff/(uint32_t)(i);}
      uint32_t& operator[](int i)  {return tbl[i];};
      uint32_t tbl[PSCALE];
    } div32tbl;
    
    class LinearMix2 {
      public:
        int w,p1,p2,pm;
        LinearMix2(){Init(WSCALEh);};
        void Init(int iw) {w=iw;};
        int Predict(int _p1,int _p2)
        {
          p1=_p1;p2=_p2;
          pm = p1+idiv_signed((p2-p1)*w,WBITS);
          pm = clamp(pm,1,PSCALEm);
          return pm;
        }
        void update_ls(int bit,int rate)
        {
          int e=(bit<<PBITS)-pm;
          int d=idiv_signed((p2-p1)*e,PBITS);
          int wd=idiv_signed(rate*d,PBITS);
          w=clamp(w+wd,0,int(WSCALE));
         }
         void update_log(int bit,int rate)
         {
           int wd,s;
           if (bit) {s=(((p2-p1)<<PBITS)*uint64_t(div32tbl[pm]))>>32;}
           else {s=(((p1-p2)<<PBITS)*uint64_t(div32tbl[PSCALE-pm]))>>32;}
           wd=idiv_signed(rate*s,PBITS);
           w=clamp(w+wd,0,int(WSCALE));
         }
      protected:
        inline int idiv_signed(int val,int s){return val<0?-(((-val)+(1<<(s-1)))>>s):(val+(1<<(s-1)))>>s;};
    };
    Last edited by Sebastian; 29th November 2011 at 23:43.

  27. #27
    Member
    Join Date
    Aug 2011
    Location
    Canada
    Posts
    113
    Thanks
    9
    Thanked 22 Times in 15 Posts
    Thanks for the link. I'm still working on fully understanding the math, but it looks a lot more effective than the linear mixer I made a while ago. I tested the mixer included with the version 1.5 code. I used enwik9 as the input file and mixed the Indexed Bit Model Array and the Hashed Bit Model, both at order 26. It compressed to 266,062,709 bytes, taking 1.6gb of memory and about 45 minutes. By the looks of it, the described mixers will give much better ratios than the one I made.

    It also seems that the actual rangecoder component of RangeCoderC is fairly inefficient. The output tends to bias towards 0x80 bytes in a smooth curve. The upper byte of each integer could be encoded using a secondary coder, but it would probably be too slow. The linear mixer seems to amplify the bias, and would probably achieve much higher ratios if it the bias was eliminated. I'm still working on fixing this, but any suggestions about improving the rangecoder component are much appreciated.

  28. #28
    Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    275
    Thanks
    6
    Thanked 23 Times in 16 Posts
    the math for the 2-input mixer is simple

    Say you wish to minimize Q=sum (x_i) from i=1 to N, where

    x_i=(b_i - p_i)^2=(e_i)^2 with N=number of bits, p_i the i.th prediction and b_i the i.th input bit, thus minimizing "least squares"

    the i.th prediction has the form

    p_i = (1-w_i)*p1_i + w_i * p2_i (a convex combination beetween the two models p1 and p2)

    One way to minimize Q is stochastic gradient descent, where the gradient of Q is approximated by a single example x_i

    So taking the partial derivative of one example x_i with respect to w_i yields

    w_(i+1) = w_i + alpha*(p2_i - p1_i)*e_i where e_i=b_i-p_i the error of prediction

    alpha is a constant learning rate which includes our forgotten "2" as well
    and w_(i+1) is the new weight for the next prediction

    notes:
    stacking 2-input mixers yields a orderM mixer,
    but not the one from my previous post (the weighting is not linear in its coefficients)
    let p1,p2,...,pM the predictions for different order where M is maximum order, then
    p=mix(p1,p2)
    p=mix(p,p3)
    ...
    p=mix(p,pM)

    is the desired weighting, you can expand the cascade and see its beautiful nature

    -I'am only using this approach ("update_ls") in my "experiments with orderN-CM"thread

    -while in theory minimizing with respect to coding cost (the "update_log" in my code-snipped) is better, you run into precision-problems on 32-bit fixed point

    -most current CM-coders use something like this but vary
    * x_i to minimize a different norm (coding cost for example)
    * p_i the form of prediction (logit-model in ZPAQ)

    -the logit-model is better for compressible data, because you expect the model to decide itself to one direction, either "0" or "1"

    cheers
    Last edited by Sebastian; 30th November 2011 at 10:08.

  29. #29
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Yes, but you don't want to minimize the squared error. You want to minimize the coding cost. In the logit domain, you end up with a neural network but with a simpler update formula than back propagation. To minimize RMS error the update rule would be:

    w_i := w_i + (x_i * rate * (desired - output) * output * (1-output))

    To minimize coding cost it is:

    w_i := w_i + (x_i * rate * (desired - output)).

    You can derive this by taking the partial derivative of coding cost with respect to w_i to find the gradient. rate is around 0.001 to 0.01 with smaller values for stationary sources (uniform statistics) and higher for mixed data types.

    PAQ an ZPAQ also select weight vectors using a small context for better compression. In zpaq -m3, the mixer uses an order 1 context (16 bits).

  30. #30
    Member
    Join Date
    Oct 2010
    Location
    Germany
    Posts
    275
    Thanks
    6
    Thanked 23 Times in 16 Posts
    I already wrote this

    while in theory minimizing with respect to coding cost (the "update_log" in my code-snipped) is better, you run into precision-problems on 32-bit fixed point
    the difference is very small nonetheless

    update_ls gets around 11.3mb on SFC
    update_log gets around 11.5mb

    my mixer-contexts are usually also around 16-bit wide

Page 1 of 3 123 LastLast

Similar Threads

  1. Simple binary rangecoder demo
    By Shelwien in forum Data Compression
    Replies: 35
    Last Post: 17th June 2019, 16:21
  2. Vectorized rangecoder
    By Shelwien in forum Data Compression
    Replies: 5
    Last Post: 16th January 2011, 19:32
  3. Need some advice for order-N CM
    By Sebastian in forum Data Compression
    Replies: 19
    Last Post: 23rd November 2010, 19:54
  4. Variable-length Offset
    By Cyan in forum Data Compression
    Replies: 8
    Last Post: 12th December 2008, 14:06
  5. fpaq0f new order 0 model
    By Matt Mahoney in forum Forum Archive
    Replies: 31
    Last Post: 5th February 2008, 23:08

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •