Results 1 to 19 of 19

Thread: Minimal Ashford arithmetic-coder termination

  1. #1
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post

    Minimal Ashford arithmetic-coder termination

    I could need some help solving a minimal-termination problem I have with the Ashford-coder. The coder is based on the CACM-coder but it has some underflow-handling which differs "significantly" (the 5Cent-coder is what you can reach with CACM). CACM-style coders can be terminated with at most 2 disambiguating bits:

    Code:
        /* Output two bits that select the quarter that
         * the current code range contains.
         */
        underflow += 1;
        clearUnderflow(!(low < firstRange));
    The number of bits flushed are "1 + outstanding" [clearUnderflow] + "1" [+= 1].

    In the Ashford-coder clearUnderflow is always "1 + (outstanding - 1)", and there are more differences in the renormalization (like minRange being 0x4001). The interval-decoding though is regular CACM-style:

    Code:
        target = (((((value - low) + 1) * total) - 1) / (range + 1))
    What I'd like to know in terms of interval-fit, which is the smallest possible number of bits to be put out in the CACM-case.? The DCC-coder has an extreme intuitive algorithm for determining the number of bits:

    Code:
        /* output the nbits integer bits (1, 2, 3) */
        for (nbits = 1; nbits <= regsze; nbits++) {
          roundup = (1 << (regsze - nbits)) - 1;
          bitfield = (low + roundup) >> (regsze - nbits);
          value = bitfield << (regsze - nbits);
    
          /* the value-interval (value + whatever bits)
           *   [value,value+roundup)
           * falls now always between the interval
           *   [low,low+range)
           * =>
           *  low <= value <= value+roundup <= low+range
           */
          if (((value + 0) >= (low + 0)) && ((value + roundup) <= (low + (range - 1))))
            break;
        }
    But that is a result of it's specific implementation and doesn't apply to CACM-style coders. For the moment I haven't come far, maybe you got an idea. Charles just throws out 0x80.

    Ashford-coder:
    Attached Files Attached Files
    Last edited by Ethatron; 14th January 2011 at 22:50.

  2. #2
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    While fiddeling around with arithmetic coders in general I also detected that it's possible to raise the precision of any coder-implementation you maybe got (even carry-less ones), when and if you have varying incoming "total". All coders, when the range-underflow occurs either maintain an underflow-queue or set the range to a safe value. All coders (I know) say range-underflow occurs when range goes below a static range-magnitude (0x4000 or 0x8000 or whatever, which is the maximum "total" you can feed to the coder). When you build up real statistics by counting "total" is not static and the underflow-case won't in real occur when it's assumed. So instead of:

    1. range-contraction
    2. flush known bits
    3. raise range for underflow (against 0x4000)

    You do the underflow-check before range-contraction and you check against the current total:

    1. raise range for underflow (against total)
    2. range-contraction
    3. flush known bits

    This has two effects; in the general case (with/without underflow-queue) you output less terminating bits (you allow range to go below safety after the last interval has been coded, which it completly fine; just assume the next interval you want to code is [0,1] e 1, range needs only to be above 1 to code that symbol); in the case of carry-less coders, besides having the underflow-case itself occur less frequently, you can set the forced range to a lower value. The famous Subbotin-coder becomes:

    ((ast.range = (-(ast.low)) % total), 1)

    instead of

    ((ast.range = (-(ast.low)) & (minRange - 1)), 1)

    The Subbotin-coder (as an exxample) has on an example source 850 bytes waste, applying this while running a normal order-0 rescaling (freq + 1 >>= 1) model lowers the waste to approximatly half 450 bytes. If you use Charles' range-determination, it goes down to 260, if you consider that you also have a shift after setting range (range can be >> 8 smaller), you pretty much end up with just 15 bytes waste (on a 10MB files vs. the DCC-coder). One of Eugene's carryless range coder takes the last "trick" already into account (setting 0xFF, then <<= .

    I just want to note we'Re not talking about percents gain here, this is counting gain-bits with two hands.
    But as I just try to find minimum waste-coders (reseting the range-coder every 256 intervals does lead to enormous waste) this is quite interesting being unmentioned in the entire literature. It's not even in the Carpinelli/Neal-coder (aka. DCC-coder) and that one tries not to waste even a single bit.

  3. #3
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by Ethatron View Post
    The famous Subbotin-coder becomes:

    ((ast.range = (-(ast.low)) % total), 1)

    instead of

    ((ast.range = (-(ast.low)) & (minRange - 1)), 1)
    Ah, well that doesn't work out that easily (eg. "total - (low % total)"). I think there should be a valid solution adjusting low in addition (eg. "low += 0x10000 - total" works but is contra-productive). It's a bit complex.
    Anyway here are measures of the verified combinations:

    enwik7 with order-0/-1 fenwick-tree and the Subbotin-coder:

    1412 bytes waste with regular coding "if (range < 0x10000) range = 0x10000 - (low & 0xFFFF)"
    0570 bytes waste with Charles' coding "if (range < 0x10000) range = maximum(), low = low - (low & 0xFFFF) + (0x10000 - range)"
    0340 bytes waste with delayed underflow coding "if (range < total) range = 0x10000 - (low & 0xFFFF)"
    0315 bytes waste with delayed underflow and Charles' coding "if (range < total) range = maximum(), low = low - (low & 0xFFFF) + (0x10000 - range)"
    ~272 bytes waste with an expected working total-adjustment of range
    Last edited by Ethatron; 12th January 2011 at 00:07.

  4. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    1. Please try making your posts more comprehensible, or are you really talking to yourself?
    2. In theory full arithmetic coders (with variable "total" and divisions) are more interesting,
    but many optimizations only apply to bitwise coders with 2^n fixed total, and that's what
    most modern compressors use, eg. lzma, bsc.
    3. My best coder optimized for short strings is in http://ctxmodel.net/files/guess/bit_guess_v1.rar
    It implies that all the bytes after EOF are 0xFF, and thus able to cut off any trailing FFs from the code.
    Imho its quite obvious how to implement it (although not simple at all) - the last "code" value in the decoder
    has to be in last symbol's interval, so we have to put enough bits for that into the stream.
    To that end, it may be important to ensure that most probable symbol takes the predefined outermost interval
    ([0..freq) or [1-freq;1)), so that it would be easier to assign shorter codes.
    Although with a non-bitwise alphabet, I suspect, it might be the best to reorder the symbols
    by huffman code.
    But EOF, if its available, can be interpreted as any fixed bitstring and provide additional savings.

  5. #5
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts

    About arithmetic coders

    Quote Originally Posted by Ethatron View Post
    Ah, well that doesn't work out that easily (eg. "total - (low % total)"). I think there should be a valid solution adjusting low in addition (eg. "low += 0x10000 - total" works but is contra-productive). It's a bit complex.
    Anyway here are measures of the verified combinations:

    enwik7 with order-0/-1 fenwick-tree and the Subbotin-coder:
    I am curious as far as general arithmetic file coders. It's really hard to beat
    the many nonstationary coders that are commonly uses since they are
    designed and optimized using what is hopefully typical data something
    that I am not sure you seem concerned with. However if you want to check
    out an optimal arithmetic coder I suggest you try arb255 it will compress
    any permutation of any file to either some fixed number of bytes. Due to
    the bijective nature of things it will either compress to a given n for all
    permutations or to a n or n+1 bytes for other files when testing all
    permutations. The arithmetic coders that are nonstationary will general
    compress real useful targeted files smaller but permutations of the
    file will not compress as small.

  6. #6
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by Shelwien View Post
    1. Please try making your posts more comprehensible, or are you really talking to yourself?
    You may read Bounds on Redundancy in Constrained Delay Arithmetic Coding, or The Delay-Redundancy Tradeoff in Lossless Source Coding. Otherwise there is no specific indepth material about how to write the least amount of disambiguating bits for generic arithmetic-encoder termination. It's apparently a not much attended topic BTW, or how do you explain your coders mostly terminate with five shiftCarry? There is not a single range-coder implementation out there with a decent termination (well I gave the algorithm in the first post, now one could terminate eg. Subbotin's coder decently).

    Quote Originally Posted by Shelwien View Post
    2. In theory full arithmetic coders (with variable "total" and divisions) are more interesting
    Ah, vaya .... I'm actually interested in minimum arithmetic-coder termination, you can terminate shift/add-coders as well I think ...
    It may be that nobody gives a damn about the Ashford-coder in particular, but I know Scott has some love for the termination-thing.

    And here is a little Jewel for that you don't need pow2-total (carry-less BIAC cross-over), as you don't need pow2-total for Barney's Coder ... still they are pretty cracked down, don't think you need pow2-total.

  7. #7
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by biject.bwts View Post
    I am curious as far as general arithmetic file coders. It's really hard to beat
    the many nonstationary coders that are commonly uses since they are
    designed and optimized using what is hopefully typical data something
    that I am not sure you seem concerned with.
    You're right, I'm not concerned with any model, file, distribution or anything; I'm just looking for some exactness, which can be applied to the arithmetic-coders; not this wish-wash "delete ari;" I'm done stuff. Not even the Guazzo/F0-coder has a correct termination. Of the non-stoneage coders apparently only some highly industrial coders (like CABAC, Q) even mind about going through the theory. There is no explanation why the CACM-coders can terminate with 2 disambiguing bits, at least not as an iterative process in terms of interval-fit, like what Radford Neal did with the DCC-coder.

    Quote Originally Posted by biject.bwts View Post
    However if you want to check
    out an optimal arithmetic coder I suggest you try arb255 it will compress
    any permutation of any file to either some fixed number of bytes. Due to
    the bijective nature of things it will either compress to a given n for all
    permutations or to a n or n+1 bytes for other files when testing all
    permutations. The arithmetic coders that are nonstationary will general
    compress real useful targeted files smaller but permutations of the
    file will not compress as small.
    I'm not sure I can find an explanation of how you terminate you're coder, but I can take a look.

  8. #8
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Quote Originally Posted by Ethatron View Post
    You're right, I'm not concerned with any model, file, distribution or anything; I'm just looking for some exactness, which can be applied to the arithmetic-coders; not this wish-wash "delete ari;" I'm done stuff. Not even the Guazzo/F0-coder has a correct termination. Of the non-stoneage coders apparently only some highly industrial coders (like CABAC, Q) even mind about going through the theory. There is no explanation why the CACM-coders can terminate with 2 disambiguing bits, at least not as an iterative process in terms of interval-fit, like what Radford Neal did with the DCC-coder.



    I'm not sure I can find an explanation of how you terminate you're coder, but I can take a look.

    I thought I wrote a lot about how I did it. However I am noted for a very poor
    choice of words. There are several ways to do it. Actually the problem mentioned
    in one of the papers you quoted for 3 symbol code is really no problem at all
    when you use the carry methods. My coder arb255 is very close to Matt
    Timmermans coder except it's slightly different. I actually started with a very
    short high and low like 3 bits each and went through every termination
    possibility. Look at

    http://bijective.dogma.net/compres11.htm

    Once I got that working I extended to longer high and low
    at some point no more cases. I went to 64 bits however
    if I do it again I will use 128 bit registers since Shelwien
    showed me how ming 64 can have them. Of course it
    will not compress smaller and will run slower but I don't
    care.

    I hope you can follow the code itself I tried to really not
    follow on a course that most would do. Example every
    thing is actually only binary coders. Sometimes the one
    symbol occupies the top part of interval somethings not.
    Sometimes the most probable occupies the top part of
    interval sometimes not. I really don't like the way others
    did it. I have over the years had several variations over
    the years I code them play with them and then lose them
    some are simple changes like KT update or Laplace or
    some other weird way. Also made forms where each
    symbol has same weight but so what. There is also
    my bit IO stuff I have lost track of versions over the
    years but it works. I worried about regular files
    blowing up on uncompress so I have some form
    of stability output transform to prevent this.

    But I would be happy to have someone who can
    actually write take a look. The code is a product of
    my views. Shelwien called me a bijective maniac
    at one time on the web maybe so. I think I am a
    frustrated crypto person who realizes the future
    of crypto is very much related to compression
    especially bijective compression. And since it
    appears Americans no longer have the freedom
    to express there views on encryption in code that
    compression is about as close as one can come.
    I hope someday that America can be free again
    so that it will once again shine. When that happens
    I will write more cyrpto. But until that time I
    think I will play with obscure bijective compression
    methods.

  9. #9
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    Sitting over Andrew Polar's carry-less range-coder design, which is an extrapolation of the Guazzo/F0-coder, I became aware of a very good way to explain what a carry-less coder is and why the topic of minimum coder-termination is about the same thing.

    Here is the basic coder-loop:

    Code:
        range = high - low;
    
        /* We're doing limited precision arithmetic until ... */
        high = (range * ivl->high) / ivl->total + 0;
        low += (range * ivl->low ) / ivl->total + 1;
    
        /* ... we don't have enough range to code the next symbol. */
        if ((high - low) < minRange)
          /* Therefor we make the coder reset by ... */
          high = low;
    
        /* ... executing the loop until low and high are different on enough digits. */
        while ((low ^ high) < topRange) {
          consumeHiSample(low);
    
          /* expand high with trailing 1s */
          high = (high << 8) | 0xFF;
          low  = (low  << 8) | 0x00;
        }
    What happens here is that in case we run out of precision, a straight reset of the coder is issued. Let's imagine our coder's initial state is low=0,high=FFFFFFFF,minRange=0000FFFF, and we encounter the precision-underflow on something like low=0000F000,high=00010000, not enough for the upcoming interval; the code set low and high equal which forces the loop to shift the entire "low"-register out, and FFs into the "high"-register; voila: low=0,high=FFFFFFFF.

    In effect I could also write this:

    Code:
        range = high - low;
    
        /* We're doing limited precision arithmetic until ... */
        high = (range * ivl->high) / ivl->total + 0;
        low += (range * ivl->low ) / ivl->total + 1;
    
        /* ... we don't have enough range to code the next symbol. */
        if ((high - low) < minRange) {
          /* Therefor we make the coder reset by first fullstop() ... */
          consumeAllHiSamples(low);    // 4 bytes
    
          /* expand high with trailing 1s */
          low  = (low  << 4 * 8) | 0x00000000;
          high = (high << 4 * 8) | 0xFFFFFFFF;
    
          /* ... then start() */
          low  = 0x00000000;
          high = 0xFFFFFFFF;
        }
        else {
          while ((low ^ high) < topRange) {
            consumeHiSample(low);
    
            /* expand high with trailing 1s */
            low  = (low  << 8) | 0x00;
            high = (high << 8) | 0xFF;
          }
        }
    Now what becomes apparent, that if one could find a less waste-full way to do a stop(), one could make a more efficient carry-less coder. The task is to find a valid "low" on reset which still enables the decoder to uniquely decode the previously coded interval ("low" entirely is always the safe bet), but which is as short as possible (to have less waste):

    Code:
          /* Therefore we make the coder reset by first minimalstop() ... */
          X = consumeAsMuchSamplesAsNecessary(low);    // 0-4 bytes
    
          /* expand high with trailing 1s */
          low  = (low  << X * 8) | (0x00 + 1 << X * 8 - 1);
          high = (high << X * 8) | (0xFF + 1 << X * 8 - 1);
    We have there the possibiliy to search for a specific "low" in which the maximum amount of trailing digits is "irrelevant", it can be 0 or 1 or whatever it would still enable the decoder to decode the interval. But as we currently have the possibility to reset the coder to any (!) new state we want, we could also utilize a specific "low":

    Code:
        while ((high - low) < minRange) {
          /* Therefor we make the coder reset by first astop() ... */
          consumeHiSample(low);    // 1 byte
    
          /* expand high with trailing 1s */
          low  = (low  << 8) | 0x00;
          high = (high << 8) | 0xFF;
    
          /* ... then restart() */
          low    = 0x00000000 + low;
          range = 0xFFFFFFFF - low;
          high   = low + range;
        }
    What this code does is, recycling the previous "low" (which we know will always allow the decoder to uniquely decode the next interval) which of course requires us to readjust our high accordingly (high + low must be below the maximum value that the precision allows: 0xFFFFFFFF in the example).

    And that is actually exactly what this line in the Shkarin/Subbotin-implemention means:

    Code:
          range = ((-low) & (RANGE_BOT - 1)) << 8;
    It means "we reset the range-coder with some small amount of digits, taking into acount that we can recycle the previous low, adjusting just high".

    Now it must have become utterly clear, that once you find a sound and effective way to stop an arithmetic-coder design, you are also inherently able to make it an effective carry-less implementation; because carry-less/underflow-free is nothing else than resetting your coder once he can not function anymore.

    Okay, I just wanted to draw out this relationship, maybe it's something new for you, maybe it's just a little nicer fairy-tale explanation of what's happening there; I also wanted to make clear I'm not nuts.

    Addendum, advanced modifications:

    The mechanism above operates on multiple of 8 bits, and makes no attempt to figure out the exact position in the "low"-register where the rest of the digits is "irrelevant" (make no difference for the decoder). This means the Subbotin-Coder can be made more efficient by finding the cutoff-point in "low" and setting the irrelevant bits of "low" to zero, which in effect makes the renewed "range" bigger, which delays the next point-of-reset a little bit longer:

    Code:
        if ((high - low) < minRange) {
          low = low & (0xFFFFFFFF << (32 - minimumHiDigits(low)));
          high = 0xFFFFFFFF - low;
        }
    
        while ((high - low) < minRange)
          ...
    And that is then pretty much what Charles (not Ashford, Bloom it is ) is doing when he takes the choice between adjusting "low" (we can do that, we're currently resetting the coder!) or better "high", whichever of the two choices leaves us with the bigger range ("high - low") always considering that we have to make sure that our chosen "low" still allows the decoder to decode the previous interval.
    Last edited by Ethatron; 13th January 2011 at 20:34.

  10. #10
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > I also wanted to make clear I'm not nuts.

    Frankly, you're still barely comprehensible.
    Fortunately you don't seem to have problems with english at least.
    Please understand the fact that you're not some VIP,
    so people won't try to decrypt your messages
    (which don't make any sense at all out of context), at least
    unless there's something interesting in the obvious parts.
    There's a good reason for all these "abstracts", "introductions"
    and "conclusions" in academic papers.
    Actually I only paid attention because there're related
    (to "delays" and "termination") issues in my vectorized rc.
    But I still don't understand why do you need a carryless rc,
    or why somebody would be concerned about "delays" with
    a single rc instance.

    > How do you explain your coders mostly terminate with five shiftCarry?

    Not all, for example the one which I linked in my previous post doesn't.

    And there're reasons for simple flush and init, even if they produce
    5 bytes of redundancy per file - its easy to prove that there're no bugs
    in these parts, and its fast and easy to maintain.
    ("fast" mainly applies to extra zero byte at start of stream, "maintain"
    implies reading/understanding, porting to various projects etc).

    > very good way to explain what a carry-less coder is

    Actually there's no need to explain what's a carryless rangecoder,
    because its name tells enough already - its a rangecoder that
    doesn't produce a carry.

    > The task is to find a valid "low" on reset which still enables the
    > decoder to uniquely decode the previously coded interval ("low"
    > entirely is always the safe bet), but which is as short as possible
    > (to have less waste):

    Actually this is where range cutting to avoid carry and "coder termination"
    are different - in range cutting we need to preserve maximum possible range
    such that low's MSBs won't be affected by carry, while in coder flush we
    need to find the subinterval [x..x+2^y) in the [low..low+range) interval,
    such that y is maximal, and x is aligned. For example:

    Code:
    for( y=0; y<32; y++ ) { 
      uint mask = (1<<y)-1;
      uint l = (low+range-1 - mask) & (~mask);
      uint h = l | mask;
      if( (l<low) || (h>low+range) ) break;
    }
    Note that I don't normally use "termination", because coder flush can
    be used for multiple purposes and doesn't really mean the end of data.

    Sure, carryless rangecoder can be implemented using a flush, but the
    only benefit in that known to me is the possibility to somewhat simplify
    the decoder, comparing to range-cutting carryless.

    And a flush-based carryless with precise shortest-possible flush just
    doesn't make sense because its (relatively) complex and decoder would
    have to resync to input stream, which is plain awful (ie slow).

    > line in the Shkarin/Subbotin-implemention

    Its purely Subbotin's:
    This is the original post (google didn't save it for some reason):
    http://compression.ru/fido/ru.compress.0027.htm#26
    And this is updated version:
    http://groups.google.com/group/fido7...3e331f61748c2b

    > Now it must have become utterly clear, that once you find a sound
    > and effective way to stop an arithmetic-coder design, you are also
    > inherently able to make it an effective carry-less implementation

    flush-based carryless is more redundant than range-cutting carryless
    by definition - because range-cutting tries to maximize the range
    and flush doesn't.

    > And that is then pretty much what Charles is doing when he takes the
    > choice between adjusting "low" or better "high"

    1. Actually Bloom's version is incomplete. It doesn't handle the
    low+range overflow anywhere, which means that another branch is
    necessary. And actually even Subbotin's version, which is much
    simpler than Bloom's, still has slower decoding than full rangecoder.

    Using stats from http://compression.ru/sh/coders6a.rar (2001):
    Code:
                   C.Size   C-Time   D-Time
    Subbotin   10,682,917    5.55     6.92  // Subbotin's rangecoder
    Subb-LB    10,682,917    5.55     6.97  // L.Broukhis mod which adds a cache byte
    CL-R       10,680,668    5.55     7.08  // my mod with 64-bit low
    CL-Rf      10,680,351    5.44     6.70  // my flush-based carryless
    Shcoder    10,680,642    6.97     8.34  // my mod of Schindler's rangecoder
    Shindlet   10,680,348    5.77     6.48  // -//- with 32-bit multiplication and 64-bit low
    So 6% faster encoding with 3.5% slower decoding is the best result for carryless,
    but we don't need faster encoding with slower decoding.

    2. As you can see in CLR vs Subbotin's, its really easy to improve its precision
    by extending low to 64 bits, so complicated handling with multiple branches is unnecessary.
    And btw, extra operations in rc are really bad even if they're not really executed at all -
    lzma has 15 rc calls per loop iteration, now imagine what happens if all of these are inlined.

    3. low in CLRF is 64-bit too, but it only does flushes if low>>32==0xFFFFFFFF, which
    is normally very rare. Also it appeared possible to implement decoder with 32-bit
    "code" and w/o low (almost like rc with carries), but it still has to track FFs,
    which is inevitably slower.

  11. #11
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by Shelwien View Post
    > I also wanted to make clear I'm not nuts.

    Frankly, you're still barely comprehensible.
    Fortunately you don't seem to have problems with english at least.
    Please understand the fact that you're not some VIP,
    so people won't try to decrypt your messages
    (which don't make any sense at all out of context) ...
    What's the point of saying this?

  12. #12
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Subbotin's coder is instructive. You can see two places where you lose a little coding space. One is due to division roundoff discarding a little of the top of the range in:

    low+= cum_freq*(range/=tot_freq);

    To minimize this, you want tot_freq to be small, but this reduces the possible precision of representing probabilities. range is kept in the range 2^24 to 2^32, so tot_freq should probably not be bigger than about 2^16, which means 8 bits for symbol frequencies for low redundancy data. Probability errors of e add redundancy of e^2 on average, so this is only a problem when coding highly redundant data with probabilities near 1 and the relative error is large due to quantization. If tot_freq is 64K then the highest probability you could have is 255/256 leaving space for the rest of the alphabet. This wastes 1 bit every 256 bytes.

    The other inefficiency is in normalization. If the range falls below 16 bits then the range is split on a 64K boundary and the top half is thrown away. This costs 1 bit on average. The range will be 24 to 32 bit, so 24 bits about 1/8 of the time. If the input has low redundancy then each byte will drop 8 bits from the range, so this range splitting can be fairly common.

    You can avoid these problems with 64 bit arithmetic or using bit model like PAQ instead of a byte model like PPM. In a bit model the minimum range is 1 and no range has to ever be discarded. Such a small range is rare (probability 2^-32) and only results in the bit probability rounded to 1/2 so costs at most 1 bit.

    The disadvantage of a bit model is you need 8 coding operations per byte. But byte level modeling has the same complexity when you consider calculating the cumulative and total frequencies using trees. (Using the obvious linear operations would be slower).

  13. #13
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > What's the point of saying this?

    That's how your attitude looks to me, sorry if you didn't want to know.

    Anyway, if you want somebody to really read your posts, you should consider making them readable,
    by adding enough information so that people won't have to look up stuff in other places to understand what you're trying to say.

    As a good example, now we have Matt's post here - imho its pretty clear that Matt didn't get what you're talking about,
    except for a few keywords.
    And actually I doubt that I got 100% too, because you never explained your context/environment details, like what you're
    trying to do and why known solutions don't work.

  14. #14
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by Matt Mahoney View Post
    Subbotin's coder is instructive ...
    Thanks, Matt. I know all that. (no sarcasm)

    Quote Originally Posted by Matt Mahoney View Post
    If the range falls below 16 bits then the range is split on a 64K boundary and the top half is thrown away.
    Which means you agree on the terms "coder termination" and "range/coder reset" for that operation? What you describe is resetting the coder/range. One could also initialize the Subbotin-coder at the very beginning with [0,0x0000FFFF] or [0,0xFFFF0000] if one prefers symmetry.

    ----------------------

    But back to the Ashford coder. I have a guess why the regular CACM procedure doesn't work. Charles does a smart thing and added the low-remainder to to range:

    Code:
        t = ivl->low * (range + 1);
      x = (t / total);
      y = (t % total);
      low += x;
    
      t = (high - low) * (range + 1);
      t += y;
      range = (t / ivl->total) - 1;
    I believe he can not terminate with a code between [low,low+range]. On the decoder-side:

    Code:
        target = (((((code - low) + 1) * total) - 1) / (range + 1));
    range will be slightly "too big", and one should choose some smaller code between [low-unkown,low-unkown+range]. Empirically all failing decodings are one too small as expected. But this is really complicated, having no real access to the previous-last range I don't think it can be calculated. What do you think?
    Last edited by Ethatron; 14th January 2011 at 22:52.

  15. #15
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by Shelwien View Post
    > very good way to explain what a carry-less coder is

    Actually there's no need to explain what's a carryless rangecoder,
    because its name tells enough already - its a rangecoder that
    doesn't produce a carry.
    Well, it's not as simplistic. Arithmetic-coders have precision-underflow problems because the calculation is finite. There is no single universal way to solve that; if you call it carry-over, underflow-queue, virtual-queueing or bit-stuffing means just that you're talking about a specific school/implementation of handling the problem (and Howard correctly says in his paper they all sides of the same). The carry-idea comes specifically from Rubin (if I'm not mistaken) and it's renormalization approach, and you could not make just any implementation carry-free, because in some algorithms occurs no carry. That's why it's sometimes ugly to name all of them like "carry-free/underflow-free/queueless", or sometimes one just says "carry-free" but means all of them.
    They are all sides of the same, they try to solve the precision-underflow problem, you can solve that by the specific ideas (underflow-queue, bit-stuffing, virtual queue, there is even the possibility to not use low==0) or you just reset the coder (see below what specifically I mean by termination). I didn't say much more than that.

    Quote Originally Posted by Shelwien View Post
    > The task is to find a valid "low" on reset which still enables the
    > decoder to uniquely decode the previously coded interval ("low"
    > entirely is always the safe bet), but which is as short as possible
    > (to have less waste):

    Actually this is where range cutting to avoid carry and "coder termination"
    are different - in range cutting we need to preserve maximum possible range ...
    I know what you mean, but not very well expressed! (Don't want to mock you, just hint that trying to understand and difficulty of expression is a mutual task/failure, right?.)

    Quote Originally Posted by Shelwien View Post
    ... such that low's MSBs won't be affected by carry, while in coder flush we
    need to find the subinterval [x..x+2^y) in the [low..low+range) interval,
    such that y is maximal, and x is aligned. For example:

    Code:
    for( y=0; y<32; y++ ) {
    uint mask = (1<<y)-1;
    uint l = (low+range-1 - mask) & (~mask);
    uint h = l | mask;
    if( (l<low) || (h>low+range) ) break;
    }
    Look into the first post ... you just repeated what I pasted there (I added the code-tags much better I hope, sorry for that.) You only need to be aware that that is a function specific to an imlementation, it won't work with the Ashford-coder for example. It also does not "explain" how it interacts with pending underflow-bits. (the paper is "Arithmetic Coding Revisited")

    Quote Originally Posted by Shelwien View Post
    Note that I don't normally use "termination", because coder flush can
    be used for multiple purposes and doesn't really mean the end of data.
    Well, I use the word "termination" for saying that I "terminate" the current consecusive dependent interval-state the coder is actually in. After "termination" the next succession of interval-states the coder is in, have no relationship with the intervals from before the "termination" point. There is no norm in coder-initialization, I can define any initialization-interval which is just a valid big enough interval, I can initialize it with random (deterministic) big enough intervals if I want to. The sole action of setting the interval to some uncorrelated value is re-initialization.

    Quote Originally Posted by Shelwien View Post
    Sure, carryless rangecoder can be implemented using a flush, but the
    only benefit in that known to me is the possibility to somewhat simplify
    the decoder, comparing to range-cutting carryless.

    And a flush-based carryless with precise shortest-possible flush just
    doesn't make sense because its (relatively) complex and decoder would
    have to resync to input stream, which is plain awful (ie slow).
    Think outside your box ... I got none (!) of the disadvantages you quote, it's not less efficient, and it's not slower, and it's not really complex either. But that's not the point; you have your constraints, and I have mine, and others theirs; you maybe want 50-lines code, I don't mind 40k-lines code. There is no universal truth about "best" or even "simple" here (eg. for not making it nicely in the coder I may/could complicate the system occupying the coder beyond sanity).

    Quote Originally Posted by Shelwien View Post
    flush-based carryless is more redundant than range-cutting carryless
    by definition - because range-cutting tries to maximize the range
    and flush doesn't.
    It's the same man. Why don't we agree on that we disagree? I can adjust to your terminology and stop talking about "termination", it's just the sentences are going to be longer (case this, case that, including those and there) ...

    Quote Originally Posted by Shelwien View Post
    > And that is then pretty much what Charles is doing when he takes the
    > choice between adjusting "low" or better "high"

    1. Actually Bloom's version is incomplete. It doesn't handle the
    low+range overflow anywhere, which means that another branch is
    necessary.
    No, because it's implicit. Think about it.
    Last edited by Ethatron; 15th January 2011 at 01:17.

  16. #16
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > No, because it's implicit. Think about it.

    Ok, what am I doing wrong?

    Code:
    low = 0xFFFFFFFF;
    range = 0x0000FFFF
    
    while( range<(1<<16) ) { // true
        
      int above = ( low + range) & 0xFFFF; // (0xFFFFFFFF+0x0000FFFF)&0xFFFF = 0xFFFE
      int below = (~low) & 0xFFFF; // (~0xFFFFFFFF)&0xFFFF = 0x0000
      
      if( above > below ) { // true
        low += range - above; // 0xFFFFFFFF + 0xFFFF-0xFFFE = 0x00000000 !!! OVERFLOW !!!
        range = above; // 0xFFFE
      } else {
        range = below;
      }
      
      *ptr++ = low>>(RANGE_SIZE-8);
      range <<= 8;
      low <<= 8;
    }

  17. #17
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    The sum of low and range must be below-equal 0xFFFFFFFF. The interval is [0,1) or [low,low+range) or [0xFFFF0000,0xFFFFFFFF]. If you violate that it won't work. Do you have 0x200000000 as interval-boundary?

  18. #18
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Subbotin's coder: http://codepad.org/MEL3YReu

    Well, ok, I guess its a misunderstanding again.
    I implied that Bloom's trick would be used kinda like that - http://codepad.org/TMZrdI8C
    But actually that just doesn't work - without ((low^low+range)>=TOP) that "above" can
    be larger than range etc, so its plain incorrect.

    So I guess its supposed to look like http://codepad.org/Uspvykm9
    In which case low+range<2^32 is maintained allright, but it doesn't look like an improvement at all.

  19. #19
    Member
    Join Date
    Apr 2010
    Location
    El Salvador
    Posts
    43
    Thanks
    0
    Thanked 1 Time in 1 Post
    You have to put it into the decoder-renorm too ...
    Take this version, it's nicer than Subbotin's own version (out of MRP), you'll also immediatly see where Charles code fits :
    Attached Files Attached Files
    • File Type: rar rc.rar (905 Bytes, 286 views)
    Last edited by Ethatron; 15th January 2011 at 15:44.

Similar Threads

  1. M1 - Optimized demo coder
    By toffer in forum Data Compression
    Replies: 189
    Last Post: 21st July 2010, 23:49
  2. How fast should be a range coder ?
    By Cyan in forum Data Compression
    Replies: 33
    Last Post: 16th November 2009, 17:02
  3. A weird order8(?) CM coder
    By Shelwien in forum Data Compression
    Replies: 1
    Last Post: 23rd July 2009, 21:08
  4. flzp_ac2 (flzp + an order-2 arithmetic coder)
    By inikep in forum Data Compression
    Replies: 4
    Last Post: 25th June 2008, 21:37
  5. UNZ - minimal LZW decoder + source
    By encode in forum Forum Archive
    Replies: 7
    Last Post: 29th January 2008, 15:54

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •