Results 1 to 9 of 9

Thread: Counting 1 or 0?

  1. #1
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I have notice that some programmers use "zero" for probability calculation (i.e. Igor Pavlov, Shelwien and me) and the others use "one" for probability calculation (i.e. Matt Mahoney). At my point of view it's only a decision. What about the underhood facts? Which solution is offered by information theory? Which solution do you use? Why?
    BIT Archiver homepage: www.osmanturan.com

  2. #2
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Here's my opinion:

    The coding cost ld(1/p) of a probability p approaching 0 approaches infinity, so coding such an event requires an infinite number of bits (ignoring that a division by zero isn't defined).

    But how "worse" is infinity? In practise, the worst output, which can happen is a complete flush of your arithmetic coding registers, if they become equal (hi=lo or range=0) (when using arith. coding). Normally implementations use 32 bit registers, so infinity is actually 32 bits.

    Maybe there's something else reasoned by rounding, symmetry in arith. code space, i never bothered ...
    M1, CMM and other resources - http://sites.google.com/site/toffer86/ or toffer.tk

  3. #3
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks! But, my question is about modeling the data. Not coding itself. I mean if I have an entropy coder which is suitable for P(0) and I want to use P(1) instead, I will implement my model based on P(1) and I will implement my coder for encoding (1-Symbol). In this explanation assume that Symbol is a bit which is actually coded.

    On the other hand, we can expand my question based on your answer. Also, what about entropy coder side?
    BIT Archiver homepage: www.osmanturan.com

  4. #4
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    You are talking about binary arithmetic coding (or modelling data bit by bit), right? In this case the above applies too.

    You have to concider that your model should output "stuff", here probabilities, which your coder can handle. So when talking about modelling, this is tied with your entropy coder; this is the motivation to answer your question in the sense of coding.

    With arithmetic coding this refers to avoid overflows of: hi<lo. This might happen if your P(1)-P(0) >= S, where [0...1) -> [0...S-1). So when modelling P(1) and coding P(0), make sure the above can't happen.

    But again, this is implementation specific....

    Maybe some of the experts here can give you a better answer.

    BTW: my CM coders always code an EOF flag (either 0 or 1) preceeding the next symbol with a probability of EOF=1 set to zero (which causes an EOF to flush the arithmetic state, outputting 32 bits). Previousely i coded and modeled P(0), now i use P(1), since you can implement bit models in a simpler manner.
    M1, CMM and other resources - http://sites.google.com/site/toffer86/ or toffer.tk

  5. #5
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Yes, I'm talking about modelling data bit by bit. And if I write my question in the other words, this can be like that:

    In bit models (data modelled bit by bit), which is the suitable way for coding data: counting 1 or 0? Don't care entropy coding phase. Just focus on modelling.
    BIT Archiver homepage: www.osmanturan.com

  6. #6
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Do you mean, which symbol you should model? Modelling S0 (the zero bit) or S1 (the one bit). You confused me - i thought you mean to allow a probability P(Sx) to have the value zero.

    In this case: Simple answer: it doesn't matter.

    If you use bitmodels, which are updated proportional to the coding error 1-P(Sx):

    P(Sx) += (1-P(Sx)) / R

    You can gain a *tiny* amount of speed when modelling S1:

    P(S1) += (y-P(S1)) / R

    where y is the bit. If you used models based on S0:

    P(S0) += ((~y) - P(S0)) / R

    You see - one more instruction, the ~ "not", which inverts y=0 to y=1.

    When you use table based updates, it doesn't matter.
    M1, CMM and other resources - http://sites.google.com/site/toffer86/ or toffer.tk

  7. #7
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    So sorry for confusing you I've been talking about exactly which symbol we should model. Sorry again. Thanks for replies
    BIT Archiver homepage: www.osmanturan.com

  8. #8
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Quote Originally Posted by toffer
    Simple answer: it doesnt matter.
    exactly.

    if sum of probabilities of 0 and 1 is constant (as in fpaq0p) then you dont model probability of 0 or probability of 1 but you model ratio of probabilities of 0 and 1.

    if sum of probabilites of 0 and 1 is not constant (as in fpaq0) then you must model two probabilities - of 0 and of 1.

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    It doesn't really matter for which value you store/update the probability as P1=1-P0.
    Though optimal P1-based and P0-based implementations of the same models would
    have slightly different formulas for updates and such - mostly due to rounding.
    Also there're no real speed- or source simplicity-based preferences, though Matt's
    counters and such look more complex when directly converted to P0 version.

    So I think that Matt's choice was based on a "common sense" association of
    0 with zero increment and 1 with non-zero.

    As for me (and whole compression.ru group, I guess), there weren't any conscious
    choice. Its just that for some time the "main stream" was models with byte alphabet
    and P0..P255 distrubutions updated all at once (its still quite useful btw, especially
    it you want a fast compressor). In which case its obvious that one should increase
    the probability of encounted byte and decrease others. And then there just
    wasn't any sense in changing anything after switching from alphabet size 256 to 2
    (for unary coding etc), as the counters etc worked good without any change.
    Then, its also a habit already, so P0-based model is easier to understand for me.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •