Results 1 to 5 of 5

Thread: DMC - turning it from binary PPM to binary CM

  1. #1
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts

    DMC - turning it from binary PPM to binary CM

    Classical Dynamic Markov Coding can be seen as a binary Prediction by Partial Matching. This is because at any given time you hold reference (call it cursor here) to one state. This state contains probability of next bit being 0 (or 1 in alternative implementations) and also pointers to next states. One cursor * one prediction pointed per cursor = 1 prediction per bit. If we change it and start a new cursor at each input byte boundary we will have multiple cursors traversing the DMC binary tree during modelling. Multiple cursors * one prediction pointed per cursor = multiple predictions per bit. After that you do Context Mixing on them. What do you think? Has anyone tried that?

  2. The Following User Says Thank You to Piotr Tarsa For This Useful Post:

    Gonzalo (25th July 2018)

  3. #2
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Hungary
    Posts
    343
    Thanks
    235
    Thanked 226 Times in 123 Posts
    I tried it with 2-8 contexts when enhancing DMC in paq8px. Didn't work out. Most probably I messed it up.

    There are some things to keep in mind: sometimes these cursors will unite/collide. Whenever you detect a collision, you must reset one of the cursors (to point to node 0). Since a collision may happen in the middle of a byte you'll need to wait until you reach a byte boundary to "relaunch" the cursor from node 0.
    Initially, at the very beginning all the cursors are pointing to node 0, so they are in a "collision" state. You'll need to let one of them advance to the next node and reset all the others.
    I also know that the more cursors you have the more often a node cloning might occur. Cloning costs memory, and DMC is already very memory hungry. For 2 cursors (2 contexts) you'll need a bit less then 2x RAM. When I realized that I gave up. You may circumvent it by using probabilistic cloning: the more the difference between the count and the threshold value is the more probably you would clone it.

    I got this far.

    ps. I don't consider DMC and PPM similar. However, they are related, that's true.

  4. #3
    Member
    Join Date
    Sep 2015
    Location
    Italy
    Posts
    216
    Thanks
    97
    Thanked 128 Times in 92 Posts
    Quote Originally Posted by Piotr Tarsa View Post
    Has anyone tried that?
    Yes, when I tried to improve the dmcModel on paq8px: I added a second "cursor", but I don't remember if the gain was very small or if it was sometimes better and sometimes worse.
    Quote Originally Posted by Piotr Tarsa View Post
    start a new cursor at each input byte boundary
    (Gotty already answered to this point, however I leave what I wrote offline)
    Check if there is already a cursor in the same position, because a cursor can jump to the first node of the DMC tree and if you also start a new cursor, you will have 2 identical cursor (at least in the first few bytes).
    The cursors can also be used to update the model other than predict a bit.

    Some my comments about DMC (so they may be wrong):
    - I guess that a mixer can't learn exactly how good is a DMC model because the prediction can vary for a specific context, e.g.: if data is "abcdefgh." and we must predict ".", DMC can return a prediction based on "h" or "fgh" or "abcdefgh" even if it already saw "abcdefgh" more times.
    - We can try to limit the depth of the DMC tree, it reaches (IIRC) > 200 bytewise order in (IIRC) world.txt.
    - Ocamyd is an old compressor and was good at that time, it mixes 3 different DMC model + other models, I tried its update conditions on paq8px unsuccessfully.
    New site: https://sites.google.com/site/ocamyd/Home
    Old site: http://www.geocities.ws/ocamyd/
    You'll find also DMC in Java + brief info about Suzanne Bunton (in her Ph.D. thesis she used a 256-ary alphabet DMC and merged it with PPM).
    - Years ago I tried a DMC compressor where the input was an Huffman code instead of 8 bit, IIRC the result was bit worse (maybe it was buggy).
    - It was demonstrate that DMC is worse than PPM because PPM is greedy and works with many contexts simultaneously and DMC only 1 per time.
    - Today I found this paper: "An Exploration of Dynamic Markov Compression" https://ir.canterbury.ac.nz/bitstrea...pdf?sequence=1 with bit, character and word DMC.

  5. The Following User Says Thank You to Mauro Vezzosi For This Useful Post:

    Gotty (25th July 2018)

  6. #4
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Hungary
    Posts
    343
    Thanks
    235
    Thanked 226 Times in 123 Posts
    Extra:

    Here is a paper which contains some really interesting ideas, like "getting free nodes by merging nodes", and crosslink: "before allocating a new node, we look for the closest prior history in already-allocated nodes, and if it's "close enough" (as set by a threshold), we make that the next node in our chain." [...] "This produces Markov models with both acyclic joins and repeating loops, and effectively prevents the model from growing without bound, as the model reuses nodes quite effectively (sometimes more than a 300:1 reuse!)."

    However couldn't find implementation / more details / results.

    http://crm114.sourceforge.net/docs/BitEntropy.txt

  7. The Following User Says Thank You to Gotty For This Useful Post:

    Mauro Vezzosi (25th July 2018)

  8. #5
    Member
    Join Date
    Sep 2015
    Location
    Italy
    Posts
    216
    Thanks
    97
    Thanked 128 Times in 92 Posts
    Quote Originally Posted by Gotty View Post
    However couldn't find implementation / more details / results.
    Take a look here:
    http://crm114.sourceforge.net/wiki/d...id=bit_entropy
    https://github.com/pmundkur/libcrm11..._bit_entropy.c
    ( http://crm114.sourceforge.net/wiki/doku.php?id=download )

Similar Threads

  1. Simple binary rangecoder demo
    By Shelwien in forum Data Compression
    Replies: 35
    Last Post: 17th June 2019, 16:21
  2. Binary sequence compression
    By smjohn1 in forum Data Compression
    Replies: 23
    Last Post: 8th December 2017, 02:48
  3. Crook, a new binary PPM compressor
    By valdmann in forum Data Compression
    Replies: 25
    Last Post: 19th March 2012, 17:12
  4. Delta: binary tables preprocessor
    By Bulat Ziganshin in forum Forum Archive
    Replies: 14
    Last Post: 1st April 2008, 09:43
  5. Bytewise vs Binary
    By Shelwien in forum Forum Archive
    Replies: 9
    Last Post: 30th March 2008, 16:51

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •