Page 1 of 2 12 LastLast
Results 1 to 30 of 35

Thread: Move-to-Front Implementation

  1. #1
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts

    Move-to-Front Implementation

    Hello


    I'm currently considering a candidate MTF algorithm, applied on a large number of Hash Codes.

    The most important issue is performance (i'm planning 32K Hash, potentially more, so this is obviously much more than char sorting).

    I've not found any fast implementation of MTF for such situation up to now. Hence the post.

    Here are some ideas, to be commented :

    Fast sorting, Slow ranking retrieval :
    Fastest MTF algorithm seems to be achieved by using double-linked chain, with trivial insertion/removal.
    However, finding afterwards the "rank" of a hash within the chain is slow at best, awkward and quite frankly wastefull.

    Slow sorting, Instantaneous ranking retrieval :
    Sorting a table with such a large number of items is also prohibitive, large shifts with potentially thousands elements. But at least, once sorted, the table provides ranking instantaneously.

    So here we are, two mechanisms, none of them being really adapted.

    A trivial work-around i came up with was a simple time-ranking counter, which still achieves MTF property and near-instantaneous ranking retrieval, but the ladder has a lot of holes. Thinking about it, this seems very similar to what i could read on LZRW2 some time ago.

    Any better idea ?

    Regards
    Last edited by Cyan; 22nd January 2010 at 17:10.

  2. #2
    Programmer
    Join Date
    May 2008
    Location
    denmark
    Posts
    94
    Thanks
    0
    Thanked 2 Times in 2 Posts
    I thought about a faster MTF method too, once. I came up with an aproximation:

    output = counter - hash[input];
    hash[input] = counter;
    counter++;

    Basically, it outputs how long time ago the input was last seen. But again, it's a bad approximation. I'm not sure if it's the same as the one you mentioned.

    I have my own personal philosophy that the need of MTF is a symptom of an unoptimal algorithm in the prior steps (thereby not claiming that an optimal algorithm isn't some times nearly impossible to find).

  3. #3
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    What about two balanced trees? One will hold MTF symbol indices - MTF symbol index is equal to number of elements before that symbol, so we can hold a subtree size in every node and then retrieval of MTF index has logarithmic time complexity once if start from the symbol. So we can use second tree which will hold all symbols in sorted order and the nodes will have pointers to appropriate nodes in the first tree.

    So the entire algo for retrieving symbol would be:
    1. Search a symbol in symbol tree map (logarithmic time).
    2. Follow the pointer to reach the node in MTF tree (constant time).
    3. Walk to the root of MTF tree:
    - after each level up check if we was in left or right subtree. If we was in right subtree then we add size of left subtree to the result (logarithmic time).

    Updates are also logarithmic time.

    MTF index tree should be ordered by symbols offsets of last appearance.

  4. #4
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    Yes, Lasse, indeed it is the same.
    It's only when completed that it looked very much like a classic LZ references with match sequences skipped, hence analogy with Ross William's LZRW2.

    You right about approximation : a very large number of positions are lost in the process, only the relative ranking remains. But this is sub-optimal in term of compression.

    I have my own personal philosophy that the need of MTF is a symptom of an unoptimal algorithm in the prior steps
    Interesting; i tend to consider it as about the same as BWT : a transformation into a more compressible format, in this case targeted at helping entropy compression by exploiting local similarities.

  5. #5
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    What about two balanced trees?
    Thanks Piotr, looks like an interesting suggestion.
    From a sky-level, 2 reference systems linked with each other.
    Balanced tree are a bit above my current skills, and i need to read your suggestion several times to properly understand it but i'll make sure to have a look.

    Regards

  6. #6
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    http://en.wikipedia.org/wiki/Splay_tree this should be worth looking at.

  7. #7
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    I'd remind about a simple reference implementation:
    http://nishi.dreamhosters.com/u/gmtf_v1.rar

    In fact, for reasonable alphabet sizes (like even 1024)
    it would be probably faster than any complicated method,
    because of simplicity and possible vectorization.

    Also, in theory, the idea with trees may be better with
    some larger alphabets, but I suspect that with contextual
    symbol ranking, especially high-order, it may be hard
    to construct such trees.

  8. #8
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    In splay tree most recenlty accessed nodes are closest to root, so it should be very fast with large alphabet MTF.

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    I mean, what if the alphabet size is 256, but you have 256^3 of such alphabets? (in o3 context)

  10. #10
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Maybe I don't understand. Contexts are separate so there will be 256^3 separate trees.

    Tree based MTF would be efficient if code entire words at once. There are many words, most of them are repeating and those who repeat have an average length higher than three IMO so that should be more efficient than simple approach.

  11. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Well, it doesn't make sense to use something like dictionary ranking.
    1) it'd be slower than many methods of adaptive entropy coding
    2) there'd be redundancy if there's a possibility to encode the same
    data with different word sequences

    Now, contextual symbol ranking does make some sense
    (its something like sequential BWT/ST in theory)
    but again there's no sense to use complex data structures
    due to speed/reason tradeoffs - there'd be better compression algos
    if you'd make it too slow.

  12. #12
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    Thanks to Shelwien's gmtf which works on characters, I just had a few tests with MTF applied on char inputs which i expected *should* benefit from it.

    These inputs are outputs of either BWT or order-N sorting.
    Using a simple hex editor, local repetitions are obvious.

    I've compared MTF followed by entropy, with direct entropy alone.

    And results are somewhat disappointing.
    MTF sometimes benefit, sometimes not.
    I would say it influences compression ratio by a range from -1% to +2%. On average, it is more positive then negative, but by a rather small margin (less than 1%).

    I'm really surprised, i was expecting something much better due to the nature of inputs. It seems a good entropy coder makes most of the job already.
    As a consequence, i'm wondering : given that MTF is very slow compared to any other part of the algorithm (sorting and entropy), is MTF really worth considering, even for BWT ?

  13. #13
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    mmmh, nevermind.
    It seems i did not use correctly gmtf.

    Using "C" (uppercase) command resulted in improved results compared to "c" (lowercase) command.

    Now the gain is improved, into +2% / +4% area.

    It's not great but definately noticeable.
    As a good sidenote, results seems to improve with increased order source, with BWT as one of the better to benefit.
    But I was however expecting even better results when starting this experiment.

    So, the question remain : could another method benefit better from local context ? something like a fastly adaptative entropy coder for example ?

    Edit : Using a very strong 0-order adaptative entropy coder such as Eugene's newbrc_1, MTF does not bring any benefit. So it seems a good entropy coder can get without MTF...
    Last edited by Cyan; 23rd January 2010 at 18:57.

  14. #14
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    If I remember correctly fpaq0f(2) was really good at compressing BWT data. And WFC is one of the strongest methods to encode BWT data. I think it could be done very quickly using GPGPU.

    IIRC:
    Francesco used Huffman coding before arithmetic coding is his Rings compressor. That increased both speed and strength of compression.
    Last edited by Piotr Tarsa; 23rd January 2010 at 19:50.

  15. #15
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    1. Here's gMTF explanation if you missed it:
    http://encode.dreamhosters.com/showthread.php?t=509
    there's a "COEF" constant in MTFg.inc, it affects the
    ranking (COEF=0 is equivalent to MTF, higher values
    are supposedly better for stationary data).

    2. MTF (or any other symbol ranking transform) only improves
    compression of stationary (or completely static) memoryless
    (ie order0) coders.

    3. Here's a real optimized order1 BWT postcoder for reference:
    http://ctxmodel.net/files/mix_test/mix_test_v9.rar

  16. #16
    Member
    Join Date
    Apr 2010
    Location
    CZ
    Posts
    81
    Thanks
    5
    Thanked 7 Times in 5 Posts
    piece of my Java implementation, if this can be useful

    Code:
        int buffer[] = new int[MAX_CHAR + 2], i;
    
        // initialise
        for (i = 1; i <= MAX_CHAR; buffer[i - 1] = i++); 
        buffer[MAX_CHAR] = buffer[MAX_CHAR + 1] = 0;
    
    
        // for each new character; output = position
    
        position = 1; 
        if ((i = buffer[MAX_CHAR + 1]) != character) { 
          while (true) {  
            if (buffer[i] == character) {position++; break;}
            if (buffer[i = buffer[i]] == character) {position+=2;  break;}
            if (buffer[i = buffer[i]] == character) {position+=3;  break;}
            if (buffer[i = buffer[i]] == character) {position+=4;  break;}
            if (buffer[i = buffer[i]] == character) {position+=5;  break;}
            if (buffer[i = buffer[i]] == character) {position+=6;  break;}
            if (buffer[i = buffer[i]] == character) {position+=7;  break;}
            if (buffer[i = buffer[i]] == character) {position+=8;  break;} 
            i = buffer[i]; 
            position += 8; 
          } 
          buffer[i] = buffer[character]; 
          buffer[character] = buffer[MAX_CHAR + 1]; 
          buffer[MAX_CHAR + 1] = character; 
        }

  17. #17
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Here's a draft implementation with 3*(maxbit+1) steps per symbol
    (maxbit=31 in the current version)
    http://nishi.dreamhosters.com/u/umtf_v0.rar

    The algorithm is independent from alphabet size, but to use
    it with larger alphabets, the node allocation has to be fixed.
    (like put more nodes into the pool, or replace the allocation
    with new/delete).

    The idea is "a little" simpler than suggestions here
    with balanced trees and such.
    There's a simple binary tree indexed by symbol's offset in stream,
    and for each node the children counts are maintained, also leaf
    nodes store the actual symbols.

    Init:
    - The tree is initialized with symbols CNUM-1..0 at offsets 0..CNUM-1
    (so initially symbol's rank is equal to its code).

    Forward transform:
    - Read the pointer to leaf node corresponding to the symbol
    and offset of symbol's last occurence
    - Traverse the tree from leaf to root and count the number
    of symbols with greater offsets
    (mtf rank = number of symbols with greater offsets)
    - Remove node(s) only used by current symbol (and uncount the symbol)
    - Reinsert the symbol with current offset (actual first symbol goes with offset CNUM)

    Backward transform:
    - Traverse the tree from root to leaf by known rank value and counts
    - Symbol's code can be found in the leaf node
    - Update the tree (the same way as with forward transform)

    Note that complexity only depends on the number of bits in offset,
    but not on the alphabet size.
    In particular, to improve the speed we can reduce maxbit to 15
    (16 bits per offset) and add offset recomputing after each
    64k-CNUM symbols (sorting the symbol offsets, determining
    their ranks, and reinitializing the tree; also it can be reinitialized
    statically which may be simpler and faster, but is not exactly MtF).

    ----------
    Update:

    Here's a bit more simplified version:
    http://nishi.dreamhosters.com/u/umtf_v1.rar
    - remove .up and leaf-to-root traversing
    - remove custom allocation
    Last edited by Shelwien; 18th April 2010 at 06:26.

  18. #18
    Member
    Join Date
    Apr 2010
    Location
    CZ
    Posts
    81
    Thanks
    5
    Thanked 7 Times in 5 Posts
    I doubt that with smaller alphabets the tree could be generally better than my test array / the linked lists (running tests on the Wine).

    Isn't that the crucial loss on performance when the tree is not balanced ?
    Last edited by quadro; 18th April 2010 at 15:52.

  19. #19
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > I doubt that with smaller alphabets the tree could be
    > generally better than my test array / the linked lists
    > (running tests on the Wine).

    My opinion is the same, its just that after talking a lot,
    we didn't find a working implementation for large alphabets here,
    so I decided to suggest one :)
    I'm not completely sure that this idea can't be faster though,
    when properly optimized. As I said, its possible to reduce the
    number of address bits used, which would also allow to discard
    the dynamic allocation and shrink the tree nodes.
    + the recursive calls can be completely unrolled.
    + maybe 2-bit steps would work better
    + search + remove can be merged
    It would be still tough to compete against the vectorized
    implementation which I posted above (gmtf), but I'm not sure
    about yours :)

    Btw, can you post timings for umtf/gmtf vs your java code
    on enwik8-9 or something?
    Just for evaluation of java performance?

    > Isn't that the crucial loss on performance when the tree is not balanced ?

    The number of steps is always the same and independent from alphabet size,
    so balancing the tree won't help at all here.

  20. #20
    Member
    Join Date
    Apr 2010
    Location
    CZ
    Posts
    81
    Thanks
    5
    Thanked 7 Times in 5 Posts
    Yes, good luck on bigger alphabets,

    I think comparison of the Wine with Java on my hardware will not tell much. I should try to compile from source when I have a time. It seems that Wine is too slow on the large files and enwik9 would last forever.

    Code:
    Linux Mint 7 Gloria - Main Edition @ Kernel 2.6.28-11-generic, EXT3, wine-1.0.1
    
    java version "1.6.0_16" Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
    Java HotSpot(TM) Client VM (build 14.2-b01, mixed mode, sharing)
    
         *-memory
              size: 512MiB; capacity: 3GiB
            *-bank:2
                 description: DIMM DRAM Synchronous
                 slot: DDR 3; size: 512MiB; width: 64 bits
         *-cpu
              product: AMD Athlon(TM) XP 2500+
              version: 6.10.0; slot: SOCKET A; size: 1100MHz
              capacity: 2250MHz; width: 32 bits; clock: 100MHz
            *-cache:0
                 description: L1 cache
                 size: 128KiB; capacity: 128KiB
            *-cache:1
                 description: L2 cache
                 size: 512KiB; capacity: 8MiB
         *-disk
              description: ATA Disk
              product: ST380011A; vendor: Seagate
              bus info: scsi@0:0.0.0; version: 3.04; size: 74GiB (80GB)
    Code:
    time wine gmtf.exe c enwik8 x
    
    real    2m0.655s
    user    1m49.907s
    sys     0m1.484s
    
    
    time wine mtf.exe c enwik8 x
    
    real    4m58.289s
    user    3m46.278s
    sys     0m49.251s
    
    
    time java -jar mtf0.08.jar c enwik8 x
    
    real    0m18.600s
    user    0m13.289s
    sys     0m2.396s
    Last edited by quadro; 19th April 2010 at 00:56.

  21. #21
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Ok, I would keep in mind that java is actually 10x faster than C++
    You could try compiling my versions from source though

  22. #22
    Member
    Join Date
    Apr 2010
    Location
    CZ
    Posts
    81
    Thanks
    5
    Thanked 7 Times in 5 Posts
    I see getc(), putc() vs when I have used the read buffer, that was not a fair measure

    or I don't know what's wrong, it seems the same times even when recompiled
    Last edited by quadro; 19th April 2010 at 23:51.

  23. #23
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Quadro:
    Could you post a link to your program? (JAR Package I mean).

    Shelwien:
    Your code doesn't compile under GCC 4.4.3 on my Ubuntu Lucid x64.


    BTW:
    Anyone tried to create an Unicode aware BWT or PPMD compressor?

  24. #24
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > Your code doesn't compile under GCC 4.4.3 on my Ubuntu Lucid x64.

    Sorry, but it compiles with anything I tried.
    Try with -m32? Or be more specific? :)

    > Anyone tried to create an Unicode aware BWT or PPMD compressor?

    Nothing widely known I guess.
    Its a hard problem for both methods to work with large alphabets,
    so its normally avoided by some kind of preprocessing... like
    converting unicode to utf8 and then compressing :)

    A real unicode CM could be cool though (it doesn't really make
    sense for BWT and PPM due to performance reasons),
    Last edited by Shelwien; 20th April 2010 at 02:35.

  25. #25
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    200
    Thanks
    41
    Thanked 36 Times in 12 Posts
    -m32 does the trick:

    will:~/Code/Temp/umtf_v1$ g++ -o mtf mtf.cpp

    mtf.cpp: In member function ?uint mtf::symbol(int, int, node*)?:
    mtf.cpp:58: error: cast from ?node*? to ?uint? loses precision

    will:~/Code/Temp/umtf_v1$ g++ -o mtf mtf.cpp -m32

    will:~/Code/Temp/umtf_v1$

  26. #26
    Member
    Join Date
    Apr 2010
    Location
    CZ
    Posts
    81
    Thanks
    5
    Thanked 7 Times in 5 Posts
    2Piotr:

    Here is my archive, just rename on jar if you wish. I have there some additional RLE around, but it's not much harming the overall performance.
    Attached Files Attached Files

  27. #27
    Member
    Join Date
    Apr 2010
    Location
    CZ
    Posts
    81
    Thanks
    5
    Thanked 7 Times in 5 Posts
    I tried the same algo also on decoder part and this seems about 20~30% faster.

    Maybe some kind of free structure can rapidly improve this part?.

    Code:
    petr@petr-desktop ~/Desktop/406/mtf0.09 $ java -jar rle0.09.zip d xxx enwik8
    Decompression time: 16.81 s.
    
    petr@petr-desktop ~/Desktop/406/mtf0.09 $ java -jar rle0.08.zip d xxx enwik8
    Decompression time: 25.707 s.
    Decoder code
    Code:
        int input, output, i, b[] = new int[MAX_BYTE + 2];
    
        // initialise
        for (i = 1; i <= MAX_BAJT; b[i - 1] = i++);
        b[MAX_BYTE] = b[MAX_BYTE + 1] = 0;
    
        // for each new character, new input
    
        if (input > 0) {
          i = b[MAX_BYTE + 1];
          while (input > 1) {
            i = b[i];
            input--;
            if (input > 4) {
              i = b[b[b[b[i]]]];
              input-=4;
            }
          }
          b[i] = b[i = b[i]];
          b[i] = b[MAX_BYTE + 1];
          b[MAX_BYTE + 1] = i;
    
          output = (byte)i;
        } else output = (byte)b[MAX_BYTE + 1];
    Attached Files Attached Files
    Last edited by quadro; 25th July 2010 at 18:48.

  28. #28
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 1 Time in 1 Post
    After some experiments with MTF I found version which is best for BWT data:

    Code:
            unsigned char previousChar = currentChar, rank = 0;
            while (true)
            {
                unsigned char temporaryChar0 = MTFTable[rank + 0]; MTFTable[rank + 0] = previousChar;
                if (temporaryChar0 == currentChar) {rank += 0; break; }
    
                unsigned char temporaryChar1 = MTFTable[rank + 1]; MTFTable[rank + 1] = temporaryChar0;
                if (temporaryChar1 == currentChar) {rank += 1; break; }
    
                unsigned char temporaryChar2 = MTFTable[rank + 2]; MTFTable[rank + 2] = temporaryChar1;
                if (temporaryChar2 == currentChar) {rank += 2; break; }
    
                unsigned char temporaryChar3 = MTFTable[rank + 3]; MTFTable[rank + 3] = temporaryChar2;
                if (temporaryChar3 == currentChar) {rank += 3; break; }
    
                rank += 4; previousChar = temporaryChar3;
            }
    Enjoy coding, enjoy life!

  29. #29
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    it looks like trivial loop unrolling?

  30. #30
    Member
    Join Date
    Apr 2010
    Location
    CZ
    Posts
    81
    Thanks
    5
    Thanked 7 Times in 5 Posts
    It seems to be twice as fast to the classic array version. (joining the search and insert parts).

    While I am using a bit different data structure, which makes it about 30% faster (on Java).

    Code:
    petr@petr-desktop ~/Desktop/406/mtf0.09 $ java -jar mtf_array.jar c o.mp4 xxx
    Compression time: 75.692 s.
    
    petr@petr-desktop ~/Desktop/406/mtf0.09 $ java -jar mtf_bsc.jar c o.mp4 xxx
    Compression time: 37.409 s.
    
    petr@petr-desktop ~/Desktop/406/mtf0.09 $ java -jar rle_0.09.jar c o.mp4 xxx
    Compression time: 24.186 s.

Page 1 of 2 12 LastLast

Similar Threads

  1. Implementation of JPEG2000 LLC based on LZMA
    By Raymond_NGhM in forum Data Compression
    Replies: 0
    Last Post: 19th March 2010, 01:14
  2. Statistical implementation of Ziv-Lempel
    By thomas in forum Data Compression
    Replies: 3
    Last Post: 10th February 2009, 20:13
  3. New move-to-front variant?
    By Lasse Reinhold in forum Forum Archive
    Replies: 7
    Last Post: 17th September 2007, 16:05

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •