+ Reply to Thread
Page 4 of 5 FirstFirst ... 2345 LastLast
Results 91 to 120 of 123

Thread: Ultra-fast LZ

  1. #91
    Member
    Join Date
    Jun 2009
    Location
    Cracov, Poland
    Posts
    713
    Encode, since you're into LZ lately, I would be interested if you play with Morphing Match Chains (by Cyan & me) or try to convert BWT output or Suffix Trees to some simple trees that would help greatly exhaustive searches.

  2. #92
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    Basically I do plan to release something that will strike. A program that will be very practical, unique, people will compare it against own software, refer to it, and so on. I need an explosion. So, I already put many work on CRUSH and tested countless number of ideas, including Morphing Chains (Guess the name is not that correct), Binary Tree structures, etc. A key feature of the program is a Fast compression mode - 50 MB/sec on modern PC with a compression on par with Deflate's best modes. And probably the best data structure for that mode is Hash Chains. The main benefit of such solution is extremely fast update of a structure - we just skip to next position briefly. Many more advanced algorithms need a special handling, a special update that is slower. Anyway, I continue experimenting...

  3. #93
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    400
    hi encode

    The main benefit of such solution is extremely fast update of a structure - we just skip to next position briefly. Many more advanced algorithms need a special handling, a special update that is slower.
    A quick clarification :
    MMC starts being a classic Hash Chain.
    The main idea driving its development was to remove from the chain elements which we know are no longer useful (ie cannot produce better length) in the current search context.

    Consequently, its update method is exactly the same as classic Hash chain : elements are quickly added without further handling. This is, as you say, in stark contrast with other complex data structure, such as BST, which requires sorting data on insertion. MMC doesn't bring such penalty (neither classic Hash Chain).

    In Greedy parsing strategy, it makes a huge difference, since a lot of data will never be sorted, or will only be partly sorted later on, just as needed.
    This conclusion still holds true with lazy match strategy.

    However, for Optimal parsing, BST is very likely to take the lead, since each and every position is searched and inserted. I haven't test it yet, but it sounds logical comparing both behaviors.


    That being said, MMC is still about a full "no compromise" search strategy. For faster speed, a Hash Table with short rows is going to provide faster performance, at the cost of reduced match search capability.


    Regards

  4. #94
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302

    Question

    Actually, It's hard to read other's code. Probably I'm not completely understood the MMC idea. Can you provide a few-line pseudo-code. For hash chains, the code will look like:

    Code:
    // Search
    
    ptr=head[hash];
    
    while (ptr)
    {
    
        // compare string at ptr
    
        ptr=prev[ptr&W_MASK];
    }
    
    // Update
    
    // if literal, len=1
    
    while (len--)
    {
        prev[pos&W_MASK]=head[hash];
        head[hash]=pos;
    
        ++pos;
    
        hash=UpdateHash(pos);
    }
    How to transform it to MMC?

  5. #95
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    400
    You need 2 pointers per position :
    The "bad one" will simply go to next position, as usual.
    The "good one" means i'm happy with the current position, which improves match length. The "good pointer" will link directly to another "good position", skipping a lot of "bad" positions in the process.

    So, in pseudo code, it looks like this :

    Code:
    // Search
    
    ptr=head[hash];
    
    while (ptr)
    {
    
        // compare string at ptr
    
        if (matchlength>currentlevel) { 
            if (goodprev[ptr&W_MASK] == NULL) {
                ptr=badprev[ptr&W_MASK]; 
                rememberGoodPrev = &goodprev[ptr&W_MASK]; 
            } else {
                *rememberGoodPrev = ptr;
                ptr=goodprev[ptr&W_MASK]; 
                currentlevel++;
            }
        }
        else ptr=badprev[ptr&W_MASK];
    }
    
    // Update
    
    // if literal, len=1
    
    while (len--)
    {
        goodprev[pos&W_MASK]=NULL;
        badprev[pos&W_MASK]=head[hash];
        head[hash]=pos;
    
        ++pos;
    
        hash=UpdateHash(pos);
    }
    As you see, there is a lot in common with regular hash chain.

    The open-source implementation proposed at google code (http://code.google.com/p/mmc/) is actually more complex, because it integrates a few more tricks to improve speed (such as Secondary promotions, proposed by Piotr).

    There is a more complete explanation of algorithm and tricks used at this page :
    http://fastcompression.blogspot.com/...tch-chain.html

    Regards
    Last edited by Cyan; 23rd April 2011 at 01:26. Reason: Modified source code for more completeness

  6. #96
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    In your pseudo code above, you never update goodprev[]! Meaning it is always NULL.
    Is "currentlevel" the best match length found so far?

  7. #97
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    400
    Indeed. The goodprev update is done during the search.

    I've omitted it into pseudo code, for better clarity. It's explained in the picture below :



    We need these concepts :

    ==> Linking : The first time we find a "good position", by construction its pointer is NULL. We keep that in memory. On the next "good position" found, we populate the "good pointer".

    ==> Promotion : When a position is linked by a "good pointer", it is no longer necessary to keep it linked by "bad ones". So we remove it from the chain of bad pointers.

    [Edit] :

    Is "currentlevel" the best match length found so far?
    "currentlevel" is a guaranteed minimum number N of common characters we'll get by continuing on the current chain, using bad pointers. Following a "good pointer" guarantees to find chains with at least "N+1" characters from now on. It makes the search more and more precise.
    Last edited by Cyan; 22nd April 2011 at 16:01.

  8. #98
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    BWT, why listed mmc.h only? (http://code.google.com/p/mmc/downloads/list?can=1&q=)

    Also, can you please update the pseudo-code? Many users will unable to read source code and it's important that users will read a small pseudo code to understand the idea. The picture is not that clear to understand what's going on...

  9. #99
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    400

    Thumbs up

    Mmmh, that's a part of google code that i don't understand completely.
    "Downloads" category seems to be about files to download on top of source files. Typically binary. I will have a look into it again.

    The correct place to read and download the source code is here :
    http://code.google.com/p/mmc/source/browse/#svn%2Ftrunk

    I will update the front page so that it's easier to find.

    Also, can you please update the pseudo-code?
    Sure, edited.
    Last edited by Cyan; 2nd May 2011 at 19:09.

  10. #100
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302

    Talking

    Thanks a lot!

  11. #101
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    Slightly stuck in a middle with a Match Finder experiments. Just because a bigger dictionary = better compression with no decompression speed loss and the Match Finder speed/performance is the only actual thing that limits the dictionary size.

    In addition, we should define what we want get - a Deflate competitor or an LZMA competitor with LZOP decompression speeds. Anyway, CRUSH will have at least 1 MB dictionary which is definitely not a Deflate level.

    Did many optimizations and experiments with my Optimizer. It's my first LZ that was optimized/tested thru my Parameter Optimizer!

    Another question, should I add an integrity checking natively? As example ADLER32 or CRC32. Currently, CRUSH has some simple LZ-stream checking that most likely may detect corrupted files.

  12. #102
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    1,895
    > CRUSH will have at least 1 MB dictionary which is definitely not a Deflate level

    There's also rar/lzh and cab/lzx

    > should I add an integrity checking natively?

    Its very easy to add if there's such a requirement, so it doesn't seem as much of a benefit.
    And its certainly bad for benchmark ratings.

  13. #103
    Member
    Join Date
    Jun 2009
    Location
    Cracov, Poland
    Posts
    713
    I think archivers should support any hash/ checksum algorithm user want. Today's CPUs are getting hardware crypto engines, like SPARC T3 - look at: http://en.wikipedia.org/wiki/SPARC_T3 . It mentions that T3 has hardware to compute: DES, 3DES, AES, RC4, SHA1, SHA256/384/512, Kasumi, Galois Field, MD5, RSA to 2048 key, ECC, CRC32. Some hashes are also easily computable on GPGPU.

  14. #104
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    2,732
    are you own T3?

  15. #105
    Member
    Join Date
    Jun 2009
    Location
    Cracov, Poland
    Posts
    713
    Not really, but VIA Nano has support for AES encryption, SHA-1 and SHA-256 hashing

  16. #106
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    149
    QuickLZ encodes a hash table position (level 1 and level 2) or an offset to previous data (level 3). What about ULZ v0.02?

  17. #107
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    ULZ 0.02 is a pure LZ77. It encodes Match Length/Match Offset pair. The compression levels just determine how deep the string searching will be... Decompressor is the same for all modes, as expected.

  18. #108
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    Still have no luck with MMC. Will continue trying...

    Anyway, some testing results with current match finder:

    ENWIK8

    Normal, 1 MB - 100,000,000 -> 33,942,665 in 15.600 sec
    Normal, 2 MB - 100,000,000 -> 33,297,420 in 30.127 sec
    Normal, 4 MB - 100,000,000 -> 32,837,273 in 54.194 sec

    Max, 1 MB - 100,000,000 -> 32,983,850 in 61.455 sec
    Max, 2 MB - 100,000,000 -> 32,260,938 in 142.421 sec
    Max, 4 MB - 100,000,000 -> 31,659,247 in 396.854 sec


    bookstar

    Normal, 1 MB - 35,594,240 -> 13,612,045 in 6.688 sec
    Normal, 2 MB - 35,594,240 -> 13,390,039 in 13.071 sec
    Normal, 4 MB - 35,594,240 -> 13,245,085 in 24.075 sec

    Max, 1 MB - 35,594,240 -> 13,208,069 in 23.649 sec
    Max, 2 MB - 35,594,240 -> 12,969,102 in 57.161 sec
    Max, 4 MB - 35,594,240 -> 12,780,865 in 156.668 sec



    Taking into account that these timings are for Sandybridge madness...

    Well, since, like I said, at any compression mode/dictionary size, the decompression speed is on par with LZOP or even faster, the only question is how much time we may spend on compression vs compression gained. A better match finder may help here for sure.

  19. #109
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    563
    If you're processing the data blockwise, why don't you use a BWT-based match finder? On the other hand, you may use a data structure called prefix-list, which allows you to build a sorted list of all strings appearing in the input (similar to BWT). Some modifications to that data structure even allow a dynamic update mechanism (you can delete and insert strings, i.e., implement a sliding window).

    http://callisto.nsu.ru/documentation...yooko_2000.pdf
    M1, CMM and other resources - http://sites.google.com/site/toffer86/ or toffer.tk

  20. #110
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    Well, probably the most advantageous is a Fast mode (not Normal or Max) that compresses enwik8 to 37,xxx,xxx bytes within 2 secs. Most likely I will unable go that fast with BWT... In addition, we must search strings on the fly to obtain specific optimizations - i.e. longest match might be not that best - a shorter match with a smaller offset can be preferable.

  21. #111
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    400
    Still have no luck with MMC. Will continue trying...
    Did you have trouble using MMC source code as available at Google Code (http://code.google.com/p/mmc/)
    or do you mean you're currently trying to code your own MMC implementation ?

  22. #112
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Kce, PL
    Posts
    1,037
    encode, did you actually test decompression speed recently?
    I wonder how does large dictionary affect cache efficiency.

  23. #113
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    Unfortunately, these days I have a small amount of spare time. So, I just briefly tested/implemented the MMC idea from the pseudo code above. The speed is not improved, most likely something wrong. I need time to get what's going on inside the source, that's why I asked for proper pseudo code. As a note, the hash chains' pseudo code will work!

  24. #114
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    Quote Originally Posted by m^2 View Post
    encode, did you actually test decompression speed recently?
    I wonder how does large dictionary affect cache efficiency.
    Bigger dictionary = extra matches = better compression = faster decompression.

    CRUSH is too fast to catch it - bigger dictionary makes it just faster at decompression.
    That's why I'm trying keep the largest dictionary acceptable by match finder.

  25. #115
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302

    Talking

    Moving towards. At some point I should stop fighting for the biggest dictionary possible, since CRUSH is not about the max.compression possible - I even abandoned the Huffman for a faster decompression. Anyway, here are the new results (CRUSH with 1 MB window):

    ENWIK8
    100,000,000 -> 32,687,373

    ENWIK9
    1,000,000,000 -> 289,163,252

    world95.txt
    2,988,578 -> 698,153

    3200.txt
    16,013,962 -> 5,636,735

    calgary.tar
    3,152,896 -> 999,605


  26. #116
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    169
    loving your work, following your posts closely

    can you also include compression and decompression timings when noting the ratios?

  27. #117
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302

    Talking

    For a long time the compression/decompression time is the same since I keep the same match finder and my integer codes are closely to be transparent at decompression. Many recent compression improvements came from better match parsing/optimization and more optimized integer codes.

    BTW, thanks for your words, this gives me more energy and enthusiasm to work more and harder on CRUSH!

  28. #118
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,302
    Improved variable-length integer codes. Now I use prefix codes similar to static Huffman ones.

    New results:

    calgary.tar
    3,152,896 -> 988,685

    world95.txt
    2,988,578 -> 693,449

    ENWIK8
    100,000,000 -> 32,577,338

    ENWIK9
    1,000,000,000 -> 287,333,602

    Nice improvement, taking into account that only the match length coding was changed. Also, the complexity of such new codes are similar to previous variable-length integer codes, thus decompression speed is the same. Also, such codes are faster than static Huffman and equally efficient in terms of compression, thanks to my brute-force optimizer. And it's interesting direction, since such codes may replace static Huffman in many situations...

  29. #119
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    169
    Do I understand this correctly? The compression comes purely from LZ coding, right? The literals are not further compressed, and the literals are all byte-aligned?

    Do you have stats on how many literal bytes are in the output of, to pick one, ENWIK8?

  30. #120
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    382
    Faster VLCs than Huffman sound pretty good! When do you plan to release?
    I am... Black_Fox... my discontinued benchmark

+ Reply to Thread
Page 4 of 5 FirstFirst ... 2345 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts