Results 1 to 15 of 15

Thread: RCM - Fast byte-aligned dictionary compressor

  1. #1
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts

    RCM - Fast byte-aligned dictionary compressor

    Hello folks,

    This is my first release for RCM (realtime compression madness), a fast byte-aligned dictionary compressor with 64KB windows. I plan to post source after cleaning up a few parts.

    Tests with enwik8 and enwik9 (semi-optimal parse, executable attached):
    enwik8: 41,251,611
    enwik9: 366,945,096

    The format is very simple. A 4-byte int (block size, worst-case is <1.01*64KB) indicating the length of the compressed block follows zero or more packets. Each packet starts with 2 bits:
    0 - number of literals follows, up to 6 bits: 0 to 63. If >= 63, next bytes encode further length. Example:
    Code:
    int sym_lit = token >> 2;
    
    int num_literals = sym_lit;
    if (sym_lit == 63)
    {
        while (sym_lit >= 0)
        {
            sym_lit = input[rp++];
            num_literals += sym_lit;
            sym_lit -= 255;
        }
    }
    1 - Short match. Length is the next 3 bits, min length (4) to min length + 7. Distance is the next 3 bits, 1 to 8.
    2 - Medium match. Length is the next 6 bits, 1 to 64. Distance is the next two bytes.
    3 - Long match. Same as medium match, but length also includes the next byte (up to 6+8 bits, 16k max match length).

    Example for these:
    Code:
    int len, dist;
    if (op == 1)
    {
        len = (token >> 2) & 7;
        dist = (token >> 5) & 7;
    
        int sym_dist = input[rp++];
        dist += sym_dist << 3;
    }
    else if (op & 0x2)
    {
        len = (token >> 2) & 63;
    
        dist = input[rp++];
        dist += input[rp++] << 8;
    
        if (op == 3)
        {
            len += input[rp++] << 6;
        }
    }
    
    len += MATCH_MIN;
    dist++;
    The executable I have attached uses a sliding window, so input size block size is 2*64 KB (read every 64 KB) and output size is 1.01*64KB. Reading small I/O blocks is bad for performance, so timings are not great. There are two compression modes, fast and semi-optimal (greedy on long matches to stop wasting time in long regions). Larger window sizes would improve compression. I chose 64 KB for 16-bit addresses and because the fast compressor (hash chain) has an overhead of ~80 KB (needed low overhead and on stack memory for my purposes).

    Commands:
    Code:
    c [input] [output] - Compress input file to output file (fast)
    co [input] [output] - Compress input file to output file (best)
    d [input] [output] - Decompress input file to output file
    h [input] - CRC32 for input file
    Disclaimer: I use the same algorithm as a pure in-memory compressor for a commercial project (my IP). However, I believe others can benefit from sharing the results, including the reference source (once I clean it up) as free for non-commercial purposes.
    Attached Files Attached Files

  2. The Following 2 Users Say Thank You to cade For This Useful Post:

    Gonzalo (27th September 2016),Nania Francesco (27th September 2016)

  3. #2
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @cade: thank you for uploading your new nice program

    RCM - Written by Nauful
    Options:
    c [input] [output] - Compress input file to output file (fast)
    co [input] [output] - Compress input file to output file (best)
    d [input] [output] - Decompress input file to output file
    h [input] - CRC32 for input file

    1. the program compress and decompress correctly

    2. the program compress and decompress fast
    - but not very fast (zstandard compress better in a shorter time)
    -
    rcm c ..\DTA\backup.dat ..\result\rcm-c
    Completed in 1089.536 sec

    rcm c0 ..\DTA\backup.dat ..\result\rcm-co
    Completed in 1070.057 sec

    rcm d ..\result\rcm-c backup.dat
    Completed in 806.829 sec
    -
    source file: backup.dat has 65.885.952.000 bytes
    result file: rcm-c has 17.540.168.342 bytes
    -
    3. in my test option "c" and option "co" produce identically archivfiles

    4. rcm can not store/restore the date&time-stamp

    for a first release i think it is very well

    - maybe if you publish the source
    someone can do further optimizing of the runtime ...

    best regards

  4. #3
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    Thanks for testing. Compression and decompression use 64KB IO blocks. This is slow for IO operations, preferably I would read/write 512 KB or more per call, but I don't do that here for minimal memory overhead. I have no speed or platform optimizations right now, this is just portable C++. I would be curious to see what others might do to make it faster. Source is coming after I try a few other thoughts and clean it up.

    c and co are different paths that produce output compatible with the same decompressor. Try on enwik8, you can see that the compression time is slower, memory usage is different and resultant size is different. Decompression is the same because there is only one format. Date and other meta-data are not included in the format. This is only for compressing a byte sequence, no container (no info about files).

  5. #4
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    now i have tested RCM with a shorter input-file (oracle-dump-file)

    in this test option "c" and option "co" produce not identically archivfiles

    exp_db.dmp has 4.498.292.736 bytes
    rcm-exp-c has 1.144.301.184 bytes
    rcm-exp-co has 1.019.097.033 bytes

    ---
    rcm c D:\DATA\exp_db.dmp ..\result\rcm-exp-c
    Completed in 71.085 sec

    rcm co D:\DATA\exp_db.dmp ..\result\rcm-exp-co
    Completed in 1085.859 sec

    rcm d ..\result\rcm-exp-co exp_db.dmp
    Completed in 60.644 sec
    ---
    @cade: can you please try to increase the window size (to 128 kbytes or 256 kbytes)

    it would be interesting to see the effect to the compression result and the necessary compress/decompress time

    best regards

  6. #5
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    I am very curious and interested in testing the program to place in WCC Benchmark but not the program continuously asks me which DLLs

  7. #6
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    You probably need to install these (MSVC 2015 runtimes):
    https://www.microsoft.com/en-us/down....aspx?id=48145

    If that does not solve this, please let me know what DLL is missing.

  8. #7
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    While installing the package recommended by you and installed manually the dll does not work. Perhaps it would be better to compile everything with GCC.

  9. #8
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    I cannot easily change the window size (offsets are max 16-bit), but I have changed the IO buffer size to 1 MB. Don't use this to benchmark memory though, it takes a lot more than necessary (but better IO speed). For more speed, I would have to replace the 64 KB sliding window with the entire file in memory (different compressor logic). I compiled with gcc this time.
    Attached Files Attached Files

  10. The Following 2 Users Say Thank You to cade For This Useful Post:

    Bilawal (4th October 2016),Nania Francesco (29th September 2016)

  11. #9
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    Uploaded source code: http://nauful.com/dump/dc/RCM.cpp

    Compile with -Dx64 if 64-bit. Compile with -DDEBUG for debug sanity checks.

  12. #10
    Member
    Join Date
    Dec 2012
    Location
    japan
    Posts
    149
    Thanks
    30
    Thanked 59 Times in 35 Posts
    RCM always crashes when used "co". compiler is MSVC 2010 Express.

  13. #11
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    Compile with larger stack size (try 8 MB, I think MSVC default is 1 MB (Edit: It is 1 MB)). You can see that the one I compiled works (MSVC and gcc).

  14. #12
    Member
    Join Date
    Dec 2012
    Location
    japan
    Posts
    149
    Thanks
    30
    Thanked 59 Times in 35 Posts
    How about below code in line 816?
    int skip = path[rp].max_length - RCM_MATCH_SEMIOPTIMAL_MAX - RCM_MATCH_MIN+1>>1;

  15. #13
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    I'm not sure what you mean. Is that a speed optimization? It's a rare case when you have a long match, so the semi-optimal parser skips to the end of all but the last part of the match. It doesn't cause overflow because an operation with uint16 and two ints is upgraded to int.

  16. #14
    Member
    Join Date
    Dec 2012
    Location
    japan
    Posts
    149
    Thanks
    30
    Thanked 59 Times in 35 Posts
    Maybe compression will be a little better. But slower.

  17. #15
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    It's a small optimization to prevent the semi-optimal parser from repeatedly testing very long matches (e.g. sparse files). You can disable this part of the source for a very small compression improvement, but it would take a long time to go through regions of very long matches.

Similar Threads

  1. another (too) fast compressor
    By Cyan in forum Data Compression
    Replies: 139
    Last Post: 6th February 2016, 21:41
  2. Is paq8px_v69 sensitive to aligned data?
    By somethingsomething in forum Data Compression
    Replies: 0
    Last Post: 7th January 2015, 21:41
  3. Fast 32-byte aligned word histogram calculation
    By Intensity in forum Data Compression
    Replies: 2
    Last Post: 28th December 2013, 22:14
  4. Byte oriented LZ77 compressor
    By Matt Mahoney in forum Data Compression
    Replies: 21
    Last Post: 30th December 2011, 18:27
  5. LZSR fast compressor !
    By Nania Francesco in forum Data Compression
    Replies: 8
    Last Post: 4th October 2011, 05:46

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •