Page 1 of 7 123 ... LastLast
Results 1 to 30 of 206

Thread: RAZOR - strong LZ-based archiver

  1. #1
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 149 Times in 18 Posts

    RAZOR - strong LZ-based archiver

    Hi everyone,

    I want to share with you my latest archiver-project called "RAZOR".

    RAZOR's key properties are:
    • strong compression ratio & highly asymmetrical
    • fast decompression speed & low memory footprint (1.66N)
    • slow compression speed & high memory footprint
    • rolz/lz compression engine
    • unicode support & solid archiving
    • block-based deduplication (BLAKE2b)
    • ensures integrity for compressed data & archived files (CRC32)
    • special processing for x86/x64, structures, some image-/audio-types

    Some random stuff:
    • 1.x is a technology demo.
    • Please report bugs (if you find any) to my e-mail address.
    • Please do not ask for new features.
    • This is a test version - please verify your archives.
    • There is no x86 or linux executable.
    • 2.x is in development and will be done, when it's done.
    • 2.x and 1.x won't be compatible.

    Have fun & happy crunching!

    Changelog
    • 2017.09.08 - 1.00 - removed
    • 2017.09.15 - 1.01 - removed
    • 2018.03.11 - 1.03.6 - see readme.txt
    • 2018.03.22 - 1.03.7 - see readme.txt

    See this post for an exemplary double commander integration.
    Attached Files Attached Files
    Last edited by Christian; 23rd March 2018 at 01:18.

  2. The Following 39 Users Say Thank You to Christian For This Useful Post:

    78372 (8th September 2017),algorithm (8th September 2017),Amsal (8th September 2017),Bekk (1st April 2018),boxerab (9th September 2017),Bulat Ziganshin (11th September 2017),Chirantan (14th September 2017),comp1 (8th September 2017),dado023 (15th September 2017),danlock (10th November 2018),Darek (8th September 2017),diskzip (13th March 2018),encode (8th September 2017),ffmla (23rd September 2017),hexagone (11th September 2017),hunman (8th September 2017),inikep (10th September 2017),Jarek (12th March 2018),khavish (11th September 2017),Lucas (8th September 2017),mhajicek (22nd April 2018),Mike (8th September 2017),m^3 (8th September 2017),Nania Francesco (25th September 2017),olokelo (23rd November 2018),oltjon (10th September 2017),RamiroCruzo (8th September 2017),Razor12911 (9th September 2017),Samantha (9th September 2017),Simorq (8th September 2017),Skymmer (8th September 2017),spark (16th September 2017),Stephan Busch (8th September 2017),WinnieW (16th October 2017),xinix (29th September 2017),xpk (4th July 2018),YuriTC (21st December 2018),Zeokat (17th August 2018),_Bluesman_ (22nd March 2018)

  3. #2
    Member
    Join Date
    Apr 2015
    Location
    Greece
    Posts
    60
    Thanks
    28
    Thanked 18 Times in 12 Posts
    WOW maybe the strongest lz ever regarding compression ratio.Maybe ROLZ and a lot of magic?

    Code:
    enwik8 22042545
    silesia.tar 43252795
    Compression speed is about 0.3-0.4 MB/s for my machine.Decompression speed about 80 MB/s.
    I tested it using wine so maybe not very accurate.

  4. #3
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    860
    Thanks
    440
    Thanked 169 Times in 80 Posts
    wow Christian .. you did it again
    amazing compression so far

  5. #4
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    Good to see you back.

  6. #5
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Many days have passed since my last speech. But Christian's return doesn't want to miss me. Legendary return!

  7. #6
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    I hope to return again with my benchmark to celebrate Christian's return

  8. The Following 3 Users Say Thank You to Nania Francesco For This Useful Post:

    Bekk (1st April 2018),encode (8th September 2017),Stephan Busch (8th September 2017)

  9. #7
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    860
    Thanks
    440
    Thanked 169 Times in 80 Posts
    Hi Francesco.. I wrote you a pm

  10. #8
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Any chances to see a x86 compile and sources ?
    What kind of executables are filtered ?
    Can we get a full list of specially treated data ?
    What algorithm is used for creating the hashes during deduplication steps ?

  11. #9
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Ok. 279 different extensions are found inside the RAZOR. They are (separated with ;):
    Code:
    *.7z;*.xz;*.lzma;*.ace;*.arc;*.arj;*.bz;*.tbz;*.bz2;*.tbz2;*.cab;*.deb;*.gz;*.tgz;*.ha;*.lha;*.lzh;*.lzo;*.lzx;*.pak;*.rar;*.rpm;*.sit;*.zoo;*.zip;*.jar;*.ear;*.war;*.msi;*.3gp;*.avi;*.mov;*.mpeg;*.mpg;*.mpe;*.wmv;*.aac;*.ape;*.fla;*.flac;*.la;*.mp3;*.m4a;*.mp4;*.ofr;*.ogg;*.pac;*.ra;*.rm;*.rka;*.shn;*.swa;*.tta;*.wv;*.wma;*.wav;*.swf;*.chm;*.hxi;*.hxs;*.gif;*.jpeg;*.jpg;*.jp2;*.png;*.tiff;*.bmp;*.ico;*.psd;*.psp;*.awg;*.ps;*.eps;*.cgm;*.dxf;*.svg;*.vrml;*.wmf;*.emf;*.ai;*.md;*.cad;*.dwg;*.pps;*.key;*.sxi;*.max;*.3ds;*.iso;*.bin;*.nrg;*.mdf;*.img;*.pdi;*.tar;*.cpio;*.xpi;*.vfd;*.vhd;*.vud;*.vmc;*.vsv;*.vmdk;*.dsk;*.nvram;*.vmem;*.vmsd;*.vmsn;*.vmss;*.vmtm;*.inl;*.inc;*.idl;*.acf;*.asa;*.h;*.hpp;*.hxx;*.c;*.cpp;*.cxx;*.m;*.mm;*.go;*.swift;*.rc;*.java;*.cs;*.rs;*.pas;*.bas;*.vb;*.cls;*.ctl;*.frm;*.dlg;*.def;*.f77;*.f;*.f90;*.f95;*.asm;*.s;*.sql;*.manifest;*.dep;*.mak;*.clw;*.csproj;*.vcproj;*.sln;*.dsp;*.dsw;*.class;*.bat;*.cmd;*.bash;*.sh;*.xml;*.xsd;*.xsl;*.xslt;*.hxk;*.hxc;*.htm;*.html;*.xhtml;*.xht;*.mht;*.mhtml;*.htw;*.asp;*.aspx;*.css;*.cgi;*.jsp;*.shtml;*.awk;*.sed;*.hta;*.js;*.json;*.php;*.php3;*.php4;*.php5;*.phptml;*.pl;*.pm;*.py;*.pyo;*.rb;*.tcl;*.ts;*.vbs;*.text;*.txt;*.tex;*.ans;*.asc;*.srt;*.reg;*.ini;*.doc;*.docx;*.mcw;*.dot;*.rtf;*.hlp;*.xls;*.xlr;*.xlt;*.xlw;*.ppt;*.pdf;*.sxc;*.sxd;*.sxi;*.sxg;*.sxw;*.stc;*.sti;*.stw;*.stm;*.odt;*.ott;*.odg;*.otg;*.odp;*.otp;*.ods;*.ots;*.odf;*.abw;*.afp;*.cwk;*.lwp;*.wpd;*.wps;*.wpt;*.wrf;*.wri;*.abf;*.afm;*.bdf;*.fon;*.mgf;*.otf;*.pcf;*.pfa;*.snf;*.ttf;*.dbf;*.mdb;*.nsf;*.ntf;*.wdb;*.db;*.fdb;*.gdb;*.exe;*.dll;*.ocx;*.vbx;*.sfx;*.sys;*.tlb;*.awx;*.com;*.obj;*.lib;*.out;*.o;*.so;*.pdb;*.pch;*.idb;*.ncb;*.opt

  12. #10
    Member
    Join Date
    Apr 2015
    Location
    Greece
    Posts
    60
    Thanks
    28
    Thanked 18 Times in 12 Posts
    Quote Originally Posted by Skymmer View Post
    Any chances to see a x86 compile and sources ?
    What kind of executables are filtered ?
    Can we get a full list of specially treated data ?
    What algorithm is used for creating the hashes during deduplication steps ?
    I think it uses Blake2

  13. #11
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    848
    Thanks
    483
    Thanked 333 Times in 246 Posts
    Really strong compression ratios, especially for multimedia files like BMP, TGA. And it's very fast ideed!

  14. #12
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    2,802
    Thanks
    125
    Thanked 712 Times in 342 Posts
    Its MT/vector-rANS. Please compare compression with rzm or something.

  15. #13
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 149 Times in 18 Posts
    Quote Originally Posted by algorithm View Post
    WOW maybe the strongest lz ever regarding compression ratio.Maybe ROLZ and a lot of magic?

    Code:
    enwik8 22042545
    silesia.tar 43252795
    Compression speed is about 0.3-0.4 MB/s for my machine.Decompression speed about 80 MB/s.
    I tested it using wine so maybe not very accurate.
    Yes, you guessed right - ROLZ. You can use a bigger dictionary (default is 64M) for slightly better results.

    Quote Originally Posted by Stephan Busch View Post
    wow Christian .. you did it again
    amazing compression so far
    Thank you for your support and ideas.

    Quote Originally Posted by Nania Francesco View Post
    I hope to return again with my benchmark to celebrate Christian's return
    Thank you Nania. But I'm just stopping by. I'm going to be a father, soon.

    Quote Originally Posted by Skymmer View Post
    Any chances to see a x86 compile and sources ?
    What kind of executables are filtered ?
    Can we get a full list of specially treated data ?
    What algorithm is used for creating the hashes during deduplication steps ?
    Sources: I've thoroughly talked with Stephan about this. Maybe 2018 after things have settled down for me. Depends on how this is developing, too.
    exe: All data which looks like x86/x64. No extensions needed.
    Specially treated data: exe, some uncompressed audio and image formats, some structures-types, extensions are just used for sorting
    Dedupe: Blake2b

  16. The Following 8 Users Say Thank You to Christian For This Useful Post:

    Bekk (1st April 2018),Cyan (25th April 2018),encode (9th September 2017),Mike (8th September 2017),Nania Francesco (25th September 2017),oltjon (10th September 2017),schnaader (9th September 2017),Stephan Busch (9th September 2017)

  17. #14
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Thanks for answering and goodluck with your fatherhood !

  18. The Following 2 Users Say Thank You to Skymmer For This Useful Post:

    Bekk (1st April 2018),Christian (9th September 2017)

  19. #15
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    I understand that you do not see the time to become a father but I have been for many years I can guarantee that nothing happens if you resume to attend this crazy world of data compression that certainly needs your contribution! We always wait for you

  20. The Following 2 Users Say Thank You to Nania Francesco For This Useful Post:

    Bekk (1st April 2018),Christian (9th September 2017)

  21. #16
    Member Jaff's Avatar
    Join Date
    Oct 2012
    Location
    Dracula's country
    Posts
    100
    Thanks
    112
    Thanked 20 Times in 16 Posts
    +1 for a x86 version of executable

  22. #17
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    854
    Thanks
    45
    Thanked 104 Times in 82 Posts
    Any chance of multithreaded version ?
    Any chance of bigger dictionaries ?
    Any limitations we should think off ? (aka eany hidden date based kill switch).
    Does it do anykind of data integrity checking ?


    -- edit --
    it seems to be at least multithreade with up to 2 heavy cpu threads

  23. #18
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    854
    Thanks
    45
    Thanked 104 Times in 82 Posts
    just a little feedback

    Code:
    Quake 3 arena + team arena
    7zip (pcf del rep rzm) 828 MB (869,031,293 bytes)
    Razor                  832 MB (872,699,935 bytes)
    
    
    Worms Armageddon
    7zip (nanozip)       397 MB (417,004,999 bytes)
    Razor                379 MB (398,304,752 bytes)
    
    Get Medieval
    7-zip (m7repacker)    193 MB (202,452,790 bytes)
    Razor                 181 MB (190,626,607 bytes)

  24. The Following 5 Users Say Thank You to SvenBent For This Useful Post:

    78372 (10th September 2017),Bekk (1st April 2018),Christian (9th September 2017),Razor12911 (10th September 2017),Simorq (10th September 2017)

  25. #19
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 149 Times in 18 Posts
    Quote Originally Posted by SvenBent View Post
    Any chance of multithreaded version ?
    Any chance of bigger dictionaries ?
    Any limitations we should think off ? (aka eany hidden date based kill switch).
    Does it do anykind of data integrity checking ?


    -- edit --
    it seems to be at least multithreade with up to 2 heavy cpu threads
    Threading:
    When I started rz a couple of years ago, the goal was: fast decompression, moderate memory footprint, strong compression
    I experimented a lot with near-optimal-parsing. In the end, parsing is not even done on chunks/blocks. I sacrificed chunking for the sake of ratio.
    Therefore, only part of the matchfinder remained for threading. There are other components of the archiver which could be threaded - but then gain would be little.
    To speed up compression, the parser would be my first starting point (the easiest would be chunking).

    Dictionaries:
    You can use lz-dictionaries up to 1023M, if you have memory in spades. But at some point, the return is diminishing. The default size of 64M is probably too small. Because of deduplication you don't need extremly large lz-dictionaries, though.

    Limitations:
    No.

    Integrity:
    -The compressed data-streams are verified.
    -The decompressed files are verified.

    CRC32 is used. I decided to use it in order to be consistent with other tools.

  26. The Following 9 Users Say Thank You to Christian For This Useful Post:

    78372 (9th September 2017),Bekk (1st April 2018),danlock (10th November 2018),Mike (9th September 2017),RamiroCruzo (9th September 2017),Razor12911 (10th September 2017),spark (16th September 2017),Stephan Busch (9th September 2017),SvenBent (11th September 2017)

  27. #20
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    464
    Thanks
    202
    Thanked 81 Times in 61 Posts
    Congrats on the baby!!! And thank you for sharing this with the world.

    Simply unbelievable! I think Razor plays on its own league. CM ratios with LZMA decompression speeds.

    Really, I keep throwing things at it and it always outperforms 7z, FA and even Nanozip while being at least as fast as them in decompression, but sometimes 2x-5x faster. You did the impossible again. And I'm switching my long-term backups format to .raz. Just letting you know.

    I can confirm that it works on Linux under wine just fine and to the date, has compressed some 4gb without incident.

    BTW... You don't have any other sorcery like this 'buried in your backups', do you?

    PS: My 2 cents for the source code. Imagine what could be this combined with a multithreaded precomp and a few more tricks like tta for audio...

  28. The Following 2 Users Say Thank You to Gonzalo For This Useful Post:

    Bekk (1st April 2018),Christian (10th September 2017)

  29. #21
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    317
    Thanks
    168
    Thanked 51 Times in 37 Posts
    Quick test:

    HFCB vm.dll
    HFCB vm.dll
    Code:
    PROGRAM      SIZE           COMP TIME                        DECOMP TIME
    =========================================================================================
       original  4,244,176,896                                                              
    rz -d 1024k    979,333,615  6815.528 (CPU), 6051.363 (Wall)  45.131 (CPU), 47.376 (Wall)
             rz    883,196,199  6854.762 (CPU), 5756.198 (Wall)  41.855 (CPU), 44.602 (Wall)
    Test Machine: Windows 7, i7-3770k (No OC), 32GB RAM, SSD drive

    PS: At 1024k dictionary, only 12MB is needed for decompression! Christian, please consider sharing an x86 compile!

  30. The Following 3 Users Say Thank You to comp1 For This Useful Post:

    78372 (10th September 2017),Bekk (1st April 2018),Christian (10th September 2017)

  31. #22
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 149 Times in 18 Posts
    Quote Originally Posted by Gonzalo View Post
    Congrats on the baby!!! And thank you for sharing this with the world.

    Simply unbelievable! I think Razor plays on its own league. CM ratios with LZMA decompression speeds.

    Really, I keep throwing things at it and it always outperforms 7z, FA and even Nanozip while being at least as fast as them in decompression, but sometimes 2x-5x faster. You did the impossible again. And I'm switching my long-term backups format to .raz. Just letting you know.

    I can confirm that it works on Linux under wine just fine and to the date, has compressed some 4gb without incident.

    BTW... You don't have any other sorcery like this 'buried in your backups', do you?

    PS: My 2 cents for the source code. Imagine what could be this combined with a multithreaded precomp and a few more tricks like tta for audio...
    Wow, thank you very much.

    Uncompressed audio compression is already ok. With 7Z/LZ4/ZSTD/... getting spread, recompression is a rapidly moving target. Tools like precomp and reflate are very impressive.

    During development, I used 7z, quark, nanozip-co, rzm and ccmx as refernce-points. Especially nanozip and ccmx can be stronger. rz's text compression is it's weak spot (in comparison to nanozip/ccmx). This was the last thing I was working on. It can be a bit stronger with a better/slower parser. At the moment, rz uses a one-arrival, but multi-token parsing. With multiple-arrivals and multiple-tokens the compression gets stronger (enwik8 -> 21.7) but the compression speed drops significantly. Then, I prototyped a complicated syntax-extension involving some dynamic dictionary-stuff - this worked really great (enwik8 -> 20.9), but messed up my code-base and architecture. My todo-list for razor was getting longer and longer, so I decided to release it now (JPG and ZIP recompression were planned, too).

    No comment on buried stuff.

    Quote Originally Posted by comp1 View Post
    Quick test:

    HFCB vm.dll
    HFCB vm.dll
    Code:
    PROGRAM      SIZE           COMP TIME                        DECOMP TIME
    =========================================================================================
       original  4,244,176,896                                                              
    rz -d 1024k    979,333,615  6815.528 (CPU), 6051.363 (Wall)  45.131 (CPU), 47.376 (Wall)
             rz    883,196,199  6854.762 (CPU), 5756.198 (Wall)  41.855 (CPU), 44.602 (Wall)
    Test Machine: Windows 7, i7-3770k (No OC), 32GB RAM, SSD drive

    PS: At 1024k dictionary, only 12MB is needed for decompression! Christian, please consider sharing an x86 compile!
    Thank you. Does rz perform better with a bigger dictionary (like 256M or 512M)? rz's deduplication ability scales with the size of the selected lz-dictionary. Btw. you have a very nice machine.

    About the small dictionary: Yes, it's funny, I understand. But 1M dictionary + x86 is not really rz's use case, is it?

  32. #23
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    854
    Thanks
    45
    Thanked 104 Times in 82 Posts
    Quote Originally Posted by Gonzalo View Post
    And I'm switching my long-term backups format to .raz. Just letting you know.
    INFIDEL the extension is clearly supposed to be .rz and pronounced JIFRZ

  33. #24
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    854
    Thanks
    45
    Thanked 104 Times in 82 Posts
    Quote Originally Posted by Christian View Post
    It can be a bit stronger with a better/slower parser.
    Would this break future compatibilty.
    Is the decompression stages pretty frozen or should i keep the RZ.exe around for decompression for each archieve ? aka will furture Razor be able to decompress current archieves ?

    Quote Originally Posted by Christian View Post
    rz's deduplication ability scales with the size of the selected lz-dictionary.
    Info info on how long /effective this windows is? in rough and simply terms?



    also... anyway to get ECM prefilters into Razor >.>

  34. #25
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    464
    Thanks
    202
    Thanked 81 Times in 61 Posts
    Quote Originally Posted by Christian View Post
    No comment on buried stuff.
    Damn! I knew it

    With 7Z/LZ4/ZSTD/... getting spread, recompression is a rapidly moving target. Tools like precomp and reflate are very impressive.
    Right. That's exactly the reason I'd rather let other people have access to the source of Razor in case they decide to improve it, say including precomp as a preprocessing stage, for instance. I know you don't have the time to do it yourself. Of course it is your decision to make and I don't mean to push.

    rz's text compression is it's weak spot (in comparison to nanozip/ccmx). This was the last thing I was working on. It can be a bit stronger with a better/slower parser. At the moment, rz uses a one-arrival, but multi-token parsing. With multiple-arrivals and multiple-tokens the compression gets stronger (enwik8 -> 21.7) but the compression speed drops significantly. Then, I prototyped a complicated syntax-extension involving some dynamic dictionary-stuff - this worked really great (enwik8 -> 20.9), but messed up my code-base and architecture.
    There is GLZA for a strong asymmetric solution (the latter scheme you describe sounded a lot like it - enwik8 -> 20.4) and ppmd for a fast symmetric approach (enwik8 -> 21.3).

    One more thing: What method do you use for deduplication? I tried Bulat's rep filter and it helped. To ratio, a little; to speed, a hell of a lot

    Quote Originally Posted by SvenBent View Post
    INFIDEL the extension is clearly supposed to be .rz and pronounced JIFRZ
    I'm sure I am going to hell now...

  35. #26
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    317
    Thanks
    168
    Thanked 51 Times in 37 Posts
    Quote Originally Posted by Christian View Post
    Thank you. Does rz perform better with a bigger dictionary (like 256M or 512M)? rz's deduplication ability scales with the size of the selected lz-dictionary. Btw. you have a very nice machine.

    About the small dictionary: Yes, it's funny, I understand. But 1M dictionary + x86 is not really rz's use case, is it?
    It may not be the intended use case, but I find advances in data compression very interesting, more so when the same hardware limitations are used for testing. For example, cmix achieves amazing levels of compression, but only because it uses amazing amounts of RAM.

    Here are some more benchmarks - I hope these are helpful:


    WINDOWS 3.0 MULTIMEDIA EDITION
    Code:
    PROGRAM  SIZE         C.TIME   D.TIME
                          (CPU)    (CPU)
    ======================================
        org  330,078,282  
      rz 1m   92,542,512  497.082  6.131
     rz 64m   90,398,131  626.874  5.819
    rz 128m   90,351,664  653.597  5.600
    rz 256m   90,344,006  648.309  5.897
    FIREFOX PROGRAM FILES DIRECTORY
    Code:
    PROGRAM  SIZE         C.TIME   D.TIME
                          (CPU)    (CPU)
    ======================================
        org  245,683,256  
      rz 1m   75,949,354  305.216  3.775
     rz 64m   65,149,620  425.898  3.853
    rz 128m   65,104,141  443.402  3.557
    rz 256m   65,070,761  432.310  3.526
    ENCODE'S COMPRESSION CORPUS (ENCCC)
    Code:
    PROGRAM  SIZE         C.TIME   D.TIME
                          (CPU)    (CPU)
    ======================================
        org  124,458,915  
      rz 1m   33,254,389  166.765  1.903
     rz 64m   31,925,561  174.035  1.903
    rz 128m   31,926,575  169.448  1.825
    rz 256m   31,926,575  172.490  1.872

  36. #27
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    854
    Thanks
    45
    Thanked 104 Times in 82 Posts
    just a little bit more test results on compression ratios

    Code:
    Age of mythology Gold ( 2 CD image in CloneCD format of 3 files each)
    Uncompressed      1.38 GB (1,493,080,807 bytes)
    ECM prefiltered   1.21 GB (1,309,862,685 bytes)
    m7repacker +ECM   0.99 GB (1,068,985,764 bytes)
    
    Razor 64MB        1.12 GB (1,206,599,484 bytes)
    Razor 64MB +ECM    984 MB (1,032,796,729 bytes)
    Razor 128MB +ECM   984 MB (1,032,037,408 bytes)
    Razor 1024MB +ECM  984 MB (1,032,788,859 bytes)
    All ECM prefiltered files include unecm.exe

    * ECM prefilter still helps on Razor compression.
    * Razor beats brute forced 7-zip compression
    * Not a lot of compression was gained by increasing dictionary
    * There was actually an regression in compression with a too big dictionary

  37. #28
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    854
    Thanks
    45
    Thanked 104 Times in 82 Posts
    A quick feature request (besides ECM prefilters). would it be possible to make it so if you just drag and drop a .rz archieve unto rz.exe it will automatic decompress it (create folder with the name of the archieve).
    It would help a lot for people that are not command line savy.

  38. #29
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    2,802
    Thanks
    125
    Thanked 712 Times in 342 Posts
    @SvenBent:
    Just create some rzx.bat with
    Code:
    @echo off
    rz.exe -y -o "%~n1" x "%1" *
    and drop archives on it instead.

  39. The Following 4 Users Say Thank You to Shelwien For This Useful Post:

    Bekk (1st April 2018),dado023 (11th September 2017),PSHUFB (27th September 2017),Simorq (30th January 2018)

  40. #30
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 149 Times in 18 Posts
    Quote Originally Posted by SvenBent
    Would this break future compatibilty.
    Is the decompression stages pretty frozen or should i keep the RZ.exe around for decompression for each archieve ? aka will furture Razor be able to decompress current archieves ?
    Better keep it around.

    Deduplication:
    First of all, Bulat's tools are great. rep seems to be differnt from rz's deduplication. rep has a sliding window and therefore is limited by the system's ram. Increase rz's sliding window to get similar results.
    rz's deduplication does not have a sliding window. But the memory-consumption is connected with the sliding-window's size. Deduplication takes 1.5N - for 64M, this is 96M. rz's dedupe is block-level and you can see it's range during compression ("Window : 65536K (512M..128G)") - 512M for min-block-size, 128G for max-block-size.

    Quote Originally Posted by Gonzalo
    There is GLZA for a strong asymmetric solution (the latter scheme you describe sounded a lot like it - enwik8 -> 20.4) and ppmd for a fast symmetric approach (enwik8 -> 21.3).
    Kennon Conrad did stellar work on GLZA - as did Jarek Duda on ANS, btw. GLZA is something new and it is remarkable for asymmetric text compression, indeed.
    Now, lz/rolz can be parsed nicely. My concept is still based on rolz. I think, we can push rolz a little further without touching speed or versatility.

    Quote Originally Posted by comp1
    ...
    Here are some more benchmarks - I hope these are helpful:
    ...
    Thank you. Yes, they do. 64M seems reasonable as a default.

    Quote Originally Posted by SvenBent
    * There was actually an regression in compression with a too big dictionary
    There's not much I can do about that. The parser is pretty complex and sometimes it goes wrong.

Page 1 of 7 123 ... LastLast

Similar Threads

  1. NanoZip - a new archiver, using bwt, lz, cm, etc...
    By Sami in forum Data Compression
    Replies: 280
    Last Post: 29th November 2015, 11:46
  2. Archiver (GUI-based utility)
    By cade in forum Data Compression
    Replies: 0
    Last Post: 9th January 2014, 03:00
  3. hashing LZ
    By willvarfar in forum Data Compression
    Replies: 13
    Last Post: 24th August 2010, 20:29
  4. LZ differential ?
    By Cyan in forum Data Compression
    Replies: 4
    Last Post: 27th September 2008, 14:00
  5. DARK - a new BWT-based command-line archiver
    By encode in forum Forum Archive
    Replies: 138
    Last Post: 23rd September 2006, 21:42

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •