Results 1 to 17 of 17

Thread: Dzo - Compression with Smart Deduplicaiton

  1. #1
    Member
    Join Date
    Sep 2011
    Location
    Bangalore
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Dzo - Compression with Smart Deduplicaiton

    Dzo is an unique offering of Essenso Labs with its innovative Compression cum Deduplication approach . It deliver 2X to 3X better compression at 3X better speed over
    other compression products (winzip, 7zip, winrar)on most user generated volume data. An evolutionary product, which bridges the gap between the Enterprise level
    Deduplication and End user level Compression. Dzo aims to bring the power of the deduplication to the masses and targets data management in cloud storage and volume transmission.

    For more informaiton visit www.essenso.com
    and for beta version http://essensolabs.com/invitation-trial

    Soon trial version will be available without intermediate step of beta version and with backup features. Looking for feedback.

  2. #2
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    That is: 2x-3x better compression than WinZip and 3x better speed than 7-zip?

    ADDED:
    I think I've seen more appealing websites:

    Firefox 3.6.x

  3. #3
    Member
    Join Date
    Sep 2011
    Location
    Bangalore
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Yep. Its 2x -3x better than 7zip on volume data. You can check the website for more details. The way we look at data is different, and on big folders Dzo performs better.

  4. #4
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by veer View Post
    Yep. Its 2x -3x better than 7zip on volume data. You can check the website for more details. The way we look at data is different, and on big folders Dzo performs better.
    I see. Weirdly, the website works OK now.
    I have a suggestion:
    Can you benchmark it on some public data? Publishing your own test files would work OK.

  5. #5
    Member
    Join Date
    Sep 2011
    Location
    Bangalore
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    You can find the best samples in google code repository. Dzo removes repetitions of bigger chunks much faster and then applies compression

  6. #6
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    What google code repository?
    I don't see any mention of it on your website.

  7. #7
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I benchmarked a beta version about a month ago. http://mattmahoney.net/dc/text.html#2356

    At this point lzo is really a solid-mode archiver because it lacks incremental backup and incremental restore. It deduplicates a folder to a temporary file and then compresses it with LZMA (7zip default compression level). I did a separate test with the maximum compression files making files maxcomp.cat (concatenation of all files), maxcomp.tar and maxcomp.zip and put them in the same folder with the original files. It correctly deduplicated all of the files except maxcomp.zip.

    Ideally you would want to back up or restore from a separate disk but I did not see a way to do that. It just creates a file.dzo with the same name as the folder in the folder above it. It also creates a temporary file in the same folder which could be a problem if the folder takes up more than 1/3 of the disk space (or more depending on compression).

  8. #8
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    It was a long waiting for your patent, I hope you can release a public trail soon.

    1. LOSSLESS COMPRESSION

    Inventor: KORATAGERE VEERESH RUDRAPA [IN]
    Applicant:
    EC: H03M7/30V H03M7/30Z (+1)
    IPC: H03M7/34
    Publication info: US2011181448 (A1) 2011-07-28
    Priority date: 2008-10-15

    2. LOSSLESS CONTENT ENCODING

    Inventor: KORATAGERE VEERESH RUDRAPPA [IN]
    Applicant:
    EC: H03M7/30V H03M7/40 (+1)
    IPC: H03M7/30
    Publication info: US2010321218 (A1) 2010-12-23
    Priority date: 2008-10-15


    3. CONTENT ENCODING

    Inventor: KORATAGERE VEERESH RUDRAPPA [IN]
    Applicant:
    EC:
    IPC: H03M7/30
    Publication info: US2010321217 (A1) 2010-12-23
    Priority date: 2008-10-15
    Attached Files Attached Files
    Last edited by Sportman; 14th October 2011 at 19:09. Reason: Added PDF files

  9. #9
    Member kampaster's Avatar
    Join Date
    Apr 2010
    Location
    ->
    Posts
    55
    Thanks
    4
    Thanked 6 Times in 6 Posts
    veer Dzo deletes repeating data units... In the same way as rep/srep/rzip
    Last edited by kampaster; 15th October 2011 at 04:40.

  10. #10
    Member
    Join Date
    Sep 2011
    Location
    Bangalore
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks Matt. your feedback helped us a lot. We are working towards removing the intermediate stage and then with some backup features.

  11. #11
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Quote Originally Posted by Sportman View Post
    It was a long waiting for your patent, I hope you can release a public trail soon.

    1. LOSSLESS COMPRESSION

    Inventor: KORATAGERE VEERESH RUDRAPA [IN]
    Applicant:
    EC: H03M7/30V H03M7/30Z (+1)
    IPC: H03M7/34
    Publication info: US2011181448 (A1) 2011-07-28
    Priority date: 2008-10-15

    2. LOSSLESS CONTENT ENCODING

    Inventor: KORATAGERE VEERESH RUDRAPPA [IN]
    Applicant:
    EC: H03M7/30V H03M7/40 (+1)
    IPC: H03M7/30
    Publication info: US2010321218 (A1) 2010-12-23
    Priority date: 2008-10-15


    3. CONTENT ENCODING

    Inventor: KORATAGERE VEERESH RUDRAPPA [IN]
    Applicant:
    EC:
    IPC: H03M7/30
    Publication info: US2010321217 (A1) 2010-12-23
    Priority date: 2008-10-15
    This looks like 3 versions of the same patent application. It has nothing to do with DZO. Rather, it describes a "universal" compression algorithm using binomial codes to encode the sequences of occurrences of each symbol in the alphabet. It does not compress because the code lengths add up to the same length as the original data.

    It looks like quite a bit of effort for something useless.

  12. #12
    Member
    Join Date
    Sep 2011
    Location
    Bangalore
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    These applications has nothing to do with Dzo.

  13. #13
    Member
    Join Date
    Oct 2011
    Location
    aus
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts


    Downloaded Dzo and tested on my backup (209 MB) files. It worked very well. DZO( 16 MB , 20 secs) has bet winrar maximum ( 110 MB, 197 secs ) and 7zip ultra (62 MB, 131 secs) hands down both in compression ratio and time taken. I liked the concept and at least i see lzma working. But the main problem with Dzo : I have to extract all the data if I need to extract only the files I want. and I expect some backup support too.


    these patents is it linked to Dzo in some way? I could not find much. Is it to protect some hidden concept?

  14. #14
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    I tested Dzo, it gave direct an start error when I select a 50GB+ test file, same I used with this benchmark http://encode.ru/threads/1354-Data-deduplication.
    After unpacking this test file and then Dzo the unpack directory I get:

    52,592.75 MB
    29,208.52 MB(55%) 7,684 sec
    8,203.92 MB (15%) 11,606 seconds = 8,602,429,279 bytes in 5 hours 21 min 30 seconds

    Compression good, speed not. I see round 20% CPU usage at 8 core test machine, peak memory used was round 1.5GB.

  15. #15
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Can somebody test its output with http://nishi.dreamhosters.com/u/lzmarec_v4b_bin.rar ?

  16. #16
    Member
    Join Date
    Oct 2011
    Location
    aus
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Can it be just one step process? I think lzma working with multiple threads but the first step is not.

  17. #17
    Member chornobyl's Avatar
    Join Date
    May 2008
    Location
    ua/kiev
    Posts
    153
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Code:
    C:\test>rec c 54.tar.dzo 54.tar.dzo.rec
    lzmaopt=5D (lc3 lp0 pb2) start=0000002F end=00E13946 size=00E13917 f_flush=5
    
    56,897,536 54.tar
    55,253,620 54.tar.dp
    14,761,606 54.tar.dp.n16.7z
    14,760,262 54.tar.dzo
    14,351,320 54.tar.dzo.rec

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •