Results 1 to 18 of 18

Thread: The best way to compress 2 mostly idenctical VMs?

  1. #1
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post

    Question The best way to compress 2 mostly idenctical VMs?

    Hi all.

    I need to compress 2 VMs to upload them to remote location.
    VMs are Windows Server 2008 R2 with latest updates. 1st is Standard ed., 2nd - Enterprize. Files are from VMware Workstation.
    The host to compress files will be 32-core 128GB RAM Xeon. Decompression will be done on 24 core 56GB RAM Xeon.
    I've made archive this way: zpaq -m0 (to dedup) | srep -m3 | (nanozip -D) or (tor -16). Compression was done on old Pentium E2200 at night.
    LzTurbo -39 and -49 and tor -7 gave bigger file, so i've skipped them. tor -16 used 11,6 times of nanozip to make 3,7% less archive.
    Anyway, i've had already tor -16 file, it was smaller and decompression (i thought) had to be faster, so i used archive from tornado.
    From 34210MB VMs zpaq made 20930MB file, then srep squeezed to 13325MB and tornado to 3735MB.
    By the way: export from VMware Workstation to compressed ovf made 13570MB files (3,5 times bigger than zpaq+nanozip) and took slightly more time.
    WinRAR made ~9GB archive.
    So, here we go for why the topic was made.
    On the reciever side (32-core 128GB RAM Xeon) decompression of tor and srep took resonable amount of time. But 64-bit zpaq took > 60 min to decompress files. On 32-core machine.
    I've also tried zpaq alone with -m 4 option, it took 40 min, all 32 cores and 14GB RAM to create 4647MB file.
    And the question is: is there a way to compress these VMs to <= 4GB file with pack\unpack speed >= 30 MB/s and not using temporary files?
    I wanted to use shar | srep | any fast archiver capable to use 16+ cores, but shar | srep gives an error "input file is larger than file specified" (shar alone works OK). Tornado, nanozip, LzTurbo, 4x4, fazip - all of them either don't support both stdin\stdout for data or don't use all cores on pack\unpack.

    Any help is appreciated, thanks.

  2. #2
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    What about just zpaq -m 1? For -m 3 and higher, decompression speed is similar to compression (slower). -m 2 decompression should be as fast as -m 1 but compression is slower. -m 1 is sometimes faster than -m 0 because the compression is faster than disk I/O.

    You can also try smaller fragment sizes to improve dedup like -fragment 2 instead of the default -fragment 6 (4 KB average size instead of 64 KB). But you have to use the same -fragment option for all updates for dedup to work. Smaller fragments may increase the archive size if there are not a lot of small fragments to dedup, and also increase memory usage. The default needs about 0.1% of the archive size in memory for the fragment tables.

    I found that speed doesn't scale to very large number of cores like 24 or 32 for the faster methods. It might be faster with -threads 8 or even -threads 4. You can also try larger blocks like -method 16 or 28. The default for 1 is 14 (16 MB). For 2 and higher it is 26 (64 MB).

  3. The Following User Says Thank You to Matt Mahoney For This Useful Post:

    surfersat (8th October 2014)

  4. #3
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    not using temporary files
    i think you can't do it with srep. btw "fazip 4x4:t32:tor" should use all cores and compress stdin to stdout

  5. #4
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    645
    Thanks
    205
    Thanked 196 Times in 119 Posts
    Quote Originally Posted by nimdamsk View Post
    The best way to compress 2 mostly identical VMs
    To compress two "mostly identical" sources, in theory you need
    H(X,Y) = H(X) + H(Y) - I(X;Y)
    where H is Shannon entropy (sizes of separate archives) and I is mutal information - capacity you could safe thanks to their similarity.
    So you could encode one of them, then encode the "difference between them", which in the simplest case (no synchronization errors) is just XOR of two sources (mostly zeros - can be well compressed).
    However, there probably are synchronization errors in your case, so you need a smarter way for find and represent the difference between these two files: something like "delete block from a to b, insert block '...' in position c" etc.

    In above case the encoder would need to have both files.
    It turns out that theoretically we could get the same capacity if encoders of both files cannot communicate (no way to find "the difference") - it is so called distributed source coding problem, like Slepian-Wolf or Wyner-Ziv.
    However, it becomes more complicated and costly ... and adding synchronization errors would make it a nightmare ...

  6. #5
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    I'll try fazip and zpaq again, as suggested. And may be Bulat's fb too.
    Theoretically the best way to compress these VMs will be parsing internal NTFS and making links to repeated files. Or making backup from inside of powered on VMs to dedup storage. Now their virtual disk images are not clean, there are unused, but not zeroed data, fragmented data, may be compressed\sparse files, may be pagefile. Sometimes there is hibernation file too. All these moments can be taken if specialized backup utility will be made. Now rolling dedup from zpaq can eleminate some of the problems, but not all. And it takes time.

  7. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    fazip by itself doesn't provide deduplication and fb ATM doesn't decompress at all so youк only choice is zpaq/exdupe. and you can use VMWare command to clean VM files prior to compression

  8. #7
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    Quote Originally Posted by Jarek View Post
    To compress two "mostly identical" sources, in theory you need
    H(X,Y) = H(X) + H(Y) - I(X;Y)
    where H is Shannon entropy (sizes of separate archives) and I is mutal information - capacity you could safe thanks to their similarity.
    So you could encode one of them, then encode the "difference between them", which in the simplest case (no synchronization errors) is just XOR of two sources (mostly zeros - can be well compressed).
    However, there probably are synchronization errors in your case, so you need a smarter way for find and represent the difference between these two files: something like "delete block from a to b, insert block '...' in position c" etc.

    In above case the encoder would need to have both files.
    It turns out that theoretically we could get the same capacity if encoders of both files cannot communicate (no way to find "the difference") - it is so called distributed source coding problem, like Slepian-Wolf or Wyner-Ziv.
    However, it becomes more complicated and costly ... and adding synchronization errors would make it a nightmare ...
    bsdiff+bspatch is what I would use for the diff+patch.

  9. #8
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    I don't really need diff now. When needed (i.e. receiver has Windows SP1, need to upload Windows SP2) i successfully used xdelta3.

  10. #9
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    Hmm... Somehow zpaq -m 0 archive was unpacked in 14 min on 24 core 56GB RAM Xeon. Instead of >60 min on 32-core 128GB RAM Xeon.
    May be NUMA issues, may be LUN throughput, may be I\O latency. I'll try to find the problem if have some time.
    But current scheme (zpaq -m 0 | srep -m3 | tornado or nanozip or fazip) seems to be acceptable.

    Thanks all again for excellent soft.

    BTW, is it possible to make selfmade EMC DataDomain using opensource?

  11. #10
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    Quote Originally Posted by Jarek View Post
    To compress two "mostly identical" sources, in theory you need
    H(X,Y) = H(X) + H(Y) - I(X;Y)
    where H is Shannon entropy (sizes of separate archives) and I is mutal information - capacity you could safe thanks to their similarity.
    So you could encode one of them, then encode the "difference between them", which in the simplest case (no synchronization errors) is just XOR of two sources (mostly zeros - can be well compressed).
    However, there probably are synchronization errors in your case, so you need a smarter way for find and represent the difference between these two files: something like "delete block from a to b, insert block '...' in position c" etc.

    In above case the encoder would need to have both files.
    It turns out that theoretically we could get the same capacity if encoders of both files cannot communicate (no way to find "the difference") - it is so called distributed source coding problem, like Slepian-Wolf or Wyner-Ziv.
    However, it becomes more complicated and costly ... and adding synchronization errors would make it a nightmare ...
    Giving this problem another look --
    I don't think Slepian-Wolf is relevant. The OP apparently has both files on the machine used for compression, so he could just concatenate them and use any compressor. Some kind of LZ with unbounded match distances seems appropriate. From a theory standpoint, concatenating doesn't sacrifice the possibility of achieving optimal compression.

    "However, there probably are synchronization errors in your case, so you need a smarter way for find and represent the difference between these two files: something like "delete block from a to b, insert block '...' in position c" etc."

    If you assume the files are concatenated, long-range LZ provides a solution to that problem. Rather than a sequence of edits (insert, delete, substitute), it's a sequence of prior matches to copy from, which ultimately isn't a big difference.
    Last edited by nburns; 9th October 2014 at 14:31.

  12. #11
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    Quote Originally Posted by nimdamsk View Post
    I'll try fazip and zpaq again, as suggested. And may be Bulat's fb too.
    Theoretically the best way to compress these VMs will be parsing internal NTFS and making links to repeated files. Or making backup from inside of powered on VMs to dedup storage. Now their virtual disk images are not clean, there are unused, but not zeroed data, fragmented data, may be compressed\sparse files, may be pagefile. Sometimes there is hibernation file too. All these moments can be taken if specialized backup utility will be made. Now rolling dedup from zpaq can eleminate some of the problems, but not all. And it takes time.
    Cleaning the images is probably the lowest-hanging fruit for making smaller files. Making repeated files into links inside NTFS should not necessarily help much -- you'd just be doing the same thing as the compressor, but manually and using NTFS. However, zeroing out unallocated parts of the disk is absolutely a good idea, since it's 100% waste, yet it can take up large amounts of space in the compressed images. Defragmenting the filesystem is a good idea, because it's likely to help and it's automatic -- the machine does all the work. Pagefiles/swap space should be deleted and zeroed out. The way compression works, at a high level, is by finding segments that are repeated in multiple places and therefore take up x*n bytes, and compressing it down to something like x+n bytes, by keeping one copy and replacing the others with references. That means that you shouldn't worry too much about extraneous extra copies of important data; the biggest savings is from removing unimportant data completely -- especially audio, video, and already compressed data, which can't be compressed further.

    Quote Originally Posted by nimdamsk View Post
    I don't really need diff now. When needed (i.e. receiver has Windows SP1, need to upload Windows SP2) i successfully used xdelta3.
    When you have two files that are almost identical, you can replace one of them with a diff. Concatenating them and using a long-range compressor should have a similar effect, but bsdiff is highly optimized and might do better.

    Quote Originally Posted by nimdamsk View Post
    Hmm... Somehow zpaq -m 0 archive was unpacked in 14 min on 24 core 56GB RAM Xeon. Instead of >60 min on 32-core 128GB RAM Xeon.
    May be NUMA issues, may be LUN throughput, may be I\O latency. I'll try to find the problem if have some time.
    But current scheme (zpaq -m 0 | srep -m3 | tornado or nanozip or fazip) seems to be acceptable.
    I'd start by looking at the cpu architectures. It's hard to tell from the info you included. All I see is that both machines have more than enough cores and that memory isn't the problem.

  13. #12
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    > zpaq -m 0 | srep -m3 |

    why zpaq? if yoг allow temfpfiles, just tar data together and compress them with srep. i.e. "tar ... | srep -m3 | ..."

  14. #13
    Member
    Join Date
    Jun 2013
    Location
    Sweden
    Posts
    150
    Thanks
    9
    Thanked 25 Times in 23 Posts
    You can try EXDUPE (www.exdupe.com).

    Use with -stdout to apply compression from other program if you like.

    exdupe -g16 -x0 -t1 *.WMs -stdout | rar a -m5 -md1g -ma5 -si FILES.RAR
    rar p -inul FILES.RAR | exdupe -R -stdin EXTRACTED\

  15. #14
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    Quote Originally Posted by Bulat Ziganshin View Post
    > zpaq -m 0 | srep -m3 |

    why zpaq? if yoг allow temfpfiles, just tar data together and compress them with srep. i.e. "tar ... | srep -m3 | ..."
    You are right. Srep made 2% smaller file and seems took less time, but not sure here, had no time to look carefully.
    May be i'll have some more time for tests. Unfortunately those systems are not ours and have some productive software running.
    But it's interesting to test all suggestions here.

  16. #15
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    zpaq computes SHA1 hashes for dedup and data integrity and checks them on extraction. Otherwise with -m 0 it is just copying data. You can try -fragile to turn off the integrity checks (but not dedup).

  17. #16
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    Quote Originally Posted by Matt Mahoney View Post
    zpaq computes SHA1 hashes for dedup and data integrity and checks them on extraction. Otherwise with -m 0 it is just copying data. You can try -fragile to turn off the integrity checks (but not dedup).
    Thank you Matt, i'll try. BTW i've made mistake: 1-st machine (32 core 128GB) is not Xeon, but AMD Opteron. This may explain why is it may be slower than 24 core Xeon on 2-n machine. I'll rerun tests on 24 core Xeon some days later.

  18. #17
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    Quote Originally Posted by nimdamsk View Post
    Thank you Matt, i'll try. BTW i've made mistake: 1-st machine (32 core 128GB) is not Xeon, but AMD Opteron. This may explain why is it may be slower than 24 core Xeon on 2-n machine. I'll rerun tests on 24 core Xeon some days later.
    Unfortunately, AMD got left behind in terms of technology. That's the likely culprit, especially if the AMD CPU is older.

  19. #18
    Member
    Join Date
    Dec 2013
    Location
    Italy
    Posts
    342
    Thanks
    12
    Thanked 34 Times in 28 Posts
    If you are very, very brave, you can take a rather naif approach.

    1) use snapshot.exe to take the images of the VMs (all partitions)
    2) take only the first one (the "boot")
    3) extract from 1) file-by-file in two directory, then deduplicate&pack

    To restore, first remake partition table & boot (2), then the system disk (empty).
    Then extract file-by-file (from 3).

    Not so straight, but I think this should be the most efficient (in space term), if you don't care for ACLs.

    Only a gekandenexperiment...

    Or you can mount a iSCSI-zfs volume (from a solaris box, for example) on the VM disk, then try zfs's send

Similar Threads

  1. Charles Ashford's COMPRESS
    By comp1 in forum Download Area
    Replies: 12
    Last Post: 18th July 2014, 04:18
  2. Compress any file to 4 bytes
    By Matt Mahoney in forum The Off-Topic Lounge
    Replies: 5
    Last Post: 28th June 2011, 07:11
  3. Compress-LZF
    By spark in forum Data Compression
    Replies: 2
    Last Post: 16th October 2009, 00:08
  4. A lot of Compress Algorith to download , here
    By Yuri Grille. in forum Data Compression
    Replies: 2
    Last Post: 20th April 2009, 20:56
  5. pim 2.9 compress mysql 5.1.32 x64 files
    By l1t in forum Data Compression
    Replies: 0
    Last Post: 23rd March 2009, 15:06

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •