Results 1 to 5 of 5

Thread: Remote diff demo

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts

    Remote diff demo

    Remembered about my old remote diff/patch kit based on anchor-hashing
    ( here's the source http://nishi.dreamhosters.com/u/fma-diff_v0ss.rar )
    And decided to make a new test for it.
    "Remote diff" is different from normal diff, because it syncs the files on different machines
    without transmitting whole files in either direction.
    An example would look kinda like this:
    Code:
    We have machines A and B, there's B\7z.dll with size 1,388,032
    and A\7z.dll with size 1,376,768. We need to sync the (newer) file from A to B,
    but we don't know which exactly version B currently has.
    // B: generate a fingerprint
    fma-hash.exe B\7z.dll b.hash // b.hash = 42,249 bytes
    // pass b.hash from B to A (its also compressible to ~34k or so)
    // A: generate a patch file
    fma-diff.exe A\7z.dll b.hash b.patch // b.patch = 1,060,947
    // pass b.patch from A to B
    // B: apply the patch
    fma-patch.exe B\7z.dll b.patch 7z.updated.dll
    And here's the demo itself - http://nishi.dreamhosters.com/u/remote-diff-demo_v0.rar
    And the results (only b.hash is transmitted from B to A, then only b.patch from A to B)
    Code:
    1,609,216 A\7z.dll // 04-10-2016 http://www.7-zip.org/a/7z1604-x64.exe
    1,609,216 B\7z.dll // 28-09-2016 http://www.7-zip.org/a/7z1603-x64.exe
    
    b.hash b.patch
    50,439 1,424,862 // 1-test-patch.bat 
    50,539 1,405,189 // 2-test-delta-patch.bat
    51,509 1,150,133 // 5-test-courgette-patch.bat
    51,599 1,134,370 // 6-test-courgette-delta-patch.bat
    53,949   808,362 // 3-test-x64flt-patch.bat
    53,909   801,765 // 4-test-x64flt-delta-patch.bat
    Somehow, courgette -dis didn't help much, while x64flt3 (my new x64flt|bcj2 filter) actually worked.
    Unfortunately dispack doesn't have x64 support, so not tested.

  2. The Following 7 Users Say Thank You to Shelwien For This Useful Post:

    Bilawal (10th December 2016),encode (10th December 2016),Gonzalo (11th December 2016),lorents17 (10th December 2016),Mike (10th December 2016),RamiroCruzo (10th December 2016),xinix (10th December 2016)

  3. #2
    Member RamiroCruzo's Avatar
    Join Date
    Jul 2015
    Location
    India
    Posts
    15
    Thanks
    137
    Thanked 10 Times in 7 Posts
    Tested on an 82.3 MB file UI.sb, took 1053 ms, faster than xdelta

    Code:
    B: generate a fingerprint
    sumblklen=86345712
    b.hash = 2714829
    pass b.hash from B to A (its also compressible)
    A: generate a patch file
    271482 anchors loaded. hashed file length = 86345712
    Bulding the anchor index. Done.
    b.patch = 1446
    pass b.patch from A to B
    B: apply the patch
    7777a7d91ef4b7c6552d75b40213e337  A/UI.sb
    7777a7d91ef4b7c6552d75b40213e337  B/UI.sb.updated

  4. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    It actually has a weird config for some reason (anchor.inc):
    Setting winsize=32, maxblklen=65535 (and maybe a different anchor mask), could probably improve the results...

    Okay, here's a reuploaded archive with a new config: http://nishi.dreamhosters.com/u/remote-diff-demo_v0.rar
    Code:
      config( void ) {
        winsize = 16;
        minblklen = winsize;
        maxblklen = 65535;
        anchormask = 0x3F;
        checkbits = 16;
      }
    Results:
    Code:
    1,609,216 A\7z.dll // 04-10-2016 http://www.7-zip.org/a/7z1604-x64.exe
    1,609,216 B\7z.dll // 28-09-2016 http://www.7-zip.org/a/7z1603-x64.exe
    
    b.hash    b.patch    total
    204,319 1,151,069 1,355,388 // 1-test-patch.bat 
    201,169 1,087,380 1,288,549 // 2-test-delta-patch.bat 
    196,309   732,497   928,806 // 3-test-courgette-patch.bat 
    193,559   669,461   863,020 // 4-test-courgette-delta-patch.bat 
    212,649   556,982   769,631 // 5-test-x64flt-patch.bat 
    209,579   497,850   707,429 // 6-test-x64flt-delta-patch.bat 
    209,589   498,034   707,623 // 7-test-x64flt-delta-delta-patch.bat 
    199,829   498,237   698,066 // 8-test-x64flt-delta-rep1-patch.bat

  5. The Following 4 Users Say Thank You to Shelwien For This Useful Post:

    Bilawal (13th December 2016),Mike (11th December 2016),RamiroCruzo (11th December 2016),xinix (11th December 2016)

  6. #4
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts
    How would it compare with large window brotli using the previous version as a custom dictionary? In one experiment we got density results similar or slightly better than bsdiff+brotli, with far superior decoding speed.

  7. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Well, here I tested it for you:
    Code:
    http://nishi.dreamhosters.com/u/bsdiff_sh2.rar
    
      lzma zstd22  bro11  bro11d   bro9d   bro1d 
    60,628 68,509 63,557 122,992 132,991 813,005 // 7z.dll                 
    43,940 48,110 45,627  99,030 108,318 761,958 // 7z.dll.bcj2            
    43,976 48,174 45,710 100,572 108,354 775,161 // 7z.dll.dis             
    67,393 72,033 67,308  64,417  69,417 712,128 // 7z.dll.bcj2.delta      
    61,977 68,934 61,628  64,535  69,307 712,198 // 7z.dll.bcj2.delta.delta
           
    lzma: bsdiff_sh2 + lzma.exe e -a1 -d22 -fb273 -mc999 -lc8 -lp0 -pb2 -mfbt4 -mt2 
    zstd: bsdiff_sh2 + zstd level 22, dictionary 22
    bro11: bsdiff_sh2 + bro --quality 11 --window 22
    bro11d: bro --quality 11 --window 22 --custom-dictionary B\%name%
    But actually this is about _remote_ diffs, ie when you don't have both files on one machine, and want to minimize the network traffic.
    Conclusions:
    1. bsdiff does a better delta than this delta filter
    2. no point in using anything other than lzma, because it has best compression and reasonable decoding speed (especially if you're decoding during download).
    3. brotli q11 has slow encoding, q9 compression is worse than zstd
    4. courgette helps less than x64flt3
    5. bro11d is improved by an external delta
    6. difference between .delta and .delta.delta is weird - maybe some alignment shift?
    7. I forgot, but bsdiff_sh2 actually has a built-in custom E8 (with special tweaks for bsdiff);
    tested without it, and just 7z.dll results got worse, but best results are exactly the same - no point in updating the table.

  8. The Following 4 Users Say Thank You to Shelwien For This Useful Post:

    Bilawal (13th December 2016),Mike (11th December 2016),RamiroCruzo (11th December 2016),xinix (11th December 2016)

Similar Threads

  1. Simple binary rangecoder demo
    By Shelwien in forum Data Compression
    Replies: 35
    Last Post: 17th June 2019, 16:21
  2. zpaq: using local .0 to speed extract from remote .1 file
    By barrycarter in forum Data Compression
    Replies: 1
    Last Post: 13th March 2015, 20:22
  3. M1 - Optimized demo coder
    By toffer in forum Data Compression
    Replies: 189
    Last Post: 21st July 2010, 23:49
  4. Remote diff utility
    By Shelwien in forum Data Compression
    Replies: 2
    Last Post: 6th September 2009, 15:37
  5. QUAD-SFX DEMO
    By encode in forum Forum Archive
    Replies: 17
    Last Post: 26th April 2007, 13:57

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •