Results 1 to 22 of 22

Thread: CHK v1.10 is here!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts

    CHK v1.10 is here!

    CHK v1.10 has been released! Please enjoy new release!



    http://encode.narod.ru/


  2. #2
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Timing tests for enwik9 (1 GB) from cache on a 2.0 GHz T3200, 2 cores, 3 GB, Win32. Nice improvement from v1.03.

    crc16 6.5 sec
    crc32 6.5 sec
    crc64 10.5 sec
    md4 8.4-8.5 sec
    md5 13.8-13.9 sec
    sha1 15.3 sec
    sha256 21.4 sec
    sha512 48.9 sec
    sha3 93.0 sec

    Times are as reported by the program rounded to 0.1 seconds. I ran each test twice and reported both times if different. To make sure that enwik9 was all in disk cache I watched the disk light (off) and CPU usage in task manager (50%). I ended up having to run a program that allocates all memory until it runs out and exits in order to force other programs to page out enough memory to keep all of enwik9 in cache.

  3. #3
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Matt, you can install RAM drive software and increase RAM disk size before running test from it. at least one from SuperSpeed LLC can do it witout rebooting

    Ilya, 7-zip includes OSS crc32 asm code that runs about 2GB/sec on 2600k@4.6

  4. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Also note times for SlavaSoft fsum 2.51 on same hardware:

    crc32 5.1 sec
    md4 4.0 sec
    md5 5.2 sec
    sha1 5.6 sec
    sha256 13.3 sec
    sha512 141.6 sec

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Compiled CHK's code using Visual C++ 2012.

    Hardware:
    CPU: Intel Core i7-3770K @ 4.7 GHz
    RAM: 16 GB Corsair Dominator Platinum @ ~1900 MHz
    SSD: 240 GB Corsair Force GT

    SHA1, ENWIK9
    CHK -> 4.4 sec
    sha1sum -> 3.3 sec
    CHK Command-line -> 3.1 sec
    fsum-> 1.9 sec

    Dunno how they did that! I think fsum uses SSE/AVX/GPU-based computation or very smart ASM optimizations...

  6. #6
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I wonder too. fsum is twice as fast as what I could write.

  7. #7
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Small improvement in libzpaq SHA1 enwik9, from 11.4 sec to 9.5 sec when compiled with g++ -O3 -msse2, or 9.9 sec with cl /O2 /arch:SSE2

    Code:
    // sha1.cpp - compute SHA1 hashes of filename arguments
    // Written by Matt Mahoney. Public domain.
    
    #define _CRT_DISABLE_PERFCRIT_LOCKS
    #include <stdio.h>
    #include <string.h>
    #include <stdint.h>
    
    typedef uint32_t U32;
    
    // For computing SHA-1 checksums
    class SHA1 {
    public:
      void put(int c) {  // hash 1 byte
        U32& r=w[len0>>5&15];
        r=(r<<8)|(unsigned char)c;
        if (!(len0+=8)) ++len1;
        if ((len0&511)==0) process();
      }
      double size() const {return len0/8+len1*536870912.0;} // size in bytes
      const char* result();  // get hash and reset
      SHA1() {init();}
    private:
      void init();      // reset, but don't clear hbuf
      U32 len0, len1;   // length in bits (low, high)
      U32 h[5];         // hash state
      U32 w[16];        // input buffer
      char hbuf[20];    // result
      void process();   // hash 1 block
    };
    
    // Start a new hash
    void SHA1::init() {
      len0=len1=0;
      h[0]=0x67452301;
      h[1]=0xEFCDAB89;
      h[2]=0x98BADCFE;
      h[3]=0x10325476;
      h[4]=0xC3D2E1F0;
      memset(w, 0, sizeof(w));
    }
    
    // Return old result and start a new hash
    const char* SHA1::result() {
    
      // pad and append length
      const U32 s1=len1, s0=len0;
      put(0x80);
      while ((len0&511)!=448)
        put(0);
      put(s1>>24);
      put(s1>>16);
      put(s1>>8);
      put(s1);
      put(s0>>24);
      put(s0>>16);
      put(s0>>8);
      put(s0);
    
      // copy h to hbuf
      for (int i=0; i<5; ++i) {
        hbuf[4*i]=h[i]>>24;
        hbuf[4*i+1]=h[i]>>16;
        hbuf[4*i+2]=h[i]>>8;
        hbuf[4*i+3]=h[i];
      }
    
      // return hash prior to clearing state
      init();
      return hbuf;
    }
    
    // Hash 1 block of 64 bytes
    void SHA1::process() {
      U32 a=h[0], b=h[1], c=h[2], d=h[3], e=h[4];
      static const U32 k[4]={0x5A827999, 0x6ED9EBA1, 0x8F1BBCDC, 0xCA62C1D6};
      #define f(a,b,c,d,e,i) \
        if (i>=16) \
          w[(i)&15]^=w[(i-3)&15]^w[(i-8)&15]^w[(i-14)&15], \
          w[(i)&15]=w[(i)&15]<<1|w[(i)&15]>>31; \
        e+=(a<<5|a>>27)+k[(i)/20]+w[(i)&15] \
          +((i)%40>=20 ? b^c^d : i>=40 ? (b&c)|(d&(b|c)) : d^(b&(c^d))); \
        b=b<<30|b>>2;
      #define r(i) f(a,b,c,d,e,i) f(e,a,b,c,d,i+1) f(d,e,a,b,c,i+2) \
                   f(c,d,e,a,b,i+3) f(b,c,d,e,a,i+4)
      r(0)  r(5)  r(10) r(15) r(20) r(25) r(30) r(35)
      r(40) r(45) r(50) r(55) r(60) r(65) r(70) r(75)
      #undef f
      #undef r
      h[0]+=a; h[1]+=b; h[2]+=c; h[3]+=d; h[4]+=e;
    }
    
    int main(int argc, char** argv) {
      SHA1 sha1;
      for (int i=1; i<argc; ++i) {
        FILE* in=fopen(argv[i], "rb");
        if (!in)
          perror(argv[i]);
        else {
          const int BUFSIZE=4096;
          int n;
          unsigned char buf[BUFSIZE];
          while ((n=fread(buf, 1, BUFSIZE, in))>0) {
            for (int i=0; i<n; ++i)
              sha1.put(buf[i]);
          }
          fclose(in);
          double sz=sha1.size();
          const char* p=sha1.result();
          for (int j=0; j<20; ++j) printf("%02x", p[j]&255);
          printf(" %1.0f %s\n", sz, argv[i]);
        }
      }
      return 0;
    }
    This will go in the next version of libzpaq if I can't find any further improvements. The main improvement is in reducing w[80] to w[16] and scheduling as needed, and replacing the f expressions to use fewer operations as suggested in Wikipedia. About half the speedup is due to replacing getc() with fread(), so that won't have an impact on zpaq.

    I suspect that further gains would come from using SSE2 for scheduling, unrolled to w[32]. The main round function has to be done sequentially.

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    New version will be released soon!

    What's new:

    + Added ED2K hash support
    + UTF-8 output of hashes (Checksums.txt), changed output format to "hash *file"
    + Changed "Copy to clipboard" format to, "file, hashtype: hash"
    + CHK will always highlight all equal hashes, not only when you "Sort By Hash"
    + Some small GUI fixes - more correct DPI scaling, corrected appearance under Windows XP

  9. #9
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	chk112.png 
Views:	350 
Size:	74.1 KB 
ID:	2129  

  10. #10
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Will ED2K be multithreaded?

  11. #11
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Not this time - just currently I'm focused on more basic things...

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    As a note, CHK's output is 100% compatible with such hash tools as RHASH, including UTF-8 encoding.
    Just save hash list as TXT file and run "rhash -c checksums.txt"
    RHASH will correctly detect the hash type: CRC-32/ED2K/MD4/MD5/SHA-1/SHA-256/SHA-512
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	chk112a.png 
Views:	313 
Size:	82.9 KB 
ID:	2130  

  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Added the Uppercase option, removed mostly useless List View mode
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	uc1.png 
Views:	268 
Size:	86.5 KB 
ID:	2132   Click image for larger version. 

Name:	uc0.png 
Views:	286 
Size:	88.5 KB 
ID:	2133  

  14. #14
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Just tested 64-bit compile of CHK. First of all, 64-bit executable is really fat - 11.7 MB vs 3.3 MB of 32-bit compile. And looks like UPX can't pack 64-bit executables.

    Anyway, timings are (ENWIK9 again):

    HashCHK 32-bitCHK 64-bit
    CRC-162.7s1.6s
    CRC-322.7s1.6s
    CRC-644.2s1.6s
    ED2K4.4s2.6s
    MD43.1s2.8s
    MD53.6s3.2s
    SHA-14.4s3.1s
    SHA-2568.5s6.3s
    SHA-51217.8s4.4s [!]
    SHA-330.1s8.5s [!]

  15. #15
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    7-zip 32-bit crc32 algo does it in 0.5 seconds (2600k@4.6)

  16. #16
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Replaced an old putz input box with a proper hash-input dialog:
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	dialog1.png 
Views:	296 
Size:	138.2 KB 
ID:	2136   Click image for larger version. 

Name:	dialog2.png 
Views:	293 
Size:	141.3 KB 
ID:	2137  

  17. #17
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    During long CHK v1.12 evaluation and testing, found a bug in RHASH's SHA-512 implementation...
    As you can see, this time I decided to do extra testing. I can tell ya, this new version is just awesome - really easy to use with improved readability and user interface - I just really enjoy testing it...

  18. #18
    Member
    Join Date
    Nov 2012
    Location
    Bangalore
    Posts
    114
    Thanks
    9
    Thanked 37 Times in 22 Posts

    Update

    The last SHA3 entry in your enwik9 performance table is interesting. Which implementation are you using ?
    SSE optimized Keccak 256 (part of the reference implementation) is 4x faster than Skein 256 on my laptop: Core i5 430M 2.27 GHz. It processes at 1 GB/s while Skein manages 243 MB/s. Of course the Skein implentation is optimized x64 assembly but does not use SSE.

    For both cases I am using (in Pcompress) the optimized reference implementations which were part of the NIST submissions. Also Intel has a SSE/AVX optimized ASM version of the core SHA 256 block function here: http://download.intel.com/embedded/processor/whitepaper/327457.pdf
    I am using this in Pcompress as well. It really shines on an AVX enabled processor.
    Last edited by moinakg; 29th December 2012 at 22:10.

  19. #19
    Member
    Join Date
    Nov 2012
    Location
    Bangalore
    Posts
    114
    Thanks
    9
    Thanked 37 Times in 22 Posts
    My previous claim wrt to Keccak performance is wrong. I was still testing it with the test vectors and found an error. So it is now more realistic and as per the table above.

  20. #20
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Quote Originally Posted by moinakg View Post
    The last SHA3 entry in your enwik9 performance table is interesting. Which implementation are you using ?
    I'm using my own implementation. It's based on the original KECCAK code and features some optimizations similar to what I'm using with my MD4...SHA-512!

  21. #21
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    And order-0 histogram for ENWIK9 - an idea for the upcoming CHK's feature. Additionally, I have an idea about an order-1 histogram/graph/map.
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	histogram.png 
Views:	302 
Size:	48.3 KB 
ID:	2145  

  22. #22
    Member chornobyl's Avatar
    Join Date
    May 2008
    Location
    ua/kiev
    Posts
    153
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Looks good, waiting to see order-1 histogram, and maybe also logarithmic scale switch.

Similar Threads

  1. CHK v1.03 is here!
    By encode in forum Data Compression
    Replies: 58
    Last Post: 2nd December 2012, 01:00
  2. CHK wishlist
    By encode in forum The Off-Topic Lounge
    Replies: 64
    Last Post: 17th March 2012, 13:51
  3. CHK 1.02 - file analysis tool
    By encode in forum Data Compression
    Replies: 6
    Last Post: 24th July 2011, 15:46
  4. CHK 1.01 is here! (New GUI MD5/SHA1 file checker)
    By encode in forum Data Compression
    Replies: 24
    Last Post: 20th July 2011, 08:45

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •