Results 1 to 2 of 2

Thread: Adler32 on Large Blocks

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts

    Adler32 on Large Blocks

    Happy New Year folks!

    A small delirium idea of the first morning of 2015. Original Adler32 optimization idea is to perform MOD of Adler and Sum2 each 5552 bytes (modulo delay count), to avoid 32-bit arithmetic overflow.
    But since we are heavily on 64-bit now, why tiny 5552 byte blocks? We easily can do MOD once per a few megabytes! And still keep Adler variable as a 32-bit integer.
    And hence the code:

    const int BUF_SIZE=1<<21; // 2 MB

    UINT32 Adler=1;
    UINT64 Sum2=0;

    int n;
    while ((n=fread(buf, 1, BUF_SIZE, f))>0)
    {
    for (int p=0; p<n; ++p)
    {
    Adler+=buf[p];
    Sum2+=Adler;
    }
    Adler%=65521;
    Sum2%=65521;
    }

    return (Sum2<<16)|Adler;

    And we are confidently safe from a 64-bit integer overflow here...

  2. The Following 2 Users Say Thank You to encode For This Useful Post:

    Cyan (3rd January 2015),Matt Mahoney (2nd January 2015)

  3. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    And with some loop-unrolling, this code is notable faster than super-fast Slice-by-8 CRCs. An upcoming CHK v1.80 will feature this accelerated Adler32 for sure!
    The code (just keep an eye on n variable - it must not exceed 16 MB, in this case max l is (255<<24), 64-bit h withstand even further):

    void Update(UINT32& Adler, UINT8* s, int n)
    {
    UINT32 l=UINT16(Adler);
    UINT64 h=Adler>>16;
    int p=0;
    for (; p<(n&7); ++p)
    h+=(l+=s[p]);
    for (; p<n; p+=8)
    {
    h+=(l+=s[p]);
    h+=(l+=s[p+1]);
    h+=(l+=s[p+2]);
    h+=(l+=s[p+3]);
    h+=(l+=s[p+4]);
    h+=(l+=s[p+5]);
    h+=(l+=s[p+6]);
    h+=(l+=s[p+7]);
    }
    Adler=(UINT32(h%65521)<<16)|(l%65521);
    }


Similar Threads

  1. MTF and DC on large alphabets via sorting
    By nburns in forum Data Compression
    Replies: 12
    Last Post: 11th June 2013, 19:28
  2. memstat: memory status and large page blocks available
    By Bulat Ziganshin in forum Download Area
    Replies: 2
    Last Post: 8th March 2013, 04:41
  3. 2G+ memory blocks
    By Shelwien in forum Data Compression
    Replies: 0
    Last Post: 6th March 2009, 03:13
  4. LZSS with a large dictionary
    By encode in forum Data Compression
    Replies: 31
    Last Post: 31st July 2008, 21:15
  5. Replies: 0
    Last Post: 26th July 2007, 18:47

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •