Results 1 to 16 of 16

Thread: libzling updates!

  1. #1
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts

    Talking libzling updates!

    https://github.com/richox/libzling/tree/20140219

    now libzling has a easy-to-use interface, but still need more documents.
    and a simple and fast lazy-match is added, compression ratio is better (so it beats cade's RH2 on both compression ratio and speed).

  2. The Following User Says Thank You to RichSelian For This Useful Post:

    cade (19th February 2014)

  3. #2
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    Very nice.

    Edit: c1 (32 depth): 30789 KB, c2 (max depth): 30730 here for enwik8 if I ignore a match if +1 or +2 are equal/longer, 30551 with arithmetic coder (close to ideal entropy).
    Last edited by cade; 19th February 2014 at 16:55.

  4. #3
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by cade View Post
    Very nice.

    Edit: c1 (32 depth): 30789 KB, c2 (max depth): 30730 here for enwik8 if I ignore a match if +1 or +2 are equal/longer, 30551 with arithmetic coder (close to ideal entropy).
    depth=32 is too deep for a fast compressor... libzling uses depth=8.
    and my lazy parser is not imprecise but fast. I just do a partial match instead of a full match (when s1[maxlen-3 .. maxlen] == s2[maxlen-3 .. maxlen], I will suppose s1 == s2).

    I won't use an arithmetic coder because it decodes too slow.

    well it don't think compression ratio is so important to a ROLZ compressor. since you can't compete with a BWT or PPM/CM one. but its speed is very competitive to normal LZ77. it makes me have chance to reach Matt's Pareto Frontier.

  5. #4
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    If you want another mode, depth=1 is LZP. There's no offset buffer anymore, just 0-255 + 256-511.

    Another idea is if you do a full match for +1 or +2 and it's a hit or a miss, you can save that. If it's a hit, skip LZ for all bytes until there, then you already know the match index and length. If it's a miss, that position skips LZ test.

    Edit: You are right about PPM and CM though, they're too slow for what I need, and need too much memory normally. The best for what I need (low RAM, okay speed) is modified BALZ or my CM4 with 16 MB of memory (dynamic mixing with o2 + o4 + match model). BWT is still possible, but sorting speed is unreliable, even after LZP.

    Of course, you want more speed, so except for LZP, none of this is helpful.
    Last edited by cade; 20th February 2014 at 06:14.

  6. #5
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by cade View Post
    If you want another mode, depth=1 is LZP. There's no offset buffer anymore, just 0-255 + 256-511.

    Another idea is if you do a full match for +1 or +2 and it's a hit or a miss, you can save that. If it's a hit, skip LZ for all bytes until there, then you already know the match index and length. If it's a miss, that position skips LZ test.

    Edit: You are right about PPM and CM though, they're too slow for what I need, and need too much memory normally. The best for what I need (low RAM, okay speed) is modified BALZ or my CM4 with 16 MB of memory (dynamic mixing with o2 + o4 + match model). BWT is still possible, but sorting speed is unreliable, even after LZP.

    Of course, you want more speed, so except for LZP, none of this is helpful.
    o1-LZP compression ratio is poor, high order LZP doesn't work well with o0-huffman, and decompression is slow.

  7. #6
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    Do you have any ideas on decreasing memory?

    These are my structures:
    Code:
    const static int TABLE_BITS = 12;
    const static int HASH_BITS = 12;
    
    const static int HASH_SIZE = 1 << HASH_BITS;
    const static int HASH_MASK = HASH_SIZE - 1;
    
    const static int TABLE_SIZE = 1 << TABLE_BITS;
    const static int TABLE_MASK = TABLE_SIZE - 1;
    
        struct MatchModel
        {
            struct Table
            {
                uint32 offsets[TABLE_SIZE]; // Upper bits 12 = hash, lower bits 24 = offset
                int16 head;
            };
    
            struct EncoderTable
            {
                uint32 next[TABLE_SIZE];
                uint32 root[HASH_SIZE];
            };
    
            Table tables[0x100];
            EncoderTable enc_tables[0x100];
    
            void Reset()
            {
                for (int i = 0; i < 0x100; i++)
                {
                    tables[i].head = 0;
                    memset(tables[i].offsets, 0, sizeof(tables[i].offsets));
    
                    memset(enc_tables[i].root, -1, sizeof(enc_tables[i].root));
                    memset(enc_tables[i].next, -1, sizeof(enc_tables[i].next));
                }
            }
        };
    
        struct MatchModel_Decoding
        {
            MatchModel::Table tables[0x100];
    
            void Reset()
            {
                memset(tables, 0, sizeof(tables));
            }
        };
    The size for decoding is fine, but the hash tables for encoding take up too much memory. Reducing HASH_BITS (depth=8 is less effective) or TABLE_BITS (any depth is less effective) is still fairly large. The alternative is I switch to 64k windows and use 16-bit offsets, but I would much rather 4 MB blocks.

  8. #7
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by cade View Post
    Do you have any ideas on decreasing memory?

    These are my structures:
    Code:
    const static int TABLE_BITS = 12;
    const static int HASH_BITS = 12;
    
    const static int HASH_SIZE = 1 << HASH_BITS;
    const static int HASH_MASK = HASH_SIZE - 1;
    
    const static int TABLE_SIZE = 1 << TABLE_BITS;
    const static int TABLE_MASK = TABLE_SIZE - 1;
    
        struct MatchModel
        {
            struct Table
            {
                uint32 offsets[TABLE_SIZE]; // Upper bits 12 = hash, lower bits 24 = offset
                int16 head;
            };
    
            struct EncoderTable
            {
                uint32 next[TABLE_SIZE];
                uint32 root[HASH_SIZE];
            };
    
            Table tables[0x100];
            EncoderTable enc_tables[0x100];
    
            void Reset()
            {
                for (int i = 0; i < 0x100; i++)
                {
                    tables[i].head = 0;
                    memset(tables[i].offsets, 0, sizeof(tables[i].offsets));
    
                    memset(enc_tables[i].root, -1, sizeof(enc_tables[i].root));
                    memset(enc_tables[i].next, -1, sizeof(enc_tables[i].next));
                }
            }
        };
    
        struct MatchModel_Decoding
        {
            MatchModel::Table tables[0x100];
    
            void Reset()
            {
                memset(tables, 0, sizeof(tables));
            }
        };
    The size for decoding is fine, but the hash tables for encoding take up too much memory. Reducing HASH_BITS (depth=8 is less effective) or TABLE_BITS (any depth is less effective) is still fairly large. The alternative is I switch to 64k windows and use 16-bit offsets, but I would much rather 4 MB blocks.
    I don't understand why you use EncodeTable and DecodeTable at the same time.
    maybe you would like to read the source of libzling. its ROLZ implement is very clear.

  9. #8
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    It's not at the same time. The encoder uses only MatchModel. The decoder only uses MatchModel_Decoding.

    libzling is clear, but it doesn't store this part any more efficiently, so I was wondering if you had other ideas.

  10. #9
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by cade View Post
    It's not at the same time. The encoder uses only MatchModel. The decoder only uses MatchModel_Decoding.

    libzling is clear, but it doesn't store this part any more efficiently, so I was wondering if you had other ideas.
    struct MatchModel
    {
    struct Table
    {
    uint32 offsets[TABLE_SIZE]; // Upper bits 12 = hash, lower bits 24 = offset
    int16 head;
    };

    struct EncoderTable
    {
    uint32 next[TABLE_SIZE];
    uint32 root[HASH_SIZE];
    };

    Table tables[0x100];
    EncoderTable enc_tables[0x100];
    // here, why you use two tables in one model?

    void Reset()
    {
    for (int i = 0; i < 0x100; i++)
    {
    tables[i].head = 0;
    memset(tables[i].offsets, 0, sizeof(tables[i].offsets));

    memset(enc_tables[i].root, -1, sizeof(enc_tables[i].root));
    memset(enc_tables[i].next, -1, sizeof(enc_tables[i].next));
    }
    }
    };at least there one more optimization in libzling: uint16 root[HASH_SIZE]; is enough.

  11. #10
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    Table only has position and hash for that position, EncoderTable has the indices for the hash table (root and next). It's not the same contents.

    Root refers to the first index, so 16-bit root would be 64k max index. I might misunderstand this, so I will take a closer look in zling, thanks.

  12. #11
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by cade View Post
    Table only has position and hash for that position, EncoderTable has the indices for the hash table (root and next). It's not the same contents.

    Root refers to the first index, so 16-bit root would be 64k max index. I might misunderstand this, so I will take a closer look in zling, thanks.
    here is the match struct for encoding in libzling:

    Code:
        struct ZlingEncodeBucket {
            uint16_t suffix[kBucketItemSize];
            uint32_t offset[kBucketItemSize];
    
            uint16_t head;
            uint16_t hash[kBucketItemHash];
        };
        ZlingEncodeBucket m_buckets[256];
    hash, suffix and offset have the same meaning with your root, next and offsets.
    here hash and suffix are all uint16, because for libzling kBucketItemSize==4096. I can't imagine you have used a BucketSize>=65536, it's totally unnecessary.

  13. #12
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    Oh... you get position from offsets[hash idx], I was storing hash idx as actual byte index. My bucket size is also 4k. Nice, thanks for the help.

    I don't think comparing with zlib is a good idea... max 32k window. That explains why the memory requirement is low.

  14. #13
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by cade View Post
    Oh... you get position from offsets[hash idx], I was storing hash idx as actual byte index. My bucket size is also 4k. Nice, thanks for the help.

    I don't think comparing with zlib is a good idea... max 32k window. That explains why the memory requirement is low.
    zlib sucks but works well in small environment like a photo camera. but many programs and libraries use zlib as well, through they don't care with memory usage.

    bzip2 uses 900KB block. it sucks, too.

  15. #14
    Member
    Join Date
    Nov 2013
    Location
    US
    Posts
    131
    Thanks
    31
    Thanked 29 Times in 19 Posts
    What don't you like about bzip2, apart from speed? BWT (1MB) + o1 CM is my 3rd option right now.

    Strange that even rar's LZH with a 64 kb dictionary compresses larger than zlib for small files (2 MB), but marginally better on enwik8. I wouldn't say zlib sucks, it works well for what it does with 32kb.

  16. #15
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Last edited by Matt Mahoney; 3rd March 2014 at 00:18.

  17. #16
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Thanks. I think I will provide some compression levels. now it compresses better but much slow.

Similar Threads

  1. zpaq updates
    By Matt Mahoney in forum Data Compression
    Replies: 2527
    Last Post: 4th May 2019, 12:33
  2. libzling is coming soon!
    By RichSelian in forum Data Compression
    Replies: 5
    Last Post: 4th February 2014, 02:06
  3. zling updates
    By RichSelian in forum Data Compression
    Replies: 13
    Last Post: 24th January 2014, 17:56
  4. GUI winzpaq updates
    By Sportman in forum Data Compression
    Replies: 31
    Last Post: 3rd August 2013, 23:21
  5. comprox updates
    By RichSelian in forum Data Compression
    Replies: 1
    Last Post: 5th November 2011, 23:19

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •