Page 3 of 3 FirstFirst 123
Results 61 to 80 of 80

Thread: lzpm 0.10 is here!

  1. #61
    Tester

    Join Date
    May 2008
    Location
    St-Petersburg, Russia
    Posts
    182
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by encode
    Again vote for numbers!
    Quote Originally Posted by encode
    With 64 MB buffer the user must have at least 1 GB of RAM.
    16K + 32MB buffer
    I think compression speed is not so important as decompression.

  2. #62
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by encode
    32 MB buffer (to keep memory on a leash)
    Quote Originally Posted by Squxe
    16K + 32MB buffer

    Most people have at least 1GB RAM these days. I vote for the 64MB buffer.

  3. #63
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    OK, will keep 16K index and 64 MB buffer as a base. At least all benchmarkers have 1 GB RAM. LZPM goes large!

    Now I will play with hashing and string searching - have a few ideas.

    Malcolm Taylor in his ROLZ2 uses Hash Tables for Fast mode and Binary Tree for higher ones.

    Hash Tables have fast update, at the same time the search is poorly bounded - that's why it's not efficient with optimal parsing.

    Binary Tree is better with optimal parsing since the string search becomes faster. Just currently I have no idea how exactly use binary trees for string searching. Probably I should look at LZMA sources.


  4. #64
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by encode
    OK, will keep 16K index and 64 MB buffer as a base. At least all benchmarkers have 1 GB RAM. LZPM goes large!
    Thanks Ilia!

  5. #65
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Some results with new LZPM:

    ENWIK8: 26,513,271 bytes
    ENWIK9: 229,206,234 bytes

    gimp-2.0.0.tar: 9,477,271 bytes

    Memory needed for compression: 512 + 128 + 64 = 704 MB


  6. #66
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    The debug-pre-release of LZPM 0.11 is here:
    lzpm011dbg.zip (47 KB)

    (only for personal testing not outside this forum)

    Just a powerhouse-like release. If you'll like it, very soon I finally release the v0.11!

    Check the fast modes - IT'S SICK!!

    Happy testing!

  7. #67
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Thanks Ilia!

    I'm sure this release will be a true PowerHouse!!!

  8. #68
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Very quick test with FP.log...

    (1) 778 KB (796,905 bytes)
    Elapsed Time: 00:00:38.14 (38.14 Seconds)


    ( 736 KB (754,389 bytes)
    Elapsed Time: 00:02:57.64 (177.64 Seconds)


    Good compression speed!

  9. #69
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quick test with SFC at lowest (1) setting...

    A10.jpg > 832,155
    AcroRd32.exe > 1,661,229
    english.dic > 1,043,341
    FlashMX.pdf > 3,768,997
    FP.LOG > 796,905
    MSO97.DLL > 2,030,758
    ohs.doc > 839,929
    rafale.bmp > 1,126,286
    vcfiu.hlp > 737,812
    world95.txt > 644,182

    Total = 13,481,594 bytes

    Compression is still good even at the lowest setting.

    I'll run more tests tomorrow.

  10. #70
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    More tests...

    LZPM (1)

    A10.jpg > 832,155
    AcroRd32.exe > 1,661,229
    english.dic > 1,043,341
    FlashMX.pdf > 3,768,997
    FP.LOG > 796,905
    MSO97.DLL > 2,030,758
    ohs.doc > 839,929
    rafale.bmp > 1,126,286
    vcfiu.hlp > 737,812
    world95.txt > 644,182

    Total = 13,481,594 bytes

    LZPM (2)

    A10.jpg > 832,164
    AcroRd32.exe > 1,642,634
    english.dic > 1,022,625
    FlashMX.pdf > 3,761,541
    FP.LOG > 776,765
    MSO97.DLL > 2,013,842
    ohs.doc > 835,722
    rafale.bmp > 1,102,195
    vcfiu.hlp > 726,469
    world95.txt > 613,538

    Total = 13,327,495 bytes

    LZPM (3)

    A10.jpg > 832,173
    AcroRd32.exe > 1,634,885
    english.dic > 1,015,535
    FlashMX.pdf > 3,756,691
    FP.LOG > 768,875
    MSO97.DLL > 2,009,222
    ohs.doc > 835,194
    rafale.bmp > 1,092,312
    vcfiu.hlp > 710,894
    world95.txt > 598,423

    Total = 13,254,204 bytes

    LZPM (4)

    A10.jpg > 832,173
    AcroRd32.exe > 1,633,323
    english.dic > 1,010,246
    FlashMX.pdf > 3,755,575
    FP.LOG > 764,299
    MSO97.DLL > 2,007,172
    ohs.doc > 834,961
    rafale.bmp > 1,089,177
    vcfiu.hlp > 710,058
    world95.txt > 592,372

    Total = 13,229,356 bytes

    LZPM (5)

    A10.jpg > 832,173
    AcroRd32.exe > 1,631,500
    english.dic > 1,000,441
    FlashMX.pdf > 3,754,544
    FP.LOG > 761,176
    MSO97.DLL > 2,004,878
    ohs.doc > 834,432
    rafale.bmp > 1,087,340
    vcfiu.hlp > 705,505
    world95.txt > 589,635

    Total = 13,201,624 bytes

    LZPM (6)

    A10.jpg > 832,173
    AcroRd32.exe > 1,630,453
    english.dic > 989,329
    FlashMX.pdf > 3,753,560
    FP.LOG > 756,963
    MSO97.DLL > 2,003,865
    ohs.doc > 834,205
    rafale.bmp > 1,083,531
    vcfiu.hlp > 703,909
    world95.txt > 588,135

    Total = 13,176,123 bytes

    LZPM (7)

    A10.jpg > 832,173
    AcroRd32.exe > 1,629,416
    english.dic > 979,415
    FlashMX.pdf > 3,753,148
    FP.LOG > 754,918
    MSO97.DLL > 2,002,878
    ohs.doc > 834,104
    rafale.bmp > 1,078,426
    vcfiu.hlp > 701,594
    world95.txt > 586,875

    Total = 13,152,947 bytes

    LZPM (

    A10.jpg > 832,173
    AcroRd32.exe > 1,629,139
    english.dic > 973,003
    FlashMX.pdf > 3,752,812
    FP.LOG > 754,389
    MSO97.DLL > 2,002,410
    ohs.doc > 833,986
    rafale.bmp > 1,075,903
    vcfiu.hlp > 700,590
    world95.txt > 585,630

    Total = 13,140,035 bytes

    LZPM (9)

    A10.jpg > 832,173
    AcroRd32.exe > 1,626,230
    english.dic > 962,032
    FlashMX.pdf > 3,750,555
    FP.LOG > 614,714
    MSO97.DLL > 2,000,508
    ohs.doc > 828,583
    rafale.bmp > 1,056,836
    vcfiu.hlp > 687,544
    world95.txt > 573,036

    Total = 12,932,211 bytes


    Awesome compressor!

    Good work Ilia!

  11. #71
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    I've found that this 5-byte string match finder is not efficient in terms of binary compression - we loose many short matches. So, I decide to use an old 4-byte string searching. This leads to higher binary and slightly higher text compression at the cost of extra compression time (a few times slower). It took a few hours to compress ENWIK9 on my PC... However, the max is max, right? OK, some results with new modification:
    ENWIK8: 26,498,178 bytes
    ENWIK9: 229,062,912 bytes
    gimp-2.0.0.tar: 9,466,969 bytes

    On small and binary files the gain is much more notable.


  12. #72
    Tester

    Join Date
    May 2008
    Location
    St-Petersburg, Russia
    Posts
    182
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by encode
    However, the max is max, right?
    Right!

  13. #73
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Quote Originally Posted by README.TXT
    LZPM file compressor v0.11

    LZPM is an experimental file compressor based on ROLZ algorithm. ROLZ is an improved version of the LZ77 compression algorithm featuring the fast decompression and a high compression ratio.

    Usage:
    To compress with the fast compression level:
    lzpm 1 test.txt test.lzpm

    To decompress:
    lzpm d test.lzpm test.txt

    Where are few compression levels ranging from one (fast) to nine (max). Note that the compression level does not affect the decompression speed or memory usage.

    The compression levels description:
    1 – The fast level. LZPM will disable any optimized parsing which leads to the fastest compression.
    2 to 8 – Normal levels. LZPM will use an improved parsing with some search restrictions to reduce the compression time. With the level two, the compressor will use 1-byte lookahead. Accordingly, with the level eight LZPM will use 7-byte lookahead. Please note that the level three in most cases is the most efficient in terms of the compression ratio versus the compression time.
    9 – The max level, it provides the best compression at any cost. It uses fully optimized parsing with no search limits. In some cases, LZPM running with such mode can be very slow. However, as noted before, the compression level does not affect the decompression speed or memory usage. Thus, if you need an extra compression, use this level.

    System requirements:
    At least 1 GB of RAM installed.

    You can contact the author via followed email:
    --------------- at yahoo.com

    Check for new versions at: http://lzpm.encode.ru

    History:
    New to v0.11 (? Sep 2007):
    + Added nine compression levels
    + Increased memory usage in favor to improve the compression

    New to v0.10 (23 Aug 2007):
    + Added an optimal parsing

    New to v0.09 (15 Aug 2007):
    + Improved the parsing scheme

    New to v0.08 (8 Aug 2007)
    + Added a further improved parsing scheme

    New to v0.07 (6 Aug 2007)
    + Improved the parsing strategy

    New to v0.06 (19 May 2007)
    + Fixed a bug with the hash chains

    New to v0.05 (17 May 2007)
    + Improved the parsing scheme and string searching

    New to v0.04 (4 May 2007)
    + Improved the string searching using the hash chains

    New to v0.03 (28 Apr 2007):
    + Improved the match finder

    New to v0.02 (18 Apr 2007):
    + Improved an arithmetic encoder

    New to v0.01 (15 Apr 2007):
    + The first release, it represents an improved version of the QUAD file compressor

  14. #74
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    When releases version 0.11 ?

  15. #75
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    The LZPM 0.11 is at almost written - now I'm just testing it. It will released within one week. Maybe even tomorrow - it depends on my tests. Anyway, the release is very soon!

  16. #76
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Have an idea for hash chains improvement.

    The data structures are accesed in followed manner:

    head -> prev -> prev -> prev ...

    i.e. we start from "head", after we traverse backwards thru "prev".

    To the head we have a random access (defined by a hash).

    To the prev we access sequentially, but backwards. The idea is to make a cheap trick, so we able to traverse towards, which assumed to be faster. To do so, we just invert each "pos":
    pos = prev[N - pos];

    So each time we will go forward, this may lead to a faster memory access.

    The simple, isn't it? Will test such idea tomorrow.

  17. #77
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by encode
    traverse towards, which assumed to be faster.
    its random access anyway. forward/backward access may have different speed when its *sequential*

  18. #78
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Indeed, this gives no improvement. At least tested the idea...

  19. #79
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Some time ago, Uwe proposed an idea for parsing improvement:

    With my FP, each time we compare the combination of two matches:

    a + b

    with

    shorter a + b

    if a + b = (MAXMATCH * 2) - 1

    we can not outperform such value inside the parsing scheme, so we just eliminate the search.

    My idea if b == MAXMATCH, then we can not outperform current combination and we completely eliminate further optimization.

    The case a + b = (MAXMATCH * 2) - 1 is not frequent, on some files such thing will never happen. b == MAXMATCH is far more frequent - as a result we can skip many match searches, providing the SAME compression.

    Such thing already added to the LZPM 0.11. However, even with such improvement LZPM with "9" is damn slow. Anyway, with such technique we can save some time with no compression loss!


  20. #80
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    In practice, we can calculate such upper limit dynamically, which can be less than MAXMATCH, getting further speed up, especially on highly redundant files like fp.log.

Page 3 of 3 FirstFirst 123

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •