Results 1 to 13 of 13

Thread: TC 5.0dev6 released!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    What's new:
    + Added fully-featured order-2-0 PPMC coder

    Link:
    Download TC 5.0dev6 (28 KB)


  2. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    It's just DRAFT-test version. In next versions I'll:
    + Improve literal/match length coding mechanism, leading higher compression
    + Speed-up the hashtable


  3. #3
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    TC 5.0dev6 on Calgary Corpus:

    bib: 30,729 bytes
    book1: 263,369 bytes
    book2: 177,833 bytes
    geo: 62,232 bytes
    news: 122,664 bytes
    obj1: 10,747 bytes
    obj2: 79,953 bytes
    paper1: 17,560 bytes
    paper2: 27,173 bytes
    pic: 53,819 bytes
    progc: 13,110 bytes
    progl: 15,541 bytes
    progp: 10,830 bytes
    trans: 17,318 bytes

    total: 902,878 bytes


  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    TC 5.0dev6 on SFC (maximumcompression.com):

    A10.jpg: 859,025 bytes
    acrord32.exe: 1,692,298 bytes
    english.dic: 844,988 bytes
    FlashMX.pdf: 3,810,493 bytes
    fp.log: 677,336 bytes
    mso97.dll: 2,058,718 bytes
    ohs.doc: 866,934 bytes
    rafale.bmp: 1,092,785 bytes
    vcfiu.hlp: 731,924 bytes
    world95.txt: 661,748 bytes

    total: 13,296,249 bytes


  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    TC 5.0dev6 on Canterbury Corpus:

    alice29.txt: 48,148 bytes
    asyoulik.txt: 42,770 bytes
    cp.html: 7,855 bytes
    fields.c: 3,120 bytes
    grammar.lsp: 1,232 bytes
    kennedy.xls: 129,290 bytes
    lcet10.txt: 122,069 bytes
    plrabn12.txt: 162,492 bytes
    ptt5: 53,819 bytes
    sum: 13,226 bytes
    xargs.1: 1,747 bytes

    total: 585,768 bytes


  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    TC 5.0dev6 on Large Text Compression Benchmark:

    ENWIK8: 29,544,971 bytes
    ENWIK9: 257,416,397 bytes (c 279 sec/d 279 sec)

    P4 3.0 GHz, 1 GB RAM, Windows XP SP2


  7. #7
    Guest
    Thanks!

    Will this version will perform better on the SFC test.

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    With no doubt, this version have higher compression compared to TC 5.0dev5.


  9. #9
    Guest
    My previous post should have read:

    Will this version will perform better on the MFC test.


    Apologies for the error!

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    ...99,9 percent it will! Let's wait and see!

  11. #11
    Guest
    OK!

  12. #12
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Well, for a few days of 24-hour testing and experimenting, I've found current TC 5.0dev6 is good enough - it can achieve higher compression with larger PPM hashtable and/or with different scaling, but difference will be about 1...2% or less, if at all. So, now I just collecting ideas...


  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Results for TC 5.0dev6x on 'enwik' files. Note compared to dev6 this version uses 16 MB PPM hashtable instead of 4 MB.

    <u>ENWIK8</u>
    TC 5.0dev6x: 28,990,965 bytes
    TC 5.0dev6: 29,544,971 bytes

    <u>ENWIK9</u>
    TC 5.0dev6x: 251,876,150 bytes
    TC 5.0dev6: 257,416,397 bytes

    Also note, in some cases a larger hashtable can provide lower compression. It's due to a smaller hashtable can efficiently drop 'outdated' contexts, unlike large one.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •