Results 1 to 13 of 13

Thread: TC 5.1dev7x is here!

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Okay, here is the very special version of the TC file compressor. This one has a huge dictionary (512 MB). I mainly made this version for the Squeeze Chart 2007...

    Enjoy!

    Link:
    tc-5.1dev7x.zip (41 KB)


  2. #2
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    That is awesome!!! Can't wait to try this one myself!

    Thanks Ilia!

  3. #3
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Very quick first test:

    A10.jpg > 830,453
    AcroRd32.exe > 1,305,803
    english.dic > 825,510
    FlashMX.pdf > 3,694,659
    FP.log > 585,245
    MSO97.dll > 1,712,306
    ohs.doc > 783,943
    rafale.bmp > 974,423
    vcfiu.hlp > 600,208
    world95.txt > 579,138

    Total = 11,891,688 Bytes

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    On small files (<64 MB), performance will be the same. This version has some advantage on large files! Malcolm Taylor describes the ROLZ algorithm as a fast large dictionary LZ. Indeed, this technique can cover a large dictionaries, both fast and memory efficiently. For example, a standard LZ technique requires about:
    dictionary_size*10
    The ROLZ can cover a really rage dictionaries with a small memory footprint. For standard LZ, just try to multiply 510 MB * 10 or 12...

  5. #5
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by encode
    On small files (<64 MB), performance will be the same
    True. Not even a byte difference from dev7. But there is a small speed drop (about 25kB/s).

  6. #6
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Black_Fox
    True. Not even a byte difference from dev7. But there is a small speed drop (about 25kB/s).
    Are you going to compare it with dev7 on some very large (1 Gig +) files?

    I will test it myself over the next few days.

  7. #7
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Yep, I am... don't know yet, what files, but I will find some... maybe 80 milions 'a' letters text-file for beginning

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    One important thing about Squeeze Chart - it has many large sets with very similar files - DriversXP, FreeDB, etc.

    I'm also testing this new TC for how really far this engine can look. For example, I make a TAR with a large number of copies of the one large file (>20 MB) and see the result - if LZ engine is able to find the previous copy inside this TAR file - the compression becomes awesome!


  9. #9
    Guest
    Why do you not create a Compressor with realy big Dictionary(512mb or bigger) and using ROLZ. When ROLZ used only 2*Dictionary?

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Quote Originally Posted by thometal
    Why do you not create a Compressor with realy big Dictionary(512mb or bigger) and using ROLZ. When ROLZ used only 2*Dictionary?
    TC 5.1dev7x has 512 MB dictionary and uses ROLZ-like algorithm. Also, for ROLZ, memory usage of 2*dictionary is not mandatory. For example, in my implementations ROLZ uses dictionary+16 MB...


  11. #11
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quick test with ENWIK8...

    TC 5.1 dev7 > 27,934,960 bytes

    TC 5.1 dev7x > 27,888,899 bytes

  12. #12
    Guest
    Your realization rolz uses not 512 mb the dictionary. It is the size of the block. It something is similar on LZRW. Under the dictionary the last are remembered N values for current active contexts.
    The test: coll.tar (1files (12mb) 2files (20mb) 1files (12mb))
    7-zip (16mb) 43.7mb
    7-zip (64mb) 31.8mb
    tc 5.1dev7x 43.5mb

    P.s Rolz detect this and compress up to 31.2 mb

  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    I guess I know how my program works...

    Actually, the real size of how far my ROLZ can look is depends on data type. For example, with hard compressible data, this distance is smallest, and with good compressible data, this distance is the largest. Theoretically, the whole dictionary can be covered. However, in practice, the distance is smaller. The approximate values:
    16...32 MB with already compressed data
    32...64 MB with the common data
    64...512 MB with the highly compressible data


Similar Threads

  1. looking for tc-5.1dev7x.zip
    By evg in forum Data Compression
    Replies: 2
    Last Post: 13th November 2009, 18:05
  2. TC 5.1dev7x test results
    By LovePimple in forum Forum Archive
    Replies: 8
    Last Post: 23rd January 2007, 23:00

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •