Page 1 of 2 12 LastLast
Results 1 to 30 of 52

Thread: Paq8o10t

  1. #1
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    377
    Thanks
    139
    Thanked 198 Times in 108 Posts

    Paq8o10t

    text detection (utf-8 partial)
    Nestmodel form paq8g
    modified contextModel2 like this:
    Code:
        switch (filetype)
        {
        case TXTUTF8:
        case TEXT: { 
                sparseModel(m,ismatch,order);
                nestModel(m);
                wordModel(m);
                indirectModel(m);
                dmcModel(m);
                 break;
            }
        case EXE: {
                sparseModel(m,ismatch,order);
                indirectModel(m);
                dmcModel(m);
                exeModel(m);
                break;
            } 
        case BMPFILE1: break;
        default: { 
                sparseModel(m,ismatch,order);
                distanceModel(m);
                picModel(m);
                recordModel(m);  
                indirectModel(m);
                dmcModel(m);
                break;
            } 
        }
    Speed is better in most cases. Compression is same on some data and worse on other data.

    Word model modifications are based on paq8hp12.
    Memory usage is increased.
    Attached Files Attached Files
    Last edited by kaitz; 11th June 2008 at 14:35.
    KZo


  2. #2
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks kaitz!

    Mirror: Download
    Last edited by LovePimple; 11th June 2008 at 13:52.

  3. #3
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    377
    Thanks
    139
    Thanked 198 Times in 108 Posts

    Post

    Code:
    paq8o10t.exe -4 A10.jpg      842468 -> 640447   Time 27.69 sec,   used 130692920 MEM
    paq8o9.exe   -4 A10.jpg      842468 -> 640447   Time 27.25 sec,   used 147485800 MEM
    paq8o10t.exe -4 AcroRd32.exe 3870784 -> 942896  Time 357.25 sec,  used 135288361 MEM
    paq8o9.exe   -4 AcroRd32.exe 3870784 -> 931760  Time 463.66 sec,  used 134210601 MEM
    paq8o10t.exe -4 english.dic  4067439 -> 391480  Time 435.09 sec,  used 128946859 MEM
    paq8o9.exe   -4 english.dic  4067439 -> 390358  Time 423.70 sec,  used 127873259 MEM
    paq8o10t.exe -4 FlashMX.pdf  4526946 -> 3563114 Time 454.69 sec,  used 148559391 MEM
    paq8o9.exe   -4 FlashMX.pdf  4526946 -> 3563934 Time 519.47 sec,  used 147485791 MEM
    paq8o10t.exe -4 FP.LOG       20617071 -> 269979 Time 2309.97 sec, used 128946868 MEM
    paq8o9.exe   -4 FP.LOG       20617071 -> 275173 Time 2354.98 sec, used 127873268 MEM
    paq8o10t.exe -4 MSO97.DLL    3782416 -> 1340660 Time 350.36 sec,  used 149620515 MEM
    paq8o9.exe   -4 MSO97.DLL    3782416 -> 1328473 Time 510.41 sec,  used 148546915 MEM
    paq8o10t.exe -4 ohs.doc      4168192 -> 490089  Time 143.16 sec,  used 148559399 MEM
    paq8o9.exe   -4 ohs.doc      4168192 -> 487629  Time 156.78 sec,  used 147485799 MEM
    paq8o10t.exe -4 rafale.bmp   4149414 -> 551463  Time 59.91 sec,   used 116595869 MEM
    paq8o9.exe   -4 rafale.bmp   4149414 -> 551466  Time 64.48 sec,   used 133388749 MEM
    paq8o10t.exe -4 vcfiu.hlp    4121418 -> 413225  Time 373.06 sec,  used 128946863 MEM
    paq8o9.exe   -4 vcfiu.hlp    4121418 -> 405263  Time 467.00 sec,  used 127873263 MEM
    paq8o10t.exe -4 world95.txt  2988578 -> 367188  Time 368.76 sec,  used 128946859 MEM
    paq8o9.exe   -4 world95.txt  2988578 -> 370766  Time 357.42 sec,  used 127873259 MEM
    Encode's Compression Corpus (EncCC)
    Code:
    paq8o10t.exe -4 Doom3.exe   5427200 -> 1060739  Time 496.59 sec, used 130007983 MEM
    paq8o9.exe   -4 Doom3.exe   5427200 -> 1040972  Time 631.25 sec, used 128934383 MEM
    paq8o10t.exe -4 Reaktor.exe 14446592 -> 1212197 Time 1195.41 sec, used 130007978 MEM
    paq8o9.exe   -4 Reaktor.exe 14446592 -> 1185980 Time 1495.48 sec, used 128934378 MEM
    Last edited by kaitz; 11th June 2008 at 14:32.
    KZo


  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    I think we should do some advanced custom model set switching. i.e. detecting file-type dynamically and choosing model sets accordingly. For example, as with PAQ6 we may check for recent 0x00 (zero byte) or 0x20 (space character) to determine TEXT/BINARY data.

  5. #5
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Here's my own speed optimised compile for Pentium Pro or later processor.

    ENJOY!
    Attached Files Attached Files

  6. #6
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    377
    Thanks
    139
    Thanked 198 Times in 108 Posts
    Code:
    paq8o10t.exe -4 Mech8.s3m 747600 -> 296673 Time 69.67 sec, 128946864 MEM
    paq8o9.exe   -4 Mech8.s3m 747600 -> 295853 Time 83.09 sec, 127873264 MEM
    paq8o10t.exe -4 PariahInterface.utx 24375895 -> 3814862 Time 2093.17 sec, 111080362 MEM
    paq8o9.exe   -4 PariahInterface.utx 24375895 -> 3803631 Time 2650.56 sec, 127873242 MEM
    KZo


  7. #7
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    hi lovepimple

    unfortunately F-Secure AntiVirus & AntiSpy
    detects your implementation "paq8o10tlp"
    as W32/Suspicious U.gen (virus)

    the original "paq8o10t" has no such effect.

    can you tell us , what is the secret of your "implementation"/"compile"

    which modification have you done?

  8. #8
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    377
    Thanks
    139
    Thanked 198 Times in 108 Posts
    Quote Originally Posted by joerg View Post
    hi lovepimple

    unfortunately F-Secure AntiVirus & AntiSpy
    detects your implementation "paq8o10tlp"
    as W32/Suspicious U.gen (virus)

    the original "paq8o10t" has no such effect.

    can you tell us , what is the secret of your "implementation"/"compile"

    which modification have you done?
    My Bitdefender did not detect nothing.
    W32/Suspicious U.gen (virus)
    KZo


  9. #9
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by joerg View Post
    hi lovepimple

    unfortunately F-Secure AntiVirus & AntiSpy
    detects your implementation "paq8o10tlp"
    as W32/Suspicious U.gen (virus)

    the original "paq8o10t" has no such effect.

    can you tell us , what is the secret of your "implementation"/"compile"

    which modification have you done?
    Its a false positive. F-Secure AntiVirus & AntiSpy detects it as a upack compressed file, but can't decompress the file, so it cheats by reporting "W32/Suspicious U.gen (virus)".

    See this thread for more compile info.
    http://www.encode.ru/forum/showthread.php?t=65

  10. #10
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    thank you lovepimple

    if i understand good:
    You dont know, which compiler-switches are used
    for the resulting compile ?

  11. #11
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Correct!

  12. #12
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    hi lovepimple

    can you please help me to avoid confusion
    about compression level and memory usage:

    without compression: -0 = ???? MB

    fast compression: -1 = 35 MB, -2 = 48 MB, -3 = 59 MB, -4 = 133 MB

    standard mode: -5 = 233 MB

    better compression: -6 = 435 MB, -7 = 837 MB, -8 = 1643 MB, -9 = ???? MB

    a) is this right ?
    b) which amount of memory is used for -9 ?
    c) which amount of memory is used for -0 ?

  13. #13
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    a) Yes, but memory usage is slightly increased for this latest version.

    b) The -9 option would need about 3290 MB.

    c) About 0.5 MB.

  14. #14
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    AFAIK there is no -9 option.

    Anyway, paq8o10t now has the best enwik8 compression for a non-dictionary based program (paq8hp* and durilca* use dictionaries).

    http://cs.fit.edu/~mmahoney/compression/text.html#1323
    http://cs.fit.edu/~mmahoney/compression/#paq

    I didn't test enwik9. A test would take over 3 days. enwik8 ran overnight (8 hours to compress and decompress).

  15. #15
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Matt Mahoney View Post
    AFAIK there is no -9 option.
    That's what I thought until I tried the -9 option and it returned with an 'out of memory' error.

    From the paq8o10t source code:
    Code:
     
    int main(int argc, char** argv) {
      bool pause=argc<=2;  // Pause when done?
      try {
    
        // Get option
        bool doExtract=false;  // -d option
        if (argc>1 && argv[1][0]=='-' && argv[1][1] && !argv[1][2]) {
          if (argv[1][1]>='0' && argv[1][1]<='9')
            level=argv[1][1]-'0';
          else if (argv[1][1]=='d')
            doExtract=true;
          else
            quit("Valid options are -0 through -9 or -d\n");
          --argc;
          ++argv;
          pause=false;
        }

  16. #16
    Member
    Join Date
    May 2008
    Location
    Estonia
    Posts
    377
    Thanks
    139
    Thanked 198 Times in 108 Posts

    Exclamation

    BTW Matt you got mistake on your webpage in History section.
    Code:
     paq8o10z is by KZ, June 11, 2008. Compression .........
    It is t not z.
    KZo


  17. #17
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @lovepimple
    "That's what I thought until I tried the -9 option and
    it returned with an 'out of memory' error."

    microsoft says:
    win32 - The virtual address space of processes and applications
    is still limited to 2 GB unless the /3GB switch is used in the Boot.ini file.

    May be because this the program displays the "out of memory" ?

    the compression-ratio-result is awesome

    a oracle-dump with 648.331.264 bytes
    is compressed (with -7) to 9.714.384 bytes
    7zip compresses it (with -mx=9) to 35.151.362 bytes
    rings 1.5c compresses it to 26.580.783 bytes

    but paq8o10tlp needs 24 hours
    7zip needs 0,5 hour
    rings 1.5c needs 2 minutes

    for me at runs on a windows server 2003 with 2x XEON 2,8 GHz and 4 MB

    in practice i use 7zip because it has full directory-support

    @kz
    the resulting compression-ratio is awesome

    especially i want to remark that this program
    do not block the whole system
    - that is wonderfull
    that means i can work on the system
    with an other programm with lower requirements at the same time

    1. have we any chance to modify the paq8o10t
    to use two processors or two cores?
    2. have we any chance to modify the paq8o10t
    to compress a complete directory with subdirectories
    inclusive storing the path and filenames?

    best regards Joerg

  18. #18
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    This build coaxes a little more speed from paq8o10t. I have also removed the obsolete -9 option from this release.

    ENJOY!
    Attached Files Attached Files

  19. #19
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Some timings from my AMD Sempron 2400+ machine:

    paq8o10t -4 world95.txt
    Time 988.53 sec, used 128946859 bytes of memory
    text 2988578

    paq8o10tlp -4 world95.txt
    Time 923.33 sec, used 128946859 bytes of memory
    text 2988578

    paq8o10tlp2 -4 world95.txt
    Time 891.58 sec, used 128946859 bytes of memory
    text 2988578

  20. #20
    Member
    Join Date
    Jun 2008
    Location
    USA
    Posts
    111
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by kaitz View Post
    BTW Matt you got mistake on your webpage in History section.
    Code:
     paq8o10z is by KZ, June 11, 2008. Compression .........
    It is t not z.
    He probably got confused between paq8o8z and your paq8o10t. (BTW, his whole site seems down, ATM. Weird.)

  21. #21
    Member
    Join Date
    Jun 2008
    Location
    USA
    Posts
    111
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by LovePimple View Post
    This build coaxes a little more speed from paq8o10t. I have also removed the obsolete -9 option from this release.

    ENJOY!
    Dare I ask, but what compiler did you use? It's not MinGW nor OpenWatcom (unless you used some kind of external tool). BTW, MinGW stuff seems to hate running on non-admin cpus (e.g. Vista, "tmpfile: access denied"). Bleh.

  22. #22
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    Even though main site is down, you can use (usually outdated, but not now) mirror.

    Also, I have tested PAQ8o10t - nice speedup.
    Last edited by Black_Fox; 16th June 2008 at 00:03.
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  23. #23
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks BF!

  24. #24
    Member
    Join Date
    Jun 2008
    Location
    USA
    Posts
    111
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Black_Fox View Post
    Even though main site is down, you can use (usually outdated, but not now) mirror.
    His main site is back up now.

    BTW, no answer re: my compiler question, LovePimple? Also, you didn't mirror paq8o8z (bah). Anyways, don't forget that Geocities has an hourly bandwidth limit for downloads (5 MB, IIRC). You may wish to get a Google Pages site instead (100 MB vs. wimpy 15 MB storage, anyways).

  25. #25
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Rugxulo View Post
    BTW, no answer re: my compiler question, LovePimple? )
    That's because this question has already been answered in my reply to joerg's post above. For the record, its GCC 4.3.0.


    Quote Originally Posted by Rugxulo View Post
    Also, you didn't mirror paq8o8z (bah).
    I only mirror the latest versions. AFAIC paq8o8 is history.


    Quote Originally Posted by Rugxulo View Post
    Anyways, don't forget that Geocities has an hourly bandwidth limit for downloads (5 MB, IIRC). You may wish to get a Google Pages site instead (100 MB vs. wimpy 15 MB storage, anyways).
    I'm well aware of the badwidth limit, but I wouldn't swap to Google Pages ATM because I have been using Geocities for many years, and its always been a very reliable service. I will probably swap to GP sometime in the future.

  26. #26
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Quote Originally Posted by Rugxulo View Post
    He probably got confused between paq8o8z and your paq8o10t. (BTW, his whole site seems down, ATM. Weird.)
    OK, it is fixed now, and site is back up.

    Also I ran some tests of durilca4linux_3 v3 with 2 GB. It still beats paq8hp12.

  27. #27
    Member
    Join Date
    Jun 2008
    Location
    USA
    Posts
    111
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by LovePimple View Post
    That's because this question has already been answered in my reply to joerg's post above. For the record, its GCC 4.3.0.
    Sorry, I didn't read the other thread.

    I only mirror the latest versions. AFAIC paq8o8 is history.
    And paq8o9 isn't? What about PKZIP 2.50/DOS ? Bzip 1.0.4 ? Whatever, do what you want, it's your site. (BTW, there's a DOS/DJGPP port of p7zip or you could run Win32's faster 7ZA under HXRT.)

    I'm well aware of the bandwidth limit, but I wouldn't swap to Google Pages ATM because I have been using Geocities for many years, and its always been a very reliable service. I will probably swap to GP sometime in the future.
    They aren't mutually exclusive. I think you can have both. (Mirror that mirror!) BTW, Geocities has been owned by Yahoo! for quite a while, and I don't think they've increased the space storage in 10 years!

  28. #28
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Rugxulo View Post
    And paq8o9 isn't? What about PKZIP 2.50/DOS ? Bzip 1.0.4 ?
    PKZIP v2.50 is the latest version for DOS. The other (older) files are there for good reason, but I don't intend sending everyone to sleep with the explanation.


    Quote Originally Posted by Rugxulo View Post
    Whatever, do what you want, it's your site.
    Exactly!

  29. #29
    Member
    Join Date
    Jun 2008
    Location
    USA
    Posts
    111
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Rugxulo View Post
    BTW, MinGW stuff seems to hate running on non-admin cpus (e.g. Vista, "tmpfile: access denied"). Bleh.
    Apparently, MS' implementation (in MSVCRT.DLL ??) of tmpfile() and tmpfile_s() both use the root dir for placing files. You have to use something else like tmpnam_s() instead. (This means everybody here using MinGW or similar should be aware of this.)

  30. #30
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks for the info.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •