Page 3 of 3 FirstFirst 123
Results 61 to 88 of 88

Thread: PAQ8Q

  1. #61
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Simon!

    Compiled...

    EDIT: Attachments "paq8q_v12.zip" and "paq8q_v12_sse2.zip" removed.

  2. #62
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    909
    Thanks
    531
    Thanked 359 Times in 267 Posts
    Hi,
    I think is something not proper in some cases of files in paq8qv11.
    My wav testfiles or files contains wav files are compressed worse than in v10. Examples:

    file: 0.wav, 8 bit mono wave, original 2762044
    paq8qv10 - 1332422
    paq8qv11 - 1332436 - small difference but...

    file: l.pak, container with different kond of data, contains about 90 wav files - mostly 8 bit mono, original 7764079
    paq8qv10 - 2696259
    paq8qv11 - 2818825 - 4.5% of difference

    Darek

  3. #63
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Are both generating a bit identical output file?

  4. #64
    Programmer Jan Ondrus's Avatar
    Join Date
    Sep 2008
    Location
    Rychnov nad Kněžnou, Czech Republic
    Posts
    278
    Thanks
    33
    Thanked 137 Times in 49 Posts
    There was wrong block size written for image/audio blocks.

    paq8q_v13
    - fixed wrong block size
    - some changes in comparing
    - percentage for decompression/comparing
    Attached Files Attached Files
    Last edited by Jan Ondrus; 8th June 2009 at 21:22.

  5. #65
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Jan!

    Compiled...
    Attached Files Attached Files

  6. #66
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    909
    Thanks
    531
    Thanked 359 Times in 267 Posts
    Hi!
    thanks for fixing this!
    now is OK.

    I have question about exe filter used in this version (I've tried to ask on paq8px forum, but nobody asnwer on this question): Should used exe algorithm (filter) recognise any kind of exe files or only selected types or exe/dll headers or only some special cases?
    For my testbed files 3 of them (of 4) aren't recognised as by exe filter even in a few part, and are compressed whole as a default, which is less effective.

    Darek

  7. #67
    Programmer Jan Ondrus's Avatar
    Join Date
    Sep 2008
    Location
    Rychnov nad Kněžnou, Czech Republic
    Posts
    278
    Thanks
    33
    Thanked 137 Times in 49 Posts
    Quote Originally Posted by Darek View Post
    Hi!
    thanks for fixing this!
    now is OK.

    I have question about exe filter used in this version (I've tried to ask on paq8px forum, but nobody asnwer on this question): Should used exe algorithm (filter) recognise any kind of exe files or only selected types or exe/dll headers or only some special cases?
    For my testbed files 3 of them (of 4) aren't recognised as by exe filter even in a few part, and are compressed whole as a default, which is less effective.

    Darek
    Headers are not used for exe detection. Actually it detects x86 code - it looks for JMP (0xe, CALL (0xe9) and 0x0f80..0x0f8f (conditional jumps) instructions and tries to guess if relative address for jump (next 4 bytes) will be present in file more often if converted to absolute.
    I don't know why your 3 test files aren't detected. Aren't they compressed with some executable packer (UPX)?

  8. #68
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    909
    Thanks
    531
    Thanked 359 Times in 267 Posts
    Jan ondrus wrote:
    Aren't they compressed with some executable packer (UPX)?

    Thanks for answer. I don't know if my files are internally compressed. It's possible. These files are original executables of old applications.

    I post one example.
    Darek
    Attached Files Attached Files
    • File Type: rar H.rar (725.2 KB, 318 views)

  9. #69
    Programmer Jan Ondrus's Avatar
    Join Date
    Sep 2008
    Location
    Rychnov nad Kněžnou, Czech Republic
    Posts
    278
    Thanks
    33
    Thanked 137 Times in 49 Posts
    Quote Originally Posted by Darek View Post
    I post one example.
    Darek
    There are 16-bit addresses used in this file. Only 32-bit addresses can be transformed / converted to absolute by paq exe-filter.

  10. #70
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    909
    Thanks
    531
    Thanked 359 Times in 267 Posts
    Thanks.
    Is this a general idea of exe compressing (for converting only 32 bit addresses) or paq-exe used filter option?

    Darek

  11. #71
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    It's a pity that paq8q won't be used tested (at the moment). . There is a general bug in all modes except -m6. It is fixed for the next version.
    Last edited by Simon Berger; 10th June 2009 at 15:58.

  12. #72
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Here is an to XML/HTML compression. While the compression ratio improvement is only small the time improvement should be noticeable.
    It's an element precompression and works for all elements with and without keys/parameters.

    Are working:
    Code:
    <id>9</id>
    <id key="0">9</id>
    <id key="1" />
    Leaved untouched (no problems there only no improvement)
    Code:
    <id >9<id>
    <id key=0>8</id>
    <id key="1"/>
    So like you see it precompresses only well formatted elements.

    Benchmark on enwik8:
    Code:
    original: 100,000,000 bytes
    precompressed: 98,125,260 bytes
    paq8q_v13: 17,733,056 bytes // without the precompression sure
    paq8q_v14_beta: 17,718,092 bytes // inklusive precompression
    No timings here. Should be done later with a fast and the same builds.

    I have to clean the code before releasing and make it more robust against malformed elements.

    If someone outside paq is interested in this I could create a general class. It shouldn't break any context and should work well for LZ-based compressors too. Would be interesting to compare it to other XML pre-processors too. But I didn't stumbled about something similar yet

    EDIT:
    Btw. Thanks for your additional informations/ corrections to the supported formats Jan. I added them for the next release. I absolutely wasn't sure on the P(X)M part since now. Missed your post somehow previously. .
    Last edited by Simon Berger; 20th June 2009 at 20:32.

  13. #73
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    I found a very nice project that automatically generates valid XML files http://www.xml-benchmark.org/.
    With this I now tested a 30mb "real XML file". Real means XML in use as a database with short- and only some longer values. The results are much better.

    This generator created tags like

    Code:
    <id key="1"/>
    Which I didn't understand. But I decided to support this besides the one with a space.

    Benchmark:
    Code:
    original: 35,702,033 bytes
    precompressed: 28,145,890 bytes
    paq8px_v60: 4,736,835 bytes
    paq8q_v14_test: 4,205,636 bytes
    
    (Decompressed output was verified to be bit identical)
    I used this command to generate the file:

    Code:
    win32.exe /f 0.3 /o outfile.xml

  14. #74
    Member
    Join Date
    Aug 2008
    Location
    Saint Petersburg, Russia
    Posts
    215
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Code:
    paq8px_v60: 4,736,835 bytes
    paq8q_v14_test: 4,205,636 bytes
    Wow.
    That's a lot.

  15. #75
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    For those looking forward to XML/Text improvements. I only work some minutes per day on this. One problem was a proper XML tag parsing/validation without using many lines of code.

    I have tested some XML compressors and preprocessors but all didn't show great results.
    Somehow I missed the well-known XML-WRT which owned my precompressor hard . The biggest reason is a dynamic dictionary which also works on normal text files.

    Here the results of WRT + paq8px_v60 on both filesI tested in the previous posts.

    Code:
    Enwik8:
    wrt+paq8px_v60: 17,372,979 bytes
    
    outfile.xml:
    paq8px_v60: 4,736,835 bytes
    paq8q_v14_test: 4,205,636 bytes
    wrt+paq8px_v60: 3,640,092 bytes // !!!!!!!!!!!!!!!!
    I have added some ideas to my implementation but it was still a too long way to WRT. Because this dynamic dict reduces filesize HEAVILY I decided to include WRT or a selfmade version into paq. Compression of enwik8 should be reduced by ~80%. I have no exact preprocessed file size but I think to remember it was under 50mb (~44mb).

  16. #76
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    909
    Thanks
    531
    Thanked 359 Times in 267 Posts
    @Simon:

    Are you plan to release some builds of paq8q_v14_test for testing purposes?

    Darek

  17. #77
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Yes I will do after I included WRT. WRT is a really massive addition with half the lines of code of the whole paq project . But I am going to delete all code not needed for paq and change the style of coding.
    At the end paq8q_v14 will have two source files. It is no longer possible in my opinion to let paq source size grow and grow. On the other hand if someone wants one file he still will get one file by easily left this text preprocession out.

    I can't create a XML preprocession if I know there is something this much better out also if it is such a bigger thing. It will make PAQ compression much faster and I get another idea which is a similar to LZP preprocession (paq9) but shouldn't have the bad side effect of compression decrease while still being much faster.

  18. #78
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    909
    Thanks
    531
    Thanked 359 Times in 267 Posts
    Thanks Simon.
    Then I'll wait for finishing the works and release, and of course I'll trace the building progress.
    Regards, Darek

  19. #79
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    I currently don't work on paq8 and the things I want to do for paq8q, but I fixed some serious problems some time ago I wanted to put in the XML precompression update that will come at a later time.

    I hope that this is a stable release and going to be tested in some benchmarks (maximumcompression...).
    It's on feature standard of paq8px_v60 too.

    Changelog

    Code:
    Fixed a bug that didn't let mode 2-5 work at all
    Fixed a bug in comparing
    Changed and Fixed small appearance things
    Attached Files Attached Files

  20. #80
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Simon!

    Compiled...
    Attached Files Attached Files

  21. #81
    Programmer Jan Ondrus's Avatar
    Join Date
    Sep 2008
    Location
    Rychnov nad Kněžnou, Czech Republic
    Posts
    278
    Thanks
    33
    Thanked 137 Times in 49 Posts
    bug: Comparing shows "Files are equal. No difference found." message when compressed file is shorter (file for comparing is longer).

    paq8q:
    Code:
    if (mode==FCOMPARE && !diffFound)
      printf("Files are equal. No difference found.\n");
    else if (mode==FCOMPARE)
      printf("First difference found at file offset: %u\n", diffFound-1);
    else
      printf("done        \n");
    paq8px_v60:
    Code:
    if (mode==FCOMPARE && !r && getc(f)!=EOF)
      printf("file is longer\n");
    else if (mode==FCOMPARE && r)
      printf("differ at %d\n",r-1);
    else if (mode==FCOMPARE)
      printf("identical\n");
    else
      printf("done   \n");
    Last edited by Jan Ondrus; 11th July 2009 at 14:44.

  22. #82
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    Simon, LP, thank you! Here is the test for paq8q_14 which brings unpleasant news. Maybe bug pointed by Jan causes it
    Tested on PAQ_TestBed.tar, -6 -m6 level.
    Code:
     compile                    time         size          CRC
    ---------------------------------------------------------------
    paq8q                      1134.122     5 350 117     E2962AC8
    paq8q_speed_optimised      1096.594     5 350 124     0698B712
    paq8q_sse2_amd             1138.415     5 350 117     02E0A874
    paq8q_sse2_intel           1122.463     5 350 117     6A6437FC
    All output files are different !
    Last edited by Skymmer; 12th July 2009 at 15:39.

  23. #83
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    Bug pointed out by Jan is about proper signalizing of the differences, not the differences themselves.
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  24. #84
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Thank you Jan.

    Thank you Skymmer too. Maybe the wavModel isn't up to date. I will look into it and release an update.

  25. #85
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    I've tracked down the problematic file in PAQ_TestBed.tar file. It's Sine_generator_65Hz.wav
    All others being compressed separately give identical output for all compiles. Strange that only this file is problematic. All other 3 WAVs and 4 AIFF are packed normally.

  26. #86
    Member
    Join Date
    Oct 2007
    Location
    Germany, Hamburg
    Posts
    408
    Thanks
    0
    Thanked 5 Times in 5 Posts
    Didn't you find out that it is exactly the same for paq8px?

  27. #87
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    37
    Thanked 168 Times in 84 Posts
    No, I didn't. Here are the checksums for Sine_generator_65Hz.wav at -6\-6 -m6 level.
    Code:
    paq8q_no_opt              78645217
    paq8q_sse2_no_opt         78645217
    paq8q_speed_optimised     78645217
    
    paq8q                     BFA453E3
    paq8q_sse2_amd            BFA453E3
    paq8q_sse2_intel          BFA453E3
    Code:
    paq8px_no_opt              0C5B87B8
    paq8px_sse2_no_opt         0C5B87B8
    
    paq8px                     31A36660
    paq8px_speed_optimised     31A36660
    paq8px_sse2_amd            31A36660
    paq8px_sse2_intel          31A36660
    paq8px_fast_wav            31A36660
    paq8px_spopt_fast_wav      31A36660
    paq8px_fastpaq2            31A36660
    paq8px_fastpaq2_so         31A36660
    paq8px_turbo               31A36660
    
    paq8px_v60_AMD             42C47800
    paq8px_v60_Intel_SSE2      42C47800
    
    paq8px_v60_MMX             F91880B7

  28. #88
    Member
    Join Date
    Jul 2009
    Location
    Moscow
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I think you should not use DOUBLE in a file compressor!
    Any floating point arithmetic is dangerous, because even (a+b)+c is not the same as a+(b+c), so it's very likely to lose binary compatibility
    when compiling with different optimization options.
    better use fixed-point, like train and dot_product!

Page 3 of 3 FirstFirst 123

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •