Results 1 to 30 of 30

Thread: ASH 05

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts

    ASH 05

    http://ctxmodel.net/files/ASH/ash05.rar

    It seemed really annoying to see ASH "disqualified" on Sami's benchmark site,
    so I finally added some memory overflow handling - now it uses a fixed
    buffer instead of loading whole file at start and reparses the "init window"
    at the end of the block to rebuild the tree for further data processing.

    ash /s255 /o40 /m2900 /b100000 /w20000 compresses enwik8 into 19376816 bytes.
    Btw, its kinda surprising, but after adding the LARGEADDRESSAWARE flag,
    it was able to use up to 3G of memory without any modifications.

    Can't say that I still like its approach, but its my only universal compressor
    comparable to others, and at least it has some historical value, as there's
    no other compressor with float-point statistics and this kind of mixing...
    and there're other unique features. Also it has better results than ppmonstr
    on some stationary data.

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Well, sorry, seems it was buggy, reuploaded.

    http://ctxmodel.net/files/ASH/ash05.rar
    + FIX: 3 bytes for Block and WInit values in .ash header (was 2)
    + FIX: synchronous memory overflow detection in a special case
    + encoding and decoding functions merged
    + book1+wcc386 parameter optimization results imported (no visible gain)

    ash /s255 /o32 /b100000 /w30000 enwik8 -> 19308831

  3. #3
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Shelwien!

  4. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    ash /m2925 /s255 /o33 /b75000 /w70000 enwik8 -> 19215977
    ash /m2925 /s255 /o33 /b75000 /w70000 enwik9 -> 162783185

  5. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    http://ctxmodel.net/files/ASH/ash06.rar
    (includes source)

    + 16-bit (4:12) custom "floats" used for statistics
    + Explicit block size parameter removed, block size is determined by
    the statistics volume
    + /wK now means that InitWindowSize=BlockSize*K/(100+K)
    (as BlockSize is variable)
    + Size of data buffer now is included into amount controlled by /m option
    + E8 filter added (can be disabled by /e)

    ash06 /o34 enwik8 -> 19,184,614
    ash06 /o14 /w50 enwik9 -> 162,789,635

  6. #6
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Quote Originally Posted by Shelwien View Post
    (includes source)
    Excellent! Thanks Shelwien!

  7. #7
    Member
    Join Date
    May 2008
    Location
    Antwerp , country:Belgium , W.Europe
    Posts
    487
    Thanks
    1
    Thanked 3 Times in 3 Posts
    Quote Originally Posted by Shelwien View Post
    http://ctxmodel.net/files/ASH/ash06.rar
    (includes source)

    ash06 /o34 enwik8 -> 19,184,614
    ash06 /o14 /w50 enwik9 -> 162,789,635
    I got a somewhat different result on Enwik8 ...
    Any idea why this is ?

    Code:
    G:\test\enwik8>timer ash /o34 enwik8 enw8_ash06_o34.ash
    
    Timer 3.01  Copyright (c) 2002-2003 Igor Pavlov  2003-07-10
    ASH 06 [10.03.2009 07:11] experimental CM compressor (c) Eugene D. Shelwien
     ■ Encoding...
     ■  Input file: "enwik8"
     ■ Output file: "enw8_ash06_o34.ash"
     ■ Using model order 33.
     ■ Using up to 3100M of RAM.
     ■ Init Window setting = 25%
     ■ SSE Depth = 255
     ■ E8 enabled
    <<< blockend=90317277 blocksize=90317277 initwindow=18063455 >>>
    Inp=100000008/100000000 Out=19296230 M96/774/2717
    You'd just wasted 1308.926s; 880k of reserved memory left unused.
    
    Kernel Time  =     3.744 = 00:00:03.744 =   0%
    User Time    =  1269.941 = 00:21:09.941 =  96%
    Process Time =  1273.685 = 00:21:13.685 =  97%
    Global Time  =  1309.285 = 00:21:49.285 = 100%
    Last edited by pat357; 10th March 2009 at 21:24.

  8. #8
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Eugene may had more memory

    or just disabled E8

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    E8 shouldn't affect texts (except for a slight speed difference maybe).

    And yeah, in my case there was 300M more of memory... its because of Vista I guess.
    So instead you can check these:
    ash /o28 enwik8 -> 19193981
    ash /o29 enwik8 -> 19191674
    ash /o30 enwik8 -> 19189950

    Also, can you run 1-ML-LAE.exe from http://shelwien.googlepages.com/maxmem_v0.rar
    and post what it prints? (and your OS version)

  10. #10
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Quote Originally Posted by Shelwien View Post
    E8 shouldn't affect texts (except for a slight speed difference maybe).
    Russian texts might be affected...

  11. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Well, a check there is strict enough for normal texts I think, but
    something like "ияяяя" in 1251 codepage might really trigger the filter...
    (Its "и...я" actually... or "й...я". Is there such a word?

  12. #12
    Member
    Join Date
    May 2008
    Location
    Antwerp , country:Belgium , W.Europe
    Posts
    487
    Thanks
    1
    Thanked 3 Times in 3 Posts
    Quote Originally Posted by Shelwien View Post
    E8 shouldn't affect texts (except for a slight speed difference maybe).

    And yeah, in my case there was 300M more of memory... its because of Vista I guess.
    So instead you can check these:
    ash /o28 enwik8 -> 19193981
    ash /o29 enwik8 -> 19191674
    ash /o30 enwik8 -> 19189950
    Thanks, I'll try to run these over night..
    The available memory was the first thing I thought about myself to explain the difference, but ASH printed something like "800Mb unused..."
    Also, can you run 1-ML-LAE.exe from http://shelwien.googlepages.com/maxmem_v0.rar
    and post what it prints? (and your OS version)
    The result from 1-ML-LAE :

    Code:
    G:\test\maxmem_v0\maxmem_v0>1-ML-LAE
    block [00790000..779AFFFF]  size = 1906M = 1951872K = 1998716928
    block [7FFF0000..A8BAFFFF]  size = 651M = 667392K = 683409408
    block [77BE0000..7F6EFFFF]  size = 123M = 126016K = 129040384
    block [7F7F0000..7FFDFFFF]  size = 7M = 8128K = 8323072
    block [00140000..0024FFFF]  size = 1M = 1088K = 1114112
    block [00260000..002AFFFF]  size = 0M = 320K = 327680
    block [003B0000..003FFFFF]  size = 0M = 320K = 327680
    block [77AE0000..77AFFFFF]  size = 0M = 128K = 131072
    block [00020000..0002FFFF]  size = 0M = 64K = 65536

    OS = Vista Prem. 32 bit SP1 / 4 GB RAM installed.

  13. #13
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Quote Originally Posted by Shelwien View Post
    Well, a check there is strict enough for normal texts I think, but
    something like "ияяяя" in 1251 codepage might really trigger the filter...
    (Its "и...я" actually... or "й...я". Is there such a word?
    Just tested my filter with bookstar...

  14. #14
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > Thanks, I'll try to run these over night..

    I think that can wait until a bugfix version (I found 3 bugs already).
    I just hoped that somebody would test it while I sleep, but no luck.

    > The available memory was the first thing I thought about
    > myself to explain the difference, but ASH printed
    > something like "800Mb unused..."

    Like 800k probably. There's always some small amount of wasted space
    due to internal heap fragmentation.

    > The result from 1-ML-LAE :

    Thanks. But its weird for address space to just end at 0xA8BB0000.
    You probably have that userva value set to something like 2700.
    http://msdn.microsoft.com/en-us/library/ms791558.aspx

  15. #15
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Quote Originally Posted by Shelwien View Post
    http://ctxmodel.net/files/ASH/ash06.rar
    (includes source)

    + 16-bit (4:12) custom "floats" used for statistics
    + Explicit block size parameter removed, block size is determined by
    the statistics volume
    + /wK now means that InitWindowSize=BlockSize*K/(100+K)
    (as BlockSize is variable)
    + Size of data buffer now is included into amount controlled by /m option
    + E8 filter added (can be disabled by /e)

    ash06 /o34 enwik8 -> 19,184,614
    ash06 /o14 /w50 enwik9 -> 162,789,635
    What was compression/decompression time, memory used, what processor? I will post to LTCB. (I am behind on testing).

  16. #16
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    @Matt: These are not final results... I'd do more tests and then send the results to you.

    Btw, in attempt to speed up the enwik9 tests I installed the extra 4G of
    RAM which were randomly lying around, and configured the ramdrive to
    use 4.7G of memory unused by XP and put the swapfile there.
    And after that, I tried to run both compression and decompression at once.
    Well, I guess it is still much faster than swap to hdd, but it seems that
    windows' swapping strategy is completely wrong in this case - like if
    there're two processes which both use nearly 3G of RAM, and one of
    them tries to allocate more memory, then _most_ of other process'
    memory is swapped out at once - only like 100M is left.

    http://shelwien.googlepages.com/ash1.png
    Last edited by Shelwien; 11th March 2009 at 11:27.

  17. #17
    Member
    Join Date
    May 2008
    Location
    Antwerp , country:Belgium , W.Europe
    Posts
    487
    Thanks
    1
    Thanked 3 Times in 3 Posts
    Quote Originally Posted by Shelwien View Post
    @Matt: These are not final results... I'd do more tests and then send the results to you.

    Btw, in attempt to speed up the enwik9 tests I installed the extra 4G of
    RAM which were randomly lying around, and configured the ramdrive to
    use 4.7G of memory unused by XP and put the swapfile there.
    And after that, I tried to run both compression and decompression at once.
    Well, I guess it is still much faster than swap to hdd, but it seems that
    windows' swapping strategy is completely wrong in this case - like if
    there're two processes which both use nearly 3G of RAM, and one of
    them tries to allocate more memory, then _most_ of other process'
    memory is swapped out at once - only like 100M is left.

    http://shelwien.googlepages.com/ash1.png
    Have you tried with a Win 64 bit version ?
    Could also try a Linux 64bit distro and run ASH in Dosbox.. maybe this would give better results..?

  18. #18
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Code:
    ash06 /o14 /w400 /e enwik9 -> 159,653,893            
    Q6600 @ 3.3Ghz, WinXP SP3
    Memory required: 2996M 
    Encoding time:   20312.656s
    Decoding time:   20374.313s
    Now running /o15...

    I don't have a local linux and remote ones won't allow such a memory usage.
    Also, even though there shouldn't be a problem to compile a 32-bit linux version,
    I'm not sure how to deal with 64bit systems as there're pointers in ash's tree,
    so at best there would be 30-40% more memory required for the same tree,
    and at worst it might be impossible to port at all.
    Also, its possible to replace the pointers by custom 32-bit offsets, but
    that requires using a single continuous memory block, which is a significant
    restriction for 32-bit systems.

    Instead, it might be interesting to try the new windows' extended memory API
    and implement some custom swapping with it... or maybe swap to file as well.
    But for now I don't have a clear picture of how to efficiently implement that.

    And anyway, instead of spending time on ash support, something new should
    be much more useful...

  19. #19
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    http://ctxmodel.net/files/ASH/ash06.rar
    (another bugfix, with source)

    15.03.2008 v06 bugfix
    + FIX: Memory size value stored in the header was wrong sometimes
    + FIX: Filesize when E8 is used
    + Improved memory management
    + SSE memory is taken into account

    Code:
    Q6600 @ 3.3Ghz, WinXP SP3
                                         comp.size     mem.  enc.time    dec.time
    ash /o15 /w400 /s110 /e enwik9  ->  159,484,805    3060M 20669.484s  21049.360s
    ash /o15 /w500 /s110 /e enwik9  ->  159,408,612    3060M 25223.718s  25687.294s
    ash /o14 /w200 /s110 /e enwik9  ->  159,936,540    3060M 12893.578s  13130.543s
    ash /o14 /w2 /s110 /e   enwik9  ->  163,570,952    3060M  5453.844s   5554.078s
    ash /s110 /o40          enwik8  ->   19,181,497    3054M   558.938s    559.859s
    ash /s130 /o40 /e       enwik8  ->   19,181,383    3058M   573.125s    573.468s

  20. #20
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Thumbs up

    Thanks Shelwien!

  21. #21
    Member
    Join Date
    May 2008
    Location
    Antwerp , country:Belgium , W.Europe
    Posts
    487
    Thanks
    1
    Thanked 3 Times in 3 Posts

    Thumbs up

    Quote Originally Posted by Shelwien View Post
    http://ctxmodel.net/files/ASH/ash06.rar
    (another bugfix, with source)

    15.03.2008 v06 bugfix
    + FIX: Memory size value stored in the header was wrong sometimes
    + FIX: Filesize when E8 is used
    + Improved memory management
    + SSE memory is taken into account
    Thanks for the "fixes" !

    I tried to get to ASH/PPMonstr ratio < 1 on IUQ generated testdata, but I didn't succeed ....
    (http://www.cs.fit.edu/~mmahoney/compression/uiq/ )

    Here are some data for a generated set :
    original : 6.522.569
    PPMonstr : 3.005.255
    3.124.911 testdata_ash06b_o40.ash
    3.124.911 testdata_ash06b_o64.ash
    3.143.157 testdata_ash06b_s130_o40_e.ash
    3.114.255 testdata_ash06b_w100_o128_e.ash

    Because ASH can be considered as a "general" compressor, not optimized for specific kinds of data, such test would give an idea of the "pure compression" power from ASH.
    I'm not sure about it, but I guess ASH uses similar techniques as PPMonstr (including SSE,..), I would expect a ratio <1 has to be possible.

    Can you please point me to some "more" optimal parameters to compress such data ?


    --
    Code:
    Complete log below :
    
    Code:
    G:\test\Test data from UIQ (Mahony)>ash /o64 testdata.bin testdata_ash06b_o64.ash
    ASH 06 [15.03.2009 05:09] experimental CM compressor (c) Eugene D. Shelwien
     ■ Encoding...
     ■  Input file: "testdata.bin"
     ■ Output file: "testdata_ash06b_o64.ash"
     ■ Using model order 63.
     ■ Using up to 7M+3137M+56M=3200M of RAM.
     ■ Init Window setting = 25%
     ■ SSE Depth = 255
     ■ E8 enabled
    Inp=6522569/6522569 Out=3124911 M7/90/153
    You'd just wasted 105.051s; 4260k of reserved memory left unused.
    
    
    G:\test\Test data from UIQ (Mahony)>ash /o40 testdata.bin testdata_ash06b_o40.ash
    ASH 06 [15.03.2009 05:09] experimental CM compressor (c) Eugene D. Shelwien
     ■ Encoding...
     ■  Input file: "testdata.bin"
     ■ Output file: "testdata_ash06b_o40.ash"
     ■ Using model order 39.
     ■ Using up to 7M+3137M+56M=3200M of RAM.
     ■ Init Window setting = 25%
     ■ SSE Depth = 255
     ■ E8 enabled
    Inp=6522569/6522569 Out=3124911 M7/90/153
    You'd just wasted 105.176s; 4260k of reserved memory left unused.
    
    
    G:\test\Test data from UIQ (Mahony)>ash /s130 /o40 /e testdata.bin testdata_ash06b_s130_o40_e.ash
    ASH 06 [15.03.2009 05:09] experimental CM compressor (c) Eugene D. Shelwien
     ■ Encoding...
     ■  Input file: "testdata.bin"
     ■ Output file: "testdata_ash06b_s130_o40_e.ash"
     ■ Using model order 39.
     ■ Using up to 7M+3164M+29M=3200M of RAM.
     ■ Init Window setting = 25%
     ■ SSE Depth = 130
    Inp=6522569/6522569 Out=3143157 M7/90/126
    You'd just wasted 85.145s; 4222k of reserved memory left unused.
    
    
    G:\test\Test data from UIQ (Mahony)>ash /w100 /o128 /e testdata.bin testdata_ash06b_w100_o128_e.ash
    ASH 06 [15.03.2009 05:09] experimental CM compressor (c) Eugene D. Shelwien
     ■ Encoding...
     ■  Input file: "testdata.bin"
     ■ Output file: "testdata_ash06b_w100_o128_e.ash"
     ■ Using model order 127.
     ■ Using up to 7M+3137M+56M=3200M of RAM.
     ■ Init Window setting = 100%
     ■ SSE Depth = 255
    Inp=6522569/6522569 Out=3114255 M7/90/153
    You'd just wasted 104.552s; 4222k of reserved memory left unused.
    Last edited by pat357; 16th March 2009 at 00:28.

  22. #22
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    1. Order is too high for this type of data, if you pay attention,
    /o5 /s9 was optimal for ash 04a and something similar is expected from ash 06.
    Also, /w is only applicable when there're memory overflows while processing.

    2. Model in ash 06 is the same as in ash 04a (I wasn't even able to find a
    better parameter profile, though using uiq data for that might be a good
    idea), but generally a slightly worse compression is expected from ash 06,
    as it uses the same model with reduced counter precision.

    3. There's a sparse submodel in ppmonstr, which might be appliicable for
    uiq data (because of zero terminators), so I doubt that ash would get
    <1 results, even if specially optimized.

    Wonder if I should optimize some Mix version for uiq data

  23. #23
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    http://ctxmodel.net/files/ASH/ash06.rar
    + FIX: E8 class replaced with fixed version
    + executable built with PGO for better benchmarking (30% speedup)

  24. #24
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    http://ctxmodel.net/files/ASH/ash07.rar
    + Additional compiler option tweaks (+speed)
    + Some value limit checks removed (+speed)
    + E8 filter update: E9 disabled

    Mainly for bugfix accumulation... Its 3-5% faster than last ash06
    and its compression might be same or slightly better for most files,
    excluding some exes (SFC/acrord32 too, unfortunately).

    Also ash01-03 uploaded: http://ctxmodel.net/files/ASH/
    For completeness and archaeology fans, I guess.

  25. #25
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks! But why did you disable the offset conversion for e9?

  26. #26
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Also some comparison with toffer's standalone E8 filter from M1 thread.
    ppmd_sh and ash include a similar "transparent" filter, with the same effect
    as compressing the filtered file. /e is used for disabling the internal filter.
    Code:
    16255160 // mcpcom.exe from IC 11.0.74 
    16255160 // mcpcom.e8e9 from e8e9.exe c mcpcom.exe mcpcom.e8e9
     5121140 // ash mcpcom.exe
     5135257 // ash /e mcpcom.e8e9
     5658514 // ppmd /m512 mcpcom.exe
     5671888 // ppmd /m512 /e mcpcom.e8e9
     6024066 // ppmd /o32 /m512 mcpcom.exe
     6042217 // ppmd /o32 /m512 /e mcpcom.e8e9
    Btw, ppmd's default is /o16, so it really dislikes higher orders sometimes.

  27. #27
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    As to E9, my tests showed that, except for SFC's acrord32, E9 hurts
    compression most of the time, both for ppmd_sh and ash.
    Obviously, it also deteriorates the compression for any non-exe files,
    as detection is not any smart.
    Last edited by Shelwien; 15th April 2009 at 22:41.

  28. #28
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    In some of my tests (well i have no numbers though) it turned out that transforming conditional relative jumps hurt. 7Zip does that, too, doesn't it? But i never tested E8 only. Did you test on much data?

  29. #29
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Well, I guess it all depends on how the tested executables was compiled.
    If there're many jumps to functions for some reason, then E9 with
    absolute offsets would be better.
    But if most of the jumps are local, then leaving them
    relative would be better, as there might be even some matches.
    However, this doesn't affect E8, as CALL is almost always used
    only on functions, except for occasional CALL $+5 in viruses and
    unpackers.

  30. #30
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    My filter versions in ppmd_sh and ash are different btw.
    Ash has additional conversion of absolute offsets to
    big-endian, and somehow it wasn't helpful for ppmd.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •