Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • encode's Avatar
    Today, 20:45
    Up! :_dance:
    5 replies | 201 view(s)
  • Shelwien's Avatar
    Today, 19:55
    Here's my attempt to make a decoder. I think it at least correctly parses masks, and 5/11 split for len/dist also seems to work. > I didn't get this part. Can you elaborate it a bit? There're two bytes per match. I tried to visualize the bitfield layout. "l" is a length bit, "d" is a distance bit, "D" is a high distance bit. But actual layout seems to be more like lllllDDD dddddddd I wonder if there's still a chance that its actually not distance, but absolute position with some offset, like in Okumura LZSS.
    3 replies | 219 view(s)
  • Shelwien's Avatar
    Today, 16:22
    fixed?
    5 replies | 201 view(s)
  • schnaader's Avatar
    Today, 16:12
    When using the "Thanks" button there: Not Found The requested URL /post_thanks.php was not found on this server.
    5 replies | 201 view(s)
  • Shelwien's Avatar
    Today, 15:04
    ok, I'd try to backup it too. ENCODE.RU has either 3 days (if my money transfer fails), or a month (unless somebody can pay 73 EUR per month to keep it going). Also please test https://encode.su and report forum bugs. webmaster's server is on windows and new one has linux - everything is very different so there're problems on that side too.
    5 replies | 201 view(s)
  • schnaader's Avatar
    Today, 12:17
    There's http://packjpg.encode.ru/, I'm not sure if it could/should be moved, it hasn't been updated the last years anyway, but the PackJPG Github repository links to it.
    5 replies | 201 view(s)
  • rhysling's Avatar
    Today, 12:05
    Thank you for your reply. It is indeed exactly like you said, I kind of figured out the 4 bytes mask when the forums were down :D The only remaining problem seems like the match length. I can't understand how that is coded. I have decompressed the part of the data more properly now using the mask and given offsets. It goes like this : EXPSKIN X:\DATAS\_MASTER\G_DATA\CHARACTERS\DEMONIC\PLAYER\MESHPWP\PLAYER_SKIN.TSKIN I think the minlen should be 3 in this case, because "00 10" copies "TER" and "00 20" copies "ER\". Still I have noticed some inconsistencies with the first byte (assumed match length). Both the words "DATA" and "SKIN has matches. For the first one it is coded as (length, distance) : "40 0F" and for the second one it is : "08 45". So distance is correct for both, but the lengths are different. I mean both of them supposed to copy 4 bytes, but I don't get why they have different values. That is the only mystery remaining to me. I couldn't make much sense of that 1st byte. Thanks for your help. I didn't get this part. Can you elaborate it a bit?
    3 replies | 219 view(s)
  • encode's Avatar
    Today, 11:57
    Subj https://encode.su/ Please update your bookmarks! The encode.ru will live can't say how long, so please help us inform members - share this info and update links at your web pages! Thank you! Stay tuned! :_thx:
    5 replies | 201 view(s)
  • Jarek's Avatar
    Today, 11:01
    Congratulations! I see you use Huffman - have maybe tried ANS? It would not only allow you to get a bit better compression ratio (especially o1), but e.g. James' implementations are also faster: https://github.com/jkbonfield/rans_static
    12 replies | 831 view(s)
  • Jarek's Avatar
    Today, 10:52
    I wanted to test the approach - learned basics of python, then went to tensorflow ... and got frustrated a bit, have to focus on something else now - will return in a few weeks. But I have prepared a more accessible paper this cycle - focused the simplest: d=1 case - SGD momentum method with online parabola model in this momentum direction to choose step size - finding linear trend of gradients in this direction using linear regression, what can be done by updating four (exponential moving) averages: of (theta, g, theta * g, theta^2) for theta, g being 1D projections of position and gradient: https://arxiv.org/pdf/1907.07063
    22 replies | 762 view(s)
  • Jyrki Alakuijala's Avatar
    16th July 2019, 12:37
    Could it be that you are comparing compressors with different backward reference window size?
    12 replies | 831 view(s)
  • Shelwien's Avatar
    16th July 2019, 12:28
    Both samples start with 26 literals, followed by 4 bytes of mask (big-endian; bit31 to bit0; 0=literal, 1=match). Matches look like 16-bit, "00 10" means "length=4, distance=16" so maybe something like DDDDllll dddddddd layout (minlen=4). Needs further investigation though.
    3 replies | 219 view(s)
  • dnd's Avatar
    15th July 2019, 20:46
    New: TurboPFor now available for SIMD ARMv8 NEON See: TurboPFor Integer Compression + Floating Point Compression
    40 replies | 18386 view(s)
  • xcrh's Avatar
    15th July 2019, 19:56
    LZSA proven to be rather interesting thing. What I like about it? - Still "simple LZ" kind of thing. - Still in "requires no memory for decompression" league (with some notes). - Seems to be very picky about stream format, trying to balance speed/simplicity vs ratio. LZSA2 looks nice in this regard. - Not yet an overgrown mammoth monster like brotli, zstd and somesuch. - LZSA2 got very reasonable ratio on small data sets. - One could find rather advanced techs here in action - be it repmatch or nibble alignment, that looks really nice. What I don't like about it: - Obsession on Z80 and other obsolete/toy platforms, ignoring modern viable uses (e.g. larger blocks for boot loaders/OS kernels/etc, modern CPU cores like e.g. app/mcu/etc ARMs, etc - comparably sized platforms for present-day uses). - A very strange decompression termination conditions, likely resulting from previous oddity. It looks like this thing was never meant to be decompressed in "safe" way? Like that: "I have 2K buffer for result, never exceed that, even if input is random garbage, no matter what". Seems this algo never been meant to run in safe way? Or I've failed to get idea. Looking on specs it seems author never considered decompressing that in safe manner under limited-size buffer that should never get exceeded, even if input is "damaged" so decompressor fed with (odd/"uncooperative" or even malicious) garbage. - Ahem, speaking of that, there're tons of standalone asm stuff, but no trivial, standalone C decompressor, digging to decompression routine expressed in C takes all pains of unwinding LZ4-like "enterprise-grade sphaggetti for all occasions". Uh-oh. Of course, perfection is a bitch, but as LZ4 shown, algo could enjoy bu very reasonable speed without resorting to assembly/simd/etc. TL;DR: out of simple LZs available in source form, on small data only lzoma gets better ratios. On larger datas lzsa isn't anyhow exciting though. I haven't benchmarked proprietary algos or algos with unknown/warez-like licensing since I have no use cases for it. Overall it looks nice for small data sets. So thanks to ppl for bringing this one into attention, I had quite some fun with it.
    150 replies | 36871 view(s)
  • xcrh's Avatar
    15th July 2019, 19:38
    If you take a look on phoronix in more details and then dig compiler devs mailing lists and somesuch, you'll eventually notice code optimization is a very tricky business. So when new compiler released it isn't unusual to get code larger or smaller on particular algo. Same for speed. To make it more complicated, unless you specify -march/-mcpu/-mtune, compiler targets something "generic", it is debatable what kind of compromises is to be taken to perform reasonably everywhere. There is nothing wrong in eventually changing notion of "reasonable". Long story short, when it comes to compilers, on new release some platforms/(sub-)archs win, some lose. Some algos win, some lose. It is rare to see TOTAL regression, where compiler would perform worse on all algos and cpu flavors - it usually prevents release until regression fixed. Yet it also eventually possible. If code generation regression hurts and seems to affect more than 1 algo, and especially on different -march/-mtune/-mcpu it can make sense to file a bug. Most funny part? If you obsessed by just one particular CPU core flavor and algo, measuring only these, it not to be taken as granted latest compiler would always be best choice. However it doesn't really tells much about overall code generation quality of compiler. It takes a lot of measurements on various algos to get anyhow reasonable idea on this. Some phoronix benchmarks can give coarse idea how it could appear. I bet it same with clang, with exception early versions had crappy optimizer but fast compile time, and now it got comparable optimizer ... and comparable compile times, because it's not like if one can have free lunch :) Side note: I've had some fun measuring ARM Cortex code generation vs various GCCs, interestingly, I've been mostly interested in smallest code size and RAM use, while performance is "good to have" but not really pressing matter. Across GCC 6.x to 8.3 it proven to be rather "chaotic" but fun to measure. In this particular cpus bloodline, code generation basically evolved making code slightly larger but considerably faster (somehow affecting even -Os). Ironically it wasn't what I really wanted under my assumptions, but tradeoff looked very sane. After all I dodged the bullet by changing CPU type to target crappy cortex-M0 cores. Cortex M0 got far more crippled instruction set (compared to M3 I really had), so generated code is considerably slower (underusing core capabilities). Yet, somehow, it also makes code size noticeably smaller, while target CPU still can run that code (at which point mission accomplished). So for me smallest combo been GCC 7 in M0 mode. Previous and later compilers lose to this combo (by rather low margins, but it really depends on how much you're pressed by system resources). p.s. if someome is obsessed on code size (e.g. for small systems), GCC got fairly neat thing, called LTO. It brings no speed penalty - but can reduce code size, eliminating unreachable/unused code. Ironically it does its job so damn well that in standalone environment it eliminated me WHOLE FIRMWARE (!!!), failing to see its entry point is used. I failed to persuade LTO in this setup my code is "used", though GCC also supports yet another cheat, putting all functions and vars to own sections - and then pruning unused ones. This trick worked, chopping code size considerably. However LTO is more radical and efficient at this. In small systems there is inherent catch - say, interrupt handlers are inherently "unused" - rest of code never calls them. p.p.s. if one needs for speed, gcc also got ton of "less reliable" things to try. -O5...9 could work. Or not. Same for -ffast-math, -funsafe-loop-optimizations and so on. One can get better performance at cost of incomplete compliance with standards, so some code would break. But most algos can get with it - eventually getting faster. If it's all that matters, it could be way to go.
    2 replies | 374 view(s)
  • rhysling's Avatar
    15th July 2019, 18:39
    Hello, I am not really skilled at compression types, but trying to improve myself. I am currently looking at some files from a game. I feel like it is using some kind of a LZ variant, but I haven't seen any noticeable stuff like flags or literal counts. That led me to believe that it is a custom variant. I might be wrong, and it might be something already being used though. I have attached 2 file samples. I included compressed text data, since it is easier to guess the decompressed result. Here is an example screenshot. First 4 bytes is probably the decompressed size, then starts the compressed data. The decompressed output should go something like : "EXPSKIN X:\DATAS\_MASTER\G????\CHARACTERS\DEMONIC\PLAYER\MESHPWP..." You can see in the data CHARAC(00 10) and PLAY(00 20). I am suspecting 0x10 and 0x20 to be offsets to the output/history buffer for copying "TER"/"ER" from "MASTER". I don't know how copy count is specified though. Also don't see any flags for literals. Any help and idea is greatly appreciated. Thank you.
    3 replies | 219 view(s)
  • LucaBiondi's Avatar
    15th July 2019, 17:58
    LucaBiondi replied to a thread paq8px in Data Compression
    My Big Testcase paq8px v180 vs. v181 Globally we loose about 2 KB JPEG loose 9 KB ISO gain 7,5 KB New record for MP4, TXT, EXE and ISO file formats :) bye bye, Luca
    1637 replies | 470021 view(s)
  • RichSelian's Avatar
    15th July 2019, 06:42
    my compressor Orz is now competitive with zstd/brotli. especially for text compression (like enwik8 ), orz is compressing 10x faster than zstd/brotli while getting same compression ratio :cool:
    12 replies | 831 view(s)
  • Darek's Avatar
    14th July 2019, 20:04
    Darek replied to a thread paq8px in Data Compression
    Scores of 4 Corpuses for paq8px v181. I've hide some versions to make this table better to read. All corpuses got gains from 0.02%-0.05%. As previous mozilla file from Silesia corpus got about 10KB of gain!
    1637 replies | 470021 view(s)
  • Jyrki Alakuijala's Avatar
    14th July 2019, 15:26
    If you compile brotli with release/optimized settings, then you get better performance. Usually you would see a difference of 3x from that. If I run them at quality -10 and -5 on silesia, I get: |Zstd 1.4.1 -10 | 12.0 | 0.390 | 59,568,473 | |Brotli 1.0.7 -5 | 10.6 | 0.739 | 59,611,601 | Zstd's low q matcher seems to be tuned better for longer data like silesia average file size of ~20 MB -- my guesswork is longer hashes for longer data (something we should do for brotli's encoder, too). Brotli's inherent compression density/speed advantage (which also slows down the decoding) is more clear when using brotli's slow matcher (quality 10 or 11) or on shorter data. Brotli:11 gives 49383136 bytes vs. vs zstd -22 --ultra gives 52748892 bytes, ~7 % more bytes for zstd (12 minutes vs 2.5 minutes of compression time). I couldn't figure out how to choose window size for zstd (like brotli's -w and --large_window parameters). What is the correct way to do that?
    12 replies | 831 view(s)
  • anormal's Avatar
    14th July 2019, 11:37
    Yes I found that, I got the files from WOS, but found new version (2016) in the z88dk git, also there is ZX7b by "antoniovillena", https://github.com/antoniovillena/zx7b and Saukav, "saukav is a generalization for zx7b algorithm" : It has a nice table to compare: Shrinkler deexov4 apcLib BBuster zx7mega saukav zx7bf2 -------------------------------------------------------------------------- lena1k 13757267 303436 176472 106746 95255 76547 81040 lena16k 238317371 4407913 2964139 1908398 1727095 1646032 1462568 lena32k 484967405 8443253 5846174 3651800 3300486 3231882 2803116 alice1k 10060954 274111 132816 98914 89385 70869 73459 alice16k 131592504 2973592 2132835 1812259 1614225 1338287 1328886 alice32k 249719379 5378511 4152512 3614393 3230255 2550243 2654236 128rom1k 13773150 249124 132763 82637 74110 60222 62000 128rom16k 197319929 3571407 2295235 1550682 1407478 1392317 1180569 128rom32k 394594060 7355277 4606385 3107867 2825773 1926027 2381847 -------------------------------------------------------------------------- routine size 245 201 197 168 244 ~200 191 Anyway, thanks to TurboBench author, it's a very nice and unique tool
    150 replies | 36871 view(s)
  • Obama's Avatar
    14th July 2019, 07:35
    Thanks your advise.the solution is in my mind.can or not I don’t mind so.i don’t want be famous and be pro or creator whatever.i make it because of fun and I never said it’s new thing.ty
    60 replies | 2774 view(s)
  • Gotty's Avatar
    13th July 2019, 20:14
    Gotty replied to a thread paq8px in Data Compression
    From v169->v170 I have a strong suspect: In v169b I accidentally broke segmentation, which Darek detected: tarred files and mixed content files lost mostly. I promised to come back with a patch soon. The patch is in my unreleased v169c version. (It was at the time of the server failure.) So the patch for the segmentation problem, and some segmentation improvements is coming back as v182. About v170->v171 degradation: I don't know. I just know that v170 was an exceptionally good release (from Márcio). And v171 was a bit weaker (from compression point of view).
    1637 replies | 470021 view(s)
  • Gotty's Avatar
    13th July 2019, 20:01
    Gotty replied to a thread paq8px in Data Compression
    Thank you, Mauro! Darek already sent me the file privately, but in the thread you provided, I found some extra files from the testset, so it was useful. Also got the 3 JPEG samples from Luca, and it turned out that the problem is not with introducing the probabilistic increment in the jpegmodel (which helps in case of Luca's files as well as in my testset). It is quite possible, that simply improving the normalmodel and (in case of L.PAK) the matchmodel is not in favor of multimedia files. It looks like multimedia content will need a specialized normalmodel/matchmodel.
    1637 replies | 470021 view(s)
  • hexagone's Avatar
    13th July 2019, 19:51
    Define "practical". Zstd and brotli are the most popular new open source compressors and are backed by big companies. However, it is not clear they are the best solutions for the Original Poster (x3 times slower than 7zip at compressing ? decompressing ? what kind of data?). I find Brotli's compression very slow at high compression ratios. Silesia i7-7700K @4.20GHz, 32GB RAM, Ubuntu 18.04 | Compressor | Encoding (sec) | Decoding (sec) | Size | |-----------------------------|-----------------|-----------------|------------------| |Original | | | 211,938,580 | |Gzip 1.6 | 6.0 | 1.0 | 68,227,965 | |Gzip 1.6 -9 | 14.3 | 1.0 | 67,631,990 | |Zstd 1.3.3 -13 | 11.9 | 0.3 | 58,789,182 | |Brotli 1.0.5 -9 | 94.3 | 1.4 | 56,289,305 | |Lzma 5.2.2 -3 | 24.3 | 2.4 | 55,743,540 | |Bzip2 1.0.6 -9 | 14.1 | 4.8 | 54,506,769 | |Zstd 1.3.3 -19 | 45.2 | 0.4 | 53,977,895 | |Lzma 5.2.2 -9 | 65.0 | 2.4 | 48,780,457 | For me MCM 0.83 and BSC are the codecs to beat (although MCM may be too slow for the OP) and zstd has the best decompression speed vs ratio.
    12 replies | 831 view(s)
  • Jyrki Alakuijala's Avatar
    13th July 2019, 16:52
    Isn't zpaq around 50x slower to decode than lzma? The original poster asked no more than around 3x.
    12 replies | 831 view(s)
  • dnd's Avatar
    13th July 2019, 10:42
    Make your own benchmarks with your own data using Turbobench Compression Benchmark including lz77, bwt, context-mixing (zpag) algorithms.
    12 replies | 831 view(s)
  • Mauro Vezzosi's Avatar
    13th July 2019, 10:13
    Mauro Vezzosi replied to a thread paq8px in Data Compression
    https://encode.ru/threads/2823-simv2-9m?p=54450&viewfull=1#post54450
    1637 replies | 470021 view(s)
  • Jyrki Alakuijala's Avatar
    13th July 2019, 01:38
    The two best practical open source compressors are brotli and zstd. Brotli for tasks that benefit from density or compression speed, and zstd for faster decompression speed.
    12 replies | 831 view(s)
  • maadjordan's Avatar
    12th July 2019, 21:47
    maadjordan replied to a thread 7-Zip in Data Compression
    INSTALLATION The Mfilter distribution package is a Zip file that contains two folders (named "32" and "64"). The folders contain the 32-bit and 64-bit versions of Mfilter, respectively. To install Mfilter, first create a folder named "Codecs" in the 7-Zip installation folder. Then copy the files from the "32" or "64" folder, depending on the 7-Zip edition that you are using (32-bit or 64-bit), to the "Codecs" folder. After that, launch 7-Zip and add "f=Mfilter" to parameters line to activate so you will have mfilter*.dll with "brunsli", "lepton" "wavpack" sub-folders.
    511 replies | 282879 view(s)
  • moisesmcardona's Avatar
    12th July 2019, 18:59
    moisesmcardona replied to a thread 7-Zip in Data Compression
    Do I just need to copy/paste the mfilter plugin into C:\Program Files\7-zip and use f=mfilter in the parameters? I always get "The parameter is incorrect". Am I doing something wrong?
    511 replies | 282879 view(s)
  • comp1's Avatar
    12th July 2019, 06:40
    To add to Shelwien's answer: If speed is a concern, the fastest you are going to get at the compression levels of nanozip would be MCM 0.83. http://encode.ru/threads/2127-MCM-LZP?p=43220&viewfull=1#post43220 http://www.mattmahoney.net/dc/text.html#1449 Beyond that, you are stuck with significant speed drops for compressors like paq, NNCCP, etc.
    12 replies | 831 view(s)
  • Gotty's Avatar
    11th July 2019, 22:12
    Gotty replied to a thread paq8px in Data Compression
    That is strange indeed. How can I get a copy of L.PAK?
    1637 replies | 470021 view(s)
  • Darek's Avatar
    11th July 2019, 19:35
    Darek replied to a thread paq8px in Data Compression
    Scores of paq8px v181 for mt testset. In total there is a some loss, however most of got gains. Quite fine for the biggest file K.WAD, for Q.WK3 and H.EXE but (what is a little strange) L.PAK got 3KB loss..
    1637 replies | 470021 view(s)
  • Shelwien's Avatar
    11th July 2019, 17:39
    NN are not especially good for compression, and there're very few of NN-based compressors (basically NNCP and lstmcompress, and a couple of others that include lstmcompress as submodel). Its usually plain contextual statistics... the trick is actually not in mathematics, but rather in programming of it - there's a trade-off of precision/memory/speed, and its pretty hard to balance it out.
    12 replies | 831 view(s)
  • AndrzejB's Avatar
    11th July 2019, 16:13
    drt|lpaq9m is quite fast. All most compressing archivers uses neural networks?
    12 replies | 831 view(s)
  • Shelwien's Avatar
    11th July 2019, 15:33
    Slower than 7-zip doing what? LZMA encoding is pretty slow (1-2MB/s), and 7-zip includes ppmd too, so based on encoding speed, most BWT/CM/PPM compressors would fit. I guess you can search http://www.mattmahoney.net/dc/text.html for your criteria. In any case, many of the best codecs (nanozip, rz and rest of Christian's codecs, ppmonstr) are not open-source. But it doesn't mean that there's nothing to learn from them.
    12 replies | 831 view(s)
  • Shelwien's Avatar
    11th July 2019, 15:20
    Shelwien replied to a thread LZ98 in Data Compression
    Huh, turns out, it was the standard Okumura LZSS, supported by quickbms. http://nishi.dreamhosters.com/u/unLZ98_v2.zip But I was right that I'd not be able to reverse-engineer the encoder from data :) (Replaced the encoder with quickbms version, now matches original compressed data after recompression).
    23 replies | 1000 view(s)
  • AndrzejB's Avatar
    11th July 2019, 15:13
    Is needed ranking, in contrast to http://prize.hutter1.net/#contestants should have only free open source programs and fast (up to 3x slower than 7zip for example) What programs would be on top these list?
    12 replies | 831 view(s)
  • Stephan Busch's Avatar
    11th July 2019, 14:56
    Stephan Busch replied to a thread 7-Zip in Data Compression
    Thank you very much, Aniskin I am also getting good results compared to plain 7z and to previous version of Mfilter: 7z+Mfilter 7-Zip 19.00 7z+Mfilter 03.07.2019 yx=9 lc=4 qs 11.07.2019 lc=4 qs f=MFilter:c128m lc=4 qs f=MFilter:c128m TEST_App 86.706.113 79.111.878 81.648.189 TEST_Audio 543.409.743 439.380.767 350.624.377 TEST_Camera 485.422.558 498.102.897 486.661.689 TEST_Gutenberg 282.748.668 318.157.415 283.751.556 TEST_Installer 576.875.494 577.054.231 577.162.757 TEST_Mobile 459.865.345 487.243.206 460.539.369 TEST_PGM/PPM 363.425.240 363.431.692 312.552.144 TEST_Sources 35.947.168 36.108.484 35.867.802 TEST_XML 197.279.352 197.277.833 197.279.352 TTL 3.031.679.681 2.995.868.403 2.787.338.091 You would get even more gain if your delta filter would use the detection of CSArc (linked in my previous post).
    511 replies | 282879 view(s)
  • Gotty's Avatar
    11th July 2019, 14:48
    Gotty replied to a thread paq8px in Data Compression
    - Matchmodel changes: - fix: expectedByte could contain garbage - fix: delta mode turned on unnecessarily for the next byte (at bpos=0) when the last bit (bpos=7) of the previous byte did not match - enhancement: a new recovery mode: when a 1-byte mismatch occurs but the match continues, the match is recovered (if no new match is found in the meantime) - enhancement: a StationaryMap and a SmallStationaryContextMap is converted to a ContextMap (less memory, better prediction, except maybe for some images) - enhancement: number of mixer contexts: 8 instead of 256 (less memory, better mixing) - Removed experimental command line options: Fastmode (-f) (in favor of a new blocktype in a forthcoming version) and Force PNM detection in text (-p). - Reverted probabilistic increment to normal increment in JpegModel - Other cosmetic changes
    1637 replies | 470021 view(s)
  • maadjordan's Avatar
    11th July 2019, 13:06
    maadjordan replied to a thread 7-Zip in Data Compression
    I am getting good result with lepton but brunsli is much faster scanned pdf file original 88,605,518 rar-max 82,774,439 rar5-max 82,774,439 7z ultra 82,125,082 7z-brunsli 64,013,180 7z-lepton 61,074,439 stuffit 14 82,656,497 (stuffit could not recognize file type as pdf and even the jpg streams) nice work.. hope mp3 on the way ;)
    511 replies | 282879 view(s)
  • Aniskin's Avatar
    11th July 2019, 04:22
    Aniskin replied to a thread 7-Zip in Data Compression
    New version of MFilter. - Now brunsli is default compression method for jpegs, lepton still available with a1 option. - Wav support - wavpack lib is used. - BCJ filter for PE files. - Ppm and .pgm support.
    511 replies | 282879 view(s)
  • LucaBiondi's Avatar
    11th July 2019, 01:34
    LucaBiondi replied to a thread paq8px in Data Compression
    Just to tell, also from v169 to v170 and from v170 to v171 we loose some KB... PROG_NAME PROG_VERSION COMMAND_LINE LEVEL INPUT_FILENAME ORIGINAL_SIZE_BYTES COMPRESSED_SIZE_BYTES RUNTIME_MS paq8px 166 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38661991 1113189 paq8px 167 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38661995 1126918 paq8px 168 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38661401 1119212 paq8px 169 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38651612 1122916 (this is Best result) paq8px 170 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38653113 1145139 paq8px 171 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655247 1153830 paq8px 172 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655265 1153905 paq8px 173 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655240 1163446 paq8px 174 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655213 1177792 paq8px 175 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655274 1183936 paq8px 176 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655274 1168394 paq8px 177 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655274 1158436 paq8px 178 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655309 1200522 paq8px 179 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38655305 1227193 paq8px 180 @subjpglist.txt -9 -v -log paq8px_subjpeg.txt 9 subjpglist.txt 52691326 38658732 1188585 Luca
    1637 replies | 470021 view(s)
  • Shelwien's Avatar
    11th July 2019, 01:17
    The link I posted has newer updates in ZX7 history (2016).
    150 replies | 36871 view(s)
  • LucaBiondi's Avatar
    11th July 2019, 01:16
    LucaBiondi replied to a thread paq8px in Data Compression
    Hi Gotty! Thank you. This is an example paq8px_v179 2/10 - Filename: ..\Varie\Jpeg_sub_testset\WP_20170821_19_14_04_Rich.jpg (6593924 bytes) Block segmentation: 0 | jpeg | 6593924 bytes File size to encode : 4 File input size : 6593924 File compressed size : 4844853 paq8px_v180 2/10 - Filename: ..\Varie\Jpeg_sub_testset\WP_20170821_19_14_04_Rich.jpg (6593924 bytes) Block segmentation: 0 | jpeg | 6593924 bytes File size to encode : 4 File input size : 6593924 File compressed size : 4845256 -------------- paq8px_v179 4/10 - Filename: ..\Varie\Jpeg_sub_testset\WP_20170822_11_30_10_Rich.jpg (4723897 bytes) Block segmentation: 0 | jpeg | 4723897 bytes File size to encode : 4 File input size : 4723897 File compressed size : 3482817 paq8px_v180 4/10 - Filename: ..\Varie\Jpeg_sub_testset\WP_20170822_11_30_10_Rich.jpg (4723897 bytes) Block segmentation: 0 | jpeg | 4723897 bytes File size to encode : 4 File input size : 4723897 File compressed size : 3483114 ------------- paq8px_v179 10/10 - Filename: ..\Varie\Jpeg_sub_testset\WP_20170824_12_48_24_Rich.jpg (6017677 bytes) Block segmentation: 0 | jpeg | 6017677 bytes File size to encode : 4 File input size : 6017677 File compressed size : 4389287 paq8px_v180 10/10 - Filename: ..\Varie\Jpeg_sub_testset\WP_20170824_12_48_24_Rich.jpg (6017677 bytes) Block segmentation: 0 | jpeg | 6017677 bytes File size to encode : 4 File input size : 6017677 File compressed size : 4389813 Where i can send this 3 jpeg files? can you provide me an email address? Thanks, Luca
    1637 replies | 470021 view(s)
  • Shelwien's Avatar
    11th July 2019, 01:15
    Shelwien replied to a thread LZ98 in Data Compression
    > If it fails, would having a memory dump of the unpacked blob help? Yes, we'd see if something decoded incorrectly. > I can possibly just change the boot program to not bother with compression or use something else (simpler). I had the same idea, but it won't fit without compression. Well, lzma can compress it 3x better, so we can use it when you'd want to install linux there or something :)
    23 replies | 1000 view(s)
  • Mauro Vezzosi's Avatar
    10th July 2019, 23:44
    > I've use .....050,2.0,0 as you wrote option but lstm-compress forced to use .... 050,2.0,1 then it uses proper value I think. It's ok, I suppose that 0.WAV has all the 256 values in the bytes and lstm-compress v3a force "1" to save 32 bytes of the "vocabulary" in the header.
    56 replies | 6535 view(s)
  • introspec's Avatar
    10th July 2019, 23:30
    There is no official repo for ZX7. The author has released the compressor via http://www.worldofspectrum.org/infoseekid.cgi?id=0027996
    150 replies | 36871 view(s)
  • Darek's Avatar
    10th July 2019, 22:48
    Darek replied to a thread lstm-compress in Data Compression
    Thanks, but it's something weird what I would have mention to ask you before you wrote. I've use .....050,2.0,0 as you wrote option but lstm-compress forced to use .... 050,2.0,1 then it uses proper value I think. As in attached screen.
    56 replies | 6535 view(s)
  • Mauro Vezzosi's Avatar
    10th July 2019, 22:28
    > How can I got your numbers with lstm-compress v3a version? In my previous post I wrote 0 instead of 1 in the last option, they should be -c352,3,10,0.2,1,0.0250,0.9999,0.0033,0.000050,0.0250,0.9999,0.00000100,0.050,2.0,1 or -c352,3,10,0.2,1,0.0250,0.9999,0.0034,0.000050,0.0250,0.9999,0.00000100,0.050,2.0,1 (Remove these spaces ........................................... ^ added by the program of the forum) They are the default options with cells=352, adam_alpha_lr=0.0033 or 0.0034, vocab_full=1. > I've started to test it and it looks as veeery slow - 1% for 0.WAV on 5-7min. lstm-compress is slower than NNCP. 0.WAV takes 7 min. and 30 sec. to show "3%" on my notebook. > Of course I've 6 sessions simulatesouly (5 sessions of NNCP and lstm) but it looks like very time consuming. The more program you run simulatesouly, the more time they takes (not proportionately).
    56 replies | 6535 view(s)
  • Darek's Avatar
    10th July 2019, 21:01
    Darek replied to a thread lstm-compress in Data Compression
    @Mauro - another question - haw is the speed of compression with your settings? I've started to test it and it looks as veeery slow - 1% for 0.WAV on 5-7min. Of course I've 6 sessions simulatesouly (5 sessions of NNCP and lstm) but it looks like very time consuming.
    56 replies | 6535 view(s)
  • telengard's Avatar
    10th July 2019, 20:04
    telengard replied to a thread LZ98 in Data Compression
    That makes total sense to me. Good points. I'm wondering what I could do to add some clarity once I can test the newly compressed binary. If it fails, would having a memory dump of the unpacked blob help? One thing I hadn't mentioned, and this may be an option. I can possibly just change the boot program to not bother with compression or use something else (simpler). I really didn't want to modify the boot program (and I'm not really sure I can yet). But this may be an option if for some reason a compressor that works can't be created due to the lack of info. thanks! I'll report back once I have more to share. Thanks again for all of your help, very appreciated!
    23 replies | 1000 view(s)
  • Mauro Vezzosi's Avatar
    10th July 2019, 18:02
    > As I understand these scores for (5) are the results of, de facto, lstm-compress v3a version. More or less, yes: it wasn't v3a, I changed the parameters in the source like we now do through CLI. > I'm trying to set your options in v3a and there are slightly different set of options than you describe (maybe it's mostly matter of names). In (5) I used the NNCP-style name to easily compare the NNCP and lstm-compress options. In v3a I used lstm-compress-style name, it's just a question of a slightly different name. > How can I got your numbers with lstm-compress v3a version? About with -c352,3,10,0.2,1,0.0250,0.9999,0.0033,0.000050,0.0250,0.9999,0.00000100,0.050,2.0,0 or -c352,3,10,0.2,1,0.0250,0.9999,0.0034,0.000050,0.0250,0.9999,0.00000100,0.050,2.0,0 (Remove these spaces ........................................... ^ added by the program of the forum) They are the default options with cells=352 and adam_alpha_lr=0.0033 or 0.0034. (5) option v3a option gradient_clipping=-/+2.0 grad_clip=2.0 n_layer=3 layers=3 hidden_size=352 cells=352 batch_size=10 horizon=10 time_steps=10 Used in NNCP, in lstm-compress is always = batch_size n_symb=256 vocab_full=1 (vocabulary size fixed to 256) ln=0 Used in NNCP, in lstm-compress is always = 0 fc=0 Used in NNCP, in lstm-compress is always = 0 sgd_opt=adam Used in NNCP, in lstm-compress v3a is always = adam lr=5.000e-002 learn_rate=0.050 (=5.000e-002) adam_lr=3.350e-003 adam_alpha_lr=0.0033 or 0.0034 (3.350e-003 is 0.00335, but adam_alpha_lr has a resolution of 4 decimals, not 5, and we cannot set exactly 0.00335) adam_beta1=0.025000 adam_beta1=0.0250 adam_beta2=0.999900 adam_beta2=0.9999 adam_eps=1.000e-006 adam_eps=0.00000100 (=1.000e-006) mem=62.412K. Memory used view in Task Manager These options are not in (5) because they are not in NNCP, they are unchanged between (5) and v3a. init_range=0.2 seed=1 adam_alpha_t=0.000050 adam_beta1_t=adam_beta1 adam_beta2_t=adam_beta2
    56 replies | 6535 view(s)
  • Darek's Avatar
    10th July 2019, 16:29
    Darek replied to a thread lstm-compress in Data Compression
    Here are my first attempt to test lstm-compress v3a version on my testset. In the table there are default scores (w/o parameters change) of: - compress without preprocessing, - compress with dictionary preprocessing (-c english.dic) and - compress with pre preprocesssing (-s english.dic then -c) - plus minimal scores of these three above. All compared to my latest NNCP scores of RC1 version (still in progress I didn't yet finish learningng rate test - 2-3 weeks to go... it's slow). But I have question: @Mauro at the NNCP post comparing lstm-compress with NNCP are you post set of scores with number 5 - here is a link: https://encode.ru/threads/3094-NNCP-Lossless-Data-Compression-with-Neural-Networks?p=60294&viewfull=1#post60294 As I understand these scores for (5) are the results of, de facto, lstm-compress v3a version. I'm trying to set your options in v3a and there are slightly different set of options than you describe (maybe it's mostly matter of names). How can I got your numbers with lstm-compress v3a version?
    56 replies | 6535 view(s)
  • Gotty's Avatar
    10th July 2019, 08:07
    Gotty replied to a thread paq8px in Data Compression
    Thank you, Luca! There was a small shange in jpegmodel. Could you upload your jpeg testset somewhere, so I can download (or just some files, that degraded the most)? My testset didn't show a degradation, but it is not a very large testset.
    1637 replies | 470021 view(s)
  • hexagone's Avatar
    10th July 2019, 02:20
    Some encoding tests vs. ZCM v.0.93 Win 7, i7-2600 @ 3.4 GHz, 16 MB R:\>AcuTimer.exe zcmx64.exe a -m7 calgary.zcm calgary\* Archive is R:\calgary.zcm Compressed 3251493 bytes to 810362 bytes Elapsed Time: 00 00:00:03.902 (3.902 Seconds) r:\bin\kanzi.exe -c -i r:\calgary -f -b 50m -l 7 Total encoding time: 3698 ms Total output size: 743760 bytes (Peak Memory is 1.1 GB) R:\>AcuTimer.exe zcmx64.exe a -m7 enwik8.zcm enwik8 Archive is R:\enwik8.zcm Compressed 100000000 bytes to 19669596 bytes Elapsed Time: 00 00:00:38.090 (38.090 Seconds) (Peak Memory is 1 GB) R:\>r:\bin\kanzi.exe -c -i r:\enwik8 -f-b 100m -l 7 Encoding r:\enwik8: 100000000 => 19597394 bytes in 36516 ms (Peak Memory is 1.5+ GB) R:\>AcuTimer.exe zcmx64.exe a -m7 silesia.zcm silesia\* Archive is R:\silesia.zcm Compressed 211938580 bytes to 41501382 bytes Elapsed Time: 00 00:01:09.900 (69.900 Seconds) (Peak Memory is 850MB) R:\>r:\bin\kanzi.exe -c -i r:\silesia -f -b 100m -l 7 Total encoding time: 93953 ms Total output size: 41892099 bytes (Peak Memory is 1.6+ GB) R:\>r:\bin\kanzi.exe -c -i r:\silesia -f -b 100m -l 7 -j 2 Total encoding time: 56561 ms Total output size: 41892099 bytes
    18 replies | 5596 view(s)
  • LucaBiondi's Avatar
    10th July 2019, 01:49
    LucaBiondi replied to a thread paq8px in Data Compression
    My Big test Case results - V179 vs. V180 Globallywe loose 3239 bytes. We have a gain of 5935 bytes in ISO file. But we loose 10966 bytes in JPEG file. (Why?) We achieve new record for MP4 files, BAK files and EXE files. Thanks Gotty!!!!1 Luca
    1637 replies | 470021 view(s)
  • Shelwien's Avatar
    10th July 2019, 01:03
    Maybe this? (seems to be newest): https://github.com/z88dk/z88dk/tree/master/libsrc/_DEVELOPMENT/compress/zx7
    150 replies | 36871 view(s)
  • Shelwien's Avatar
    10th July 2019, 00:50
    Shelwien replied to a thread LZ98 in Data Compression
    > I ran dhex on the original compressed and on a version that was decompressed and then re-compressed. > They seem to diverge quite quickly. LZ compression works by replacing repeated strings with references ({12-bit position;4-bit length} in LZ98 case). Problem is, there're multiple equally correct representations for the same data - for example, we can choose to not use string references ("matches") at all, or we can use any of multiple previous instances of the same string as reference (only match position would be different). Also, LZ matchfinding is usually implemented with some type of hashtable, so precisely reproducing the same compression can be hard. > I take the original code and decompress, make some changes, > re-compress and then the original boot code loads it and decompresses it. That's the only thing we can do really - test and change something if it doesn't work. Well, unless the encoder function is also there somewhere. For example, in the LZ98 header, there's a suspicious binary field "00 01 00 00". Could anything special happen after decoding 64k of data? > I'll start with something easy like changing a string. Good luck! :)
    23 replies | 1000 view(s)
  • Shelwien's Avatar
    10th July 2019, 00:24
    Shelwien replied to a thread LSTM and cmix in Data Compression
    > For now I think about two models You can see the models here: https://github.com/hxim/paq8px/blob/master/paq8px.cpp#L10602 > text like Wiki dump. Its actually not really text. There's too much of xml, wiki markup, html and bibliographies > is any faster method to determine best model than trying all 100 and slow down 100 times? Its possible to reach similar results with optimal parsing and model switching (slower encoding, much faster decoding), or by writing very specific data transformations for known syntax. But complexity is high enough even without that - mixing has low redundancy and is much simpler to implement. Of course, there're speed optimization tricks even for mixing - for example, NNCP splits data into multiple independent bit streams (for MT compression) and paq8px detects data types and enables only the relevant models. But rather than improving speed, these optimizations are commonly used to improve compression while keeping the same speed - since prediction quality is still limited by hardware.
    4 replies | 219 view(s)
  • Matt Mahoney's Avatar
    9th July 2019, 23:41
    The race continues. http://mattmahoney.net/dc/text.html :)
    77 replies | 19428 view(s)
  • AndrzejB's Avatar
    9th July 2019, 22:42
    AndrzejB replied to a thread LSTM and cmix in Data Compression
    For now I think about two models: random and text like Wiki dump. If is about 100 models, is any faster , statistic (for example Bayesian) method to determine best model than trying all 100 and slow down 100 times?
    4 replies | 219 view(s)
  • WinnieW's Avatar
    9th July 2019, 21:35
    I asked myself why Telegram picked the old deflate method instead of a modern algorithm, like Brotli.
    8 replies | 477 view(s)
  • telengard's Avatar
    9th July 2019, 17:58
    telengard replied to a thread LZ98 in Data Compression
    Thanks for the tip on Ghidra. It does a good job, I just need to manually mark most things, but once I do that, barring a few instructions, it disassembles (and decompiles!) quite well. I compiled both of your programs on my Linux box and they work well, thank you! I ran dhex on the original compressed and on a version that was decompressed and then re-compressed. They seem to diverge quite quickly. I don't know any theory going on here, but can the following work? I take the original code and decompress, make some changes, re-compress and then the original boot code loads it and decompresses it. I will be able to try this for real hopefully within the next couple weeks. I have a 2nd machine coming and I just need to solder on a debug header, map the pins, and get the software going. I want to have that working before I attempt trying any new changes. I'll start with something easy like changing a string. :) I just don't want to brick an $800 machine and not be able to recover.
    23 replies | 1000 view(s)
  • CompressMaster's Avatar
    9th July 2019, 17:28
    1.100K 2.200K 3.900K 4.5000K
    157 replies | 72325 view(s)
  • schnaader's Avatar
    9th July 2019, 16:41
    Yes, paq8p was only used to provide kind of a lower limit, it's much too slow of course. Nice to see that brotli performs well here. The .tgs format is indeed gZipped Lottie, here's a GitLab repository that has some reverse engineering results and Python scripts. The lottie2tgs script is basically a gZip invocation. The repository also contains a Synfig studio (open source 2D animation software) plugin so Adobe After Effects is not a requirement anymore (though I haven't tried if it works like it should).
    8 replies | 477 view(s)
  • anormal's Avatar
    9th July 2019, 13:43
    Does anyone nows if ZX7 has an official repo somewhere?
    150 replies | 36871 view(s)
  • anormal's Avatar
    9th July 2019, 13:17
    Hi trixter!, If anyone needs a newer version of this compilation just ask me, I am the author, whenever I found new dos stuff, I sort it up and store here. Regards
    3 replies | 463 view(s)
  • dnd's Avatar
    9th July 2019, 12:43
    "Bidirectional Text Compression in External Memory" Software: TU DOrtmund lossless COMPression framework
    71 replies | 9263 view(s)
More Activity