Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • nikkho's Avatar
    Today, 12:42
    nikkho replied to a thread FileOptimizer in Data Compression
    Thanks. For jpeg-recompress, I am not able to find regular Win32 builds, so nothing I can do. As for mutool and guetzli, do you know about any command line tool for automating the patching? I do not want to manually hex-edit them, since plugins are being updated frequently. Are you aware of any other cwebp/dewebp binaries I can use?
    361 replies | 120952 view(s)
  • nikkho's Avatar
    Today, 12:37
    nikkho replied to a thread FileOptimizer in Data Compression
    Thanks. When no downsampling, I am passign to Ghostcript: -dDownsampleColorImages=false -dDownsampleGrayImages=false -dDownsampleMonoImages=false This should work, but if not, I will disable Ghostcript in that scenario.
    361 replies | 120952 view(s)
  • Lucas's Avatar
    Today, 11:31
    So since it's mainly a memory throughput limitation couldn't we still be able to achieve even higher speeds by transferring the workload over to an area with even faster memory, as in the GPU? I imagine since GDDR5 bandwidth is anywhere from 4 to 10 times greater than the DDR3 memory in most machines that it's still possible to squeeze even more speed out of this algorithm.
    11 replies | 510 view(s)
  • Jaff's Avatar
    Yesterday, 22:42
    Jaff replied to a thread FileOptimizer in Data Compression
    These files need to be patched to work under Windows XP (32 bit) mutool.exe (Required OS/Subsystem version 6.00/6.00 -> 5.00/5.00) @ hex offset #150 / #158 guetzli.exe (Required OS/Subsystem version 6.00/6.00 -> 5.00/5.00) @ hex offset #150 / #158 just replace #06 with #05 cwebp.exe and dwebp.exe have kernell32.dll dependinces that are not present in the WinXP version. They run under Windows 7 (32 bit). jpeg-recompress.exe is a win64 binary file
    361 replies | 120952 view(s)
  • JamesB's Avatar
    Yesterday, 20:34
    Ah yes I see what you mean. There are 8 disparate writes, but always marching up in a linear fashion so basically it's just updating the same 8 cache lines most of the time so this has little gain. Looks like it is indeed just memory IO bound. I tried to do some hinting with mm_prefetch (NTA etc), but got the usual outcome of not helping. I've occasionally had luck on prefetch, but it seems very rare. It also only reduces latency and can't solve bandwidth issues I think.
    11 replies | 510 view(s)
  • JamesB's Avatar
    Yesterday, 20:29
    For single lines only. All lines get pasted together, even in a code block, breaking formatting. I guess this is an update gone awry and will fix itself with a new update soon. :-) Such is life! Edit: no matter! I see it already fixed itself and was purely a display issue. Hurrah to the admins. Thank you.
    11 replies | 510 view(s)
  • webmaster's Avatar
    Yesterday, 18:43
    fixed thx
    4 replies | 213 view(s)
  • mpais's Avatar
    Yesterday, 17:31
    >Do you have a match model for images, something like "motion compensation" in videos? I planned on trying it with the 8bpp color-palette model, on the idea that it may help with dithering patterns. Something like a 2D-match model, or as you describe it, apply motion-compensation pattern matching techniques but use them on previously seen pixel-data instead of previous frames. I also have code, currently on hold for that model, to look for symmetries, on the idea that on non-photographic color-palette images (such as icons, cartoons, etc) there is often a lot of symmetry that a simple predictive model based on pixel neighborhoods may miss, but a mirrored match along the symmetry axis can provide important data. >The plan is to make a specialized png recompressor based on reflate. I had a pretty good sketch of how I'd do the PNG model, combining ideas from the JPEG and GIF model. But since some types of data will require pre-processing (EMMA is purely a streaming compressor, with the lowest latency possible) so sooner or later I'd have to do it in EMMA, and there is already a lot of research on that, I decided it wasn't worth the effort. If sucessful, that would allow me to use my existing models, which provide much better compression. The LZW encoding is simpler and yet it proved quite hard to predict the codewords, and I made the erroneous assumption that GIF encoders respected the LZW algorithm, which led to problems with the model rejecting some GIFs when it thought they were corrupted. So I dread the amount of special corner cases that I'd have to account for with PNG..
    294 replies | 51842 view(s)
  • Shelwien's Avatar
    Yesterday, 17:01
    > See post #286 Ok, so I missed it :) >I can post the converter and SSE coder if you want (with C++ source). http://nishi.dreamhosters.com/u/SSE_v0.rar Mainly sh_SSE1.inc and MOD/* (these are generated though) > This model handles a lot of raw formats, TIFFs, DICOM images, etc. That's pretty cool, quite a long list. Do you have a match model for images, something like "motion compensation" in videos? > And obviously there are the JPEG and GIF models, Did you see that btw - http://nishi.dreamhosters.com/u/001.avi Its a10.jpg from SFC, converted FFT coef per frame video. > That is why I never made a PNG model, it would be really complex > and give mediocre results when compared to something like your Reflate. Reflate is actually pretty bad at handling pngs atm. That is, it generates too much diffs because of 4k winsize, etc. The plan is to make a specialized png recompressor based on reflate. Ideally, it would output a bmp image + diff data, but that still requires writing a recompressor for png's delta modes.
    294 replies | 51842 view(s)
  • Bulat Ziganshin's Avatar
    Yesterday, 16:40
    на бис - now try to answer me with quoting, in bold and write two paragraph answer :D
    4 replies | 213 view(s)
  • Bulat Ziganshin's Avatar
    Yesterday, 16:34
    >Do you get a benefit from splitting the next assignment step in 2 too? (Ie read from BWT into a temporary local array and then copy that into T in a second step.) Each of 8 output segments is written sequentially, so it's already fine PS: Yeah, i can't answer with quoting and in Advanced answer mode, there are no formatting buttons at all. Also, it's impossile to insert a newline. Hopefully, all smilies are alive, so forum kept its most important part of functionality :_biglol2:
    11 replies | 510 view(s)
  • mpais's Avatar
    Yesterday, 16:03
    >Well, actually yours was wrong, because it wasn't what you described in https://encode.ru/threads/?p=51533&pp=1 See post #286 -> 7,P27,..P10,P20] then? 33x expansion (16 bits for predictions, only 12 used) >I can post the converter and SSE coder if you want (with C++ source). Sure, thanks >1. Did you try mod_ppmd with non-text files? Yes, but on images EMMA shuts down all other models except for the match model. On most non-textual files the gain is smaller but still interesting. >2. Are you planning to use color conversions for 24bpp model? EMMA already has an optional color transformation, I've tried several actually, might make it user selectable. >So I think that you'd gain more by making eg. different models for photos and digital pictures (3D renders etc) than by trying to make universal models per color depth. EMMA already does that. On any 8bpp image that it recognizes, it parses the palette to see if it represents a grayscale image. So for 8bpp I have a model for grayscale and another for color-palette images. Even then, my grayscale model was designed to handle both natural and artifical images (see the results listed for the imagecompression.info testset). The same goes for the 24bpp model, and by extension, the 32bpp, which can handle the 4th channel as pixel data or transparency. I also have models for 4bpp and 1bpp, which are quite useful for executables, since they handle many resources. I then have a generic image model, which handles 1 to 16bpp images, in 1 to 4 channels, bitpacked or not, with control bytes or not, and with a variable pixel layout, to accomodate different CFA layouts from digital cameras. This model handles a lot of raw formats, TIFFs, DICOM images, etc. I then have specific models for some popular raw formats, such as ARW from Sony and RW2 from Panasonic. And obviously there are the JPEG and GIF models, though the latter gives sub-par results since it doesn't decompress the LZW encoded pixel data, it works by predicting the LZW indexes, that way it doesn't fail for some images like the transform in paq8, but the achievable ratios suffer a lot. That is why I never made a PNG model, it would be really complex and give mediocre results when compared to something like your Reflate.
    294 replies | 51842 view(s)
  • Shelwien's Avatar
    Yesterday, 15:37
    It could be related to https move. Anyway, bbcode still works. test test
    11 replies | 510 view(s)
  • Shelwien's Avatar
    Yesterday, 15:18
    > On a quick check, your struct is wrong. Well, actually yours was wrong, because it wasn't what you described in https://encode.ru/threads/?p=51533&pp=1 :) But thanks to the new info it worked. > The probabilities are interleaved . My SSE improves it from 176,966 to 176,738 atm. I can post the converter and SSE coder if you want (with C++ source). > I guess I'll also have to improve the text model, my models weren't written > with maximum compression in mind. Note that pxd18 result that I mentioned is the one with dynamic dictionary. When external drt is used (with its dictionary), it becomes 16387097 or so. > On an unrelated note, I've run some quick tests by plugging parts of the > new grayscale model into the 24bpp image model and I'm getting up to 1% > gains, so it seems worth it to write a new 24bpp model. Now I just need to > find the time to do it.. 1. Did you try mod_ppmd with non-text files? 2. Are you planning to use color conversions for 24bpp model? It could be neat to implement a lossless version of HSV for that. 3. There're actually multiple image types, independently from color model etc. Even photos converted from raws are different from photos converted from jpeg. Also, grayscale model is basically the same model that can be used for intensity channel for 24bpp images (with color conversion). So I think that you'd gain more by making eg. different models for photos and digital pictures (3D renders etc) than by trying to make universal models per color depth.
    294 replies | 51842 view(s)
  • JamesB's Avatar
    Yesterday, 14:28
    Meh! How do I post code now? Has this forum changed? I couldn't find any way of doing raw quoting. Odd.
    11 replies | 510 view(s)
  • JamesB's Avatar
    Yesterday, 14:27
    I also wondered if it's possible to avoid so many scatter style operations by buffering up data before distributing it. I've had success with this before, but it didn't seem to help here. Eg byte X; int i; for (i = 0; i != step & ~7; i+=8) { // decode 8 symbols at once for (int j = 0; j < 1; j++) { p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; X = BWT - (p >= idx)]; X = BWT - (p >= idx)]; X = BWT - (p >= idx)]; X = BWT - (p >= idx)]; X = BWT - (p >= idx)]; X = BWT - (p >= idx)]; X = BWT - (p >= idx)]; X = BWT - (p >= idx)]; } for (int j = 0; j < 8; j++) memcpy(&T], &X, 8); //*(uint64_t *)&T] = *(uint64_t *)&X; } for (;i != step; i++) { // decode 8 symbols at once p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; p = MAP - 1]; T] = BWT - (p >= idx)]; T] = BWT - (p >= idx)]; T] = BWT - (p >= idx)]; T] = BWT - (p >= idx)]; T] = BWT - (p >= idx)]; T] = BWT - (p >= idx)]; T] = BWT - (p >= idx)]; T] = BWT - (p >= idx)]; }
    11 replies | 510 view(s)
  • Shelwien's Avatar
    Yesterday, 13:54
    Shelwien replied to a thread 7-Zip in Data Compression
    There's nothing special, its just an icl.exe wrapper. I just have a few with paths to different lib versions etc. Note that icl.cfg has include paths via /I. @echo off SET LIB=%LIB%;C:\IntelH0721\lib\intel64;C:\MSVS10\VC\lib\amd64;C:\PROGRA~2\MICROS~1\Windows\v7.0A\Lib\x64\; set path=C:\VC2015\bin\amd64\;C:\IntelH0721\bin\intel64;C:\IntelH0721\binutils64\bin\; C:\IntelH0721\bin-intel64\icl.exe %*
    349 replies | 227695 view(s)
  • JamesB's Avatar
    Yesterday, 12:48
    I'm guessing the main speed benefit here is that you have interleaved code such that you don't need the result of p = MAP - 1] until 7 other similar instructions have executed, removing waits on memory fetches. Nice work. Do you get a benefit from splitting the next assignment step in 2 too? (Ie read from BWT into a temporary local array and then copy that into T in a second step.) Even prefetches may not help if ultimately it is throughput limited, but you can try things like perf or iaca to see how much waiting is going on; likely little now. The only way I can think of to improve a memory throughput limit is to get more memory accesses hitting cache. With BWT that's a real challenge and I don't see how it works short of some preprocessing step.
    11 replies | 510 view(s)
  • necros's Avatar
    Yesterday, 12:00
    necros replied to a thread 7-Zip in Data Compression
    pls post your icl, icl2a bats 8-)
    349 replies | 227695 view(s)
  • mpais's Avatar
    Yesterday, 11:09
    On a quick check, your struct is wrong. The probabilities are interleaved . On enwik8.drt I'm at 16.675.959, so >1% gain from mod_ppmd. I guess I'll also have to improve the text model, my models weren't written with maximum compression in mind. On an unrelated note, I've run some quick tests by plugging parts of the new grayscale model into the 24bpp image model and I'm getting up to 1% gains, so it seems worth it to write a new 24bpp model. Now I just need to find the time to do it..
    294 replies | 51842 view(s)
  • nikkho's Avatar
    Yesterday, 11:06
    nikkho replied to a thread FileOptimizer in Data Compression
    It requires C++ Builder 10 or later (not CodeBlocks).
    361 replies | 120952 view(s)
  • nikkho's Avatar
    Yesterday, 11:02
    It is not PKPAK 3.61 nor PAK 2.51 archives.
    2 replies | 95 view(s)
  • hexagone's Avatar
    Yesterday, 09:41
    Ported the code to C++ with targets for VS 2008 (win32), VS 2015 (win64) and g++/Linux.Here:https://github.com/flanglet/kanzi
    3 replies | 2360 view(s)
  • hexagone's Avatar
    Yesterday, 08:33
    Not one moment too late ... found the VS project I used back then.
    36 replies | 19851 view(s)
  • Shelwien's Avatar
    Yesterday, 08:29
    Shelwien replied to a thread 7-Zip in Data Compression
    http://nishi.dreamhosters.com/u/7zdll_1604_src.rar
    349 replies | 227695 view(s)
  • necros's Avatar
    Yesterday, 08:19
    necros replied to a thread 7-Zip in Data Compression
    can you give exact command lines used to build pls?
    349 replies | 227695 view(s)
  • Shelwien's Avatar
    Yesterday, 06:18
    > https://goo.gl/pIuYpP Uh, can you tell what I'm doing wrong? http://pastebin.com/3XCTXEf7 I'm getting clen=2914871.815 bytes from this script (run it as "coder book1.log book1.pd"). Symbols are correct, but probabilities are somehow wrong? Can you maybe give me final probabilities that go to rangecoder? As to order,mpc,nummasked - I guess its an idea, as I can provide these easily enough, same for the whole byte though. However some of the others don't really make sense for ppmd. P.S. Current SSE improvement for pxd18 - 16,516,347 -> 16,502,512 (book1 opt) -> 16,492,547 (enwik8 opt)
    294 replies | 51842 view(s)
  • Shelwien's Avatar
    Yesterday, 05:14
    I'd suggest looking at http://forum.xentax.com/viewtopic.php?f=13&t=4450 or around. Its really hard to understand an unknown binary format from a few compressed files, so unless there's an existing unpacker somewhere, it would involve reverse-engineering the decoder etc. There're kinda different forums for that.
    2 replies | 95 view(s)
  • theruler's Avatar
    Yesterday, 05:06
    Hi mates, I am into a translation project that aims to translate the graphic also, so I need to decode/encode an old raw image file of a game called "Kings of the beach" from electronic arts (1988). Any help would be much appreciated. Thanks in advance. Stefano
    2 replies | 95 view(s)
  • Shelwien's Avatar
    Yesterday, 04:30
    Shelwien replied to a thread 7-Zip in Data Compression
    I tried both. Compiles with both, but there're problems with codec/format enumeration when gcc is used, because 7-zip does it via global struct initializers, and it seems that gcc optimizer eliminates some of these. My method is very simple - I made a list with .c/.cpp files in it and feed it to a console compiler via @list. I guess its possible to generate a clean makefile from such a list. There's a quirk though - you are supposed to build it starting from CPP\7zip\Bundles\Format7zF - paths in many #include directives go up, like this: #include "../../7z.h" Also there're some defines: /DNDEBUG /DWIN32 /D_WIN32 /D_WINDOWS /D_MBCS /D_USRDLL /DMY7Z_EXPORTS/DNO_REGISTRY /D_7ZIP_LARGE_PAGES /DLZMA_LOG_BSR
    349 replies | 227695 view(s)
  • necros's Avatar
    Yesterday, 01:04
    necros replied to a thread 7-Zip in Data Compression
    anyone tried to compile 7z with Intel compiler or MingW (clang is also interesting)? guide would be great
    349 replies | 227695 view(s)
  • necros's Avatar
    Yesterday, 01:01
    necros replied to a thread FileOptimizer in Data Compression
    discussed it posts ago no true solution was found, btw my question was closed but issue stays 8-) https://github.com/coherentgraphics/cpdf-binaries/issues/12
    361 replies | 120952 view(s)
  • webmaster's Avatar
    Yesterday, 00:06
    Done https://encode.ru/
    4 replies | 213 view(s)
  • Lucas's Avatar
    22nd January 2017, 23:46
    Thank you Bulat and Christoph! On my desktop (i5-4690K@4.4ghz) it's decoding speed is about 115 MB/s on enwik8 with a 16MB block using all threads so it's not far off from your observation; but that's without the entropy stage since it's not the fastest ans coder in the world. I'll keep working on it.
    11 replies | 510 view(s)
  • Christoph Diegelmann's Avatar
    22nd January 2017, 23:33
    First off: Good job ! I've already done some research in this direction (for my compressor bce but currently I focus on improving libdivsufsort and bce itself) and think I've seen this before somewhere. I've got a folder full of papers on the inverse bwt but sadly I can't find it currently. The papers "Cache Friendly Burrows-Wheeler Inversion" and "Slashing the Time for BWT Inversion" might be a good start. I think they (or another paper) were able to achieve 2 times speed up by decoding 2 chars at a time by calculating MAP - 1] - 1] ahead of time in a more linear way. This might be possible for your version, too. Keep us updated !
    11 replies | 510 view(s)
  • Bulat Ziganshin's Avatar
    22nd January 2017, 22:32
    my greatest congratulations! not everyday we see 4x speedup in one of the key compression algorithms!!! Now why it's "only" 4x. Speed of ordinal unbwt is limited not by cpu cores, but by memory itself. On each step you have to request almost random memory cell, so it can't be cached and each byte decoded means 1 memory access. And you need these data to arrive in order to know address for the next access. So you are limited by the memory latency, which is usually 50-70 ns - hence the 15-20 MB/s decoding speed limit. Your genius code traverses 8 indexes simultaneously, this means that in the same 50-70 ns you can decode 8 bytes, and overall speed may be 120-160 MB/s. Well, only in the case when memory has large enough throughput. But it's limited too. My own measures on Sandy Bridge 2600K was that with multiple simultaneous memory access you can read about 50-100 MB/s. I don't remember exact numbers, but probably spedup was in range of 4-6x using only single core, and 6-8x using multiple cores. This means that you can get a bit better performance, may be 1.5x extra speedup, by splitting it between multiple threads. The limit has nothing common with compiler optimizations. It's easy to see that you still perform only ~1 arithmetic operation per 10 cpu cycles (while up to 4 per cycle may be executed). So, no - SIMD is absolutely useless here, since you utilize even scalar ALUs by less than 10%. Finally - modern CPUs executes commands speculatively, limited much more by dependencies rather than original command order. In fact, there is a window of last 100-200 commands not yet executed, and these commands are executed once their data arrive. And this window is large enough to keep several last loops of your cycle. The only way i know to speedup unbwt beside this point is to use lzp/dictionary preprocessors. In particular, we can encode words as 16-bit entities and perfrom 16-bit BWT, so each memory access will read 2 useful bytes and give us the entire word number or LZP length.
    11 replies | 510 view(s)
  • Lucas's Avatar
    22nd January 2017, 21:25
    Hi guys, so I have a decoding loop for BWT and was wondering if there's a way to make it vectorized among N parallel units. Currently it decodes 8 symbols at a time. https://github.com/loxxous/Jampack/blob/master/bwt.cpp#L72 I was wondering if there's a way to break the loop's dependencies such that it can just decode N streams at independently and simultaneously until they've each processed 'step' number of symbols instead of waiting on each iteration of the loop. In my experiments adding more streams improves speed to a certain degree, two decoding streams (indexes) usually causes a 2 times speed up compared to using a single BWT index but on 8 parallel streams it appears to be capping out at a 3-4 times speed increase. I'm doubtful any compiler will notice they are 8 non-overlapping streams so I'm pretty sure it's not getting optimized as best as it should be. Any advice is appreciated.
    11 replies | 510 view(s)
  • thorfdbg's Avatar
    22nd January 2017, 19:10
    Sure, go to: http://hdrvdp.sourceforge.net/wiki/ Note, however, that VDP 2.2 is matlab, so you need a matlab installation for linux.
    31 replies | 4369 view(s)
  • olavrb's Avatar
    22nd January 2017, 17:08
    olavrb replied to a thread FileOptimizer in Data Compression
    Lossless PDF compression with FileOptimizer ain't lossless. Using v9.50: - Options tab | Optimization level: 9 - PDF tab | Profile: None, no downsampling Before (left) and after comparison is attached. Edit: If "None, no downsampling" is not intended to be lossless, I'd really wish for adding a truly lossless option to FileOptimizer :)
    361 replies | 120952 view(s)
  • necros's Avatar
    22nd January 2017, 13:41
    necros replied to a thread FileOptimizer in Data Compression
    Nikkho how do you compile Fileopt.? Latest Codeblocks doesn`t recognize .cbproj What ver do you use?
    361 replies | 120952 view(s)
  • mpais's Avatar
    22nd January 2017, 13:18
    >So, can you post emma probability log for book1? https://goo.gl/pIuYpP >Much less effect with paq8pxd18 though - 187415->187216. >Any ideas about possible contexts? I tried adding SSE success runs etc, but it didn't help - just using order1 atm. I'm currently at 176.985 for book1, I don't think I'll be able to improve it further since I don't have access to mod_ppmd's internal state. I don't know how your SSE works for mod_ppmd (bytewise/bitwise?), for EMMA I'd try using: - current order, quantized, probably something like Min(?,Order) + log2(Max(1,Order-?)) - most probable symbol in current context - number of symbols in current context (masked, so excluding symbols from escaped higher orders), quantized - difference in (masked) cumulative frequencies between the last context and the current one, quantized - (if bitwise only) already encoded bits from present symbol, with a leading 1 bit to disambiguate when encoding 0's - flags: have we escaped to a lower order? are we using inherited stats? does this context have non-ascii symbols (after masking)?
    294 replies | 51842 view(s)
  • SvenBent's Avatar
    22nd January 2017, 10:10
    Yeah its just funny coincidence you interest to actually make an account never occurred before you ran into pirated software example... That coincidence seems to happen a lot. It not like you are the only example of coming to this forum and first post is asking for help in regards to compression of pirated software. If you had actually taking interest in compression and read around the forum the solutions is pretty much there. At least a lot of pointer to the basic idea of how it became that size. So i'm just wondering how you interest in compression didn't stretch to actually research on the forum. But i guess you have some innocent answer for that coming up. Never the less i can only advice you to read around the forum if you are truelly interested. There is a lot of information that pieced together will give you your answer.
    11 replies | 404 view(s)
  • Shelwien's Avatar
    22nd January 2017, 04:34
    191250 atm. So, can you post emma probability log for book1? Much less effect with paq8pxd18 though - 187415->187216. Any ideas about possible contexts? I tried adding SSE success runs etc, but it didn't help - just using order1 atm.
    294 replies | 51842 view(s)
  • boxerab's Avatar
    22nd January 2017, 03:25
    Thanks, good point. If PSNR and SSIM are not great indicators of fidelity with original image, how does one evaluate two lossy codecs for image quality? I suppose, as you say, one needs the good old human visual system to decide.
    42 replies | 13680 view(s)
  • Samantha's Avatar
    22nd January 2017, 03:19
    Ha..Ha..ha..this is strong, I hope it is a sarcastic joke..:D. to be noted that the compression concept of a block of files is not based on the addition arithmetic (1 + 1 + 1 = 3), but each file has a different compression ratio, based to its structure. Your calculation expected (28mb x 13) = 364 mb .... but then if I insert the 3 files (0909c-m00_main.tex) in three different subfolders, I get one data folder of size ( 3 x 92MB ) = 276mb uncompressed, based to your calculations I should get a compressed archive of (28mb x 3) = 84mb, on the contrary I get this ... -↓- -↓- Compressed Archive Completed At -↓- 22/01/2017 00:53:07 ============================================================================================================================================= ○ ○ ○ ○ ○ ○ ○ ○ nz 277,2°MB 59,3°MB 21,39% 3°Files 00:02:59:37 a -r -v -cc -m1g -br256m -bw256m -p2 -t2 -nm -nofilenameext -sp ============================================================================================================================================= :rolleyes:
    11 replies | 404 view(s)
  • ony's Avatar
    22nd January 2017, 02:33
    Dear SvenBent. I dont know why you cannot understand me. Sometime man find great compression example. And one wonder how it can be achievd. Thats all, the Example game I posted here even if it was free software I would not neither play it nor care about it becuase simply my propouse is to understand compression. I do understand that we should respect people work but again My solo propose is to understand how compression(only compression) work not to learn how to make pirated softwares If I learn hack that doesnt mean I will only hack Pcs and steal people bcus I can work in security company or what ever. Please understand my propose Thanks
    11 replies | 404 view(s)
  • ony's Avatar
    22nd January 2017, 02:28
    The original method was Wave injector
    11 replies | 404 view(s)
  • ony's Avatar
    22nd January 2017, 02:19
    No 28 mb is not enough because the game contain 13 .Tex files which mean in this case it becomes 364 mb wheras in compressed version they are only 160 mb.
    11 replies | 404 view(s)
  • Samantha's Avatar
    21st January 2017, 21:35
    You get a very good compression ratio with that file (0909c-m00_main.tex), I do not understand your problem... -↓- -↓- Compressed Archive Completed At -↓- 21/01/2017 19:32:35 ============================================================================================================================================= ○ ○ ○ ○ ○ ○ ○ ○ nz 92,4°MB 28,6°MB 30,95% 1°File 00:01:02:447 a -r -v -cc -m1g -br256m -bw256m -p2 -t2 -nm -nofilenameext -sp =============================================================================================================================================
    11 replies | 404 view(s)
  • SvenBent's Avatar
    21st January 2017, 21:26
    If you now its pirated you are condoning piracy which is basically pissing on people like those on this forum that creates software. you are blatantly saying you don't want to pay for our work and effort and now want us to help you do that? The fact you didn't pick up on this the first time it was explained to you and now have the audacity to tell not to bring up the morale issue just further improves the point that you deserve absolutely no help from the kind of people you are disrespecting. I have no respect for piracy or the people condoning it.
    11 replies | 404 view(s)
  • ony's Avatar
    21st January 2017, 19:34
    I have got the Original game Hitman Blodmoney(Demo) version which contain same file .tex File https://ufile.io/0909c here is the .Tex file and it is legal and original If you want to get the demo version it is avaliable in many places Steam for example and again I really dont care about piracy I only want to understand how Compression is working Thanks for advance
    11 replies | 404 view(s)
  • snowcat's Avatar
    21st January 2017, 17:12
    Usually, we will ask for sample files, but since it a pirated game, you should not upload it here. So, how can we help you?
    11 replies | 404 view(s)
  • ony's Avatar
    21st January 2017, 14:58
    I have not created it. I am not doing any piracy software. I know it is pirated game. but I only care about the compression method not the game I have said that I Have Hitman BloodMoney compressed (I did not made that) 269 mb After extracting it becomes 2.89 gb I tried to compress .Tex files but no method worked for me Anyboday knows how to compress such file? thanks And please no lesson about morality
    11 replies | 404 view(s)
  • Shelwien's Avatar
    21st January 2017, 14:38
    For a change, tried to apply v4's SSE to paq8p log instead... 192108 -> 191437 for now (book1). Wonder if it would work with emma too.
    294 replies | 51842 view(s)
  • mpais's Avatar
    21st January 2017, 14:08
    >Still, you can try testing it with EMMA - maybe with your mixing results would be better? :) book1 177.153 v3, unaltered 177.129 v4, unaltered 177.027 v3, SSE done by EMMA 177.122 v4, SSE done by EMMA book2 114.965 v3, unaltered 114.937 v4, unaltered 114.919 v3, SSE done by EMMA 114.932 v4, SSE done by EMMA enwik6 200.197 v3, unaltered 200.124 v4, unaltered 200.088 v3, SSE done by EMMA 200.107 v4, SSE done by EMMA The unaltered results are sligthly better, but using SSE in EMMA gives even better results with v3. >I'd be happy if it helps you improve the model, but imho counts and differences >are too small to make any conclusions there. Well, it has already helped improve the prediction with SSE, and I'm currently trying to tweak some mixers contexts to see if I can squeeze some more gains. enwik8.drt 16.684.505 v3, unaltered 16.679.568 v3, SSE done by EMMA enwik9.drt 135.277.490 v3, unaltered 135.254.675 v3, SSE done by EMMA >My scripts currently use bit/probability pairs anyway, but I can convert it if its easier >to write bytes for you. But please don't quantize the probabilities. It's just a matter of speed, it already takes so long to compress :rolleyes: 7,P27,..P10,P20] then? 33x expansion (16 bits for predictions, only 12 used) >I could try optimizing v4 SSE parameters based on overall results. So you'd precompute the unaltered mod_ppmd predictions for a file, your optimizer would run a SSE test scenario on it, output a new log file with just the predictions, and then EMMA would use this file when encoding, instead of calling mod_ppmd to get a prediction, it would simply read the prediction, and use it as usual (mix it with the rest, apply SSE, code it)? In that case this should ideally be a 2 pass version of EMMA, first pass would write a log file with the individual predictions for each model, so the second pass would just read these predictions and those from mod_ppmd, and just do the mixing, SSE and coding stages. But in EMMA, some of mixing contexts use information from the internal state of the models, so you can't skip the modelling. And then there's all the parsing, which is done online, nothing is stored in the compressed files, there is no block segmentation. So it would take quite a big rewrite to make it possible. And even then, I'm not sure that would be ideal. That wouldn't take into account that I can just do the SSE in EMMA based on the unaltered mod_ppmd prediction and maybe get better results.
    294 replies | 51842 view(s)
  • Shelwien's Avatar
    21st January 2017, 09:30
    > Just some food for thought. I'd be happy if it helps you improve the model, but imho counts and differences are too small to make any conclusions there. > Sure, if you prefer a more visual analysis. How about , > with P1 and P2 quantized to 8 bits each? So for every input byte, you get 17 output bytes. My scripts currently use bit/probability pairs anyway, but I can convert it if its easier to write bytes for you. But please don't quantize the probabilities. ... So v4 is likely a failure (because its slower and uses more memory, but results are the same), but if you can make a console coder which would encode files from probability logs, I could try optimizing v4 SSE parameters based on overall results. The idea is to skip computing parts not affected by parameter values, like counter lookups, log their values once, and then only iterate the parametric part.
    294 replies | 51842 view(s)
  • Shelwien's Avatar
    21st January 2017, 09:11
    Made a new version, with added SSE - http://nishi.dreamhosters.com/u/mod_ppmd_v4_dll.rar Standalone compression improved, 209952->204730 for book1, 245370->237279 for enwik6. But with paq8p: 192,227 BOOK1.paq8p 191,293 BOOK1.paq8p_v3 191,309 BOOK1.paq8p_v4 211,370 enwik6.paq8p 210,168 enwik6.paq8p_v3 210,166 enwik6.paq8p_v4 Still, you can try testing it with EMMA - maybe with your mixing results would be better? :) (just replace the dll)
    294 replies | 51842 view(s)
  • SvenBent's Avatar
    21st January 2017, 05:19
    Use the same method to how you compressed it the first time. i assumed you made the original compression by the facts that it is copyrighted software and you would not be stupid enough to go on a forum mainly consisting of people that are software developers and ask about piracy help...
    11 replies | 404 view(s)
  • SvenBent's Avatar
    21st January 2017, 04:09
    Beeing my digital paranoia personality i would love for that to happen too. But then again Encode.ru aint that mainstream and im using a uniqie password for it so there is really nothing of worth to get from here.. I might just be weird but i do like the idea of https everywhere.
    4 replies | 213 view(s)
  • mpais's Avatar
    21st January 2017, 01:22
    >Nice, but you kinda did too much, and I don't see anything in these stats. Actually, I see a lot of interesting stats there. Look at book1: Context: None 35,85%/40,58%] 37,31%/40,59%] 43,15%/47,17%] 44,47%/48,29%] 45,69%/47,49%] 34,83%/31,84%] 41,44%/44,20%] 26,93%/21,61%] Ok, so in general, mod_ppmd gives a somewhat consistent boost, regardless of context, to the quality of the predictions for bits 7 to 3, and 1, all of them about 3 to 5%. But, and this surprised me, the predictions get worse for bits 2 and 0. This means that the mixer is assigning a high weight to the mod_ppmd model in certain circunstances where its prediction (unexpectedly to any mixer context) is bad when compared to the other models. So the mixer is getting some benefit from the model, maybe because of that confirmation bias when generally the predictions are good and in agreement, but it's unable to derive the contexts in which it is failing. Since the compression ratio is improving anyway, it is safe to assume that the coding gains afforded by the improved predictions far outweight the coding losses, but if we could find the causes for these losses, we could use that information to improve the mixers contexts to be aware of these discrepancies, and then the mixer would by itself learn to trust less on mod_ppmd's predictions when needed. Now look at this context: Context: After uppercase 31,25%/42,23%] 32,16%/41,51%] 50,22%/34,07%] 47,20%/38,14%] 30,24%/22,42%] So, immediately after an uppercase letter is seen, mod_ppmd is quite good at predicting the first 2 bits, but more so, it's bad at predicting the last 3 bits. And in both cases, the results deviate significantly from the contextless average seen above. And let's see if spaces make a difference too: Context: After space 0x20 If you look at the spreads between the results, they closely match those of the contextless average, and for the middle bits it's basically a tie. So a single space in and of itself isn't important. Let's check line feeds: Context: After line feed 0x10 Here we see that mod_ppmd gives a nice small improvement to every bit except the last. So it seems that bit 0 really gets worse predictions with mod_ppmd in most contexts. And if you look at the contexts for digits, you'll see that in almost every bit position, mod_ppmd causes significantly worse predictions. And now, just for fun, here is the effect of the adaptive learning rate when using just the main model at its lowest setting coupled with the mod_ppmd model: Adaptive Learning Rate Off Context: After uppercase 57,32%] Adaptive Learning Rate On Context: After uppercase 70,92%] With no text model, we see that mod_ppmd is much better at modelling after capital letters than the main model, especially for the last bit(!). And with the adaptive learning rate active, it squeezes even more gains from it. This makes some sense intuitively, since the last bit should be the easiest to predict, seeing as how we already have all previous bits for context, and so the prediction error should be small, in which case a reduced learning rate may help converge on a better local minimum. Just some food for thought. >Can't you just write the log in the form like "1 22222 33333" ( ie { bit, p1, p2 } ). Sure, if you prefer a more visual analysis. How about 7..P10]7..P20], with P1 and P2 quantized to 8 bits each? So for every input byte, you get 17 output bytes. Best regards
    294 replies | 51842 view(s)
  • Shelwien's Avatar
    21st January 2017, 00:10
    Nice, but you kinda did too much, and I don't see anything in these stats. Can't you just write the log in the form like "1 22222 33333" ( ie { bit, p1, p2 } ). Then maybe I'd be able to generate something like this - http://nishi.dreamhosters.com/u/lzma_defl.png
    294 replies | 51842 view(s)
  • taliho's Avatar
    20th January 2017, 23:52
    Does anyone know of any publications on this work?
    1 replies | 383 view(s)
  • ony's Avatar
    20th January 2017, 22:14
    I have HitmanBloodMoney compressed which is only 269 Mb after Extracting it is 2.89 GB I tried to compress it with arc and precomp but did not work, The original method was UHARC with Wave Injector Can you Please Explain to me Why I am not getting any good result Thanks
    11 replies | 404 view(s)
  • mpais's Avatar
    20th January 2017, 22:07
    I've made a special x64 build with logging capabilities. When logging is active, 2 predictions are calculated for each bit, one with the options selected, and another one with those same options except for the mod_ppmd model. These are then compared according to each context, and for each of those, the log registers how many times this context was found (the value in brackets), and then for each of the 8 bits encoded in that context (from msb to lsb), the percentages regarding how often did each model provide a better prediction. Note that these won't necessarily sum to 100%, since I'm excluding any situation where both predictions are equal. The first percentage listed is for the prediction excluding the mod_ppmd model, so a higher first percentage means mod_ppmd hurt the prediction more often then it helped, and vice-versa for the second percentage. The log file is created in the same directory as EMMA, named with the same name as the executable but with a ".log" extension (usually "emma.log"), so if you wish to run more than 1 instance, you should make several copies of both the executable and the preset file, and change their names. Now for some results (sorry for the long post). book1 Context: None Context: After space 0x20 Context: After comma+space Context: After uppercase Context: After line feed 0x10 Context: After digit+space Context: After first digit Context: After first 2 digits enwik6 Context: None Context: After space 0x20 Context: After comma+space Context: After uppercase Context: After line feed 0x10 Context: After digit+space Context: After first digit Context: After first 2 digits dickens Context: None Context: After space 0x20 Context: After comma+space Context: After uppercase Context: After line feed 0x10 Context: After digit+space Context: After first digit Context: After first 2 digits If anyone wants to run some tests, I've attached this interim version.
    294 replies | 51842 view(s)
  • Shelwien's Avatar
    20th January 2017, 13:14
    > If we know that mod_ppmd is better only for e.g. bit7 for english text, > how can we be sure it's always better when the text model is switched on? My main idea about that was actually to use different mixer weights for ppmd depending on that context. > Do you suggest to check all models and not just mod_ppmd? Ideally, yes. Well, for my own coders I usually have a different option - I can simply collect all the relevant contexts, define masks for them, and then run my parameter optimizer. But in this case it would be likely faster to just check it visually first. > What do you suggest to do when a predictor is dynamically added to the mixer? I don't really see much problem in that... you can renormalize the remaining weights to the same total, for example. Also I don't like n-ary mixers anyway - I only use trees of binary mixers for my own coders, because there (1) meaning of mixer weight is very clear - its a probability of one input being better than another; its even possible to explicitly compute it (this probability), instead of using a specific mixer update; (2) its possible to use different update parameters for different binary mixers; (3) its possible to use arbitrary contexts for different mixers. And in such a mixer-tree model there's absolutely no problem with dynamically adding or removing some inputs - we'd simply skip mixing (and its update). P.S. I did add SSE to mod_ppmd predictions, book1 compression improved from 209933 to 204622. Will see how it affects paq8p or something.
    294 replies | 51842 view(s)
  • Mauro Vezzosi's Avatar
    20th January 2017, 12:46
    > Can you check context? > I mean, something like logging probability estimations generated by mod_ppmd and your main model, > and then checking if maybe mod_ppmd only generates better results in some contexts, like eg. only for bit7 or bit6, > or only after \x20 (space) symbols. > So I'm suggesting to try and see where specifically it is better, > because normally the mixer is unable to notice any patterns, when > a specific model is only better in specific contexts. Shouldn't be done more efficiently by the mixer instead of a fixed way? Ok, mixer must have the right context, but I guess that mod_ppmd can be also useful on non text/non usual text/mixed data (DNA, EXE, CSV, UTF-*...). If we know that mod_ppmd is better only for e.g. bit7 for english text, how can we be sure it's always better when the text model is switched on? Do you suggest to check all models and not just mod_ppmd? > we're trying to check whether ppmd just > accidentally improves the results at random points, or if there's a visible > contextual dependency among the points where predictions are improved. > If its the latter, we can try further improving the results by adding ppmd > to the mix only in specific contexts. What do you suggest to do when a predictor is dynamically added to the mixer? I found that sometime is better to add an "else" prediction when a model doesn't have a prediction (e.g. when a match model don't match) and sometime is better feed the mixer with a flag to ignore the prediction (the mixer must check this flag every time it use a predictor, slowing down the program :-( ). TIA
    294 replies | 51842 view(s)
  • necros's Avatar
    20th January 2017, 11:39
    subj
    4 replies | 213 view(s)
  • RamiroCruzo's Avatar
    20th January 2017, 08:03
    None of the antivirus out there today allows any software, companies pay to get their exe whitelisted. So, its not the "warez" factor that makes exe flagged as "Virus". XD We should be discussing possible ways of recompression rather than just ordering Schnaader Uncle, as last time I checked, lz4 got 21 lvl and can make about 200+ recompression possibilities compared to 81 of zlib.
    22 replies | 5033 view(s)
  • Shelwien's Avatar
    20th January 2017, 04:37
    > Ok, so p0_old would be the prediction I would normally feed the arithmetic coder, post-mixing and SSE, > and p0_ppmd would be the prediction from mod_ppmd alone, untouched? No, p0_ppmd in this case would be the new model, with ppmd, also after mixing and SSE. > So sometimes the prediction may be improved even though the prediction from the added model wasn't the best. True, but it doesn't matter, we're trying to check whether ppmd just accidentally improves the results at random points, or if there's a visible contextual dependency among the points where predictions are improved. If its the latter, we can try further improving the results by adding ppmd to the mix only in specific contexts.
    294 replies | 51842 view(s)
  • mpais's Avatar
    20th January 2017, 02:11
    pbit_ppmd = bit ? (1-p0_ppmd) : p0_ppmd; pbit_old = bit ? (1-p0_old) : p0_old; flag = (pbit_ppmd > pbit_old); Ok, so p0_old would be the prediction I would normally feed the arithmetic coder, post-mixing and SSE, and p0_ppmd would be the prediction from mod_ppmd alone, untouched? I'll run some tests and report the results then. >If some prediction is improved by mixing it with ppmd's, it means that ppmd's prediction is better sometimes. From my experience with EMMA, it isn't as linear as that. If both the main model and mod_ppmd predict a 1 with respectively 80% and 78% probabilities, the final prediction will most likely be 90%, ie, there is like a "confirmation bias", the final prediction is sometimes better than the best prediction from any model. This occurs because the sum of the mixer's weights isn't bounded. So in this simplistic example, pfinal may be = Clip( ( 1.21*0.8 + 1.14*0.78 )/2 ) = 0,93. So sometimes the prediction may be improved even though the prediction from the added model wasn't the best.
    294 replies | 51842 view(s)
More Activity