I think I've found a bug in dict.exe.
Please try the attached file (after unzip).
I think I've found a bug in dict.exe.
Please try the attached file (after unzip).
inikep, what exactly the problem?
"dict.exe copyright" throws an exception on WinXP Core 2 Duo
After choosing "debug" in Visual Studio I have:
Unhandled exception at 0x00401388 in dict.exe: 0xC0000005: Access violation reading location 0x00288014.
i can't reproduce the problem, but will check it with valgrind. try frshly-compiled executable:
It does nothing. Finished without error, but also without any output. "Usage" works fine.
it's ok - outfile not produced if dictionary built is empty:
Code:>dict.exe -v copyright outfile Bytes read: 11512 Collected words: 1340 Promoted words: 204 Good words: 0
Compiled with -g on MinGW. Under gdb I have:
[New thread 3936.0xdd8]
Program received signal SIGSEGV, Segmentation fault.
0x0040134b in count_desc_order (a=0x0, b=0x2c0010) at dict.cpp:350
350 int count_desc_order (const Word *a, const Word *b) { return b->count - a->count; }
(gdb) bt
#0 0x0040134b in count_desc_order (a=0x0, b=0x2c0010) at dict.cpp:350
#1 0x77c26fc1 in qsort () from /cygdrive/c/WINDOWS/system32/msvcrt.dll
#2 0x00401eb8 in phase3 (MinWeakChars=0, nodes=0x22fee8) at dict.cpp:682
#3 0x00402e7d in DictEncode (
buf=0x3e4580 "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\"\n\"http://www.w3.org/TR/html4/loose.dtd\">\n<html>\n\n<!-- Mirrored from travelindependent.info/copyright.htm by HTTrack Website Copier/3.x [XR"...,
bufsize=11512, outbuf=0x22ff50, outsize=0x22ff4c, MinWeakChars=0,
MinLargeCnt=2048, MinMediumCnt=100, MinSmallCnt=50, MinRatio=4)
at dict.cpp:1156
#4 0x00403c75 in main (argc=2, argv=0x3e2458) at dict.cpp:1414
where you get sources? svn or haskell.org?
haskell.org
thanks, i will check
is there e newee version of delta than 1.0 or can i go ahead and report af bug ?
anyway
seems to be that delta sometimes output very small files.
but hers is the wierd catch
it only happens on network attached drives
i tried several times but it always 48kb files on network drive and 1.8GB file on local drive
The local version of this file was copied from the network drive.
I'll go ahead and chop the file down in size if this is not already a fixed bug
rep 1.2a seems to work perfectly
-- edit --
also it only happens a certain blocksize (-b768)
Last edited by SvenBent; 14th March 2009 at 23:22.
i've started to develop a new algorithm: long-distance REP that consumes just ~10% of filesize for compression. there are 32-bit and 64-bit executables:
http://freearc.org/download/testing/srep.exe
http://freearc.org/download/testing/srep64.exe
with 32-bit version, one can process files up to 24gb long. decompression isn't yet implementedCode:>t srep.exe 5gb nul 512 mb used for hash (filesize/12, rounded up to power of 2) 5586729972 -> 4575184372: 81.89% Elapsed time = 432.297 seconds
now decompression works too. i've constructed the following batch for its testing:
timer srep.exe %1 1
timer srep.exe -d 1 2
arc a a -mcrc %1 2
arc v a
archive listing given by las command should show the same CRC values for both files
Last edited by Bulat Ziganshin; 23rd November 2009 at 02:13.
Looks like its nonlinear. (X=seconds, Y=bytes)
![]()
Last edited by Shelwien; 23rd November 2009 at 05:55.
srep fills fixed-size hash with secondary probing. as cache filled, access becomes slower. in my tests, speed was 40 mb/s for 100 mb file, and 10-15 m/s for 5gb file
the problem also is that it uses linear hash probing (next_slot = (slot+1)%hash_size)
sources: http://svn.freearc.org/freearc/trunk...n/REP/srep.cpp
Last edited by Bulat Ziganshin; 23rd November 2009 at 12:44.
Guess this would be relevant too:
Its (somewhat a failure of) an utility from my backup toolkit.
Note that its result for enwik9 is 988,869,959 (or 989,409,055
if we include the match structure), while rep result is 995,842,134.
(Well, its set to minmatch 128 here, so no wonder).
Usage:
fma-rep.exe enwik9 enwik9.str enwik9.rep nul
(last argument is a hashtable which isn't necessary here)
Last edited by Shelwien; 23rd November 2009 at 13:36.
srep 0.6:
- fixed 64-bit version, now it properly handles files >2gb
- fixed decompresion with non-default -l
- -s prints stats after each block
srep 0.7:note: yes, 32-bit and 64-bit versions are 100% compatible with each other
- reduced memory usage down to 6-8% of filesize. For example, 24gb file needs 256+256+960 mb memory chunks
- now hash keeps address of the last chunk with the same contents
- hashing improved a little
- fixed WinXP crashing bug
Last edited by Bulat Ziganshin; 24th November 2009 at 21:32.
srep 0.8:http://freearc.org/download/testing/srep.exe
- better compression due to improved hashing and compressed format
- faster compression on files <1gb
- MD5 integrity checking on decompressed data
- first 8 bytes of compressed file contains SREP signature, helping programs like unix magic
- exit code == 0 on success
http://freearc.org/download/testing/srep64.exe
http://svn.freearc.org/freearc/trunk...n/REP/srep.cpp
arc.ini section:
[External compressor:srep]
packcmd = srep $$arcdatafile$$.tmp $$arcpackedfile$$.tmp
unpackcmd = srep -d $$arcpackedfile$$.tmp $$arcdatafile$$.tmp
it's possible but not in near future. also note that it needs temporary file on decompression stage so i don't think that it will be highly popular
improved arc.ini section:
Code:[External compressor:srep] ;options = l%d (minimal match length, default=512) packcmd = srep {options} $$arcdatafile$$.tmp $$arcpackedfile$$.tmp unpackcmd = srep -d $$arcpackedfile$$.tmp $$arcdatafile$$.tmp
Strange that nobody expressed an amazement about the brand new SuperREP yet. But don't worry, I have the test which probably will make you to say: "My Goodness!"
First of all about the test material. On 10th of November 2009 new Call of Duty called Modern Warfare 2 game have been released. As previously it contains a lot of maps in .ff files which packed with zLib compression. I took 26 first files (from af_caves to gulag and excluding mp_*) and PreComp-ed them. Resulting .pcf files have been TAR-ed. So final test TAR-file used is 4 158 320 128 bytes. Then I passed this file through REP v1.2 alpha and SREP v0.8. Options are:
After that processed files have been packed with 7z v9.07 -mx=9. The columns of the resulting table are:Code:rep -b512m -l512 -a99 -h25 srep -l512
memory consumption\processing time\processed size\compression time\total time\compressed size
When I got final results I've been pinched down to my chair for a couple of minutes and only said: "Blya!" Seriously, with more than twice lower memory consumption, 16% faster processing+compression time, SREP provided 23% better compression !Code:mem proc.time proc.size comp.time tot.time comp.size ------ ----------- ------------- ----------- ---------- ------------- rep 1.2 alpha 642 MB 275 2 799 705 707 2411 2686 1 351 454 277 srep 0.8 301 MB 312 2 295 693 088 1944 2256 1 036 945 493 7z 9.07 -mx=9 676 MB --- ------------- 3508 3508 1 938 923 490 fma-rep 434 MB 220 2 286 737 478 1951 2171 1 040 502 574
And for dessert... On decompression SREP took only 18 MB of memory !
Definitely, its impressive work ! Thanks, Bulat !!!
EDIT: Updated result table with plain 7z -mx=9 results. Corrected 18 KB to 18 MB
EDIT2: Results updated with FMA-REP. Removed bolding.
Last edited by Skymmer; 26th November 2009 at 16:10.
My Goodness!![]()
Have you tried using 7-Zip without a (S)REP stage? I know (S)REP will speed up compression a lot as it is much faster than 7-Zip, but I'm curious if compression ratio will be even better without it or if 7-Zip fails to detect similar regions in such big files even in Ultra mode.
http://schnaader.info
Damn kids. They're all alike.
How about release a linux version? or source code?
But anyway, very good work![]()
schnaader, 7-zip ultra mode has only 64mb dictionary. but it will be gread to add pure 7-zip results here for even more (s)rep advertisingalso, can you try some archiver with data reordering - probably winrk will be the best one (try it with and without reordering)
thometal, http://svn.freearc.org/freearc/trunk...n/REP/srep.cpp - really you need to download entuire tree starting from http://svn.freearc.org/freearc/trunk
Last edited by Bulat Ziganshin; 26th November 2009 at 02:51.
yesYou mean fast\full analysis?
without. or both and use best resultIf so, then on what data? I mean final TAR file or without it?