Thanks Ilia!![]()
OK, new version has been released. This version introduces new greedy encoder which is *FAST*. I hope you enjoy it. Also I hope that BALZ become a fast LZ77 coder. So, it's interesting to compare it with TORNADO, and others.
http://encode.ru/balz/index.htm
![]()
Thanks Ilia!![]()
Hello everyone,
see my comment here:
URL
Best regards!
INTEL Core 2 duo E6600
SFC 13.436.454 B COMP.=16,797 sec. DEC.= 2,781 sec.
MOC test 163.570.844 COMP.=111,434 sec. DEC.=25,429
Nania Francesco Antonio
Measuring compression ratio, don't forget about ex mode which is still exists (balz ex in out)!
However, in your test you may use the most efficient one!
MOC Efficiency 411,79 [Normal Mode] (Very good for LZ77!)
how about sending it to the http://www.metacompressor.com/submit.aspx ?
Throws a error message!Originally Posted by Bulat Ziganshin
![]()
what you mean? i'm successfully test 4x4 there
It throws:Originally Posted by Bulat Ziganshin
You have not uploaded a zip file but application/zip!
I tried many times with differently packed ZIP archives - nothing works!![]()
are you use IE? this form does't work with Opera at least
at least, 1.02 was tested there by someone
Firefox!Originally Posted by Bulat Ziganshin
![]()
>timer balz.exe e dll100.dll balz
User Time = 394.837 = 00:06:34.837 = 78%
>timer balz.exe d balz nul
User Time = 20.609 = 00:00:20.609 = 82%
36,720,996 bytes
>tor -5 dll100.dll -otor
User Time = 25.797 = 00:00:25.797 = 78%
>timer tor -d tor -o
User Time = 5.708 = 00:00:05.708 = 82%
32,634,030 bytes
although my computer is by no way modern - 64k+64k cach size
With IE it throws:Originally Posted by Bulat Ziganshin
Your code is wrong!
What is CODE??
Try others...Originally Posted by Bulat Ziganshin
![]()
ex mode tested
for "e" mode resulting total was 16Â*859Â*360, comp speed 836 kBps and decompression speed little bit smaller than for ex mode
Quick test...
Test file: ENWIK8
Test Machine: AMD Sempron 2400+, Windows XP SP2
mode: ex
Compression Time: 10814.922s
Compressed Size: 30,604,477 bytes
Decompression Time: 17.408s
mode: e
Compression Time: 1381.077s
Compressed Size: 32,406,406 bytes
Decompression Time: 18.962s
By the way, in future versions I may increase dictionary size to say 1..4 MB. Also, ex may represent a Lazy Parsing, not SS, which may be completely removed. In addition, I will reduce memory usage to ~70 MB, even if we deal with 4 MB and larger dictionaries.
I think the future is in such *FAST* modes, since SS' compression speed is unacceptable. What do you think?
I agree!Originally Posted by encode
![]()
Furthermore, in most cases the difference in compression is really small, even if we compare to greedy (unoptimized) parsing, as currently BALZ make use. If we deal with Lazy Matching we even further close the gap, being just slightly slower.
![]()
What I've already done with BALZ v1.04:
+ Removed SS parsing
+ Reduced memory usage to ~80 MB
+ Changed dictionary size to 1 MB
+ Mode "e" uses greedy parsing
+ Mode "ex" uses lazy matching with 2-byte lookahead
Having said that with a larger dictionary and a new, simpler parsing, new BALZ often outperforms an old one (current version), being incomparable faster and with smaller memory footprint. Very cool!![]()
ilia_muraviev # yahoo . com
Hope that bots will not extract my box...![]()
Continue improving BALZ:
+ Changed dictionary size to 2 MB. Looks like 2 MB is some kind of standard value for modern LZ77 coders (CABARC,QUANTUM,etc.)
+ Tested more deeply parsing with 2-byte lookahead. In some cases such thing may slightly hurt compression, comapred to simple 1-byte lookahead lazy matching. But overall it helps, especially on text files.
+ Just stuck in a middle with formula - len/offset limits. i.e. which offset should be the max for each length. With 2 MB dictionary I've found that I should restrict offsets for 3,4,5-byte matches.
3 - ~256
4 - ~4k
5 - ~512k
Continue digging...
P.S.
This new beast, due to a larger dictionary, has a higher compression, in some cases notable higher, even with such simpler parsing scheme. At the same time it's faster...![]()
yes, if you live in 90sOriginally Posted by encode
rar/ace already had 4mb dicts
you have a lot of room for improvementOriginally Posted by encode
![]()
Actually, I can set ANY dictionary size. Larger dictionary = slower compression. Well, Ill test the BALZ with 4 MB dictionary. Maybe I should keep 4 MB, at least for "pht.psd" file.Originally Posted by Bulat Ziganshin
![]()
OK, tested BALZ with various window sizes (1..4 MB). Well, maybe 4 MB is too heavy - Hash Chain based match finder is not so efficient on such large dictionaries. The main question is - what is BALZ - fast and efficient LZ77 or LZMA competitor. Well, even with 4 MB BALZ may not compete with LZMA - we need Optimal Parsing and Binary Tree based match finder. Therefore, I think I should keep something in middle between Deflate and LZMA - fast LZ77. Concluding, 1 MB dictionary is enough, although I will make additional tests with 2 MB one. In new BALZ I also improved parsing, new "ex" mode uses an advanced Lazy Matching with 2-byte lookahead, also during decision of dropping a match it looks for offset of a current match, is it closer/good match, also is current offset in a rep state, and so on. Another parameter I tested is a hash chain length – 4k, 8k, 16k; 16k is too large value and with a large dictionaries, say 4 MB, may heavily affect compression speed. 8k is very deep search, 4k is OK with 1 MB dictionary but not really enough on a larger ones. Anyway, you may post your own thought about what BALZ do you really want to see – i.e. fast or not, favor compression or speed, etc.
![]()
Id like to see stronger or much faster compression. In case you go for stronger compression an increase in dictonary size would be great. But frankly speaking, Id love to see an improved version of TC combining your know-how from quad and lzpm. I first found this forum because I was looking for some info on TC.Originally Posted by encode
![]()
Yep, TC is one of the craziest things I've ever made. Another cool compressor is one closed source version of QUAD, which represents an order-2 fast CM+LZP layer. The performance of this compression is crazy! However, starting with QUAD idea I'm looking more carefully at asymmetric things like ROLZ and LZ77. Indeed, new BALZ in some cases has identical or greater compression than my old PIMPLE, being incomparable faster. In addition, my new LZ77 easily outperforms LZPM on binary files. But the coolest part of BALZ is its simplicity – I think it's one of the simplest compressors ever made – BALZ v1.04 has ~7 KB source code (encoder/decoder/interface, all stuff), at the same time, things like TC have high complexity – large sources, lots of classes, etc. - hard to work on such large projects. Anyway, things like fast CM and LZP is well known to me and to release a new compressor I may just Copy+Paste my own code. Just currently, I more interested in relatively new area to me – pure LZ77. OK, will look at MFC's results and will decide what stuff and tricks to insert to BALZ, maybe again I'll skip back to CM+LZP.![]()