Results 1 to 20 of 20

Thread: introducing FastLZ

  1. #1
    Member
    Join Date
    Jun 2007
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hallo folks,

    I have been following this forum lately. I am quite new to compression and I am glad to be able to learn from all the experts here

    Here is my toy open-source compressor: FastLZ. The project page is at http://code.google.com/p/fastlz/. Soon I will make a first release and also Windows executables. In the mean time, please don't hesitate to check it with your favorite compiler(s). Kindly let me know if it does not with certain compiler.

    Nothing fancy in FastLZ. It is just an improvement over Herman Vogt's LZV and Marc Lehmann's LZF, i.e. byte-aligned LZ77 implementation. Notable feature is that the decompression is crash-proof against corrupted and/or malicious data. Also I carefully crafted the implementation so that the performance is optimum in modern systems (I spent long hours with valgrind/cachegrind).

    I have included an illustrative file packager using FastLZ called 6pack. Its speed and performance is competitive to QuickLZ 1.2 (and 1.3beta level 1), tor -1, and thor e1. There is about 10% penalty due to the use of Adler-32 checksum, but still it is fast enough.

    At the moment, FastLZ is used in the development version of KOffice for on-the-fly compression/decompression of application data.

    Feel free to comment, use, file bug reports, etc. Patches are also welcomed

  2. #2
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,954
    Thanks
    359
    Thanked 332 Times in 131 Posts
    Cool!

  3. #3
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Thanks ariya!

  4. #4
    Programmer
    Join Date
    May 2008
    Location
    denmark
    Posts
    94
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Nice, it beats LZF and LZO but it's still not faster than QuickLZ

    I've set #define BLOCK_SIZE (20000000) and inserted a for loop:
    for(y = 0; y < 100; y++)
    chunk_size = fastlz_compress_level(2, buffer, bytes_read, result);

    which will bypass CRC and other overheads. Also gave it REAL_TIME_PRIORITY. Compiled with /Ox on VC 2005 and then timed it on the same Celeron as on www.quicklz.com:

    winword.exe: FastLZ = 58.6 MB/s, LZF = 46.5 MB/s, QuickLZ = 94.2 MB/s
    corpus.tar: FastLZ = 79.1 MB/s, LZF = 57.4 MB/s, QuickLZ = 113 MB/s
    pic.bmp: FastLZ = 51.7 MB/s, LZF = 42.0 MB/s, QuickLZ = 78.5 MB/s

    I also tried benchmarking on a ramdrive (cenatek's) using the normal file compressor program (still compiled with VC /Ox). That more than *halved* compression times! At these speeds even ramdrive and file API overhead is significant

    By the way, you could perhaps move buffer[BLOCK_SIZE] out of local function space because you get stack overflow on too large BLOCK_SIZE. This can also in theory give a (very) tiny speedup on x86 because addressing doesn't need to go trough the ebp register which can then be used for other purposes.

  5. #5
    Member
    Join Date
    Jun 2007
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi Lasse Reinhold,

    Thanks for testing!

    As for the speed, FastLZ is not optimized (yet) for large or even very large block (for an obvious reason that I don't have the need for that in KOffice). Same case for disk I/O access.

    Also, perhaps could you kindly test using my gcc-compiled 6pack.exe available from http://www.fastlz.org/6pack_20070614.exe (to view the source, change .exe to .c)? Basically I copied your benchmark routine and adjusted the number of passes to 8. It is still slower than QuickLZ of course (due to the above mentioned excuse).

    Note: this is by no means the official release, so please don't use it for public benchmark websites (for this forum it's definitely okay). The link won't work after few days as well.

    BTW, I am not expert on Win32 programming, but shouldn't we use QueryPerformanceCounter instead of GetTickCount for timing measurement?

  6. #6
    Programmer
    Join Date
    May 2008
    Location
    denmark
    Posts
    94
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Hi Ariya,

    D:>6pack_20070614.exe -mem WINWORD.EXE
    Compressed 10577312 bytes into 7679072 bytes (72.6%) at 56.7 Mbyte/s.
    Decompressed at 143.8 Mbyte/s.

    D:>6pack_20070614.exe -mem pic.bmp
    Compressed 18108198 bytes into 16153342 bytes (89.2%) at 49.4 Mbyte/s.
    Decompressed at 156.9 Mbyte/s.

    D:>6pack_20070614.exe -mem corpus.zip
    Compressed 3255624 bytes into 1637110 bytes (50.3%) at 76.0 Mbyte/s.
    Decompressed at 146.5 Mbyte/s.

    D:>6pack_20070614.exe -mem video.dat
    Compressed 11130658 bytes into 11029022 bytes (99.1%) at 45.8 Mbyte/s.
    Decompressed at 293.1 Mbyte/s.

    D:>6pack_20070614.exe -mem proteins.txt
    Compressed 7254685 bytes into 2790963 bytes (38.5%) at 104.0 Mbyte/s.
    Decompressed at 193.5 Mbyte/s.

    D:>6pack_20070614.exe -mem test4.xml
    Compressed 18283820 bytes into 3394705 bytes (18.6%) at 138.7 Mbyte/s.
    Decompressed at 291.1 Mbyte/s.

    (1 MB = 1000000 byte)

    It's a little slower because it's compiled with gcc instead of vc.

    QueryPerformanceCounter and QueryPerformanceFrequency is more accurate but not portable, so I didn't use it. Today the quicklz demo project isn't portable anyway because of the win32 file api calls but I havn't bothered rewiting the timing part.

  7. #7
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by Lasse Reinhold
    At these speeds even ramdrive and file API overhead is significant
    its exactly the time required to move input file from memory to memory (from ramdrive to input buffer) and then the same for outfile. on my box, quicklz is only 1/3rd speed of memcpy so your results (halving the speed) are predictable

  8. #8
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by ariya
    Its speed and performance is competitive to QuickLZ 1.2 (and 1.3beta level 1), tor -1, and thor e1.
    dont forget about slug. although it uses lzh scheme its still pretty fast

  9. #9
    Member
    Join Date
    Jun 2007
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Lasse Reinhold:

    Thanks again for testing. Here I posted again: http://www.fastlz.org/6pack_20070614.exe, another version which is now compiled with gcc 4.2 (not as good as VC yet, but better than before). I also changed the compression to FastLZ Level 1 which is intended for the fastest compression.

    Also, I guess it's sensible to compare it vs QuickLZ 1.2 or QuickLZ 1.3 Mode 1 only. This is because QuickLZ 1.3 Mode 0 is very fast in compression but not for the decompression.

    Here is what I got on Pentium 4-M 1.8 Ghz, 512 MB RAM running Windows XP (compressed size, compression speed, decompression speed).

    pic.bmp
    quick1 = 88.2%, 54.8 Mb/s, 147.7 Mb/s
    6pack_20070614b = 89.2%, 50.2 Mb/s, 161.1 Mb/s

    protein.txt
    quick1 = 39.8%, 103.1 Mb/s, 189.4 Mb/s
    6pack_20070614b = 38.2%, 104.8 Mb/s, 184.6 Mb/s

    corpus.tar
    quick1 = 52.0%, 77.1 Mb/s, 161.0 Mb/s
    6pack_20070614b = 50.8%, 74.4 Mb/s, 146.6 Mb/s

    enwik9 (process time is shown, using Igor Pavlov's timer utility)
    quick1 = 50.9%, comp: 20.8 s, decomp: 11.3
    6pack_20070614b = 49.3%, comp: 19.s, decomp: 12.9 s

  10. #10
    Programmer
    Join Date
    May 2008
    Location
    denmark
    Posts
    94
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Hmm I get the exact same values on the Celeron with the new 6pack_20070614.exe compared to the old version.

    Benchmarks on my Athlon64:

    E:>6pack_20070614.exe -mem WINWORD.EXE
    Compressed 10577312 bytes into 7679072 bytes (72.6%) at 81.1 Mbyte/s.
    Decompressed at 201.3 Mbyte/s.

    E:>6pack_20070614.exe -mem test4.xml
    Compressed 18283820 bytes into 3394705 bytes (18.6%) at 168.8 Mbyte/s.
    Decompressed at 402.2 Mbyte/s.

    E:>6pack_20070614.exe -mem pic.bmp
    Compressed 18108198 bytes into 16153342 bytes (89.2%) at 72.4 Mbyte/s.
    Decompressed at 219.9 Mbyte/s.

    E:>6pack_20070614.exe -mem proteins.txt
    Compressed 7254685 bytes into 2790963 bytes (38.5%) at 131.0 Mbyte/s.
    Decompressed at 241.8 Mbyte/s.

    E:>6pack_20070614.exe -mem corpus.txt
    Compressed 3255624 bytes into 1637110 bytes (50.3%) at 107.4 Mbyte/s.
    Decompressed at 208.4 Mbyte/s.

    E:>6pack_20070614.exe -mem e:enchvideo.dat
    Compressed 11130658 bytes into 11029022 bytes (99.1%) at 68.7 Mbyte/s.
    Decompressed at 372.9 Mbyte/s.

    (1 MB = 1000000 byte)

    Seems like 6pack is more optimized for P4.

  11. #11
    Member
    Join Date
    Jun 2007
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Sorry, it was a wrong link

    Should be http://www.fastlz.org/6pack_20070614b.exe instead (there is 'b' at the end).

  12. #12
    Programmer
    Join Date
    May 2008
    Location
    denmark
    Posts
    94
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Hi ariya,

    Exciting, it's now comparable to quick1. Can I put up benchmarks on quicklz.com once you have a final version?

  13. #13
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Good work ariya!

  14. #14
    Member
    Join Date
    Jun 2007
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Lasse Reinhold:
    Thanks again for testing. As for benchmarks, of course it's not a problem

    LovePimple: Thanks

  15. #15
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I tested FastLZ from the source code.
    http://cs.fit.edu/~mmahoney/compression/text.html

  16. #16
    Member
    Join Date
    Jun 2007
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Matt Mahoney
    I tested FastLZ from the source code.
    http://cs.fit.edu/~mmahoney/compression/text.html
    Thanks, Matt !

    I hope I can make a first release within a week.

  17. #17
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by ariya
    I hope I can make a first release within a week.
    I look forward to it!

  18. #18
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    still no official download?

    Best regards!

  19. #19
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    When can we expect the first official release?

  20. #20
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    still no progress on http://www.fastlz.org/download.htm ?!

    Best regards!

Similar Threads

  1. Introducing zp
    By Matt Mahoney in forum Data Compression
    Replies: 1
    Last Post: 27th April 2010, 18:56
  2. Introducing zpipe, a streaming ZPAQ compatible compressor
    By Matt Mahoney in forum Data Compression
    Replies: 0
    Last Post: 1st October 2009, 06:32

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •