Page 1 of 8 123 ... LastLast
Results 1 to 30 of 213

Thread: Brotli

  1. #1
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    200
    Thanks
    41
    Thanked 36 Times in 12 Posts

    Brotli

    Google have now announced a new compression format - brotli - and are getting attention on various programmer websites. Its in Chrome and is currently being added to Firefox.

    I have heard about it all other the internet but its only been mentioned in passing on this forum, so I figured it deserved a thread of its own.

    http://google-opensource.blogspot.se...mpression.html has links to sourcecode and papers.

    Brotli is roughly as fast as zlib’s Deflate implementation. At the same time, it compresses slightly more densely than LZMA and bzip2 on the Canterbury corpus. The higher data density is achieved by a 2nd order context modeling, re-use of entropy codes, larger memory window of past data and joint distribution codes

  2. The Following 6 Users Say Thank You to willvarfar For This Useful Post:

    Bulat Ziganshin (22nd September 2015),Cyan (23rd September 2015),encode (10th February 2016),Jarek (23rd September 2015),lorents17 (22nd September 2015),Matt Mahoney (23rd September 2015)

  3. #2
    Member
    Join Date
    Sep 2015
    Location
    germany
    Posts
    12
    Thanks
    1
    Thanked 0 Times in 0 Posts
    The names of Google compressors sound so funny: Zopfli, Brotli, ... the next ones will be called Guetzli, Brunsli, Kägi-Fretli, Schoggli oder Läckerli

  4. #3
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    What are joint distribution codes? Some quick searches turned up nothing useful.

  5. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    My tests on LTCB. brotli made the Pareto frontier for decompression speed even on my slow computer, but you need to compress for a long time to achieve that. http://mattmahoney.net/dc/text.html#2414

  6. #5
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    https://bugzilla.mozilla.org/show_bug.cgi?id=366559 so it seems that firefox 44 will support brotli compression in https

    i've compiled Brotli using their makefiles - 32-bit gcc 4.9.2 -O2 -static --large-address-aware. it may have edge on fast extraction (although lzmh should be as fast as deflate too), but on binary data max. compression and speed/compression ratio is nothing close to 7 year-old tornado, not even talking about lzma, lzham and so on

    Usage: bro.exe [--force] [--quality n] [--decompress] [--input filename] [--output filename] [--repeat iters] [--verbose]

    Example: bro.exe -v -i INFILE -f -o OUTFILE -q 9 (default "-q 11" is extremely slow)
    Attached Files Attached Files
    Last edited by Bulat Ziganshin; 24th September 2015 at 01:38.

  7. The Following 5 Users Say Thank You to Bulat Ziganshin For This Useful Post:

    Bilawal (16th December 2016),comp1 (23rd September 2015),lorents17 (24th September 2015),Matt Mahoney (23rd September 2015),Zhabaloid (20th June 2016)

  8. #6
    Member
    Join Date
    May 2013
    Location
    ARGENTINA
    Posts
    54
    Thanks
    62
    Thanked 13 Times in 10 Posts
    Hi bulat i have an error says "file missing libgcc_s_sjlj-1.dll"
    Last edited by GOZARCK; 23rd September 2015 at 23:20.

  9. #7
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by GOZARCK View Post
    Hi bulat i have an error says "file missing libgcc_s_sjlj-1.dll"
    thanks, fixed

  10. The Following User Says Thank You to Bulat Ziganshin For This Useful Post:

    GOZARCK (23rd September 2015)

  11. #8
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Testing bro.exe in Windows on the Silesia corpus (bro.exe -i file -o file.bro). Decompression says "corrupt input" on all files. Looking at a dump of the .bro files, I see that CR is always followed by LF.

  12. #9
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Testing bro.exe in Windows on the Silesia corpus (bro.exe -i file -o file.bro). Decompression says "corrupt input" on all files. Looking at a dump of the .bro files, I see that CR is always followed by LF.
    thanks, i modified sources to open files as binary in the Windows way:

    static FILE* OpenInputFile(const char* input_path)
    { if (input_path == 0) {
    setmode(STDIN_FILENO,O_BINARY);
    return stdin;
    }
    FILE* f = fopen(input_path, "rb");
    if (f == 0) {
    perror("fopen");
    exit(1);
    }
    return f;
    }

    static FILE *OpenOutputFile(const char *output_path, const int force)
    {
    if (output_path == 0) {
    setmode(STDOUT_FILENO,O_BINARY);
    return stdout;
    }
    if (!force) {
    struct stat statbuf;
    if (stat(output_path, &statbuf) == 0) {
    fprintf(stderr, "output file exists\n");
    exit(1);
    }
    }
    FILE* f = fopen(output_path, "wb");
    if (f == 0) {
    perror("fopen");
    exit(1);
    }
    return f;
    }

    and uploaded new executable. now it should work both with -i/-o and stdin/stdout

    to everyone: if you made any benchmarks with earlier versions of my executable, please redo them
    Last edited by Bulat Ziganshin; 24th September 2015 at 01:41.

  13. The Following User Says Thank You to Bulat Ziganshin For This Useful Post:

    Matt Mahoney (24th September 2015)

  14. #10
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    Input:
    812,392,384 bytes, HTML top 10k Alexa crawled (8,998 with HTML response)

    Output:
    219,591,148 bytes, 1.799 sec., 1.355 sec., tor-small -1
    218,018,012 bytes, 1.951 sec., 1.464 sec., tor-small -1 -b800mb
    210,647,258 bytes, 1.736 sec., 1.328 sec., qpress64 -L1T1
    210,194,996 bytes, 0.333 sec., 0.371 sec., lz4 -1
    194,233,793 bytes, 2.818 sec., 1.348 sec., qpress64 -L2T1
    187,766,706 bytes, 7.966 sec., 1.059 sec., qpress64 -L3T1
    173,904,470 bytes, 2.995 sec., 2.721 sec., tor-small -2
    173,418,150 bytes, 3.132 sec., 2.843 sec., tor-small -2 -b800mb
    169,476,113 bytes, 2.072 sec., 0.352 sec., lz4 -9
    165,571,040 bytes, 2.931 sec., 2.820 sec., NanoZip - f
    158,213,503 bytes, 1.855 sec., 0.980 sec., zstd
    154,673,082 bytes, 3.213 sec., 2.445 sec., tor-small -3
    154,166,902 bytes, 3.364 sec., 2.555 sec., tor-small -3 -b800mb
    152,477,067 bytes, 2.128 sec., 0.973 sec., lzturbo -30 -p1
    152,477,067 bytes, 2.132 sec., 0.971 sec., lzturbo -30 -p1 -b4
    150,773,269 bytes, 2.151 sec., 1.047 sec., lzturbo -30 -p1 -b16
    150,219,553 bytes, 2.332 sec., 1.204 sec., lzturbo -30 -p1 -b800
    149,670,044 bytes, 7.825 sec., 2.567 sec., WinRAR - 1
    149,642,742 bytes, 2.069 sec., 0.646 sec., zhuff_beta -c0 -t1
    145,770,266 bytes, 4.586 sec., 2.285 sec., tor-small -4
    141,951,602 bytes, 4.484 sec., 2.360 sec., tor-small -4 -b800mb
    141,215,050 bytes, 2,751 sec., 0.938 sec., lzturbo -31 -p1 -b4
    140,657,806 bytes, 5.037 sec., 2.544 sec., FreeArc - 1
    140,211,060 bytes, 6,970 sec., 1.775 sec., bro -q 1
    138,103,483 bytes, 118.023 sec., 2.394 sec., cabarc -m LZX:15
    138,051,401 bytes, 2.761 sec., 1.001 sec., lzturbo -31 -p1 -b16
    137,564,310 bytes, 18.762 sec., 4.299 sec., NanoZip - dp
    137,211,547 bytes, 7.808 sec., 1.712 sec., bro -q 2
    137,000,208 bytes, 3.763 sec., 3.852 sec., NanoZip - F
    136,523,335 bytes, 2.830 sec., 1.094 sec., lzturbo -31 -p1
    136,445,854 bytes, 50.932 sec., 2.344 sec., lzhamtest_x64 -m0 -d24 -t0 -b
    136,337,495 bytes, 14.823 sec., 4.259 sec., NanoZip - d
    135,723,691 bytes, 8.318 sec., 1.677 sec., bro -q 3
    135,656,476 bytes, 2.972 sec., 1.153 sec., lzturbo -31 -p1 -b800
    135,315,388 bytes, 51.436 sec., 2.371 sec., lzhamtest_x64 -m0 -t0 -b
    135,287,357 bytes, 51.937 sec., 2.418 sec., lzhamtest_x64 -m0 -d29 -t0 -b
    135,071,576 bytes, 21.650 sec., 4.251 sec., NanoZip - dP
    132,819,515 bytes, 20.102 sec., 6.278 sec., 7-Zip - 1
    131,871,664 bytes, 14.052 sec., 0.899 sec., lzturbo -32 -p1 -b4
    131,401,865 bytes, 9,917 sec., 1.677 sec., bro -q 4
    129,184,341 bytes, 8.305 sec., 2.692 sec., tor-small -5
    127,355,215 bytes, 20.866 sec., 5.825 sec., 7-Zip - 2
    127,045,472 bytes, 9.549 sec., 0.957 sec., lzturbo -32 -p1 -b16
    126,139,033 bytes, 8.025 sec., 2.751 sec., tor-small -5 -b800mb
    125,732,647 bytes, 10.642 sec., 2.618 sec., tor-small -6
    125,454,769 bytes, 140.513 sec., 2.281 sec., cabarc -m LZX:18
    123,169,077 bytes, 8.472 sec., 1.090 sec., lzturbo -32 -p1
    123,093,411 bytes, 22.468 sec., 5.508 sec., 7-Zip - 3
    122,564,329 bytes, 10.074 sec., 2.680 sec., tor-small -6 -b800mb
    122,480,456 bytes, 19.411 sec., 1.645 sec., bro -q 5
    121,068,548 bytes, 14.536 sec., 3.289 sec., FreeArc - 2
    120,653,755 bytes, 16.107 sec., 2.552 sec., tor-small -7
    119,969,489 bytes, 27.663 sec., 1.602 sec., bro -q 6
    119,740,393 bytes, 27.370 sec., 5.259 sec., 7-Zip - 4
    118,343,545 bytes, 24.112 sec., 2.123 sec., WinRAR - 2
    118,139,032 bytes, 35.361 sec., 4.371 sec., NanoZip - Dp
    117,500,327 bytes, 15.517 sec., 2.594 sec., tor-small -7 -b800mb
    117,388,039 bytes, 23.546 sec., 2.524 sec., tor-small -8
    116,526,595 bytes, 37.847 sec., 4.383 sec., NanoZip - DP
    116,454,906 bytes, 35.232 sec., 4.351 sec., NanoZip - D
    116,269,246 bytes, 25.888 sec., 6.589 sec., FreeArc - 3
    116,217,001 bytes, 40.748 sec., 1.630 sec., bro -q 7
    115,993,125 bytes, 192.929 sec., 2.199 sec., cabarc -m LZX:21
    115,985,847 bytes, 386.192 sec., 0.850 sec., lzturbo -39 -p1 -b4
    115,729,606 bytes, 35.504 sec., 2.095 sec., WinRAR - 3
    115,163,486 bytes, 55.523 sec., 1.614 sec., bro -q 8
    115,022,074 bytes, 49.863 sec., 2.084 sec., WinRAR - 5
    114,602,026 bytes, 8.403 sec., 1.218 sec., lzturbo -32 -p1 -b800
    114,345,025 bytes, 78,418 sec., 1.594 sec., bro -q 9
    114,281,170 bytes, 22.925 sec., 2.575 sec., tor-small -8 -b800mb
    113,354,128 bytes, 29.519 sec., 2.474 sec., tor-small -9
    112,376,531 bytes, 177.077 sec., 1.923 sec., lzhamtest_x64 -m1 -d24 -t0 -b
    111,848,802 bytes, 29.046 sec., 2.515 sec., tor-small -9 -b800mb
    110,532,234 bytes, 40.580 sec., 2.496 sec., tor-small -10
    110,177,215 bytes, 54.632 sec., 6.398 sec., FreeArc - 4
    109,908,468 bytes, 40.292 sec., 2.501 sec., tor-small -10 -b800mb
    109,522,695 bytes, 208.748 sec., 1.898 sec., lzhamtest_x64 -m2 -d24 -t0 -b
    109,425,530 bytes, 436.824 sec., 0.893 sec., lzturbo -39 -p1 -b16
    108,520,934 bytes, 58.227 sec., 2.521 sec., tor-small -11
    108,520,934 bytes, 58.329 sec., 2.518 sec., tor-small -11 -b800mb
    107,850,398 bytes, 266.166 sec., 2.562 sec., tor-small -12
    107,842,909 bytes, 267.559 sec., 2.550 sec., tor-small -12 -b800mb
    106,128,420 bytes, 271.607 sec., 1.850 sec., lzhamtest_x64 -m3 -d24 -t0 -b
    105,933,030 bytes, 571,280 sec., 5.168 sec., lzturbo -49 -p1 -b4
    105,692,791 bytes, 193.962 sec., 1.919 sec., lzhamtest_x64 -m1 -t0 -b
    104,539,771 bytes, 316.307 sec., 1.833 sec., lzhamtest_x64 -m4 -d24 -t0 -b
    104,094,380 bytes, 2313.780 sec., 1.693 sec., bro -q 10
    104,053,219 bytes, 195.503 sec., 1.977 sec., lzhamtest_x64 -m1 -d29 -t0 -b
    102,895,078 bytes, 148.997 sec., 4.850 sec., 7-Zip - 5
    101,941,653 bytes, 237.364 sec., 1.889 sec., lzhamtest_x64 -m2 -t0 -b
    100,898,120 bytes, 534.627 sec., 1.097 sec., lzturbo -39 -p1
    100,159,922 bytes, 239.813 sec., 1.933 sec., lzhamtest_x64 -m2 -d29 -t0 -b
    99,699,129 bytes, 625.001 sec., 4.902 sec., lzturbo -49 -p1 -b16
    96,239,572 bytes, 347.011 sec., 1.893 sec., lzhamtest_x64 -m3 -t0 -b
    95,197,295 bytes, 236.139 sec., 4.587 sec., 7-Zip - 9
    94,133,011 bytes, 356.440 sec., 1.933 sec., lzhamtest_x64 -m3 -d29 -t0 -b
    93,601,386 bytes, 431.884 sec., 1.899 sec., lzhamtest_x64 -m4 -t0 -b
    92,303,359 bytes, 727.475 sec., 4.729 sec., lzturbo -49 -p1
    91,310,894 bytes, 449.102 sec., 1.923 sec., lzhamtest_x64 -m4 -d29 -t0 -b
    90,239,627 bytes, 680.976 sec., 1.170 sec., lzturbo -39 -p1 -b800
    87,715,022 bytes, 314.169 sec., 5.428 sec., FreeArc - 9
    82,891,405 bytes, 882.513 sec., 4.597 sec., lzturbo -49 -p1 -b800
    77,286,010 bytes, 6497.059 sec., 7.715 sec., glza

    Used:
    7z 15.07 beta - Sep 17, 2015 (one thread)
    rar 5.40 beta 4 - Sep 21, 2015 (one thread)
    arc 0.67 - Mar 15, 2014 (one thread)
    nz 0.09 - Nov 4, 2011 (one thread)
    zhuff_beta 0.99 - Aug 11, 2014
    cabarc 6.2.9200.16521 - Feb 23, 2013
    lz4 1.4 - Sep 17, 2013
    qpress64 1.1 - Sep 23, 2010
    zstd 0.0.1 - Jan 25, 2015
    tor-small 0.4a - Jun 2, 2008
    lzturbo 1.2 - Aug 11, 2014
    lzhamtest_x64 1.x dev - Sept 25, 2015 (own VS2015 compile)
    glza 0.3a - Jul 15, 2015
    Last edited by Sportman; 26th September 2015 at 12:15. Reason: Added 7z, rar, arc, zhuff_beta, cabarc, lz4, qpress64, tor-small, lzhamtest, glza, lzturbo -b4 and -b16, -39

  15. The Following 3 Users Say Thank You to Sportman For This Useful Post:

    dnd (30th September 2015),Jarek (24th September 2015),Kennon Conrad (26th September 2015)

  16. #11
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Windows and Linux versions of brotli compress to different sizes on most files in the Silesia corpus (using default -q 11), but decompress correctly either way. I posted both results to http://mattmahoney.net/dc/silesia.html

  17. #12
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    200
    Thanks
    41
    Thanked 36 Times in 12 Posts
    The use-case for Brotli is to replace deflate for streaming pre-compressed text over HTTP.

    Lots of web servers, particularly the "application servers", store (often in memory) the deflated versions of text/ mime types to serve from cache (if not asked for a range request).

    The way an application server (e.g. a webserver with sever-side scripts to generate pages) works is typically shockingly suboptimal: they generate the response, then hash it (which they need to do for an ETag anyway), then compare that hash to dictionary of previously-sent compressed responses...

    As a talk about the use-case, several years ago I added partial pre-compression of response templates to the Tornado webserver. Most of these application servers have a text template response system e.g. <html><head><title>{% page_title}</title><body><p>Hi {% username}, ...

    In tornado they actually precompile these text-based templates to code. Neat hacker trick.

    And what I did was pre-compress those template fragments between the computed fields as deflate using relative offsets for the matches. When I emitted the template I emitted the computed fields as deflate literals and fixed the offsets in the matches as I went. At least, that's how I recall doing it.

    It was effective. I don't think the same approach would work with a more complicated compressor like Brotli though.

    Sorry for the walk down memory lane inside "application servers"

  18. The Following User Says Thank You to willvarfar For This Useful Post:

    SolidComp (6th September 2016)

  19. #13
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    new version:
    - added -w/-m options
    - improved -v statistics
    - extended -h help
    - published to github
    - note that default dictionary (window) is only 4 MB and maximum is 16 MB, so on larger files it has no chances against lzma and even tornado

    C:\>bro -h

    Usage: bro [--quality n] [--window n] [--mode n] [--decompress] [--force] [--input filename] [--output filename] [--repeat iters] [--verbose]

    --quality: controls the compression-speed vs compression-density tradeoff. The higher the quality, the slower the compression. Range is 0 to 11. Defaults to 11.

    --window: base 2 logarithm of the sliding window size. Range is 16 to 24. Defaults to 22.

    --mode: the compression mode can be 0 for generic input, 1 for UTF-8 encoded text, or 2 for WOFF 2.0 font data. Defaults to 0.

    Usage example: bro -q 9 -w 24 -v -f -i INFILE -o OUTFILE


    C:\>bro -q 9 -w 24 -v -f -i Z:\100m -o m:\brotli

    100000000 -> 30900269: 30.900% 101.225 sec 0.942 MiB/s
    Attached Files Attached Files
    Last edited by Bulat Ziganshin; 25th September 2015 at 13:57.

  20. The Following 6 Users Say Thank You to Bulat Ziganshin For This Useful Post:

    avitar (24th September 2015),comp1 (24th September 2015),Cyan (25th September 2015),lorents17 (24th September 2015),Matt Mahoney (24th September 2015),Stephan Busch (24th September 2015)

  21. #14
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    new version:
    - added -w/-m options

    Attached bro.exe looks like the same as version before.

  22. #15
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    645
    Thanks
    205
    Thanked 196 Times in 119 Posts
    Hi Sportman, could you also add "-39"?

  23. #16
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by Sportman View Post
    Attached bro.exe looks like the same as version before.
    sorry, fixed

  24. #17
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    Quote Originally Posted by Jarek View Post
    could you also add "-39"?
    Added default, more later.

  25. The Following User Says Thank You to Sportman For This Useful Post:

    Jarek (24th September 2015)

  26. #18
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts
    @bulat: two quick tests: the program works ...

    c:\TEST1>bro -q 9 -w 24 -v -f -i c:1.iso -o c:v2-brotli
    302786560 -> 283952103: 93.780% 131.672 sec 2.193 MiB/s

    but
    - is there a limit ?
    - maybe a 4 GByte - limit for the input (exp_spx_db.dmp has 4.547.686.400 bytes) ?

    c:\TEST2>bro -q 9 -w 24 -v -f -i D:\exp_spx-150504-0605\opt\oracle\orabackup\export\exp_spx_db.dmp -o c:v1-brotli
    compression failed

    best regards

  27. #19
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    645
    Thanks
    205
    Thanked 196 Times in 119 Posts
    Thanks Sportman, let's select the most interesting for comparison.

    dynamically generated:
    135,723,691 bytes, 8.318 sec., 1.677 sec., bro -q 3
    114,602,026 bytes, 8.403 sec., 1.218 sec., lzturbo -32 -p1 -b800

    Prepacked:
    104,094,380 bytes, 2313.780 sec., 1.693 sec., bro -q 10
    90,239,627 bytes, 680.976 sec., 1.170 sec., lzturbo -39 -p1 -b800

    So lzturbo could give ~15% bandwidth improvement still being essentially faster/less costly ... maybe instead of enforcing brotli, they should buy lzturbo ...

  28. #20
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Jarelk, don't forget that lzturbo is just stolen tornado code. his best own archivement is a huffman coder. by looking at lzturbo sources, anyone can immediately understand that it's just a scam

  29. #21
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    645
    Thanks
    205
    Thanked 196 Times in 119 Posts
    Bulat, I wouldn't call something offering (in verifiable way) 15% improvement of world-wide bandwidth "a scam".

    Sure he might stole some of your code, but it was many versions ago, the difference of performance with tornado suggests that it might be completely different now.
    This guy is a genius of low level optimization, also in other tasks: https://github.com/powturbo

    We are talking about huge world-wide improvements of extremely frequent tasks - it is worth much more than e.g. 1M$ Google could easily pay him to make it open-source, maybe shared with other contributors like you.

  30. #22
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    Quote Originally Posted by Jarek View Post
    maybe instead of enforcing brotli, they should buy lzturbo ...
    I thought the same or hire Hamid Buzidi. If I remember my lzturbo research during the start days well, Hamid worked for a German search engine that time, so it must be optimized for HTML compression.

  31. #23
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Jarek, brotli dictionary by default is 4 MB and it's indended for compression of pretty small files. so you need to use the same dictionary and non-solid compression, for the start, and then compare with original programs, f.e. lzmh+xwrt with builtin dictionary or lzga. hamid can sell anything telling us what he wrote this but the only real thuing he was ever wrote was a tiny integer compressor. even the huffman compressor he wrote follows the description of jumpless bit coding i wrote here a year ago. modern lz compression like lzmh is absolutely outside his programming skills

  32. #24
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    brotli failed on 10gb.tar. Input: 10,065,018,880. Compressed with -q 1: 4,733,283,813 (5580 s). Decompressed: 14,687,141,888 (212 s). I'm testing the Sept. 21 commit compiled in Linux (10GB benchmark machine 4).

    Edit: rebooted, installed latest (24 Sep 2015) commit of brotli on Linux again, recompiled, tested with -q 5. Output is still bad (wrong size). Compressed 5,357,707,755 (557s, faster than -q 1?), decompressed 13,689,897,536 (262s).

    Edit: after further experimentation, the error occurs at byte 25,165,825 of 10gb.tar with -q 1. If you truncate to this length or longer then the output differs. At this length only the last byte of output differs. If you truncate to a smaller file then the output is identical.
    Last edited by Matt Mahoney; 24th September 2015 at 21:33.

  33. #25
    Member
    Join Date
    Feb 2015
    Location
    United Kingdom
    Posts
    154
    Thanks
    20
    Thanked 66 Times in 37 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    outside his programming skills
    shots fired

    I assume Google developed brotli so they wouldn't encounter any issues incurred by using someone else's work for their enterprise. So the odds of them using lzham or glza and especially lzturbo are off the table. I would've liked brotli to perform a bit better for what it is but it's not like they can change the standard at this point.

  34. #26
    Member Fu Siyuan's Avatar
    Join Date
    Apr 2009
    Location
    Mountain View, CA, US
    Posts
    176
    Thanks
    10
    Thanked 17 Times in 2 Posts
    Hi, just curious what is lzmh?

    Quote Originally Posted by Bulat Ziganshin View Post
    Jarek, brotli dictionary by default is 4 MB and it's indended for compression of pretty small files. so you need to use the same dictionary and non-solid compression, for the start, and then compare with original programs, f.e. lzmh+xwrt with builtin dictionary or lzga. hamid can sell anything telling us what he wrote this but the only real thuing he was ever wrote was a tiny integer compressor. even the huffman compressor he wrote follows the description of jumpless bit coding i wrote here a year ago. modern lz compression like lzmh is absolutely outside his programming skills

  35. #27
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    323
    Thanks
    174
    Thanked 51 Times in 37 Posts
    Quote Originally Posted by Fu Siyuan View Post
    Hi, just curious what is lzmh?
    http://encode.ru/threads/1117-LZHAM?...ll=1#post22513

  36. #28
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    410
    Thanks
    37
    Thanked 60 Times in 37 Posts

    @bulat: brotli seems to have problems if working with big files ...

    can you please comment this?

    may be you can compile a windows 64 bit version?

    best regards

  37. #29
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    you can ask Matt to test it with Linux version. i've attached bro64.exe to the post above

  38. #30
    Member
    Join Date
    Sep 2010
    Location
    US
    Posts
    126
    Thanks
    4
    Thanked 69 Times in 29 Posts
    I did a little test of Brotli.

    For general purpose data, it seems to be total garbage. Compression ratio is terrible, encode speed is terrible, decode speed is mediocre. Even a good LZ-Huff beats it (like LZX).

    For text, it's okay.

    But that's not really fair testing a text-specific compressor vs. a generic compressor. The competition should be something like LZ-Huff with a text preprocessor or special text mode.
    Last edited by cbloom; 3rd August 2016 at 20:33.

  39. The Following 3 Users Say Thank You to cbloom For This Useful Post:

    Bulat Ziganshin (25th September 2015),Cyan (25th September 2015),Matt Mahoney (26th September 2015)

Page 1 of 8 123 ... LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •