Results 1 to 10 of 10

Thread: Calthax's Compression Negielo Benchmark

  1. #1
    Member
    Join Date
    Aug 2014
    Location
    Overland Park, KS
    Posts
    17
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Calthax's Compression Negielo Benchmark

    welcome to my negielo benchmark where negielo is calculated by the percentage of file size times file size in bytes and the file used is one million digits of y cruncher leminscate the less negielo you have the better you are
    Please send me compressors to negielo benchmark i use core 2 6400 with 4gb ram on windows and 32 bit

    the one million digits of leminscate weigh one million and 2 bytes freearc compressed at ultra to 437538 bytes the negielo of freearc using ultra compression is 191439
    Last edited by calthax; 21st August 2014 at 05:54.

  2. #2
    Member
    Join Date
    Jan 2014
    Location
    Bothell, Washington, USA
    Posts
    685
    Thanks
    153
    Thanked 177 Times in 105 Posts
    Quote Originally Posted by calthax View Post
    welcome to my negielo benchmark where negielo is calculated by the number of seconds times file size in bytes and the file used is enwik8 the less negielo you have the better you are
    Please send me compressors to negielo benchmark i use core i7 3240 with 4gb ram on linux
    It will be difficult to beat memcpy.

  3. #3
    Member
    Join Date
    Aug 2014
    Location
    Overland Park, KS
    Posts
    17
    Thanks
    0
    Thanked 0 Times in 0 Posts
    `Ok i changed to the percentage of file size times file size in bytes

  4. #4
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    856
    Thanks
    45
    Thanked 104 Times in 82 Posts
    so its solely based on filesize ?

    I'm wondering why you advocate for more benchmark, yet just repeat the same benchmark ( enwik8 ) instead of new datasets so it would actually reveal more info om compressor behavior ?
    Last edited by SvenBent; 19th August 2014 at 01:54.

  5. #5
    Member
    Join Date
    Aug 2014
    Location
    Overland Park, KS
    Posts
    17
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Changed to one million digits of y cruncher lemniscate how many benchmarks are out there
    Last edited by calthax; 18th August 2014 at 03:08.

  6. #6
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    856
    Thanks
    45
    Thanked 104 Times in 82 Posts
    is y cruncher output just some part of the pi value ?

    Then zpag wil probably win this if we only care for size and not speed since it can basically just be program to calculate it , and those instructions take far less storage that actually storing the data. however the "decompression" is a lot slower.
    ZPAQ already "Compresses" 1 million digits of pi to 114 bytes.

    i might be totally wrong though
    Last edited by SvenBent; 18th August 2014 at 12:59.

  7. #7
    Member
    Join Date
    Aug 2014
    Location
    Overland Park, KS
    Posts
    17
    Thanks
    0
    Thanked 0 Times in 0 Posts

  8. #8
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    856
    Thanks
    45
    Thanked 104 Times in 82 Posts
    if thats the case im pretty sure zpag can just "compress it down" to the math formulars to begin with.
    Depending on your purpose of the benchmark that would make the test itself invalid for its purpose.

    If you purpose is to improve compression by measuring improvements. Then choosing artificial data instead of real world data will lead to improvements in the benchmark might come at a cost of real world usage improvements.


    Look at it this way. lets say you want to benchmark GPU's and you do it sole based on one on an artificial engine. then improvements comes around and make the gpu faster in that test but not in real games. did your benchmark help that purpose at all. or did it worsening by making developers optimization for you benchmark instead of real world.
    This was very true in the past with the old Ziff davis benchmark. New benchmark came out and a bit later new drivers for graphics card with improved "Perfomance". However the new driver had no improvement in any game or otherwise it was simple sole optimized for this one benchmark.
    This is also the reason why many hardware reviews sites went away from using artificial benchmark instead and focused more in real game benchmark.

    So again does your benchmark reflect what you want to measure? Does its existence improve its purpose? That 2 big question you have to ask yourself before starting.
    Last edited by SvenBent; 19th August 2014 at 01:59.

  9. #9
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    How about using a list of prime numbers like 10gb/benchmarks/primes from the 10 GB benchmark? It is a text file containing all 5,761,455 prime numbers less than 100 million, like this:

    2
    3
    5
    7
    ...
    99999941
    99999959
    99999971
    99999989

    Each line ends with a CR LF. The file size is 56,860,455 bytes.

    For your convenience I have attached the file compressed to a 97 byte zpaq archive (inside a 353 byte zip archive since I can't attach zpaq files in this forum). It takes about 10 seconds to decompress on a 2 GHz T3200. To extract:

    unzip primes.zpaq.zip
    zpaq x primes.zpaq
    Attached Files Attached Files

  10. #10
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    856
    Thanks
    45
    Thanked 104 Times in 82 Posts
    for text compression benchmark i use my log files from playing FF XI online i like this as it both contains a lot of log file aspect (same lines) and a lot of real world text (chatting) mixed with lotsa noise (spelling errors. different languages, weird acronyms) my log files are form 4 different accounts at the same time so there will be passage that are nearly identical but in files far from each other. and lines that are nearly identical close to each other.
    Also there is the aspect of randomness since its contains data from multiple sources (the system and tons of people chatting)

    Example for a log file structure
    [05:12:39]yYour 12 plates of sole sushi sold.
    [05:12:40]yYour 12 plates of sole sushi sold.
    [05:12:40]yYour 12 plates of sole sushi sold.
    [05:12:40]yYour 12 plates of sole sushi sold.
    [05:12:40]yYour 12 plates of sole sushi sold.


    Example of Noise (Japanese person shouting)
    [05:12:28]Sangalia : ƒOƒ‰ƒ”ƒHƒCƒh‘厖Žæ‚©‚ç‚ ‚«‚Ü‚¹‚ñ‚©H
    [05:12:29]Sangalia : ƒgƒŠƒK[’ñ‹ŸŽÒA‘Žæ‚è‚Å‚·B
    Last edited by SvenBent; 20th August 2014 at 05:17.

Similar Threads

  1. MONSTER OF COMPRESSION - New Benchmark
    By LovePimple in forum Data Compression
    Replies: 225
    Last Post: 23rd December 2009, 11:57
  2. New benchmark for generic compression
    By Matt Mahoney in forum Data Compression
    Replies: 20
    Last Post: 29th December 2008, 09:20
  3. MONSTER OF COMPRESSION - New Benchmark -
    By Nania Francesco in forum Forum Archive
    Replies: 222
    Last Post: 5th May 2008, 10:04
  4. Compression speed benchmark
    By Sportman in forum Forum Archive
    Replies: 104
    Last Post: 23rd April 2008, 16:38
  5. Synthetic compression benchmark
    By giorgiotani in forum Forum Archive
    Replies: 6
    Last Post: 3rd March 2008, 12:14

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •