Results 1 to 8 of 8

Thread: GearEnc - test application gear encoding

  1. #1
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts

    GearEnc - test application gear encoding

    Kwc use what I called “Gear Encoding” to encode a list with dictionary IDs.
    Gear encoding is only useful by value ranges who go slowly up and can be mixed with lower values from the range till then.

    For example values:
    2,1,4,3,1,3,3,6,1,9,5,11,4,7,15,9,5,13,11,4,17,18, 6,21,24,8,17,27,4,12,28,19,5,31

    I wrote a test application GearEnc what accept for encoding four types of inputs:

    1.text (x,y,z) - for example the range above one row with comma separated values.
    2.Int8 - binary file with 8 bit integers
    3.Int16 - binary file with 16 bit integers
    4.Int32 - binary file with 32 bit integers

    The output is a gear encoded binary file from the values found in the input

    For decoding it's the opposite, it need a binary gear encoded file as input and one of the four types Text, Int8, Int16, Int32 can be select as output.

    The text input is handy to test manual typed values in a text editor, but is slower then binary input by a very big range of values.

    GearEnc download link:
    http://www.metacompressor.com/download/gearenc.zip

    Two files what can be used as input:

    Example text file with 2211 values:
    http://www.metacompressor.com/download/values.zip

    The same 2211 values but as Int16 binary file:
    http://www.metacompressor.com/download/values2.zip

    It's also written in Visual Basic 2008 and Microsoft Framework 3.5 is needed to run this application. This time only Window 32/64bit GUI.

    I'm curious at what data this encoding can be useful and how it compare against other encodings.

    It must be possible to write a version what is also effective at value ranges who also go down after a while and again up after a while etc.

  2. #2
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    I added command line support to GearEnc and updated the download file.

    Command line parameters:

    Encoding:
    gearenc e x input output

    Decoding:
    gearenc d x input output

    Where x is the file type for input by encoding or output by decoding and can be 1,2,3 or 4:
    1 = text (x,y,z)
    2 = Int8
    3 = Int16
    4 = Int32

    So for example:
    gearenc e 1 values.txt values.ge
    gearenc d 1 values.ge values.txt

    Results for example input files values.txt and values2.bin:
    http://www.metacompressor.com/upload...estfile=values
    http://www.metacompressor.com/upload...stfile=values2
    Last edited by Sportman; 23rd January 2010 at 07:07.

  3. #3
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    I saw that the GUI file extension default settings for encoding and decoding where switched after adding command line support. I have fixed this and also improved it and updated the download file.

  4. #4
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    A new feature is added to GearEnc to have better output file size reduction for sorted input values. It's now possible to enable Delta in this case the difference between input values are gear encode instead of the input values.

    Delta can only be enabled when input values are sorted from low to high.

    Value zero was in older GearEnc versions only allowed as first value, to support multiple zero's and Delta I change the code. Because that this version is not backward compatible with older versions, it work but all output values are the original input values minus one.

    To enable Delta from the command line add d behind the input or output type so 1d, 2d, 3d or 4d.

    Examples Delta enabled for type Text input:

    gearenc e 1d primes.txt primes.ge
    gearenc d 1d primes.ge primes.txt

    or type binary Int32 input:

    gearenc e 4d primes2.bin primes2.ge
    gearenc d 4d primes2.ge primes2.bin

    Updated download file:
    http://www.metacompressor.com/download/gearenc.zip

    Test input files with all sorted primes values till 100.000:

    As Text:
    http://www.metacompressor.com/download/primes.zip

    As binary Int32:
    http://www.metacompressor.com/download/primes2.zip

    Results:
    http://www.metacompressor.com/upload...estfile=primes
    http://www.metacompressor.com/upload...stfile=primes2
    Last edited by Sportman; 25th January 2010 at 06:49.

  5. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Why don't you try encoding wavs - they're arrays of int16 too, in a way

  6. #6
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    After Matt Mahoney prime numbers test http://encode.ru/threads/1723-Compressing-prime-numbers I was curious what my old GearEnc could do. I generated same prime number text file and adjusted GearEnc so that it can handle two separators (CR/LF), this version of GearEnc is not public available.

    Input ramdisk:
    56,860,455 bytes - text file prime numbers each new row till 100.000.000

    Output ramdisk:
    26,141,932 bytes, 0.141 sec. - 0.037 sec., LZ4 - 0
    25,211,293 bytes, 0.137 sec. - 0.043 sec., lzturbo - 10
    24,201,678 bytes, 0.243 sec. - 0.050 sec., lzturbo - 11
    24,154,918 bytes, 0.514 sec. - 0.050 sec., lzturbo - 12
    24,141,574 bytes, 2.724 sec. - 0.038 sec., LZ4 - 1
    23,560,734 bytes, 3.586 sec. - 0.032 sec., LZ4 - 2
    23,030,272 bytes, 18.663 sec. - 0.054 sec., lzturbo - 19
    22,558,992 bytes, 0.129 sec. - 0.059 sec., lzturbo - 20
    21,463,555 bytes, 0.219 sec. - 0.057 sec., lzturbo - 21
    21,380,336 bytes, 1.062 sec. - 0.260 sec., eXdupe - 1
    21,219,907 bytes, 0.603 sec. - 0.061 sec., lzturbo - 22
    20,890,625 bytes, 0.116 sec. - 0.117 sec., Qpress - 1
    20,595,651 bytes, 0.239 sec. - 0.137 sec., Qpress - 2
    20,091,353 bytes, 0.751 sec. - 0.090 sec., Qpress - 3
    19,197,016 bytes, 2.304 sec. - 1.311 sec., Bsc - 0
    18,837,068 bytes, 1.447 sec. - 1.985 sec., Bsc - 6
    18,742,959 bytes, 1.824 sec. - 0.859 sec., WinZpaq - 1
    18,620,323 bytes, 11.831 sec. - 12.749 sec., WinZpaq - 3
    17,494,723 bytes, 1.428 sec. - 0.040 sec., eXdupe - 2
    17,404,957 bytes, 20.000 sec. - 0.065 sec., lzturbo - 29
    17,195,192 bytes, 1.274 sec. - 1.582 sec., Bsc - 5
    16,852,206 bytes, 4.053 sec. - 0.013 sec., eXdupe - 3
    15,738,813 bytes, 0.421 sec. - 0.410 sec., FreeArc - 1
    14,853,886 bytes, 1.047 sec. - 1.272 sec., Bsc - 4
    14,719,382 bytes, 0.185 sec. - 0.154 sec., lzturbo - 30
    14,108,727 bytes, 3.249 sec. - 0.258 sec., WinRAR - 2
    13,886,879 bytes, 0.286 sec. - 0.176 sec., lzturbo - 31
    13,674,553 bytes, 0.618 sec. - 0.272 sec., WinRAR - 1
    13,130,635 bytes, 1.143 sec. - 0.505 sec., FreeArc - 2
    12,948,063 bytes, 0.677 sec. - 0.181 sec., lzturbo - 32
    12,462,484 bytes, 0.900 sec. - 1.018 sec., Bsc - 3
    11,908,956 bytes, 22.679 sec. - 0.257 sec., WinRAR - 5
    11,908,261 bytes, 16.790 sec. - 0.257 sec., WinRAR - 4
    11,905,376 bytes, 8.066 sec. - 0.257 sec., WinRAR - 3
    11,479,736 bytes, 4.777 sec. - 4.533 sec., WinZpaq - 2
    10,713,804 bytes, 3.744 sec. - 0.617 sec., 7-Zip - 4
    10,711,906 bytes, 3.736 sec. - 0.616 sec., 7-Zip - 3
    10,566,694 bytes, 3.178 sec. - 0.865 sec., FreeArc - 3
    10,441,108 bytes, 3.456 sec. - 0.611 sec., 7-Zip - 2
    10,140,092 bytes, 3.086 sec. - 0.604 sec., 7-Zip - 1
    7,947,811 bytes, 23.806 sec. - 0.167 sec., lzturbo - 39
    5,747,984 bytes, 3.942 sec. - 1.630 sec., GearEnc - 5d
    5,613,687 bytes, 30.251 sec. - 0.497 sec., lzturbo - 49
    5,600,704 bytes, 31.410 sec. - 0.733 sec., FreeArc - 5
    5,598,275 bytes, 12.632 sec. - 0.739 sec., FreeArc - 4
    5,521,436 bytes, 18.608 sec. - 0.519 sec., 7-Zip - 5
    5,500,608 bytes, 10.131 sec. - 10.461 sec., ZCM - 7
    5,452,517 bytes, 35.524 sec. - 37.678 sec., WinZpaq - 4
    5,390,592 bytes, 9.616 sec. - 9.876 sec., ZCM - 6
    5,358,112 bytes, 9.576 sec. - 9.833 sec., ZCM - 5
    5,356,405 bytes, 9.486 sec. - 9.749 sec., ZCM - 4
    5,282,428 bytes, 9.381 sec. - 9.622 sec., ZCM - 3
    5,193,795 bytes, 9.245 sec. - 9.497 sec., ZCM - 2
    5,161,014 bytes, 9.201 sec. - 9.416 sec., ZCM - 1
    5,101,762 bytes, 9.423 sec. - 9.672 sec., ZCM - 0

    1 thread tested.
    Last edited by Sportman; 16th May 2013 at 21:13.

  7. #7
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    Input ramdisk:
    5,747,984 bytes, GearnEnc file with Gear Encoded output of text file prime numbers each new row till 100.000.000

    Output ramdisk:
    5,743,357 bytes, 2.395 sec. - 0.01 sec., lzturbo - 39
    4,913,084 bytes, 0.044 sec. - 0.014 sec., lzturbo - 20
    4,803,086 bytes, 0.74 sec. - 0.161 sec., eXdupe - 1
    4,794,661 bytes, 0.031 sec. - 0.027 sec., Qpress - 1
    4,789,221 bytes, 0.049 sec. - 0.012 sec., LZ4 - 0
    4,779,901 bytes, 0.001 sec. - 0.011 sec., lzturbo - 10
    4,707,467 bytes, 0.071 sec. - 0.013 sec., lzturbo - 21
    4,665,192 bytes, 0.065 sec. - 0.011 sec., lzturbo - 11
    4,632,454 bytes, 0.179 sec. - 0.011 sec., lzturbo - 12
    4,599,518 bytes, 0.101 sec. - 0.012 sec., LZ4 - 1
    4,549,183 bytes, 0.121 sec. - 0.02 sec., Qpress - 3
    4,536,349 bytes, 0.124 sec. - 0.012 sec., LZ4 - 2
    4,532,533 bytes, 0.196 sec. - 0.014 sec., lzturbo - 22
    4,523,193 bytes, 0.052 sec. - 0.031 sec., Qpress - 2
    4,523,062 bytes, 2.03 sec. - 0.011 sec., lzturbo - 19
    4,276,879 bytes, 2.176 sec. - 0.016 sec., lzturbo - 29
    4,274,133 bytes, 0.35 sec. - 0.131 sec., WinZpaq - 1
    3,692,024 bytes, 0.759 sec. - 0.073 sec., eXdupe - 2
    3,557,356 bytes, 0.147 sec. - 0.066 sec., WinRAR - 1
    3,531,285 bytes, 0.258 sec. - 0.037 sec., lzturbo - 32
    3,510,338 bytes, 0.456 sec. - 0.155 sec., 7-Zip - 1
    3,506,803 bytes, 0.335 sec. - 0.158 sec., FreeArc - 2
    3,495,145 bytes, 0.058 sec. - 0.037 sec., lzturbo - 30
    3,493,302 bytes, 0.591 sec. - 0.146 sec., 7-Zip - 2
    3,492,292 bytes, 0.128 sec. - 0.141 sec., FreeArc - 1
    3,489,776 bytes, 1.018 sec. - 0.139 sec., 7-Zip - 4
    3,487,994 bytes, 0.782 sec. - 0.141 sec., 7-Zip - 3
    3,480,164 bytes, 1.087 sec. - 0.038 sec., eXdupe - 3
    3,475,201 bytes, 0.475 sec. - 0.061 sec., WinRAR - 2
    3,468,011 bytes, 0.083 sec. - 0.037 sec., lzturbo - 31
    3,458,963 bytes, 0.753 sec. - 0.061 sec., WinRAR - 3
    3,458,615 bytes, 0.781 sec. - 0.06 sec., WinRAR - 4
    3,458,607 bytes, 0.785 sec. - 0.06 sec., WinRAR - 5
    3,458,192 bytes, 0.636 sec. - 0.259 sec., FreeArc - 3
    3,336,364 bytes, 1.392 sec. - 0.244 sec., FreeArc - 4
    3,308,639 bytes, 2.242 sec. - 0.233 sec., FreeArc - 5
    3,308,400 bytes, 2.083 sec. - 0.131 sec., 7-Zip - 5
    3,282,971 bytes, 1.219 sec. - 0.964 sec., WinZpaq - 2
    3,182,377 bytes, 3.047 sec. - 0.159 sec., lzturbo - 49
    3,157,149 bytes, 1.402 sec. - 1.448 sec., WinZpaq - 3
    3,151,082 bytes, 0.121 sec. - 0.259 sec., Bsc - 3
    3,150,408 bytes, 0.371 sec. - 0.353 sec., Bsc - 6
    3,150,388 bytes, 0.126 sec. - 0.301 sec., Bsc - 4
    3,150,352 bytes, 0.308 sec. - 0.168 sec., Bsc - 0
    3,150,316 bytes, 0.184 sec. - 0.368 sec., Bsc - 5
    3,111,559 bytes, 4.702 sec. - 4.898 sec., WinZpaq - 4
    3,089,594 bytes, 3.036 sec. - 2.952 sec., ZCM - 7
    3,038,809 bytes, 306.546 sec. - x.xxx sec., paq8pxd - 8
    3,006,548 bytes, 14.476 sec. - 14.38 sec., WinZpaq - 9
    3,006,546 bytes, 13.567 sec. - 13.887 sec., WinZpaq - 8
    3,006,530 bytes, 13.243 sec. - 13.455 sec., WinZpaq - 7
    3,006,528 bytes, 13.058 sec. - 13.311 sec., WinZpaq - 5
    3,006,527 bytes, 13.058 sec. - 13.385 sec., WinZpaq - 6

    1 thread tested.

  8. #8
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    Updated GearEnc to version 0.0.0.5, added support and auto detect for any one or two bytes not numeric char(s) separators in text mode, added support for closing with separator in text mode, added 2 or 3 bytes header to output file what makes only parameter d enough to decode, this give as disadvantage that this version of GearEnc can't be used anymore to convert between text, 8, 16 and 32 bits formats because output is fixed to first input format. With tests I found out that in all GearEnc versions only till 2^26 is supported as max input value when option delta is disabled.

    GearEnc download link:
    http://www.metacompressor.com/download/gearenc.zip

    Example:
    gearenc e 1d primes.txt primes.ge
    gearenc d primes.ge primes.txt

    Matt Mahoney prime numbers test file GearEnc 1d and zpaq -5 to 3,006,530 bytes:
    http://www.metacompressor.com/download/primes.zpaq
    Last edited by Sportman; 19th May 2013 at 04:49.

Similar Threads

  1. Advanced Huffman Encoding
    By Simon Berger in forum Data Compression
    Replies: 28
    Last Post: 15th April 2009, 14:24
  2. Test set: Java application
    By m^2 in forum Data Compression
    Replies: 4
    Last Post: 24th October 2008, 00:06

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •