Results 1 to 6 of 6

Thread: How to compress this data? (delta encoding)

  1. #1
    Member
    Join Date
    Oct 2016
    Location
    Russia
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    How to compress this data? (delta encoding)

    Hi!

    I need to quickly compress and decompress an array of 32-bit integers sorted in strictly increasing order (i.e. all array items are unique, deltas (differences between previous and current items) are always greater than zero).

    I tried libvbyte (compression ratio approx. 3.9 ) libpfor (1.7-2) and rle (expands data by 1.6 ).
    But this can be compressed much better, because most deltas are very small, many zeros.
    I've just implemented simple encoding using Rice codes, and using k = 2 compresses the data by 9, k = 1 (unary code! ) gives 11, which seems wrong.

    Could anyone please recommend me a better approach? Packing and unpacking should be reasonably fast.

    Here's a sample distribution of deltas and run lengths:
    5242 deltas
    deltas:
    3161 (60.30 %) values in [0 .. 1)
    743 (14.17 %) values in [1 .. 2)
    551 (10.51 %) values in [2 .. 4)
    151 (2.88 %) values in [4 .. )
    212 (4.04 %) values in [8 .. 16)
    168 (3.20 %) values in [16 .. 32)
    75 (1.43 %) values in [32 .. 64)
    54 (1.03 %) values in [64 .. 12 )
    46 (0.88 %) values in [128 .. 256)
    29 (0.55 %) values in [256 .. 512)
    15 (0.29 %) values in [512 .. 1024)
    12 (0.23 %) values in [1024 .. 204 )
    12 (0.23 %) values in [2048 .. 4096)
    4 (0.08 %) values in [4096 .. 8192)
    5 (0.10 %) values in [8192 .. 16384)
    3 (0.06 %) values in [16384 .. 3276 )
    Runs:
    1 (0.03 %) values in [0 .. 1)
    2265 (74.29 %) values in [1 .. 2)
    465 (15.25 %) values in [2 .. 4)
    267 (8.76 %) values in [4 .. 8)
    49 (1.61 %) values in [8 .. 16)
    2 (0.07 %) values in [16 .. 32)

    Average Run Length: 1.718924
    Maximum Run Length: 17


    // 3392 deltas:
    1863 (54.92 %) values in [0 .. 1)
    606 (17.87 %) values in [1 .. 2)
    227 (6.69 %) values in [2 .. 4)
    115 (3.39 %) values in [4 .. 8)
    121 (3.57 %) values in [8 .. 16)
    101 (2.98 %) values in [16 .. 32)
    88 (2.59 %) values in [32 .. 64)
    63 (1.86 %) values in [64 .. 128)
    59 (1.74 %) values in [128 .. 256)
    58 (1.71 %) values in [256 .. 512)
    31 (0.91 %) values in [512 .. 1024)
    29 (0.85 %) values in [1024 .. 2048)
    14 (0.41 %) values in [2048 .. 4096)
    8 (0.24 %) values in [4096 .. 8192)
    7 (0.21 %) values in [8192 .. 16384)
    1 (0.03 %) values in [16384 .. 32768)
    Runs:
    1 (0.05 %) values in [0 .. 1)
    1537 (70.96 %) values in [1 .. 2)
    494 (22.81 %) values in [2 .. 4)
    134 (6.19 %) values in [4 .. 8)

    Average Run Length: 1.563249
    Maximum Run Length: 5


    In case you're wondering, I'm compressing volumetric data.
    Sorry for stupid question, I'm new to this forum and data compression in general.

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Try this (with 7-zip)
    7z a -mf=off -m0=delta:4 -m1=lzma:lc0:lp2:pb2:mc9999:fb273
    Possibly with more parameter tuning (eg. increasing lc could help)

  3. #3
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts
    You can try "delta+variable simple" or the integrated differential TurboPFor "p4dd1d32" from the TurboPFor package.
    If you are more interested in compression ratio than speed, then you can try the "byte transpose4" or the "nibble transposen4" in conjunction with
    another lz77 compressor.
    See the usage examples in the "icbench.c" benchmark app.
    Is it possible to upload a sample dataset?

  4. #4
    Member
    Join Date
    Oct 2016
    Location
    mumbai
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Have you got the solution yet? because am also facing the same problem please let me know

  5. #5
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    772
    Thanks
    63
    Thanked 270 Times in 190 Posts
    Quote Originally Posted by arjupraja143 View Post
    Have you got the solution yet? because am also facing the same problem please let me know
    You can try my GearEnc:
    http://encode.ru/threads/545-GearEnc...-gear-encoding

  6. #6
    Member
    Join Date
    Oct 2016
    Location
    mumbai
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks sportman for the ref will look into that tonight

Similar Threads

  1. Transfer-Encoding & Data Compression
    By dnd in forum Data Compression
    Replies: 66
    Last Post: 2nd July 2018, 06:18
  2. Google: Compress Data More Densely with Zopfli
    By roytam1 in forum Data Compression
    Replies: 64
    Last Post: 9th July 2016, 00:09
  3. Replies: 38
    Last Post: 27th April 2016, 18:01
  4. Patch/delta compression?
    By cbloom in forum Data Compression
    Replies: 3
    Last Post: 4th September 2012, 20:10
  5. Delta transformation
    By encode in forum Forum Archive
    Replies: 16
    Last Post: 4th January 2008, 12:13

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •