Results 1 to 12 of 12

Thread: Sensor Data Compression

  1. #1
    Member
    Join Date
    Dec 2010
    Location
    Greece
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Sensor Data Compression

    Dear

    let assume we have a sensor filed with dimension of M*M. In order to apply any data compression technique, first I want to know what is the compression limit or minimum entropy of the entire sensor field. How could I compute the minimum entropy or compression limit for the sensor filed?

    Thanks in advance

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    1. Compress it with a strong CM like paq8 - http://paq8.hys.cz/
    2. Depending on your data type, it may be good to preprocess the values to make it look like some popular data type,
    eg. a picture in your case
    3. Also here's a coder for floating-point data - http://www.csl.cornell.edu/~burtscher/research/FPC/
    paq8 doesn't handle floats very well.

  3. #3
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    As Shelwien mentioned implicitly, there is no direct measurement. So, you have to test with one of strongest compressor (i.e. paq8 as Shelwien suggested). I assume, your sensors are not CCD array. If so, you have to apply some preprocessing to make more compressible. Because, most compressors work with bytes (8 bits) while sensor outputs are usually 10 to 12 bits wide. And not to mentioned about noise which is always active.

    If you want to quick response about your question, it's better to share some sample data among useful information about data structure. Data source characteristic is also very important.
    BIT Archiver homepage: www.osmanturan.com

  4. #4
    Member
    Join Date
    Dec 2010
    Location
    Greece
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for your quick response.

    Actually I want to have the theoretical compression limit. Let put the problem for an Image. I want to know are there any mathematical method exist to calculate the theoretical compression limit. Please let me know or suggest me any readings to formulate the problem.

    Thanks

  5. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Sure there're some theoretical results, but without a formal model of your data we can only apply something simple,
    like memoryless models for specific probability distribution types (and even that may be wrong if you work with floats).
    Also its very likely that any ad hoc estimation would be off by an order of magnitude.
    And I don't see why paq8 can't be considered a "theoretical estimator" - sure its based on an iterative formula, but so what?

    Btw, just for fun, you can ask your question there - http://stats.stackexchange.com/

  6. #6
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    What's maximum compression limit? A BARF like solution ( http://cs.fit.edu/~mmahoney/compression/barf.html )? Or Kolmogorov complexity?

  7. #7
    Member
    Join Date
    Dec 2010
    Location
    Greece
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Entropy calculations for fully specified data have been used to get a theoretical bound on how much that data can be compressed. My specified filed is the data from sensor filed. And I want to calculate the minimum entropy (assuming lossless compression) for entire sensor field not for individual nodes.




    Quote Originally Posted by Piotr Tarsa View Post
    What's maximum compression limit? A BARF like solution ( http://cs.fit.edu/~mmahoney/compression/barf.html )? Or Kolmogorov complexity?

  8. #8
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Its really easy to calculate that entropy.
    Its Sum[ -log2(bit[i]*p[i]+(1-bit[i])*(1-p[i])), i=0..N-1 ] ),
    where bit[i] are bits of your data, N is the number of bits, and
    p[i] are probabilities of bit[i]==1.
    But p[i] values are defined by _the_model_ and we don't know anything about your data.

  9. #9
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    If the input data to be compressed is very large (eg several terabytes) then size of decompressor attached to compressed data is neglectable. With such assumption, computing minimum compressed size is about equal to Kolmogorov complexity, which is practically uncomputable. Compressed data is our programming language and decompressor is it's interpreter.

    If you have the probability model then you can apply equation Shelwien provided and get the result. Or maybe not. Shelwien didn't include contexts into his equation.

    The problem of finding compression limit should be about equally difficult as writing a compressor that achieves that. I am right?

  10. #10
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > Shelwien didn't include contexts into his equation.

    I actually did - note the "p[i] are probabilities of bit[i]==1",
    ie each bit has its own probability estimation, and they can be computed using contexts or whatever.

    > The problem of finding compression limit should be about equally difficult as writing a compressor that achieves that. I am right?

    That depends on approach - for example, its easier to compute (n0+n1)!/n0!/n1! than actually encode such a bit permutation.
    Also some approximations can be applied for redundancy measurement, but not for actual (decodable) compression.
    But if we'd use a bitwise statistical approach, like what i described, making an actual coder would be certainly easier, as
    there're reasonably good open-source examples of that.
    Also its a good idea to write a compressor first anyway, because otherwise (without decoding tests) its really
    easy to make a subtle mistake (eg. use some "future" information as context somewhere), which would make the estimation
    completely wrong.
    While when a working compressor already exists, it should be easy enough to write an approximated (and simplified) formula
    which would produce similar results.

  11. #11
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    The theoretical limit of compression is Kolmogorov complexity, which is not computable. If you know the probability distribution, then the theoretical limit is the entropy given by Shannon for which there are easy solutions like arithmetic coding. However, the probability distribution is not computable in general. To get the best practical compression, you need to know what the data means, because prediction is the same as understanding. See chapter 1 of http://mattmahoney.net/dc/dce.html

  12. #12
    Member
    Join Date
    Dec 2010
    Location
    LA
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Data compressions

    There are not so much options for doing this kind of execution. For this type of work it should be compressed through CM, and its all depending on what kind of data you have with you. Many sites have the same kind of tutorials and you may Google it.


    Thats the no more than thai ed visa point to do if we famine to start in on our
    improvement.

Similar Threads

  1. loseless data compression method for all digital data type
    By rarkyan in forum Data Compression
    Replies: 157
    Last Post: 9th July 2019, 17:28
  2. Any money in data compression?
    By bitewing in forum The Off-Topic Lounge
    Replies: 18
    Last Post: 19th March 2019, 10:34
  3. Data compression explained
    By Matt Mahoney in forum Data Compression
    Replies: 92
    Last Post: 7th May 2012, 18:26
  4. Data Compression Crisis
    By encode in forum The Off-Topic Lounge
    Replies: 15
    Last Post: 24th May 2009, 19:30
  5. Data Compression Evolution
    By encode in forum Forum Archive
    Replies: 3
    Last Post: 11th February 2007, 15:33

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •