Results 1 to 5 of 5

Thread: Compression technique

  1. #1
    Member
    Join Date
    Apr 2015
    Location
    Norway
    Posts
    16
    Thanks
    5
    Thanked 2 Times in 2 Posts

    Compression technique

    Hi,

    I was wondering if any coders have tried this compression idea before, and know if there's a name for it. Its a lossless technique, and im wondering what the compression ratio could be for something like this.

    Let's say one has the compressed data. The compressed data has hamming weight number = 5, so (it's just an RLE-stream of 1's and) the data looks like this:
    11111

    To decompress we have an index to where to split the data and move the stream to the right, where the data is splitted and no ones, there is zero. Since during the iteration we will always have to move one step, the step number for one-step will be a zero. We choose that the step number is 1 next:

    110111

    The new start number starts where the bitstream was divided. Next step-number we choose are 0:

    1100111.

    (we could do this until we got the original, but lets just stop here).
    This is our decompressed data: 1100111 = 103.

    From the above, it looks like some kind of delta's. If anyone can confirm if this has anything to do with delta-coding please elaborate. Ive made delta encoder/decoder before, but can't tell if it is yet. In the case above 103 was compressed to 5. But we need to check wether or not the compression can be done backwards. And we can see that 5 are not enough, because we need two-step sequences. In this case we could use two two-bit numbers. So, we have the bitstream: 1100111. We use a technique to read bitpairs. first is 11 => do nothing. second is 10 => ... the bits are stored in a way that we can decode. The three 1's are moved from a position to the next posision -1. same after the next iteration. and the result is 11111 = 5. meanwhile we stored the position of where the splitting occurs.

  2. #2
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    In general it takes more than one bit to encode where to insert the zero bits, so there is no compression.

  3. #3
    Member
    Join Date
    Apr 2015
    Location
    Norway
    Posts
    16
    Thanks
    5
    Thanked 2 Times in 2 Posts
    Matt, you were right about the above.

    I worked more on the idea and found another possible solution?
    By reading bits one by one and storing the deltas between the lengths of consecutive bitstates.




    Maybe it's just RLE, or some form of it?
    After a just a few tests I found out that some data was compressed a tiny bit while some dont. so that it depends on how the bits are arranged.
    I want to test it with an transform-function.
    In the example here, 17-bits were compressed down to 16-bits, and no information is lost. Or am I missing something?
    Maybe it was an bad example since the bits are not pow2 number or anything. And dont know the length of the original bitstream.
    Each "cut&drag" are stored with two-bit numbers. Perhaps the size can be dynamic in size if bits of same state are larger.

    Im still working on some code to test things.

    [Edit: i sat 00 to 1 etc.. because zero never happens when bit change states it will always drag atleast once]
    Last edited by Rudi; 21st April 2015 at 21:39.

  4. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    It won't work in general. Suppose your input is random bits. Then the probability that a run of identical bits will end on the next bit is always 1/2. The distribution of run lengths is:
    1 = 1/2.
    2 = 1/4.
    3 = 1/8.
    4 = 1/16
    ...
    n = 1/2^n

    and the best you can do is encode n using -log2(p) = -log2(1/2^n) = n bits. So there is no savings at all.

    More generally, random data is not compressible by any algorithm. http://mattmahoney.net/dc/dce.html#Section_11

  5. #5
    Member
    Join Date
    Apr 2015
    Location
    Norway
    Posts
    16
    Thanks
    5
    Thanked 2 Times in 2 Posts
    Sorry for the long delay. Ive been busy with other things..

    That's really interesting. I have to read your book. Probability maths is not my strongest side. I have programming background, assembler, c/c++ etc. you can see my 128 byte demonstrations. i found a clip of one i made here: http://www.youtube.com/watch?v=2Xnrf92qvw8 it works only in an emulator so i uploaded it as video instead. assembler source can be found on my homepage https://sites.google.com/site/rudibstranden/home/ when i make these i see the instructions (opcode and operands) mostly as bricks in a big puzzle game. making the pieces fit together. thats somehow im also trying to do when looking at compression from time to time. but compression is a different game, its a little more to read and theres math stuff i dont understand, so i try make my own ways to figure out stuff. im happy to get feedback and advices. i dont think compression is easily to visualize. i want to find out if there is some simple technique thats really logical and the best compressor which also is recursive either that the compressor decoder and/or decoded data also is part of the recursive process.

    when 2 is 1/4 probability, what i am interested in is the pattern it creates and i guess that would be part of what others use of terminology is the transform-function(?), wheter patterns also are part of a big pseudo random automaton (in the universe or not) i find it hard to accept that random data are not compressible, when one gets random data by compressing data. i read and saw some say that the laws of physics make it easy to go backwards and reverse (cit. Ed Fredkin).

    i bookmarked your book. i will read it when i get time to do it.
    Last edited by Rudi; 8th May 2015 at 04:46.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •