20th April 2015, 13:40
I was wondering if any coders have tried this compression idea before, and know if there's a name for it. Its a lossless technique, and im wondering what the compression ratio could be for something like this.
Let's say one has the compressed data. The compressed data has hamming weight number = 5, so (it's just an RLE-stream of 1's and) the data looks like this:
To decompress we have an index to where to split the data and move the stream to the right, where the data is splitted and no ones, there is zero. Since during the iteration we will always have to move one step, the step number for one-step will be a zero. We choose that the step number is 1 next:
The new start number starts where the bitstream was divided. Next step-number we choose are 0:
(we could do this until we got the original, but lets just stop here).
This is our decompressed data: 1100111 = 103.
From the above, it looks like some kind of delta's. If anyone can confirm if this has anything to do with delta-coding please elaborate. Ive made delta encoder/decoder before, but can't tell if it is yet. In the case above 103 was compressed to 5. But we need to check wether or not the compression can be done backwards. And we can see that 5 are not enough, because we need two-step sequences. In this case we could use two two-bit numbers. So, we have the bitstream: 1100111. We use a technique to read bitpairs. first is 11 => do nothing. second is 10 => ... the bits are stored in a way that we can decode. The three 1's are moved from a position to the next posision -1. same after the next iteration. and the result is 11111 = 5. meanwhile we stored the position of where the splitting occurs.
20th April 2015, 23:40
In general it takes more than one bit to encode where to insert the zero bits, so there is no compression.
21st April 2015, 21:35
Matt, you were right about the above.
I worked more on the idea and found another possible solution?
By reading bits one by one and storing the deltas between the lengths of consecutive bitstates.
Maybe it's just RLE, or some form of it?
After a just a few tests I found out that some data was compressed a tiny bit while some dont. so that it depends on how the bits are arranged.
I want to test it with an transform-function.
In the example here, 17-bits were compressed down to 16-bits, and no information is lost. Or am I missing something?
Maybe it was an bad example since the bits are not pow2 number or anything. And dont know the length of the original bitstream.
Each "cut&drag" are stored with two-bit numbers. Perhaps the size can be dynamic in size if bits of same state are larger.
Im still working on some code to test things.
[Edit: i sat 00 to 1 etc.. because zero never happens when bit change states it will always drag atleast once]
Last edited by Rudi; 21st April 2015 at 21:39.
22nd April 2015, 20:47
It won't work in general. Suppose your input is random bits. Then the probability that a run of identical bits will end on the next bit is always 1/2. The distribution of run lengths is:
1 = 1/2.
2 = 1/4.
3 = 1/8.
4 = 1/16
n = 1/2^n
and the best you can do is encode n using -log2(p) = -log2(1/2^n) = n bits. So there is no savings at all.
More generally, random data is not compressible by any algorithm. http://mattmahoney.net/dc/dce.html#Section_11
Sorry for the long delay. Ive been busy with other things..
That's really interesting. I have to read your book. Probability maths is not my strongest side. I have programming background, assembler, c/c++ etc. you can see my 128 byte demonstrations. i found a clip of one i made here: http://www.youtube.com/watch?v=2Xnrf92qvw8 it works only in an emulator so i uploaded it as video instead. assembler source can be found on my homepage https://sites.google.com/site/rudibstranden/home/ when i make these i see the instructions (opcode and operands) mostly as bricks in a big puzzle game. making the pieces fit together. thats somehow im also trying to do when looking at compression from time to time. but compression is a different game, its a little more to read and theres math stuff i dont understand, so i try make my own ways to figure out stuff. im happy to get feedback and advices. i dont think compression is easily to visualize. i want to find out if there is some simple technique thats really logical and the best compressor which also is recursive either that the compressor decoder and/or decoded data also is part of the recursive process.
when 2 is 1/4 probability, what i am interested in is the pattern it creates and i guess that would be part of what others use of terminology is the transform-function(?), wheter patterns also are part of a big pseudo random automaton (in the universe or not) i find it hard to accept that random data are not compressible, when one gets random data by compressing data. i read and saw some say that the laws of physics make it easy to go backwards and reverse (cit. Ed Fredkin).
i bookmarked your book. i will read it when i get time to do it.
Last edited by Rudi; 8th May 2015 at 04:46.