1. ## lossless data compression

Hello.
I'm new here and sorry if some thing wrong.
I have modification of lossless data compression. It is already publicated from patent organization of my country so i can show how it works. I don't know worth it something or not and now i looking some advice about what to do, or some one to cooperate if it worth some thing. I can't write fully worked program because it needs too many calculations what table or what table combinations use, but i have checked it on many compressed and not compressed different types of files and my method works.

EXAMPLE:
Imagine we have the following sequence of bit signals
10011001000101100110010110011001000100101101001100 0100101100 (initial data)
COMPRESSION:
1) Stage
Represent it as symbols by using the following table:
A _ 000
B _ 001
C _ 010 (Previously made up initial table)
D _ 011
E _ 100
F _ 101
G _ 110
H _ 111
We get the following text:
E G C B D B E F E G C B B D C D A E F E (the set of symbols of the first stage)
Perform text coding using the Huffman Coding method.
We shall obtain the following table:
A _ 1010
B _ 00
C _ 111 (the data restoring of the set of symbols of the first stage)
D _ 110
E _ 01
F _ 100
G _ 1011
We inscribe the text obtained from the original bit signal expressed by the symbols in the form of bit sequence using table obtained by Huffman Coding.
We get the following set of bit signals:
01101111100110000110001101111100001101111101010011 0001 (first stage result)
2) Stage
Imagine the compressed set of bit signals into initial data.
Represent it (like the first stage) in the form of text symbols using the initial previously made table.
We get the following set of symbols:
D D H B E B E D D H A D D H C E G B (the set of symbols of the second stage)
We perform the text coding using the method of Huffman Coding again.
We shall obtain the following table:
A _ 0111
B _ 00
C _ 010
D _ 10 (the data restoring of the set of symbols of the second stage)
E _ 111
G_ 0110
H _ 110
Let write the set of symbols of the second stage as the bit sequence through the data restoring the set of symbols of the second stage received in the result of coding through Huffman Coding method.
We get the following set of bit signals:
10101100011100111101011001111010110010111011000 (second stage result).

Here is shown just 2 stage, this method can have many stages and for better results can be used many different previously made up initial tables, different tables on different stages, different combinations of this tables and different algorithms of lossless data compression.

2. Well... Yet another patented recursive algorithm... Why don't you make an "real" executable to discuss further? I think, "max-stage" option is sufficient.

3. method has not max stage. in every step program must to calculate what table to use and check will it give enough compression or not.

4. This just won't work. Every time you're doing another huffman stage, your bitstring gets shorter, but you'll have to store additional information (the huffman tree). For random data, this will destroy your savings. Also, after doing it right, it's just a plain huffman coder on 3 bit symbols, neither something new nor something that will reach incredible compression ratios.

To make it clearer: Your input data is 60 bits. You split it to 20*3 bits and "compress" it to 54 bits in the first stage. But now look at those 54 bits and tell me how to restore the 20*3 bits with it. You'll need the huffman codes for this, and a representation of the huffman codes will be larger than 6 bits, so you'll end up with more than 60 bits.

Typical error for someone trying to develop a compression algorithm. You just looked at compression and didn't make sure it can be decompressed correctly, thus you overlooked additional information you'll have to encode. That's why the first thing replied to something like this is "do you have working code/working decompressor"?

5. Of course this method will newer work on 60 bits this is just example how it works i could not write there 60000 or more zeros and ones. In every step of this compression initial data of this step must be longer than compressed data with restoring tables, other ways it is useless.

There is shown just 1 table with 3 bit symbols but compression program must include many different tables with 4 bit symbols 5 and more the tables will be there will more chances that compression will be useful. program will be more effective it there will be option which will use different table combinations like this
EXAMPLE:
A _ 000
B _ 001
C _ 010
D _ 011
E _ 100
F _ 101
G _ 110
H _ 1110
GH _ 1111

If in data will be many combinations of "GH" together and "H" will be much little this can give opportunity to get more compression rate.

Also every next step must be compressed with data which restores previous step and there will not be too many restoring tables which will take more and more place with every step.

I don't have program which will compress file, but i have checked many files, compressed with many different programs and not compressed, on how much it can be compressed in 1 step and in most times i have 2-3% compression with restoring data. All time i use 4 bit tables 8 bit tables or 16 bit tables i could not use other options.

This method after compression gives independent data and opportunity to work on it like not compressed data.

6. the best place to discuss new compression ideas is comp.compression. here people cannot do anything btter than squeezing a few more percents out of 30-year old lz and ppm

7. ## Working code is King!

Originally Posted by Bulat Ziganshin
the best place to discuss new compression ideas is comp.compression. here people cannot do anything btter than squeezing a few more percents out of 30-year old lz and ppm
It may seem that way to you, it even seems that way at times to me. But the code they are working on

****WORKS***

And in comp.compression the same question will be asked, have you tried writing code?
Do you have a working compressor and a working decompressor? One without the other is worthless.
Are you going to start demanding that others must write the code for you. No-one will, they have their own coding to do.

8. I can't write program which will work on maximum of this method because it needs too many calculations.

If i will know that compressing 1, 2 or 3 special files will tell me that method will worth some thing or not i can make program for this special files or for some data and see how much it can be compressed with my method, but i don't know what to compress.

9. Your method really can provide incremental compression with multiple passes,
if input file is big and redundant enough.
But there's no sense to implement it, because other compression methods
(even the same huffman coding, but applied to bytes) would do better even
without multiple passes.
Well, I guess it still can be a good programming exercise, but its not really
a practical compression algorithm.

Actual compression algorithms work by removing known dependencies in input data.
For example, its hard to encounter a file which contains a sequence of 3-bit codes,
but most files are generated as byte sequences.

redundancy.
For example, a .bmp header contains fields like size of bitmap, size of header, size
of whole file, which can be computed from image width/height/colorspace, which are
stored separately. So a bmp compressor can replace these fields with single bit 1
if they match values predicted from other fields, or bit 0 + full values (to keep it
lossless when bmp structure is broken for some reason).

Its just an example, but real compressors work by removing more complex dependencies
of that kind. There're no "universal" compression methods which can compress all files
without knowing anything about specific dependencies in input files.

10. I know that many other algorithms will compress better than this on different type of files, but this method works on binary data and gives opportunity to compress already compressed file with any other method.

I can't proof that this method is universal and can compress every file, but it has too many options with manipulating on previously created tables and table combinations and option to use different lossless algorithms on different compression stages.

This method can give if not "very good" results at last "some" results in compression, but i don't know how much it will worth or worth work on it or not.

11. > this method works on binary data and gives opportunity to compress
> already compressed file with any other method.

All compression algorithms work with binary data, because all the data
on modern computers is binary.

Also you can't compress a file, which is already compressed with a good algorithm
Of course, presuming that you won't forget to include the description of huffman tree
into size of compressed file.

Sure, you can compress a file, which is already compressed by _your_ algorithm,
but it doesn't mean anything, because with multiple passes the result would be
worse than with usual single-pass algorithms.

> I can't proof that this method is universal and can compress every
> file, but it has too many options with manipulating on previously
> created tables and table combinations and option to use different
> lossless algorithms on different compression stages.

Yes, but there's no innovation in that.

> This method can give if not "very good" results at last "some"
> results in compression, but i don't know how much it will worth or
> worth work on it or not.

No, there's nothing new in your method, its a plain application of
huffman coding. Sure it can be useful in some circumstances, but
most engineers would be able to apply it without your help then.

If you're interested in universal compression, I'd suggest to consider
"Kolmogorov compression", based on http://en.wikipedia.org/wiki/Kolmogorov_complexity

12. As much as i know there is no algorithm which will represent binary data with symbols and compress it using text compression algorithms.

This is not a new algorithm this is just modification which gives many opportunity in representation data and work on already compressed data like initial.

13. As much as i know there is no algorithm which will represent binary data with symbols and compress it using text compression algorithms.
Because that's pointless. You can convert binary to ASCII using base64 encoding and then compress, but how is that good?

This is not a new algorithm this is just modification which gives many opportunity in representation data and work on already compressed data like initial.
People tried MANY variations. Your algorithm is basically a base8 encoding - ie. base8 encoding doesn't exist, but for example base16 does. You can convert binary to base16 and then apply some Huffman coders, for example: http://huffman.sourceforge.net/ The resulting compression will almost always be worse than without your preprocessing step.

14. Originally Posted by SLS
As much as i know there is no algorithm which will represent binary data with symbols and compress it using text compression algorithms.
In computers, binary data is already represented as symbols (bytes). And a good example of a text compression would be PPM which works on bytes. I think, you're looking at wrong side of the problem. I highly advice to write a small compression program to make it clear. Even a simple RLE compressor could tell you something which is more effective than ours. Else, it's pointless to discuss further. Because, almost all people in here already know that kind of thing simply won't work.

15. Indeed the discussion is pointless, SLS, you won't convince the bunch of old farts by talking. Go on and write the code to show them where's their place.

16. As i said before i can't make normal working program on different files, but if i will know that compressing of some files will mean some thing i can make program but i don't know what to compress or on what data check this method.

17. It is already publicated from patent organization of my country so i can show how it works.
Did you acquire a patent before discussing this here? Or does someone else have the patent?

(Did you acquire a patent before any kind of test on any kind of test data to see if the fundamental idea succeeds?)

(What land is this patent registered in? What is the patent number? These are important considerations for those of us examining your claims.)

18. he probably expects that we will implement his patented method. wise man!

19. ## So what?

Originally Posted by SLS
As i said before i can't make normal working program on different files, but if i will know that compressing of some files will mean some thing i can make program but i don't know what to compress or on what data check this method.
If you are not willing to do the work why should others cover up for your lack of will?

Choose data that is of interest for you to compress, if you don't have anything what good is this program to you anyway? If it is just something to waste your time than choose some files at random and test your code with them.

And don't talk about how slow your computer is, back in the 1980's I was programming on a 7MHz Amiga and I had test runs of my compression code running for an entire week to test how well they would work. Why can't you just leave your computer on for a while if the code runs slow on your machine?

20. http://mattmahoney.net/dc/dce.html#Section_11

Also, the USPTO has granted patents on lots of random and recursive compression algorithms. Just because it's patented doesn't mean it works.

21. Really its amazing this kind of thing finds this forum. Which of us is registering as a joke, putting a message through babelfish a few times and finally to English so as to disguise it, and posting to enjoy trolling us??

or do all comp.comp do is tell those who turn up there with these ideas to come here and ask us to implement it for them?

22. There no one pushed some one to implement something or make something that you don't want, i just looking advice about method or what tests will mean something about this method and there is no problem at computer calculations there is need many calculations which must be done for program because without normal calculations, which i can't make, this method will not work on even 0.001% of files.

Thanks every one who tried or will try to help, every one else can don't even read this and good luck.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•