Results 1 to 22 of 22

Thread: lossless data compression

  1. #1
    Member
    Join Date
    Mar 2011
    Location
    georgia
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    lossless data compression

    Hello.
    I'm new here and sorry if some thing wrong.
    I have modification of lossless data compression. It is already publicated from patent organization of my country so i can show how it works. I don't know worth it something or not and now i looking some advice about what to do, or some one to cooperate if it worth some thing. I can't write fully worked program because it needs too many calculations what table or what table combinations use, but i have checked it on many compressed and not compressed different types of files and my method works.

    EXAMPLE:
    Imagine we have the following sequence of bit signals
    10011001000101100110010110011001000100101101001100 0100101100 (initial data)
    COMPRESSION:
    1) Stage
    Represent it as symbols by using the following table:
    A _ 000
    B _ 001
    C _ 010 (Previously made up initial table)
    D _ 011
    E _ 100
    F _ 101
    G _ 110
    H _ 111
    We get the following text:
    E G C B D B E F E G C B B D C D A E F E (the set of symbols of the first stage)
    Perform text coding using the Huffman Coding method.
    We shall obtain the following table:
    A _ 1010
    B _ 00
    C _ 111 (the data restoring of the set of symbols of the first stage)
    D _ 110
    E _ 01
    F _ 100
    G _ 1011
    We inscribe the text obtained from the original bit signal expressed by the symbols in the form of bit sequence using table obtained by Huffman Coding.
    We get the following set of bit signals:
    01101111100110000110001101111100001101111101010011 0001 (first stage result)
    2) Stage
    Imagine the compressed set of bit signals into initial data.
    Represent it (like the first stage) in the form of text symbols using the initial previously made table.
    We get the following set of symbols:
    D D H B E B E D D H A D D H C E G B (the set of symbols of the second stage)
    We perform the text coding using the method of Huffman Coding again.
    We shall obtain the following table:
    A _ 0111
    B _ 00
    C _ 010
    D _ 10 (the data restoring of the set of symbols of the second stage)
    E _ 111
    G_ 0110
    H _ 110
    Let write the set of symbols of the second stage as the bit sequence through the data restoring the set of symbols of the second stage received in the result of coding through Huffman Coding method.
    We get the following set of bit signals:
    10101100011100111101011001111010110010111011000 (second stage result).

    Here is shown just 2 stage, this method can have many stages and for better results can be used many different previously made up initial tables, different tables on different stages, different combinations of this tables and different algorithms of lossless data compression.

  2. #2
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Well... Yet another patented recursive algorithm... Why don't you make an "real" executable to discuss further? I think, "max-stage" option is sufficient.
    BIT Archiver homepage: www.osmanturan.com

  3. #3
    Member
    Join Date
    Mar 2011
    Location
    georgia
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    method has not max stage. in every step program must to calculate what table to use and check will it give enough compression or not.

  4. #4
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    This just won't work. Every time you're doing another huffman stage, your bitstring gets shorter, but you'll have to store additional information (the huffman tree). For random data, this will destroy your savings. Also, after doing it right, it's just a plain huffman coder on 3 bit symbols, neither something new nor something that will reach incredible compression ratios.

    To make it clearer: Your input data is 60 bits. You split it to 20*3 bits and "compress" it to 54 bits in the first stage. But now look at those 54 bits and tell me how to restore the 20*3 bits with it. You'll need the huffman codes for this, and a representation of the huffman codes will be larger than 6 bits, so you'll end up with more than 60 bits.

    Typical error for someone trying to develop a compression algorithm. You just looked at compression and didn't make sure it can be decompressed correctly, thus you overlooked additional information you'll have to encode. That's why the first thing replied to something like this is "do you have working code/working decompressor"?
    Last edited by schnaader; 11th March 2011 at 17:23.
    http://schnaader.info
    Damn kids. They're all alike.

  5. #5
    Member
    Join Date
    Mar 2011
    Location
    georgia
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Of course this method will newer work on 60 bits this is just example how it works i could not write there 60000 or more zeros and ones. In every step of this compression initial data of this step must be longer than compressed data with restoring tables, other ways it is useless.

    There is shown just 1 table with 3 bit symbols but compression program must include many different tables with 4 bit symbols 5 and more the tables will be there will more chances that compression will be useful. program will be more effective it there will be option which will use different table combinations like this
    EXAMPLE:
    A _ 000
    B _ 001
    C _ 010
    D _ 011
    E _ 100
    F _ 101
    G _ 110
    H _ 1110
    GH _ 1111

    If in data will be many combinations of "GH" together and "H" will be much little this can give opportunity to get more compression rate.

    Also every next step must be compressed with data which restores previous step and there will not be too many restoring tables which will take more and more place with every step.

    I don't have program which will compress file, but i have checked many files, compressed with many different programs and not compressed, on how much it can be compressed in 1 step and in most times i have 2-3% compression with restoring data. All time i use 4 bit tables 8 bit tables or 16 bit tables i could not use other options.

    This method after compression gives independent data and opportunity to work on it like not compressed data.

  6. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    the best place to discuss new compression ideas is comp.compression. here people cannot do anything btter than squeezing a few more percents out of 30-year old lz and ppm

  7. #7
    Member
    Join Date
    Mar 2010
    Location
    Canada
    Posts
    40
    Thanks
    0
    Thanked 2 Times in 2 Posts

    Working code is King!

    Quote Originally Posted by Bulat Ziganshin View Post
    the best place to discuss new compression ideas is comp.compression. here people cannot do anything btter than squeezing a few more percents out of 30-year old lz and ppm
    It may seem that way to you, it even seems that way at times to me. But the code they are working on

    ****WORKS***

    And in comp.compression the same question will be asked, have you tried writing code?
    Do you have a working compressor and a working decompressor? One without the other is worthless.
    Are you going to start demanding that others must write the code for you. No-one will, they have their own coding to do.

    Writing the code yourself will quickly point out your idea's weak points and help you develop your method.
    Last edited by Earl Colby Pottinger; 12th March 2011 at 08:00.

  8. #8
    Member
    Join Date
    Mar 2011
    Location
    georgia
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I can't write program which will work on maximum of this method because it needs too many calculations.


    If i will know that compressing 1, 2 or 3 special files will tell me that method will worth some thing or not i can make program for this special files or for some data and see how much it can be compressed with my method, but i don't know what to compress.

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,130
    Thanks
    179
    Thanked 919 Times in 467 Posts
    Your method really can provide incremental compression with multiple passes,
    if input file is big and redundant enough.
    But there's no sense to implement it, because other compression methods
    (even the same huffman coding, but applied to bytes) would do better even
    without multiple passes.
    Well, I guess it still can be a good programming exercise, but its not really
    a practical compression algorithm.

    Actual compression algorithms work by removing known dependencies in input data.
    For example, its hard to encounter a file which contains a sequence of 3-bit codes,
    but most files are generated as byte sequences.

    If the specific file format is known, it gives us even more information about possible
    redundancy.
    For example, a .bmp header contains fields like size of bitmap, size of header, size
    of whole file, which can be computed from image width/height/colorspace, which are
    stored separately. So a bmp compressor can replace these fields with single bit 1
    if they match values predicted from other fields, or bit 0 + full values (to keep it
    lossless when bmp structure is broken for some reason).

    Its just an example, but real compressors work by removing more complex dependencies
    of that kind. There're no "universal" compression methods which can compress all files
    without knowing anything about specific dependencies in input files.

  10. #10
    Member
    Join Date
    Mar 2011
    Location
    georgia
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I know that many other algorithms will compress better than this on different type of files, but this method works on binary data and gives opportunity to compress already compressed file with any other method.

    I can't proof that this method is universal and can compress every file, but it has too many options with manipulating on previously created tables and table combinations and option to use different lossless algorithms on different compression stages.

    This method can give if not "very good" results at last "some" results in compression, but i don't know how much it will worth or worth work on it or not.

  11. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,130
    Thanks
    179
    Thanked 919 Times in 467 Posts
    > this method works on binary data and gives opportunity to compress
    > already compressed file with any other method.

    All compression algorithms work with binary data, because all the data
    on modern computers is binary.

    Also you can't compress a file, which is already compressed with a good algorithm
    Of course, presuming that you won't forget to include the description of huffman tree
    into size of compressed file.

    Sure, you can compress a file, which is already compressed by _your_ algorithm,
    but it doesn't mean anything, because with multiple passes the result would be
    worse than with usual single-pass algorithms.

    > I can't proof that this method is universal and can compress every
    > file, but it has too many options with manipulating on previously
    > created tables and table combinations and option to use different
    > lossless algorithms on different compression stages.

    Yes, but there's no innovation in that.

    > This method can give if not "very good" results at last "some"
    > results in compression, but i don't know how much it will worth or
    > worth work on it or not.

    No, there's nothing new in your method, its a plain application of
    huffman coding. Sure it can be useful in some circumstances, but
    most engineers would be able to apply it without your help then.

    If you're interested in universal compression, I'd suggest to consider
    "Kolmogorov compression", based on http://en.wikipedia.org/wiki/Kolmogorov_complexity

  12. #12
    Member
    Join Date
    Mar 2011
    Location
    georgia
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    As much as i know there is no algorithm which will represent binary data with symbols and compress it using text compression algorithms.

    This is not a new algorithm this is just modification which gives many opportunity in representation data and work on already compressed data like initial.

  13. #13
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,470
    Thanks
    26
    Thanked 120 Times in 94 Posts
    As much as i know there is no algorithm which will represent binary data with symbols and compress it using text compression algorithms.
    Because that's pointless. You can convert binary to ASCII using base64 encoding and then compress, but how is that good?

    This is not a new algorithm this is just modification which gives many opportunity in representation data and work on already compressed data like initial.
    People tried MANY variations. Your algorithm is basically a base8 encoding - ie. base8 encoding doesn't exist, but for example base16 does. You can convert binary to base16 and then apply some Huffman coders, for example: http://huffman.sourceforge.net/ The resulting compression will almost always be worse than without your preprocessing step.

  14. #14
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by SLS View Post
    As much as i know there is no algorithm which will represent binary data with symbols and compress it using text compression algorithms.
    In computers, binary data is already represented as symbols (bytes). And a good example of a text compression would be PPM which works on bytes. I think, you're looking at wrong side of the problem. I highly advice to write a small compression program to make it clear. Even a simple RLE compressor could tell you something which is more effective than ours. Else, it's pointless to discuss further. Because, almost all people in here already know that kind of thing simply won't work.
    BIT Archiver homepage: www.osmanturan.com

  15. #15
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Indeed the discussion is pointless, SLS, you won't convince the bunch of old farts by talking. Go on and write the code to show them where's their place.

  16. #16
    Member
    Join Date
    Mar 2011
    Location
    georgia
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    As i said before i can't make normal working program on different files, but if i will know that compressing of some files will mean some thing i can make program but i don't know what to compress or on what data check this method.

  17. #17
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    200
    Thanks
    41
    Thanked 36 Times in 12 Posts
    It is already publicated from patent organization of my country so i can show how it works.
    Did you acquire a patent before discussing this here? Or does someone else have the patent?

    (Did you acquire a patent before any kind of test on any kind of test data to see if the fundamental idea succeeds?)

    (What land is this patent registered in? What is the patent number? These are important considerations for those of us examining your claims.)

  18. #18
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    he probably expects that we will implement his patented method. wise man!

  19. #19
    Member
    Join Date
    Mar 2010
    Location
    Canada
    Posts
    40
    Thanks
    0
    Thanked 2 Times in 2 Posts

    So what?

    Quote Originally Posted by SLS View Post
    As i said before i can't make normal working program on different files, but if i will know that compressing of some files will mean some thing i can make program but i don't know what to compress or on what data check this method.
    If you are not willing to do the work why should others cover up for your lack of will?

    Choose data that is of interest for you to compress, if you don't have anything what good is this program to you anyway? If it is just something to waste your time than choose some files at random and test your code with them.

    And don't talk about how slow your computer is, back in the 1980's I was programming on a 7MHz Amiga and I had test runs of my compression code running for an entire week to test how well they would work. Why can't you just leave your computer on for a while if the code runs slow on your machine?

    Stop asking others to do your work for you, asking for pointers to data for research ok. Asking others to make your decisions for you, bad.

  20. #20
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    http://mattmahoney.net/dc/dce.html#Section_11

    Also, the USPTO has granted patents on lots of random and recursive compression algorithms. Just because it's patented doesn't mean it works.

  21. #21
    Member
    Join Date
    Feb 2010
    Location
    Nordic
    Posts
    200
    Thanks
    41
    Thanked 36 Times in 12 Posts
    Really its amazing this kind of thing finds this forum. Which of us is registering as a joke, putting a message through babelfish a few times and finally to English so as to disguise it, and posting to enjoy trolling us??

    or do all comp.comp do is tell those who turn up there with these ideas to come here and ask us to implement it for them?


  22. #22
    Member
    Join Date
    Mar 2011
    Location
    georgia
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    There no one pushed some one to implement something or make something that you don't want, i just looking advice about method or what tests will mean something about this method and there is no problem at computer calculations there is need many calculations which must be done for program because without normal calculations, which i can't make, this method will not work on even 0.001% of files.


    Thanks every one who tried or will try to help, every one else can don't even read this and good luck.

Similar Threads

  1. loseless data compression method for all digital data type
    By rarkyan in forum Data Compression
    Replies: 157
    Last Post: 9th July 2019, 17:28
  2. Data Compression PC
    By encode in forum The Off-Topic Lounge
    Replies: 202
    Last Post: 3rd January 2019, 23:28
  3. Sac: (State-of-the-Art) Lossless Audio Compression
    By Sebastian in forum Data Compression
    Replies: 38
    Last Post: 25th April 2016, 16:01
  4. Comparison of lossless PNG compression tools
    By Surfer in forum Data Compression
    Replies: 54
    Last Post: 19th September 2011, 22:58
  5. Replies: 13
    Last Post: 2nd April 2010, 23:46

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •