1. ## Integer compression

Hello users,

I am currently looking for an integer compression algorithm.
My sample number is attached.

It is possible to interpret this huge number with much smaller amount of data? Maybe with integer factorization or similar math operations? Or with an algorithm?

Thanks a lot for your willingness.

Best regards,
CompressMaster

2. The question is (as always): where did you get this number from? How did you calculate/generate it?

3. This number has been generated randomly by myself.

4. Originally Posted by CompressMaster
This number has been generated randomly by myself.
seed(X)

for (int i=0; i<N; i++) rndInt();

save X and N, it's max compression

5. SpyFX,
thanks for your reply. But I am afraid I´m don´t understand you properly...

"X" stands for an integer?
Could you give me detailed description of an algorithm you´ve described?
Thanks.

6. CompressMaster, how did you generate that number exactly.

7. By typing RANDOMLY numbers, of course.

8. Aha, I see.
Typing "random" digits - the result is usually not random. (Your favorite digits are 6 and 4 (151 and 134 times; 20% and 18%). 0 occurs only 17 times (2%). This is not random.)

And what is your purpose? Why would you like to compress it?

9. Originally Posted by CompressMaster
By typing RANDOMLY numbers, of course.
The probability of creating a distribution of decimal digits like this:

0: 17
1: 92
2: 25
3: 42
4: 134
5: 58
6: 151
7: 92
8: 55
9: 59

with a random number generator is at most one in a googol.

Originally Posted by CompressMaster
It is possible to interpret this huge number with much smaller amount of data? Maybe with integer factorization or similar math operations? Or with an algorithm?
The data can be compressed to a smaller amount by almost any compressor. Without additional details on the nature of the data or desired performance goals it is impossible to recommend a particular compressor.

10. he was described the source, we just need to build model of It

11. ## The Following User Says Thank You to Bulat Ziganshin For This Useful Post:

12. Originally Posted by Kennon Conrad
The probability of creating a distribution of decimal digits like this:

0: 17
1: 92
2: 25
3: 42
4: 134
5: 58
6: 151
7: 92
8: 55
9: 59

with a random number generator is at most one in a googol.

this is nonetheless possible and an eventual 'certainty' , keep at it .... or even better try improves on this 'certainty' : )

13. Originally Posted by LawCounsels
this is nonetheless possible and an eventual 'certainty' , keep at it .... or even better try improves on this 'certainty' : )
Yes, there is a chance. This is why people play the lottery.

14. Originally Posted by LawCounsels
this is nonetheless possible and an eventual 'certainty' , keep at it .... or even better try improves on this 'certainty' : )
supposes we want to make use of NN as 'blackbox' tool

whats the current brief rationale for NN in latest state of art lossless data compressions ?

15. Originally Posted by LawCounsels
supposes we want to make use of NN as 'blackbox' tool

whats the current brief rationale for NN in latest state of art lossless data compressions ?
FURTHER can a generic neural network structures ( not purpose made into eg RNN structure nor prior endowed with anything at all eg Arithmetic Coder capability ) be made to learn on its own to effective compress data lossless ?

16. AND not training with target sets ( eg known target Huffman code etc ) , ONLY train comparing output is smaller than input & reconstructed correct

Can generic feed forward neural network be made to on its own learn to invent devise Huffman Codes/ Arithmetic Codes or even ANS or something newer ?

17. No.

18. Originally Posted by Gotty
No.
relaxes the requirements now allowed to train feed forward NN with target optimum Huffman encoded compressed of training input sets , can it now learn to correct Huffman encode tests new input sets ( not even requires that it now discovered Huffman code ) ?

19. I have problems understanding your sentences.

20. ie if we now train the feed forward NN with training input binaries data AND comparing to known Huffman encoded compressed output binaries data to back-propagate adjust weights/ bias , can the NN then correct Huffman encode entire new test input binaries data produces correct compressed output binaries henceforth ?

probably NOT an immediate straight 'no' here

21. it now able correct Huffman encode new tests input binaries , while at the same time has complete no knowledge of the underlying Huffman code sets

22. Your sentences are still very cryptic. I barely understand what you mean.
So you want to teach a neural network how to Huffman encode a file?

23. Originally Posted by Gotty
So you want to teach a neural network how to Huffman encode a file?
yes , eg train it on eg 1000 binary bits ( eg consisting symbols A(00) B(01) C(10) D(11) ) input data sets and compare its compressed outputs to pre-known Huffman encoded target binaries sets ( to back propagate adjust weights/bias )

NN should then able correct Huffman encode compress entirely new test input binaries data

( while NOT knowing anything about the underlying Huffman codes eg A is now 0 B is now 10 C is now 110 D is now 111 etc )

24. Aham.
Still no.
Sorry.

A (feed forward) Neural Network is suitable for classification, function approximation, but not for such a complex task as finding an unknown algorithm.
If I gave you 1000 "binary bits" in its original and Huffman encoded format - could you figure out the symbols?

25. Originally Posted by LawCounsels
yes , eg train it on eg 1000 binary bits ( eg consisting symbols A(00) B(01) C(10) D(11) ) input data sets and compare its compressed outputs to pre-known Huffman encoded target binaries sets ( to back propagate adjust weights/bias )

NN should then able correct Huffman encode compress entirely new test input binaries data

( while NOT knowing anything about the underlying Huffman codes eg A is now 0 B is now 10 C is now 110 D is now 111 etc )

we train feed forward NN with many sets of 1,000 binaries ( wherein A is 00 B is 01 C is 10 D is 11 , we know this BUT NN does not know this ), compare its output with target sets of corresponding Huffman encoded binaries ( therein now A is 0 B is 10 C is 110 D is 111 BUT NN does not know this ) to back propagate adjust weights/bias

thereafter NN can produce correct Huffman encoded output binaries , from entirely new unseen before input 1,000 binaries [ even though it now still has no notion of the optimum Huffman code A is 0 B is 10 C is 110 D is 111 ]. It simply able maps input 1,000 binaries direct into Huffman encoded output binaries , without even knowing the algorithm [ hence NN here is not required to figure out Huffman codes , not required to figure out algorithm ]

26. here NN just needs the correct sets of weights/ bias

27. No, it does not work like that.
If I gave you many-many-many sets of 1000 "binary bits" in its original and Huffman encoded format - could you figure out the symbols?
Try it.

28. first simplify this so ALL sets of 1,000 bits all consists of A(00) B(01) C(10) D(11) and all exact same probabilities ( 50% , 25%, 12.5% , 12.5% )

provable there exists sets of weights/bias that maps ALL such sets of 1,000 binaries input to correct Huffman encoded compressed output binaries

29. No. It won't work.

How many inputs and how many outputs do you have in such a Neural Network?
What exactly will you feed to the input(s) of that Neural Network? And how do you interpret the output(s)?

30. 1,000 input nodes, 1 node corresponds to 1 binary bit

Output number of nodes some fixed C( ) value corresponds to symbol occurrences

Here it's trivial can just initialize correct sets of weights/bias such that successive 2 input nodes weighted to produce successive output 0 10 110 111

This is not meant to be 2^large # of nodes

31. In your concrete case ("A is 0 B is 10 C is 110 D is 111") how many outputs do you have? How do you interpret them?
p.s. You don't need to write "binary bit". You probably know that "bit" is the portmanteau of "binary digit".

Page 1 of 2 12 Last