> how to update the probability that the next value will be a 0 or a 1 ?

Suppose you've encountered n0 zeroes and n1 ones,

then the entropy of the processed string is

l[p0_,n0_,n1_]:=n0*Log[2,p0]+n1*Log[2,1-p0]

and by solving l'(p)=0 we'd be able to find the p value

which minimizes the entropy.

http://www.wolframalpha.com/input/?i...%3D%3D0%2Cp%5D

which is p=n0/(n0+n1).

But its a posterior estimation.

(ie its the best approximation of static generator's parameter, given the known bits,

but its not necessarily optimal for prediction of the next bit.)

And for a prior estimation there's a useful method based on likelihoods:

http://www.wolframalpha.com/input/?i...%2Bn0%5D%29%5DCode:// the probability of a string with n0 zeroes and n1 ones: c[n0_,n1_]:=1/Binomial[n0+n1,n0]; // the probability of the next string being the one with extra zero: p=c[n0+1,n1]/(c[n0+1,n1]+c[n0,n1+1])

Same method actually can be also used with string probabilities based on entropy

with substituted posterior p=n0/(n0+n1), ie

c[n0_,n1_]:=1/2^l[n0/(n0+n1),n0,n1] =(n0/(n0+n1))^n0*(n1/(n0+n1))^n1

but entropy formula is an approximation (it assumes p=n0/(n0+n1), while actual information

is that there're n0 zeroes and n1 ones; these are not the same at all.),

so we'd end up with a pretty weird estimation:

p=1/(1+(n0^n0*(n1+1)^(n1+1)/(n0+1)^(n0+1)/n1^n1))

Although Shkarin was actually advertising it (as "MP-estimator") -

http://encode.ru/threads/997-Counter...ll=1#post20146

and its supposedly better than plain (n0+d)/(n0+n1+2*d) in practical use.

(see http://nishi.dreamhosters.com/u/mp2a.rar)

Actually this approach can be improved even further by taking into

account the probability distribution of p values.

For example,

gives p=(n0+1)/(n0+n1+2)Code:c[n0_,n1_]:=Integrate[ PDF[BetaDistribution[1,1],p]*(1-p)^n1*p^n0, {p,0,1}, Assumptions->{Element[p,Reals],Element[n0,Integers],Element[n1,Integers],n0>=0,n1>=0}] p=FullSimplify[c[n0+1,n1]/(c[n0+1,n1]+c[n0,n1+1])]

and

gives p=(n0+.5)/(n0+n1+1) (the Krichevski-Trofimov estimator)Code:c[n0_,n1_]:=Integrate[ PDF[BetaDistribution[1/2,1/2],p]*(1-p)^n1*p^n0, {p,0,1}, Assumptions->{Element[p,Reals],Element[n0,Integers],Element[n1,Integers],n0>=0,n1>=0}] p=FullSimplify[c[n0+1,n1]/(c[n0+1,n1]+c[n0,n1+1])]

But its commonly more interesting to use secondary statistics and numerical integration there.

Its actually completely practical if we'd apply these formulas to initialization of a

state-to-probability map or some such.

Btw, the plot of BetaDistribution[1,1] is this: http://www.wolframalpha.com/input/?i...2C0%2C1%7D+%5D

(ie all p values have the same probability)

and BetaDistribution[1/2,1/2] is this: http://www.wolframalpha.com/input/?i...2C0%2C1%7D+%5D

(ie probabilities near 0 and 1 are more likely to appear)

Of course, when we know that the data is not random, the latter is a more reasonable assumption.

But we'd only know the actual distribution after collecting the statistics from various contexts in a real CM.

--------------

1. This was originally a comment on Cyan's blog, but then I tried to fix some typos...

2. "Mathematica" syntax is used.