RULE OF "30"

The rule of 30 is a criterion to establish how many errors should be observed in order to estimate the error rate with a given uncertainty (at some confidence level, say 90%, 95% or 99%). Basically it states that, if R is the experimental error rate, ie, R = E/N where E is the number of errors and N is the number of trials, when E=30 errors we can say

There is also a rule of "3". It says that when the error number is zero (E=0) the probability of error rate is 3/N with 95% confidence level.

Proof

We assume that the error probability p is not a random variable. The probability of E errors in N trials is given by the binomial distribution

P(E|N) = (N choose E) pE (1-p)N-E

By maximiziong the likelihood function L=P(E|N) we have the experimental error rate R=E/N.

For large N (say N>10) the binomial distribution is almost a gaussian with meam Np and standard deviation ... . The variance of the observed errors is sE2=Np(1-p), so the standard deviation of the maximum likelihood estimator is

sR2 = p (1-p) / N = E (1-E/N) N-2

Therefore R is within +/- k sR of p with confidence Erf(k). When k=1 we have 68% confidence, when k=2 we have 95% confidence. Thus p is within +/- k sR of R with the same confidence. The error uncertainty of the error rate is

2 k sR R-1 = 2 k sqrt( (1-E/N)/E )

which is almost 2 k / E for large N.



Marco Corvi - Page hosted by geocities.com.