Image Processing and Recognition

RULE OF "30"

The rule of 30 is a criterion to establish how many errors should be observed in order to estimate the error rate with a given uncertainty (at some confidence level, say 90%, 95% or 99%). Basically it states that, if R is the experimental error rate, ie, R = E/N where E is the number of errors and N is the number of trials, when E=30 errors we can say

with 99% confidence, the error is within 0.50 R and 1.50 R;
with 95% confidence, the error is within 0.60 R and 1.40 R;
with 90% confidence, the error is within 0.66 R and 1.33 R;

There is also a rule of "3". It says that when the error number is zero (E=0) the probability of error rate is 3/N with 95% confidence level.

Proof

We assume that the error probability p is not a random variable. The probability of E errors in N trials is given by the binomial distribution

P(E|N) = (N choose E) p^E (1-p)^N-E

By maximiziong the likelihood function L=P(E|N) we have the experimental error rate R=E/N.

For large N (say N>10) the binomial distribution is almost a gaussian with meam Np and standard deviation ... . The variance of the observed errors is s_E²=Np(1-p), so the standard deviation of the maximum likelihood estimator is

s_R² = p (1-p) / N = E (1-E/N) N^-2

Therefore R is within +/- k s_R of p with confidence Erf(k). When k=1 we have 68% confidence, when k=2 we have 95% confidence. Thus p is within +/- k s_R of R with the same confidence. The error uncertainty of the error rate is

2 k s_R R^-1 = 2 k sqrt( (1-E/N)/E )

which is almost 2 k / E for large N.

Marco Corvi - Page hosted by geocities.com.