The problem of "counting ones"

by kyungmi lim (http://geocities.yahoo.com/kyungmi_lim/)
Feb. 2005

A word about the notations.
It is hard to write math in plain text. One can write math
in TeX, but that makes an awkward reading.  So i use notations
somewhat in between:
   powers:  N^2,  N^(B-1)
   summation: Sum_{ ... } with the range in the braces
   subscripts: N_b, N_(B-1)
   tuples: (a,b,...,z)


Marco came up with several conjectures on numbers in base B.
One of them says:
  for a given positive integer number N lets define
      D(N) = Sum_{n <= N} #(1,B,n) - N
  where #(1,B,n) is the numbers of digit '1' in the representation
  of 'n' in base B.
  there exists a finite number of positive integers N that satisfy
      D(N) = 0
  and the largest number is 

  (1)    N_B = (1,1, ..., 1,0) in base B

  ie, N_B = Sum_{k=1 .. B-1} B^k

My homework is to prove this conjecture. I divide the proof in two steps
  -   (2)    D(N_B) = 0
  -   (3)    for any N > N_B, D(N) > 0

------------------------------------------------------
Proof of (2).

For convenience write
  #(n) = #(1,B,n)
since the base, B, and the digit, '1', are fixed.

Lets consider an example, B=5, N_B=11110, and count the '1's
in base 5.

     0
     1
     2
     3
     4 ............................................. D(4) = 1
    10   20   30   40
    11   21   31   41
    12   22   32   42
    13   23   33   43
    14   24   34   44 .............................. D(24) = 2 * 5
   100  110  ...  140  200  ... 300 ... 400 ... 440
   101    .                                 ... 441
   102    .                                 ... 442
   103                                      ... 443
   104                                      ... 444  D(124) = 3 * 5^2
  1000 .... 1400 2000 .... 3000 .... 4000 ....
  1001
   ... 
  1044 ....................................... 4444  D(624) = 4 * 5^3
 10000 11000 ... 20000 ... 30000 ... 40000
   ...   ... ... 20001 ... 30001 ...   ...
 10100 11100 ...   ... ...   ... ... 44000
     .   ... ...   ... ...   ... ...   ...
     . 11110 ...   ... ...   ... ... 44400
     .   ... ...   ... ...   ... ...   ...
     .   ... ...   ... ...   ... ... 44440
 10444   ... ...   ... ...   ... ... 44444

From this example we see that
  - there are B^d numbers from 0 to B^d - 1
  - there are B^(d-1) digit '1's in each column, and there are d columns
Therefore

     (4)  S(d) = Sum_{n=0 .. B^d-1} #(n) 
               = d * B^(d-1)

In particular this gives

     (5)  S(B-1) = Sum_{n=0 .. B^(B-1)-1} #(n) 
                 = (B-1) * B^(B-2)

How many '1' are there from B^(B-1) to N_B, inclusive ?
Lets go back to the example and see the structure of the numbers.
Define
                    ___B-k___ ___k___
     (6)    s(k) = (1,1,...,1,0,...,0)  

from s(B-1) to s(B-2)-1 there are B^(B-2) numbers, and B^(B-2)+S(B-2) '1's,
from s(B-2) to s(B-3)-1 there are B^(B-3) numbers, and 2*B^(B-3)+S(B-3) '1's,
...
from s(2) to s(1)-1 there are B numbers, and (B-2)*B+S(1) '1's, and
s(1) is just 1 number and has (B-1) '1's.

From these and (5) we have
         Sum_{n=0 .. N_B} #(n) = (B-1) * B^(B-2) 
                             + B^(B-2) + (B-2)*B^(B-3)
                             + 2 * B^(B-3) + (B-3)*B^(B-4)
                             + ...
                             + (B-2) * B + 1
    (8)                      + (B-1)
             = B^(B-1) + B^(B-2) + B^(B-3) + ... + B^2 + B
             = Sum_{n=1 .. B-1} B^n
             = N_B

This proves eq. (2).

-------------------------------------------------------
Proof of (3)

We will prove it in three steps:
  - for the numbers from N_B+1 to 2*B^(B-1)-1
  - for the numbers from 2*B^(B-1) to N_f=B^B - 1
  - for the numbers greater or equal to B^B

[1] - - - - - - - - - - - - - - - - - - - - - - - - - - -
Looking again at the example we notice that from N_B + 1 to 2*B^(B-1)-1,
each number has at least one '1' (and N_B + 1 has B '1's). Therefore
D>0 for these numbers. Using the leading '1's to compensate for the increment
of natural numbers, D(2*B^(B-1)-1) is just the sum of '1's in the remaining
columns (from the second to the rightmost) in these numbers.

        D(2*B^(B-1)-1) =
          = S(B-1) - [ Sum_{n=B^(B-1) .. N_B} #(n) - (N_B - (B^(B-1) - 1)) ]
  (9)     = (B-1)*B^(B-2) - [ B^(B-2) + Sum_{k=1 .. B-2} B^k
                              - Sum_{k=1 .. B-1} B^k + B^(B-1) - 1 ]
          = (B-2) * B^(B-2) + 1

For B=2 this is 1 and the next number is B^B.
For B>=3, the next step is to show that all integers from 2*B^(B-1) to 
N_f = B^B - 1 have D(n) > 0.  Then we think about the numbers that are 
greater or equal than N_f.

[2] - - - - - - - - - - - - - - - - - - - - - - - - - - -
Let m(k) = k * B^(B-1).  Every block from m(k) to m(k+1)-1 has exactly
the same sequence of numbers if we disregard the digit 'k' in the first
column.  There are B^(B-1) numbers in each block.  Moreover the 'k's in
the first column does not contribute to the sum of '1's.  Therefore if we
write the numbers in the block k as n(k,x) = m(k)+x = k*B^(B-1) + x, 
with 0 <= x < B^(B-1),

  Sum_{n=0 .. n(k,x)} #(n) = Sum_{n=0 .. k*B^(B-1)} #(n) + Sum_{n=0 .. x} #(n)

For B-1>= k > h >=2,

  D(n(h,x)) = Sum_{n=0 .. n(h,x)} #(n) - n(h,x)
            = Sum_{n=0 .. k*B^(B-1)} #(n) - (k-h) Sum_{n=0 .. B^(B-1)} #(n)
              + Sum_{n=0 .. x} #(n) - (k * B^(B-1) - (k-h) * B^(B-1) + x )
  (10)      = D(n(k,x)) - (k-h) * S(B-1) + (k-h) * B^(B-1)
            = D(n(k,x)) + (k-h) * B^(B-2)

Therefore if D>0 for every number in the last block, k=B-1, then D>0 for
every number in the other blocks as well, ie, from 2*B^(B-1) to N_f=B^B - 1.

Compute D for N_f and m(B-1) - 1,

       D(N_f) = Sum_{n=0 .. N_f} #(n) - N_f
              = S(B) - (B^B - 1)
   (11)       = B * B^(B-1) - B^B + 1
              = 1

       D(m(B-1) - 1) = Sum_{n=0 .. (B-1)*B^(B-1)} #(n) - (B-1)*B^(B-1) + 1
                     = S(B) - S(B-1) - (B-1)*B^(B-1) + 1
   (12)              = B*B^(B-1) - (B-1)*B^(B-2) - (B-1)*B^(B-1) + 1
                     = B^(B-2) + 1

D(m(B-1)) is less than the number of integers in the block from m(B-1) to
N_f.  We need to check D inside the block.  The strategy is to break the block
into segments and use the positive amount of D at the entrance of each
segment to compensate for the increment of natural numbers inside the segment,
without computing D for each number.  In fact if we can segment the block in
a way that the entrance value of D for each segment is greater than the amount
of numbers it contains we are sure that D is positive for every number in 
the segment. Therefore D>0 in the whole block, which is what we need to prove
for the second step.

Let's break the block such that the closer a segment is to N_f the shorter 
it is.  Define
                 ____B-k____ ____k____ 
   (13)  r(k) = (B-1,...,B-1,0,0,...,0)      1<=k<=B-1

with segments:
         r(1) --> N_f    for k=1. It contains B numbers.
         r(k) --> r(k-1) for 1<k<=B-1. It contains B^k-B^(k-1) numbers.

In general, for k>=1, 
 
   (13')    D(r(k)-1) = Sum_{n=0 .. B^B} #(n) - S(k) - (B^B - B^k -1)
                      = B^k - k * B^(k-1) + 1

which is less that B^k - B^(k-1).  For the segment k=1, D(r(1)-1) = B, and
there are B-1 numbers from r(1) to N_F-1. Therefore D>0 in this segment.

For k>=2 further segmentation is needed.  Define

                    ____B-k____   ___k-1___
   (14)   r(k,a) = (B-1,...,B-1,a,0,0,...,0)   0<= a < B-1,   2 <= k <= B-1

with segments r(k,a) --> r(k,a+1)-1 which contain B^(k-1) numbers.  Clearly
r(k,0) = r(k), and r(k,B-1)=r(k).

   (15)   D(r(k,0)-1) = Sum_{n=0 .. B^B} #(n) - S(k) -(B^B - B^k -1)
                      = (B-k)*B^(k-1) + 1

Since D(r(k,0)-1) > B^(k-1), D>0 for each number in the segment a=0.  The
following B^(k-1) numbers (segment a=1) have at least one '1'. Therefore 
D>0 for them as well.  In general, for segments a>=2,

          D(r(k,a)-1) = (B-k)*B^(k-1) + 1 + a*S(k-1) + B^(k-1) - a*B^(k-1)
                      = (B-k+1-a)*B^(k-1) + 1 + a*(k-1)*B^(k-2)

The derivative of this with respect to a is negative,
   
    dD(r(k,a)-1)/da = (-B+k-1)*B^(k-2) < 0 

therefore D is monotonically decreasing function of 'a'.  It attains the
minimum at a=B-2, 

          D(r(k,B-2)-1) = (-k+3)*B^(k-1) + 1 + (B-2)*(k-1)*B^(k-2)
    (16)                = B^(k-1) + (B-2*k+2)*B^(k-2) + 1

This is not always greater than B^(k-1).  Let's introduce smaller segments.
Define
                     ___B-k____     ___k-2__
    (17) r(k,a,b) = (B-1,..,B-1,a,b,0,0,..,0)   2<=k<=B-1, 2<=a<=B-2, 0<=b<=B-1

with segment r(k,a,b) --> r(k,a,b+1) containing B^(k-2) numbers.  From 
eq. (16)  D(r(k,a,0)-1) > B^(k-2), therefore D>0 for the segment b=0.  In the
next segment (b=1) each number has at least one '1'.  Therefore D>0 for 
segment b=1.  For b >= 2

         D(r(k,a,b)-1) = (B-k+1-a)*B^(k-1) + 1 + a*(k-1)*B^(k-2) + b*S(k-2)
                         + B^(k-2) - b*B^(k-2)
                       = (B-k+1)*B^(k-1) + B^(k-2) + 1 - a*(B-k+1)*B^(k-2)
                         - b*(B-k+2)*B^(k-3)

and its derivative with respect to 'b' is negative:

    dD(r(k,a,b)-1)/db = - (B-k+2)*B^(k-3)

Therefore it is monotonically decreasing function of 'b' and its minimum
(at a=B-2, b=B-1) is

      D(r(k,a,b)-1)_min = (B-k+1)*B^(k-2) + (B-k+2)*B^(k-3) + 1 
                        > B^(k-2)

Hence D>0 for each segment of 'b' for 2<=k<=B-1, 2<=a<=B-2, 2<=b<=B-1.
We conclude that D(n)>0 for each number from r(B-1,0,0)=(B-1)*B^(B-1) to N_f.

This complete the proof that D>0 for each number from 2*B^(B-1) to N_f.

[3] - - - - - - - - - - - - - - - - - - - - - - - - - - -
The last step shows that D(n)>0 for all n>N_f=B^B. For k>=0, 

     D(B^(B+k)-1) = S(B+k) - (B^(B+k)-1)
                  = 1 + k*B^(B+k-1)
                  > 0

Since from B^(B+k) to 2*B^(B+k) every number has '1' in the first column,
D>0 for these numbers (there are B^(B+k) of them).  We proceed as we did
in step [1].

     D(2*B^(B+k)-1) = 1 + k*B^(B+k-1) + S(B+k)
                    = 1 + 2*k*B^(B+k-1) + B^(B+k)
                    > B^(B+k)

Therefore we can proceed with the next B^(B+k) numbers.  In general for
2 <= a <= B-1,

     D(a*B^(B+k)-1) = 1 + (a-1)*k*B^(B+k-1) + S(B+k) 
                    = 1 + a*k*B^(B+k-1) + B^(B+k)
                    > B^(B+k)

Therefore D is always positive for the numbers from B^(B+k) to 
(B-1)*B^(B+k) + B^(B+k) - 1.  since this is true for any k>=0, D>0 for
all n>N_f, and this completes the proof of (3).