Euler’s Formula

Ways of summing series go back to the ancient Greek Pythagoreans who predated Euclid by hundreds of years. We find odd references to these early ‘concrete number’ mathematicians in Aristotle and in Speusippus’s later book on numbers which was allegedly based on the work of the Pythagorean, Philolaus. Quite a lot can be derived from messing about with stones or marbles which is how everything got started. Consider the following  never-ending triangle. Since we start with unity and each layer has one more marble, or dot, than the previous one layer, it is a concrete representation of what we today call the ‘natural numbers’.

0
00
000
0000
00000

Cutting off the pile at successive layers gives us the ‘triangular numbers’ (by convention unity is considered the first ‘triangular’, or for that matter ‘square’ or ‘polygonal’ number)
      0

     0
     00

     0
    
00
    
000

Listing the successive triangles as Δ1 Δ2 Δ3 we can build up a table which, in modern terms, is

Δ1 =   1
Δ2 =  1 + 2 = 3
Δ3 = 1 + 2 +3  =  6
Δ4 = 1 + 2 +3  + 6  = 10………
Δn = 1 + 2 +3  + 6  + 10  ……. + n

i.e. the sum of the first n ‘natural numbers’ as we call them today constitutes the nth triangular number.

This may not at first seem particularly illuminating. But wait. If we invert a triangular number, place it above the same triangular number and close up, we obtain a rectangular number where the longer side is greater than the shorter side by a single unit. This is what the Greeks called an ‘oblong’ number (etemhkhs)

If the original number has a bottom layer of n units, the resulting rectangle will have a bottom layer of (n+1) and the rectabgle will be (n+1) × n . So

Δn + Δn = (n+1) × n or     2 Δn =   n(n+1) as we would write it.

But we know that the nth triangular number is the sum of the ‘natural numbers’ from 1 to n. Thus,

2 (1 + 2 + 3…….+ n) = n(n+1) or

(1 + 2 + 3…….+ n) = n(n+1)/2

This is probably the first known formula for summing a series — though Euclid would have expressed it in either a verbal or geometric form. So the sum of the successive natural numbers from 1 to 4 say is just half of 4 ×5 or 10.    And in effect 1 + 2 + 3 + 4 = 10

It is by no means a trivial piece of mathematics. If you want to know the sum of the natural numbers from 1 to 50 inclusive, for example, you don’t need a calculator, or even pen and paper. The result, according to the above formula, is  half of 50 × 51, or  2550/2 = 1275 According to the story, Gauss’s teacher, when Gauss was only ten years old, gave the class a problem in addition involving a hundred successive numbers. Gauss flung his slate onto the table with the correct answer written on it when the teacher had hardly finished speaking, saying in his uncouth German dialect ‘Ligget se’ (‘There it is’). He had discovered for himself the above formula in a slightly more advanced form where the first number was not unity.

By the time of Newton and his immediate successors formulae for sums of low powers of positive integers were already well-known, e.g.

1 + 2 + 3 + 4 …. + n         =   ½ n2 +  ½ n

12 + 22 + 32 ……+ n2 =   1/3 n 2 +  ½ n2 + 1/6 n

13 + 23 + 33 ……+ n3 =   ¼ n4 + ½ n3 + ¼ n2

and Bernoulli had shown how to extend these results to higher powers of any order.

But what about summing the reciprocals of the natural numbers?  It was known that the Harmonic Series, 1 + ½ + 1/3 + ….1/n , did not produce a recognizable finite sum even though the terms became progressively smaller — in modern terms it was a divergent series.

Did the same apply to

1 + ½ 2 + 1/32 + ¼2 +……….+ 1/n2 ?

Mathematicians at the time suspected that the above did have a limiting value but for a long time Europe’s finest mathematicians were unable to point to a recognizable ‘final sum’ —  we would say an ‘upper limit’ to which the partial sums converge — and it came as a surprise when Euler produced the following famous formula  like a rabbit out of a hat.

1 + (1/2)2 + (1/3)2 ……+ (1/n)2 = π2/6

Once again, as in Leibnitz’s elegant formula for summing

1 1/3  + 1/5 −  1/7  …. + 1/(2n-1)  = π/4

the irrationl number   π makes an appearance in an unexpected context.

Euler was a mathematician of surpassing brilliance — probably one of the half dozen greatest of all time — but, by modern standards, his proofs were lacked rigour. He was an explorer of mathematics rather than a lawmaker. Today, the above formula is generally proved by using Complex Numbers which had not even been invented during Euler’s lifetime. I find this highly objectionable since the  proof of a theorem that falls within a certain domain of mathematics should not, as I see it, use techniques from a more advanced domain. Those few persons who dared to object to Wiles’s extremely complicated modern ‘proof’ of Fermat’s Last Theorem employing mathematical  techniques invented about four hundred years after Fermat’s death were entirely in the right.

So what was Euler’s proof? It relied principally on results drawn from the theory of equations which, for the benefit of those who are innocent of the subject, I will briefly sketch out.

A whole range of problems that come up very often in practice can be stated in the form of so-called polynomials which are equated to zero. The canonical (preferred standard) form is

An xn + An-1 xn-1 + ………..+ A1 x + A0 =  0

The An , An-1­ and so forth are coefficients that do not involve the variable x while the last coefficient A0 is a constant since it does not accompany any power of x (unless we imagine it accompanying x0 = 1). A polynomial is deemed to have the degree of its highest term, the n term.  For example,
x2 – 3x + 2
has degree 2 and the polynomial 8x5 – 3x2 + 1 has degree 5.

(Note that a polynomial of degree n does not necessarily have terms in all the possible powers — in the last example there is no term in x3 for example. If we wish, we can consider that there are terms in all powers of x up to an including the highest but that some of them have zero coefficients and so they can be discounted. Thus, being somewhat pedantic, the last polynomial could be written as

8x5 + 0x4 + 0x3 – 3x2 + 0x1 + 1

A polynomial will take different values according to the value we give to x where x is, at least in so-called ‘elementary’ textbooks,  taken as being a real number. Any polynomial expression can be plotted on  a graph and in the vast majority of cases you are likely to come across, the graph of the function will cross the x-axis, which means that the functional value will be zero at this point. The value of x producing this result can itself occasionally be zero, as for example in the case of the first degree polynomial 4x = 0. But such a situation is unusual. More often than not, the value of x producing a zero functional value will not be zero, as in the case of, say, 2x2 – 18 when the graph of the function will cross the x-axis at the point  x = 3. I mention this because it is confusing for the layman to find mathematical textbooks talking about the ‘zeroes’ of a polynomial, or other, algebraic expression — what they mean is “the value of x which makes the function sum to zero”.

Such a value is also sometimes called  a ‘root’ of the relevant equation. It is a basic theorem (or alternatively an axiom) that a polynomial of degree n cannot have more than n distinct roots and this result, first proved rigorously by Gauss, is so important that at one time it was known as  the Fundamental Theorem of Arithmetic. In most cases you are likely to come across there will be exactly n different roots though you may encounter cases where there are so-called ‘equal roots’.

Again, there are a large number of simple looking equations which have no real roots at all — this was one of the reasons for extending the real (or ordinary) number system to embrace so-called Complex Numbers. For example, if you search for a (real) value of x which will make x2 + 4x + 5 go to zero, you will search in vain : the graph of this function does not cross the x-axis at all. However, we need not concern ourselves with this issue for the moment : at the time Euler was working complex numbers, though not far over the horizon, had not been properly investigated and defined.

Sometimes, instead of using the equation

An xn + An-1 xn-1 + ………..+ A1 x + A0 =  0

we make a very minor modification and eliminate the first coefficient, An, or rather change it to 1. (This is quite legitimate because we could in any case divide right through by this coefficient and thus produce An-1/An , An-2/An and so forth as the respective lower degree coefficients.) We have then

xn + An-1 xn-1 + ………..+ A1 x + A0 =  0

If we do this we find that the second highest coefficient, An-1 actually indicates the sum of the roots, and that the final coefficient, the constant term, gives us the product of the roots —  this is something that never ceases to amaze me. More specifically, the second coefficient, An-1, gives us the negative of the sum of the roots, i.e. (-1) Sum  roots, and the constant term, A0 , gives us either the product of the roots, or the negative of this product, depending on the degree of the polynomial. If the polynomial is of even degree, then A0 = Product roots and if it is of odd degree, then  A0 = (-1) Product roots.

As an example take an equation of degree 2 selected at random

x2 – 10x + 24  =  0

According to what I have just stated, the sum of the roots is 10 (because the coefficient is negative, A1 = (-10)), and the product of the roots is 24. Calling the two roots w1 and w2 we can work out what the roots are from this information.

w1 + w2 =  10

w1 w2 =  24

You can most likely guess what the roots are by inspection. Otherwise, using the well-known algebraic relation

(a + b)2 − (a – b)2 =  4 ab

I write

(w1 + w2)2 (w1 - w2)2 =   4 w1 w2
102
(w1 - w2)2 =   4  (24)   =  96
(
w1 w2)2 =   100 96  =  4
so w1 w2 =   2

which combined with  w1 + w2 =   10 means  
w
1 =  6
w
2 =  4

Fitting in these values to the expression, lo and behold everything cancels out and we are left with zero as desired since
62 – 10. 6  + 24  =  36 – 60 + 24  =  0

42 – 10.4 + 24   =  16 – 40 + 24   =  0

You can try out examples of your own (though beware of sometimes getting equations with no solutions in real numbers).

All the other coefficients, if there are any, of a polynomial can be written as combinations of the roots. For example, the third highest coefficient, An-2 , turns out to be the combined sum of the product of the roots taken in pairs. In our second degree equation, we only have two roots, so this coefficient is in effect the product. But, in a cubic equation thrown into canonical form

x3 + A2 x2 + A1 x2 + A0 =  0

A1 will be  (w1w2 + w1w3 +  (w2w3 ).  The next coefficient An-3, if there is one, will be the negative of the roots taken together in threes, and so on. (If this is new to you and you are interested by all this, consult some textbook dealing with the theory of equations.)

We can in fact throw the entire equation into a slightly different form

A(x - w1 )(x - w2)(x - w3)……(x - wn) =  0

(This time I have included a coefficient A which will accompany xn when the whole things is multiplied out, but as explained earlier this makes no essential difference.)  The above will become zero for any value of x which coincides with a root w1 w2 …. since such a value will make a bracketed term zero, and multiplying by this term will make everything zero. This is often a very convenient way of setting out an equation.

Euler had the ingenious and daring idea of extending the principle to expressions which were not polynomials, such as sin x. As you can readily see from a graph sin x crosses the x-axis when x takes the values 0, π, 2π, 3π, 4π…. (This is radian measure with 2π radians equal to 360° so π = 180° &c. The graph of the function sin x starts at 0 and does in fact cross the x axis at regular intervals of 180°.)

Also, if we look to the left, we find that the same pattern continues with the negative values (-π)(-2π)(-3ππ)….. The reason for this is that, by convention, we consider an angle taken anti-clockwise to be ‘positive’ and taken in a clockwise direction to be negative. Thus, supposing we can write down sin x in the same sort of way as we did for a polynomial function with the different roots appearing in each bracket, we have

sin x = ………..(x – (3π))(x – (2π))(x – (π) A(x – 0)(x π)(x 2π)(x – 3π)…..

with the brackets stretching in both directions since there are an unlimited quantity of values which make sin x go to zero. Since(π) = π the terms on the left become positive and we have

sin x = ………..(x +3π)(x +2π)(x + π)A(x)(x π)(x 2π)(x – 3π)…….

Now, we can rearrange this infinite product in a way which brings the +π  and –π terms together, i.e.

sin x = A (x)(x +π )(x π)(x +2π)(x 2π)(x + 3 π)(x 3π)…….

No terms have been neglected in this rearrangement and the order in which we multiply quantities does not affect the final result, so this is quite acceptable. The reason we have rearranged things in this way is so we can apply the well-known rule (a + b)(a – b) = (a2 b2)

sin x =  A(x)(x2 π2)(x2 – (2π)2)(x2 – (3π)2)(x2 – (4π)2)…..

We now divide each side by x giving

(sin x)/x  = A(x2 π2)(x2 – (2π)2)(x2 – (3π)2)(x2 – (4π)2)…..

The coefficient at the head, A, is an (as yet) unspecified constant which in such an expression usually gives some indication of the ‘slant’ of a function. For example, in a first-degree linear function such as y = 6x + 3 the coefficient 6 tells you that this line will have gradient 6 : 1. In higher degree equations the situation is more complicated but the coefficient associated with the highest term will usually give a reasonable indication of the ‘slant’ of the function, i.e. the ratio of the increase of the functional value to the increase of the x value.

Now, it is a well-known fact that when x (in radians) is very small it is not so different from the value of sin x or, put in mathematical terms, that the limiting value as x goes to 0  of  the expression sin x/x   =  1.  Graphically, what this means is that the curving sin x line is not so far from the diagonal through the origin with gradient 1 : 1 — the slant of the function y = x which is always at 45°. It is also quite easy to prove that this is indeed the limit from other considerations.

It would seem, then, that we require A to take an extremely large, theoretically ‘infinite’, value. For if we divide both sides by x we have

(sin x)/x   = A(x2 π2)(x2 – (2π)2)(x2 – (3π)2)(x2 – (4π)2)…..

and as x gets smaller and smaller each bracketed term gets nearer to just having a multiple of π in it, or we approach

A(0 π2)(0 – (2π)2)(0 – (3π)2)(0 – (4π)2)….. = ±  (π2)(4π2)(9π2)…..

an enormous product, negative or positive. But the limiting value of sin x/x does not blow up but on the contrary goes to 1. So we require an equally large denominator in the constant term to make the two ultimately cancel out, i.e. we need to set

A =      1 (π2)(4π2)(9π2)…..

In theory the denominator here is infinite to match the theoretically infinite product we will get as x goes to 0 but we can simply consider that we  are  progressively extending the range of our  function at will, stopping at some arbitrarily large but still finite point. We need not bother about the sign either as it will oscillate, depending on how far we go, but we may as well take it as positive. Thus we reach the tentative conclusion that we can represent sin x as a function

(sin x)  = A(x2 π2)(x2 – (2π)2)(x2 – (3π)2)(x2 – (4π)2)…..

where   A =     1 (π 2)(4π 2)(9π 2)…..

It is more convenient to have the bracketed terms the other way round, i.e. replace (x2 π2) by  (π2 – x2) and this does not matter as the sign is changing anyway. Thus we obtain

(sin x)  =  A (x)(π2- x2)((2π)2 - x2)((3π)2 - x2)…..

Here we make one final clean up to get the expression into the form we want. We take out all the p terms from the  beginning of each bracket, replacing for example

(π2 x2) by  π2(1 x2/π2)((2π)2 x2) by  (2π)2(1 x2/(2π2) and so on — if you multiply out you will see that this comes to the same thing. We then match all these π terms with those in the denominator of our constant and the result is that the new constant goes to unity since top and bottom cancel out.

We now have as our expression for sin x

sin x  =  1(x)(1 – x2/π2)(1 – x2/4π4)(1 – x2/9π6)……..

What is the point of all this jiggery-pokery?  You might well wonder but the surprising pay off is round the corner.

We start to multiply out the above giving

sin x     =  (x – x3/π2)(1 – x2/4π4)(1 – x2/9π6)……..

=  (x – x3/π2 – x3/4π4 + x5/4π6 )(1 – x2/9π6)……..

We are interested in the third degree terms such as (– x3/π2 ) which we will gather together, i.e.

(–x 3)(1/π2 +  1/(2π)2 + 1/(3π)2 + ……….)

Now, it is well-known that sin x can also be written as the Taylor series

sin x     =  x –  x3/3!  +  x5/5! - x7/7!  +  x9/9 ……

where the ! indicates the factorial function. Factorial n , written n!, is the product of all the integers up to and including n. So 3! = 1.2.3 = 6; 5! = 1.2.3.4.5 = 120. The factorial function grows amazingly fast —  even for such a small number as 9 factorial 9 takes us into the hundreds of thousands!  (There is incidentally never any confusion between the linguistic and mathematical use of ‘!’ since the context makes it clear which is intended.)

The coefficient of (- x3) in each series must match up — or so Euler argued. (This means we have an infinite series ‘equal’ to a single finite term but no matter.) Thus

(1/π2 +  1/(2π)2 + 1/(3π)2 + ……….)   =  1/3! or

(1/π2)( 1 + ½2 + 1/32 + ¼2 + ……..)    =   1/6

We can multiply both sides by
to (finally) get Euler’s  beautifuult for the sum of the reciprocals of the squares

(1 + 1/22 + 1/32 + ¼2 + ……..) =  π2/6

All this may strike you as rather fluky as indeed it is. The great Euler managed to get away with things that the university student of today would be slapped down for. The slippery part of the proof is where Euler casually assumes that the infinite product 1(x)(1 – x2/π2)(1 – x2/4π4)(1 – x2/9π6)…….. is not going to ‘blow up’ giving an infinite result as it well might. There are in fact rather few infinite products (as opposed to sums) that do converge to a limit but this happens to be one of them (the proof is extremely tricky and will not even be sketched out here). Euler, though he did not have the training in analytic rigour that became part and parcel of nineteenth century advanced mathematics, had enormous  experiential knowledge of numbers and functions and had developed a profound  (but not infallible) instinct for what was safe and what was not —  in much the same way as great engineers like Isambard Kingdom Brunel and his lesser known father could tell by a casual glance at a design whether a proposed bridge was likely to fail or not. Needless to say, summing the squares using a modern computer does lead to the stated result, indeed it is one possible way of determining the value of π (though too slow to be actually employed).

RSS feed for comments on this post · TrackBack URL

Post a Comment

You must be logged in to post a comment.