Iteration methods for finding roots of polynomials, numerical solution for polynomials, synthetic division , Horner's method for evaluating polynomial, Horner's method for finding roots, Newton's method for upper bound of roots,
iteration methods for numerical solution of polynomials.docx
1. Iteration methods for approximate roots of polynomials
In the process, for finding roots of polynomials, we hazard a
reasonable guess for a real number root and employ an iteration
formula to obtain next approximation or correction of error . A little
preparation for adopting the process is required such as rational root
theorem, intermediate value theorem, synthetic division etc. dealt
hereafter in rudiments.
Rational root theorem
Suppose the polynomialan x
n
+an−1 x
n−1
+….a0 has all integer coefficients.
If not we can multiply throughout by LCM of the denominators of all
coefficients.
Suppose it has a rational root
p
q where they have no common factors,
i.e., they are coprime.
So an(p
q )
n
+an−1(p
q )
n−1
+….a0=0 .
Or, an p
n
+an−1 p
n−1
q+….a0q
n
=0
Or, p(an p
n−1
+an−1 p
n−2
q+..a1 q
n−1
)=−a0q
n
So p is a factor of left side, it must divide the right sidea0q
n
also. Butp
does not divide q
n
as it does not divideq as assumed.
Thereforep dividesa0 .
2. Similarlyan p
n
+an−1 p
n−1
q+….a0q
n
=0 gives
an−1 p
n−1
q+……..a0 q
n
=−an p
n
.
Or, q(an−1 p
n−1
+……..a0q
n−1
)=−an p
n
.
Now q must divide anas it and therefore does not divide p does not
divide p
n
.
So the rational root theorem says, the denominator of the rational
root must divide the leading coefficient and the numerator of the
root must divide the last term.
This fact can be used to determine rational roots of a polynomial if
any, by analyzing factors of the leading coefficient and last term. The
integer root theorem is a special case of rational root theorem when
the leading coefficient is 1. If the monic polynomial has any integer
roots it must appear as a factor of the last term. As an application, if
a polynomial is has an integer root or rational root r it can be found
out by factoring the leading coefficient and the last term. Then the
factor x−r can be taken out of the polynomial reducing its degree by
one just by dividing the polynomial byx−r .
Descartes’s rule of signs
There can be no more positive roots of the polynomial f (x) than the
changes of sign in the polynomial written in descending order of
power. Similarly the maximum number of negative roots of a
polynomialf (x) shall be maximum number of changes of sign inf (−x) .
3. Since the polynomial must have pairs of imaginary roots, their
number must be even, got after deducting positive and negative
roots in the polynomial.
Formal proof may also be given later.
Explanation;
Suppose a polynomial with one minus sign coefficient as in the
following
an xn
+an−1 xn−1
−an−2 xn−2
….a0is multiplied by a factor x−a where a is
positive. In sketch
+ + - + + ……… +
- + - - ……… -
+ ± ∓ ± ± ………. ±
It may be observed that there is at least one more sign change from
2nd
place to 3rd
place. Thus for every additional positive root there is
at least one more sign change in the polynomial. In other words,
there can be no more positive roots of the polynomial than the
changes of sign in the polynomial written in descending order of
power.
4. Synthetic Division
This is nothing but writing the process of long division in brief. An
example below:
Divide x
6
−2 x
4
+3 x
3
−x
2
−2x+6 by x−2.
Just write the coefficients of the polynomial, including missing
powers , if any. Write 2 in x−2. to its below left and make a dashed
line.
The first coefficient 1 is carried down, as first member of quotient.
This carried down 1 multiplied by 2 in the divisor, raised up, added to
0 carried down. See above.
Next, this 2 is again multiplied by 2, raised up, written below –
2 ,added to it and carried below. See below .
5. So the dividend is
x
5
+2x
4
+2 x
3
+7 x
2
+13 x+24and remainder is last term54 . This evaluates
the polynomial at x=2, as per remainder theorem (Division
algorithm)
f (x)=d(x)(x−a)+f (a). When the polynomial f (x) is divided byx−a , the
dividend is (x) , and remainder is R=f (a). This is value f (a)of the
polynomial at x=a. The polynomial is divisible byx−a if the remainder
R=f (a) is 0. (Division algorithm)In fact with a little practice, this can
be done all in one step. So synthetic division is used primarily for
evaluation of a polynomial at a point. If R=f (a)=0,thenf (x)=d(x)(x−a)
andf (x) is divisible byx−a .(Factor theorem).
Synthetic division can be exhibited as a formula. Note this key
example below.
Let us divide a x
2
+bx+c by x−a
through synthetic division as under.
6. In the first step a is just brought down. In the 2nd
step,a is multiplied
byr and placed under b. Its totalar+b is brought down and multiplied
again by and the resultar
2
+br is placed under c and total
ar
2
+br+c is brought down. This is the final remainder. This can be
generalized to get remainder theoremf (x)=d(x)(x−a)+R(x)
wheref (x) is a polynomial of degree n, andd (x), is a polynomial of
degree < n. Further R(x)=f (a). In f (x)=d(x)(x−a)+f (a) ,d (x) is unique.
If f (a)=0 the same result is called factor theorem. This x−a is called a
factor of f (x) and a is a root . Again dividingf (x) by
x−bif is banother root, we get another polynomial as quotient of
degree further one degree less . If the polynomial f (x) has n roots,
the process stops after n steps and further division result in fractions.
Thus a polynomial of degree n has maximum n roots or n factors and
no more; albeit some of them may be repeated. Since the division
process gives unique quotient at each step, the polynomial has n
roots and no more and can be factorized in unique way. This is
unique factorization theorem in parlance with unique factorization
theorem in arithmetic. In other words, if the equationf (x)=0 has more
than n distinct roots, it must be an identity, not an equation and
7. shall be satisfied by every value of x. As an exercise you may prove
the following quadratic equation is an identity and is satisfied by
every value ofx
.
Same here below.
Just find the values of the expressions for x=a ,x=b ,x=c
Thus if a polynomial has n roots and no more, else every number
satisfies it.
In another way, called Horner’s method putx5
+2x4
+2 x3
+7 x2
+13 x+24 in
nested form in successive steps as follows.
x5
+2x4
+2 x3
+7 x2
+13 x+24
¿ x
4
(x+2)+2x
3
+7 x
2
+13 x+24
¿ x3
(x (x+2)+2)+7 x2
+13 x+24
¿ x2
(x (x (x+2)+2)+7)+13 x+24
¿ x(x(x(x (x+2)+2)+7)+13)+24
8. In each successive steps we are we are taking the highest available
power to put the terms inside bracket, such as
x5
+2x4
=x4
(x+2), x4
(x+2)+2x3
=x3
(x(x+2)+2),etc. In evaluating the
polynomial, adding a coefficient and a multiplication by x greatly
reduces the calculations such as raising to powers.
Horner’s method is equivalent to synthetic division by Ruffini
With a bit analysis the Horner’s method is seen to be equivalent to
synthetic division a process by . Take another example .To divide
Things would be clear if we write long division and synthetic division
side by side. Long division. (Please ignore different colours of texts)
Synthetic division:
9. Please compare the two to understand. In synthetic division, the
coefficients of divisor g(x)are written in the left side column with their
signs changed and coefficients of f (x)are written in a horizontal row
as shown. The 1 in top left column is not a coefficient, but just to
show that the leading coefficient of g(x)is not anything other than 1.
Had it been 3, both numerator and denominator could have been
divided by 3 and then the process of synthetic division would have
followed. Both the dividend and the divisor would have been
factored out by same factor before starting division.
For example take
The synthetic division is then as usual.
As another example , to divide 4 x
4
−6 x
3
+3 x−5by
10. 2 x−1 lease see below:
The third row, is the sum of the first and second rows all divided by
2.
In Horner’s method if we have to divide f (x)=x3
+4 x2
−5 x+5 by x−3, we
present here, long division, synthetic division and Horner’s algorithm
for comparison and understanding.
Horner’s nested brackets
If this polynomial is to be evaluated at x=3, we would divide f (x) by
x−3and the remainder would be R=f (3);
f (x)=(x−3)g(x)+f (3).
Long Division and synthetic division are as under
11. If you continue division process even after the remainder has been
obtained, you get digits after the decimal place. This is also example
of Paravartya sutra in Vedic Mathematics.
To explain how the method of synthetic division is equivalent to
Horner’s method, we show below that both are essentially
equivalent to division algorithm in long division process . Please see
below.
Let the polynomial be
f (x)=∑
i=0
n
ai xi
=¿¿ .
To evaluate f (x0), define a new sequence of constants b0,b1 ,…
so that
12. b0=f (x0)
, value of the function at
x=x0
.
In view of the above,
f (x)=∑
i=0
n
ai xi
=¿¿ and f (x0)=¿
by substituting the formula for b0,b1 ,….etc.
This is nothing but the usual division algorithm, so both Horner’s
method and synthetic division are equivalent to division algorithm at
large.
Butf (x0) is the remainder whenf (x) is divided by −x0 . By division
algorithm,f (x)=g(x)(x−x0)+f (x0) .So the value of the polynomial f (x)at is
the remainder f (x0)when the polynomial is divided byx−x0 .
By substituting values of constants b0,b1 ,… etc. in the polynomial we
get
13. f (x)=(b1+b2 x+b3 x2
…….bn xn−1
)( x−x0)+b0
So (x0)=b0 . We can start at bn=anand use the recursive formula until
we reach b0. This is quickly achieved by doing synthetic division as
described above.
To find value of a polynomial in Horner’s method or by
synthetic division.
Given a polynomial f (x) we evaluate it at a pointx=a ,as per
remainder theorem computed by Horner’s method or synthetic
division. If the remainder is 0,then x=a is a root of the polynomial or
x−a is a factor of the polynomial f (x) as per factor theorem. Now the
reverse problem is to find out the roots or factors of the polynomial
in the same method, using Horner’s method or synthetic division.
The procedure is, hazard a guess for some x=x0 ,reasonably
approximate root. Particularly it is helpful for finding largest root.
Then divide polynomial f (x) by x−x0 and remainderf (x0) is obtained in
the above method. Our aim is now to make this remainder
approximate to 0 by recursively repeating the procedure. Straight
and simple. The basic idea is, what happens when we reduce all
roots of the polynomialf (x) uniformly by a number say, r.
(If f (x)=A x3
+Bx2
+Cx+D then
f (x+r) keeps the form of the polynomial unchanged ,i.e.,
f (x+r)
¿ A(x+r)3
+B(x+r)2
+C(x+r)+D
14. If we divide it by (x+r) by synthetic division or otherwise then the
remainder is D).
To compute a root x=r of the polynomialf (x)=0 we choose a sequence
x=x0, x=x1 and x1,x=x2 …such that f (x)→0. Computer would do
everything for us, but it must be given some algorithm, a routine
procedure of what to do, how to do, when to start and when to stop.
If x0leads to x1 and x1 leads tox2 , by means of some algorithm, and so
on, two situations may arise. Either we are we are nearing a real root
or getting away from it.
The steps of difference x2−x1 , x1−x0 etc. depend upon what? They
obviously depend upon rate of change of the function i.e.
how the function changes from point to point, i.e., the
differential coefficient of the function at the points, f ' (x0), f ' (x1),
etc. Since ' (x0)≈
f (x1)−f (x0)
x1−x0
,
Or,x1−x0≈
f (x1)−f (x0)
f ' (x0)
. This is as if the function is linear in a small range
nearx0 . (The tangent at that point has two points on the curve
arbitrarily close to each other)
Supposing f (x1)=0 i.e.,x1 is exactly a root the above expression
becomes x1−x0≈
−f (x0)
f ' (x0)
, or x1≈ x0−
f (x0)
f ' (x0)
.
If f (x1)≠0we can take x2≈ x1−
f (x1)
f ' (x1)
as new value of x1that may make
f (x2)=0, or x2 a root of the equation. If not We go to x3≈ x2−
f (x2)
f ' (x2)
so that
x3may be a root.
15. So the formulaxn+1 ≈ xn−
f (xn)
f ' (xn)
takes us to further and further
approximation of a real number root on the condition that f ' (xn)≠0.
We have to explore the conditions under which it goes near a root or
goes away from it. This is Newton’s method or Newton-Raphson .
Raphson further sharpened the tool.
Newton’s criteria for finding upper limit of roots
By Taylor’s theorem,
f (x+h)
¿ f (h)+ x f '
(h)+
x
2
2!
f ''
(h)…
x
n
n!
f n
(h),
If f (h)and all derivativesf '
(h),f ''
(h),… f n
(h) are all positive at x=h for x>0
then f (x)>0 for ¿0 . This shows h is higher than the highest root, i.,e.,h
is an upper bound for roots.
We may find by trial some least integerh1such that
f n−2
(h1)>0whenx>h1 .If thish1 does not makef n−2
(x)>0 , then again by trial
a greater integerh2>h1 so that f
n−2
(h1)>0. In this way we can find a least
integer h so that f (h)and all derivativesf '
(h),f ''
(h),… f n
(h) are all positive at
x=h for ¿0 . then f (x)>0 for x>0 .
For example
If f (x)=x3
−10x2
−11 x−100, Then
f '
(x)=3 x2
−20 x−11,f ' '
(x)=2(3 x−10)
16. f
'' '
(x)=6
The least integer values of x for which f '
(x)>0 is 4 , that for f '
' (x)>0 is 8
and that for f ''
' (x)>0 is 12. Thus 12 exceeds all roots and thus integer
part of highest root must be 11.
Geometrically
Obviously f
'(x0 )
=tanΨ ≈
f (x0)
x0−x1
as in the figure. It remains to see under
what condition this approximation really converges to a root.
Assumingf ' (x) to be continuous in addition to f ' (x)≠0 in an interval
around x0 .
(These conditions are obviously satisfied in case of
polynomial functions.) Convergence is guaranteed if this following
condition is satisfied.
Supposing α is a root of the polynomial, i.e.f (α)=0 , Taylor’s theorem
with residue is
f (α)=0=f (xn)+f ' (xn)(α−xn)
+
1
2
f '' (εn)(α−εn )
for some εn,α−xn ≤εn≤α +xn , α−xn ≈εnmay be
taken as error in n-th step. Rearranging the terms,
17. f (xn)
f ' (xn)
+α−xn=
−f
' ' (εn)
2f
'(xn) (α−xn)
2
.
Putting xn+1=xn−
f (xn)
f ' (xn) , the last equation becomes,
α−xn+1=
−f
' '(εn )
2f
' (xn) (α−xn)
2
.
If α−xn ≈εnis the error in n-th step, α−xn+1=εn+1is error in n+1st
step. So
εn+1=
−f
'' (εn )
2f
' (xn )
εn
2
.
If ¿
|f
'' (εn )
2f
' (xn)|=M <1, the errors in successive iterations would definitely
converge to 0 . this is sufficient condition for f (α)=0 or to be a rootα;
condition of quadratic convergence. At least good accuracy is
achieved after a few steps of iteration.
What has it to do with synthetic division or Horner’s method ?
Synthetic division would quickly computef (xn) .
If the polynomial f (x)has a double root atx=α , then the iterations
oscillate ‘twice’ around the root, diminishing the ‘speed’ of
convergence. Naturally increasing the interval to would help
maintain the quadratic convergence. This is done by setting
xn+1=xn−2
f (xn)
f ' (xn)
, and in case of multiplicity m of the root,
we may set xn+1 ≈ xn−m
f (xn)
f ' (xn)
, the concept called successive over
relaxation.
Multiple roots
Suppose f (x)has a multiple root,
18. take f (x)=(x−α )m
g(x).
On differentiation,
f ' (x)=m(x−α )m−1
g(x)+(x−α )m
g' (x)
So f (x)andf ' (x) have a common factor(x−α )m−1
. So it helps finding df ' (x)
and analyzing for roots before analyzing df (x) have.
Intermediate value theorem.
Now about ‘guess of x=x0 . Just we may find some
a,b: f (a)<0,f (b)>0 . Since f (x)is a continuous function, it takes every
value for f (a)<f (x)<f (b) if the function is real valued. We may choose
any point x0: a<x0<b. This is intermediate value theorem.
The function must pass through every value f (a)<f (x)<f (b), obviously 0
is a value of f (a) between them iff (a)<0<f (b).
Particularly, if the function is monotonically increasing f
'
(x)>0in the
intervala< x<b , we can start at the beginning x0=a ,or if the function is
monotonically decreasingf '
(x)<0, we can start at the end point x0=b,
and convergence to a root is guaranteed in this case. Formal proof
may be given later.
19. Example 1
To find cube root of a number. Take f (x)=x
3
−a.
Immediately we have xn+1=xn−
f (xn)
f
' (xn)
=¿
xn−
xn
3
−a
3 xn
2
=xn−
1
3 xn
+
a
3 xn
2 .
A suitable initialization shall serve with a few steps. Caution not to
fall in ditch by taking xn=0 which blows up with f
'(x0 )=f
'
(0)=0
.
Example 2
In fact it is a non-example. Non-examples are as important as
examples to understand anything . To find natural logarithm of
numbers . Take the functionf (x)−lnx and see how to run into trouble.
The function is defined only forx>e and it has a root atx=1 .
We have xn+1=xn−
f (xn)
f
' (xn)
=xn (1−ln xn).
If we initialize at xn=e, the next iterate is 0 and the sequence does not
proceed further. If we takexn>e , the next step ends in negative and
the process halts.
As Newton-Raphson method is slow converging , sometimes
oscillating and sometimes fails on condition as seen above, more
criteria and algorithms other than Newton’s method got devised as
time elapsed .
20. Other helpful theorems
1) Suppose that for α is a root of that for f (x),f (α )=0.
By Taylor’s expansion,
f (α+h)=f (α ) yields
→
+h f '
(α )+
h
2
2!
f ' '
(α)...,
f
'
(α +h)=f
'
(α)+hf
' '
(α )+
h2
2!
f
' ''
(α )+….,
If the first term
h
r
r!
f
r
(α ) in both series above is not 0, its sign is sign of
both f (α+h) and f '(α+h)
for very small values of h
if h>0. Please note that they are of opposite signs otherwise.
2) If α is a multiple root, then as x increases through x=α, the
terms
f (α),f '
(α ),f ''
(α)... are all of same sign immediately after x=α, as shown
above, but are alternatively + and – immediately
beforex=α .
3) As already shown the theorem for Des Cartes rule of signs.
21. Secant method
Instead of considering tangent at the point x0,Let us consider the
secant joining points on the curve f (x0) and f (x1) .Any point x2 where
the secant cuts the X-axis is given by the equation of the secant. If we
consider nearer to a root of the equation, then taking slopes of the
segment between x2, x1 and x1, x0 to be equal, we get assuming f (x2)≈0
orx2 is very near the root,
f (x2)−f (x1)
x2−x1
=
f (x0)−f (x1)
x0−x1
,
Or,x2=x1−f (x1)
x1−x0
f (x1)−f (x0) ,
Or the recurrence relation becomes,
22. xn=xn−1−f ( xn−1)
xn−1−xn−2
f (xn−1)−f ( xn−2)
,
This method is also called regula falsi or method of false positions.
Again, if the points x2, x1 and x0 are not very close to the root, the
problems of slow convergence, oscillations reappear, but
convergence is guaranteed if there are real roots. The order of
convergence is not quadratic but more than linear.
Bisection method
If we take x2=
x0+x1
2
in the above ,
we get,
f (x2)−f (x1)
x2−x1
=
f (x0)−f (x1)
x0−x1
.
Or,
x0+x1
2
=x
2
=x1−f (x1)
x1−x0
f (x1)−f (x0),
The method is useful when f (x0),f (x1)are negative of each other, so
that by intermediate value theorem f (x2)has a valuef (x2)≈0 .
The iteration equation becomes.
xn+xn+1
2
=x
n+2
=xn+1−f (xn+1)
xn+1−xn
f (xn+1)−f (xn)
,
Example
Takef (x)=x3
−x−2 .
We see that if x1=2,x0=1
then f (1)=−2,f (2)=+4 of opposite signs. Since it is a continuous
function, by intermediate value theorem there is a root in the
interval [1,2].
Now x2=
x0+x1
2
=
1+2
2
=1.5,
23. so f (1.5)=(1.5)3
−(1.5)−2=−0.125, and f (2)=+4 .
Hence we would further search for x3=
1.5+2
2
=1.75 and find f (1.75).
Continuing this iteration 15 steps we getx16=1.5214 sufficient near 0
and it could be taken as a root practically
f (1.75)=0.00008.
Picard’s Fixed point iteration process
The roots of y=f (x) are the points wheref (x)=0 or where the graph of
y=f (x)meets the X-axis, i.e., where the graphs of
y=f (x) and y=x. We may pretend to ‘solve’ this function by extracting
an fromf (x)=0 , by making it x=g(x).
Simply if we set F(x)=g(x)−x , in an interval [a, b] , then we have
F(a)−a≥0, F(b)−b≤0, as the function is continuous , it takes every value
in [a, b] so there is a point c ∈[a,b]: F(c)=0 which is a fixed point in the
interval. This is by virtue of intermediate value theorem described
elsewhere in this book.
Sufficient condition for this would beF'(x)
<0 then F(x)would be
decreasing. i.e.F'(x)
=g'(x )
−1<0, or, g' (x)
<1. In other words, slope of y=g(x)
would be less than slope of y=x. Obviously it is tacitly assumed that
has continuous derivative in the interval. The following examples
shall make this point clearer. ( it may be noted that y=x is the line on
which any function and its inverse function (if exists) meet. Since this
line does not change interchanging of y∧x.
For example let f (x)=x
2
−2x−8=0and we can write it asx=g(x)=
x
2
−8
2
from the same function. The roots off (x)=0 is the same points where
graphs y=g(x)
24. And graph of y=x meet. Simple. Having known the function
x=g(x) we can choose a point x0 nearby a root, as may be known from
intermediate value theorem and put the iteration equation as
x1=g(x0),x2=g(x1),x3=g(x2)
and so onxn+1=g(xn).
In this particular example, taking x0=5, we get,
x1=g(5)=
5
2
−8
2
=8.5,
x2=g(8.5)=
8.52
−8
2
=32.125, and so on.
This form of x=g(x)=
x2
−8
2 does not work and divergesxn.
Next we take x=h(x)=
2x+8
x . The iteration equation
xn+1=
2 xn+8
x
starting fromx0=4.5 also does not work.
Next let us try x=g(x)=√2x+8 and the iteration equation
xn+1=g(x)=√xn+8 . Starting atx0=5 , the successive iterations are :