Lecture 16 Math221 

Least Squares Fit of Polynomials to Data 

 

As you know from lab classes in Physics or Chemistry you are ask to plot data.  Many times you know that the data should fit a specific type of equation say  one variable varies directly as another.   

This type of relationship is then a straight line with one of the variables the independent variable and the second the dependent variable.  For example s=at + b.  Here you need to figure out the values of a and b from the data.  The one problem that you have is that the data reading will not be exact so the points that you receive from the data will not lie exactly on the straight line.   We would like to develop a technique to find the best fit to the data in a systematic way.   The following type of situation will be considered.   

 

We will be given four  points (Typesetting:-mrow(Typesetting:-mi() , (Typesetting:-mrow(Typesetting:-mi(), ( Typesetting:-mrow(Typesetting:-mi(), (Typesetting:-mrow(Typesetting:-mi() which should be fit by a straight line of the form y = ax + b.  The points will be taken quite a ways off from the desired straight line so we can view the desired results.   

The errors will satisfy the following set of equations. 

Typesetting:-mrow(Typesetting:-msub(Typesetting:-mi(Typesetting:-mrow(Typesetting:-mi(Typesetting:-mrow(Typesetting:-mi(Typesetting:-mrow(Typesetting:-mo(
We would like to find the values of a and b so that  Typesetting:-mrow(Typesetting:-mi( is a minimum.  We can use calculus to do this but it require a knowledge of partial derivative which is not require in this class.  We will take a linear algebra point of view and look at the equation above as a system of equation which will not have a solution but we will look at the meaning from the point of view of projections.
 

 

Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi(= Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( .  The experiment will have error so the right hand side of the equation will not in general be in the span of the columns of the coefficient matrix.  If we project the right-hand side vector onto the span of the columns vectors of the coefficient matrix we will be able to find the projection vector.  This projection vector will be the least square solution because it is the one with the shortest distance in the Euclidean sense which is exactly what we are looking for in the analysis above.  i.e  

 

Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi(= Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi(
Since the columns of the coefficient matrix are linearly independent then we can solve the above system for a unique vector Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( which will give the least squares fit to the data.
 

Example 1  Linear Fit to a Set of Data. 

Consider the following points.  We would like to find the best linear least squares fit to the data 

[0,1],[1,1.4] and [2,1.9] 

> with(LinearAlgebra):
 

 

> A:=<<1,1,1>|<0,1,2>>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> y:=<1,1.4,1.9>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> x:=<0,1,2>;
 

Typesetting:-mrow(Typesetting:-mverbatim( (1.1.1)
 

>
 

> AT:=A^%T;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> ATA:=AT.A;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> ATb:=AT.y;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> ATAI:=MatrixInverse(ATA);
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> s:=ATAI.ATb;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> b:=s[1];
 

.983333333333333836 (1.1.2)
 

> a:=s[2];
 

.449999999999999732 (1.1.3)
 

> with(plots):
 

> plot([[x[1],y[1]],[x[2],y[2]],[x[3],y[3]]], style=POINT);
 

Plot_2d
 

> data_list1:=[[x[1],y[1]],[x[2],y[2]],[x[3],y[3]]];
 

[[0, 1], [1, 1.4], [2, 1.9]] (1.1.4)
 

> PointPlot:=plot(data_list1,style=POINT):
 

> formula:=a*xx+b;
 

`+`(`*`(.449999999999999732, `*`(xx)), .983333333333333836) (1.1.5)
 

> CurvePlot:=plot(formula, xx=0..2.5):
 

> display([PointPlot,CurvePlot],title=`Data points & least squares curve fit`);
 

Plot_2d
 

>
 

So the best least square linear fit to the equation is given by  

y=.45x + .9833 

 

 

Example 2   

We will look at extending this to a polynomial of degree two or a quadratic equation. 

The equation will be of the form  Typesetting:-mrow(Typesetting:-mi( .  The date that we will try to fit the data to is the following: [-1,1],[0,0],[1,1] and [2,5] 

The matrix system will look like the following 

Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( = Typesetting:-mrow(Typesetting:-mfenced(Typesetting:-mrow(Typesetting:-mi( so   we will repeat the above example but with a larger coefficient matrix. 

 

 

> x:=<-1,0,1,2>;
 

Typesetting:-mrow(Typesetting:-mverbatim( (1.2.1)
 

> x2:=<x[1]^2,x[2]^2,x[3]^2,x[4]^2>;
 

Typesetting:-mrow(Typesetting:-mverbatim( (1.2.2)
 

> ones:=<1,1,1,1>;
 

Typesetting:-mrow(Typesetting:-mverbatim( (1.2.3)
 

> A:=<ones|x|x2>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

 

> y:=<1,0,1,5>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> AT:=A^%T;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> ATA:=AT.A;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> ATbb:=AT.y;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

The inverse is not that easy to find in this case so we will use gausselimination or gaussjord to solve the problem 

 

 

> s:=BackwardSubstitute(GaussianElimination(<ATA|ATbb>));
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> c:=s[1];
 

-`/`(3, 20) (1.2.4)
 

> b:=s[2];
 

`/`(1, 20) (1.2.5)
 

> a:=s[3];
 

`/`(5, 4) (1.2.6)
 

> data_list2:=[[x[1],y[1]],[x[2],y[2]],[x[3],y[3]],[x[4],y[4]]]:
 

> formula:=a*xx^2 + b*xx + c;
 

`+`(`*`(`/`(5, 4), `*`(`^`(xx, 2))), `*`(`/`(1, 20), `*`(xx)), `-`(`/`(3, 20))) (1.2.7)
 

> CurvePlot:=plot(formula,xx=-1.1..2.1):
 

> PointPlot:=plot(data_list2,style=POINT):
 

> display([PointPlot,CurvePlot],title=`Data points & least squares cure fit`);
 

Plot_2d
 

The ` used in the above display command is the character under ~ not the single quote.   

 

 

Typesetting:-mrow(Typesetting:-mi(. 

 

We will look at the error for this problem.  It will apply to all problems .  The error will be the two norm of the difference between the points y values and the calculated y values at the data points.    The error below is called the least squares error.    

 

 

> ones:=<1,1,1,1>;
 

Typesetting:-mrow(Typesetting:-mverbatim( (1.2.8)
 

> ys:=a*x2+b*x+c*ones;
 

Typesetting:-mrow(Typesetting:-mverbatim( (1.2.9)
 

> error1:=Norm(y-ys,2);
 

`+`(`*`(`/`(1, 10), `*`(`^`(5, `/`(1, 2))))) (1.2.10)
 

Least Square fit to the Equation of a Plan 

We will look at a system of points that are three or higher space and try to fit a plan through the data points or what is called a hyperplane in higher than three dimensions.  

Typesetting:-mrow(Typesetting:-mi(
[1,1,2],[1,2,1],[2,1,3], and [1,0,2] the coefficient matrix  will be given by
 

> restart:
 

I am using restart because I wand to use a[0], a[1] and a[2] but I have alread used a which Maple will assume is a[0]. 

> with(LinearAlgebra):
 

 

> AA:=<<1,1,1,1>|<1,1,2,1>|<1,2,1,0>>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> sol:=<a[0],a[1],a[2]>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> bbb:=<2,1,3,2>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> AAT:=AA^%T;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> AATAA:=AAT.AA;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> AATbbb:=AAT.bbb;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> sol1:=BackwardSubstitute(GaussianElimination(<AATAA|AATbbb>));
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

The plane is given by Typesetting:-mrow(Typesetting:-mi(   

Example 3 

In this example we will look at a system of equation of the form Ax = b 

where A and b are given but the vector b does not live in the column space of A so there is no true solution but we can follow the above process to find what will be referred to as a Least Squares Solution.   

 

 

> A:=<<0,-1,-2>|<1,2,1>>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> b1:=<3,2,2>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> X:=<x,y>;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

We want to find the least squares solution to AX = b1 

 

> ATT1:=A^%T;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> ATT1A:=ATT1.A;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> ATT1b1:=ATT1.b1;
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

> sol:=BackwardSubstitute(GaussianElimination(<ATT1A|ATT1b1>));
 

Typesetting:-mrow(Typesetting:-mverbatim(
 

i.e. x= 0 and y = 3/2.    

 

Exercises 2.6 Part 1 

Exercise 2.6.1 (b), (e)  Exercise 2.6.4 (a).  Graph the data and the straight line one one graph.  Find the least squares error.   

 

Exercise 2.6.6 (b)  Again graph the data and the quadratic polynomial that is the best least squares fit to the data. Also, find the least squares error    Problems are due on October 29, 2008