Polynomial Regression
The theory, math and how to calculate polynomial regression.
An Algorithm for Polynomial Regression
We wish to find a polynomial function that gives the best fit to a sample of data. We will consider polynomials of degree n, where n is in the range of 1 to 5. In other words we will develop techniques that fit linear, quadratic, cubic, quartic and quintic regressions. In fact, this technique will work for any order polynomial.
where
The function will only approximate the value that we have in our sample data and consequently there will be a residual error .
Let us consider the set of sample observations where
.
For each observation we will have an equation:
As a matrix equation polynomial regression becomes:
Or more conveniently as:
The residual error should be a small, randomly distributed term that we seek to minimize. Because of this we will solve the equation by setting and solving for
.
Squaring the Matrix
The matrix has
rows and
columns and so is probably not square because we usually have many more observations than the degree of the polynomial. This means that we can’t calculate an inverse of
because only square matrices can be inverted.
The trick we use here is to multiply the equation by the transposition of which we will denote by
. When we transpose a matrix we simply swap its rows and columns.
Firstly, let us multiply our equation on both sides by to give us
Let us think about the dimensions of the last equation. The matrix has dimensions
so the matrix
must have dimensions
and the product
will have dimensions
and so is forced to be a square matrix. The size of the matrix depends on the polynomial we wish to fit. For example if we are fitting a quadratic regression it will be a
or three-by-three matrix.
and
are vectors with
members so have dimension
. We can now see the dimensionality of the full equation:
So both sides of the equation end up as vectors with n+1 numbers. This means we have a series of n+1 linear equations from which we can derive . This obviously makes sense as to fit a polynomial regression of degree n we have n coefficients of x plus a constant value, which contributes the
.
Solving the Equation
is square so we can invert it. Consequently we multiply our equation by the inverse of
which we show as
.
but
is simply the identity matrix so the right side is just
.
We can look closer at the two main terms of this last equation:
Similarly
Given our input data and
we can easily calculate and fill these matrices and complete the equation.
Worked Example
The above theory is quite hard to follow so we can show an easy worked example to illustrate how the numbers all work together.
Firstly we need to have some observations. We have 5 observations and we can fit a linear regression:
So this suggests that the function would be a good linear regression for the data. We can obviously see if this worked by plotting our observations on a chart as blue dots and the function as a red line. This is what we see when we do this.
![]() |
![]() |
We can clearly see that the fit looks quite good, However, if we repeat the analysis again but we try to fit a quadratic regression we get this.
The charts now look like this:
![]() |
![]() |
Clearly the fitted curve is now a perfect fit as the residuals are all zero and the is one.
Final Comment
The technique that we outlined here is simple and it works. However, as we did the worked examples we started to see some of the problems in this technique. The matrices are filled with powers and so the numbers start to get high. If you look at the final multiplication we have the inverse matrix with small numbers multiplied by a vector with big numbers and the result has reasonable sized numbers. This is where this technique has a problem. When a computer runs through the math there can be issues with overflow and propagation of errors.