How To Represent Polynomials
Polynomials are mathematical functions with a lot of nice properties. They are easy to calculate, easy to derive, interpolate, multiplicate together, and more. Therefore it shouldn’t surprise anyone that there are algorithms to approximate more complex functions with polynomials.
While those algorithms are not complicated, the first problem to solve is the following: how can we represent polynomials in C++?
One possible solution could be to create a set of functions that would represent the basis of the polynomial space:
double x0(double x){
return 1;
}
double x1(double x){
return x;
}
double x2(double x){
return x*x;
}
double x3(double x){
return x*x*x;
}
// ....
And then define our function as
double fun(double x){
return 5* x3(x) + 2 x1(x)  4x0(x);
}
This approach has multiple drawbacks:

The base of polynomials is not finite

It is cumbersome to write those functions

Reading those functions is cumbersome too

We have lost all the nice properties of the polynomials functions

We cannot evaluate those function efficiently, writing algorithms like Horner’s rule are almost impossible.
I know that MATLAB and Octave used to represent polynomials as an array of coefficients, or at least they provided an interface to manage those coefficients as if the polynomial were an array.
Therefore I decided that it was not a bad idea to use a std::vector<double>
to represent a polynomial.
Doing so makes implementing most operations on polynomials a walk in the park.

Adding polynomials together is equivalent to adding the coefficients at the right position.

Calculating the derivative of an integral is the equivalent of adding or removing a coefficient, and multiplying/dividing every coefficient by a specific number depending on their position.

Multiplicating two polynomials is a little more complex, but I do not think there would be an easier way.

Evaluating a polynomial, thanks to C++ and the possibility to overload
operator()
, is like evaluating a normal function that takes adouble
and returns adouble
.
Last but not least, we can evaluate (internally) the polynomial with the Horners' method, freeing the caller from the burden to write y = eval_with_hoerner(my_pol, x)
instead of y = my_pol(x)
.
This gives us the possibility to make the code a lot more readable since the interface is very clean, and, internally, we can use the optimal strategy for managing our polynomial.
This is a snippet of the polynomial
class.
The complete class can be found on github.
The eval_with_hoerners_rule
and multiply_poly
algorithms are defined as separate functions since those were a little more complex to implement.
By doing so we can test them more easily, or use them with different implementations.
class polynomial{
std::vector<double> coeff = {0};
public:
polynomial() = default;
explicit polynomial(double c0) : coeff{c0}{
}
explicit polynomial(const std::vector<double>& coeff_) : coeff(coeff_){
}
friend polynomial operator+(polynomial lhs, const polynomial& rhs){
lhs.coeff.resize(std::max(lhs.coeff.size(), rhs.coeff.size()));
const auto minsize = std::min(lhs.coeff.size(), rhs.coeff.size());
std::transform(std::begin(lhs.coeff), std::begin(lhs.coeff)+minsize, std::begin(rhs.coeff), std::begin(lhs.coeff), [](double c1, double c2){return c1+c2;});
return lhs;
}
friend polynomial operator+(polynomial lhs, double coeff0){
lhs.coeff[0] += coeff0;
return lhs;
}
friend polynomial operator+(double coeff0, polynomial rhs){
rhs.coeff[0] += coeff0;
return rhs;
}
friend polynomial operator*(const polynomial& lhs, const polynomial& rhs){
polynomial toret;
toret.coeff.resize(lhs.coeff.size()+rhs.coeff.size()1);
multiply_poly(std::begin(lhs.coeff), std::end(lhs.coeff), std::begin(rhs.coeff), std::end(rhs.coeff), std::begin(toret.coeff));
return toret;
}
friend polynomial operator*(polynomial lhs, double coeff0){
std::transform(std::begin(lhs.coeff), std::end(lhs.coeff), std::begin(lhs.coeff), [coeff0](double v){return v*coeff0;});
return lhs;
}
friend polynomial operator*(double coeff0, polynomial lhs){
std::transform(std::begin(lhs.coeff), std::end(lhs.coeff), std::begin(lhs.coeff), [coeff0](double v){return v*coeff0;});
return lhs;
}
friend polynomial derive(polynomial p){
if(p.coeffs().size() == 1){
return polynomial(0);
}
for(std::size_t i = 1; i != p.coeff.size(); ++i){
p.coeff[i] = p.coeff[i]*fd(i);
}
p.coeff.erase(p.coeff.begin());
return p;
}
friend polynomial integrate(polynomial p){
for(std::size_t i = 0; i != p.coeff.size(); ++i){
p.coeff[i] = p.coeff[i]/fd(i+1);
}
p.coeff.insert(p.coeff.begin(), 0);
return p;
}
double operator()(double x) const {
return eval_with_hoerners_rule(std::begin(coeff), std::end(coeff), x);
}
};
While the polynomial
class works correctly, it is suboptimal.
The polynomial ${x}^{n}$ is represented with a std::vector
of size n
, where all values, except for one, are exactly equal to 0
.
We could use internally some sort of map, where every coefficient is mapped to a specific position.
This is an optimal solution in the case where most coefficients are zero. If none of them is zero, then it will take approximately twice as much space as with the current implementation.
If we know that the maximum grade is always constant, we could even avoid allocating memory dynamically.
For now, I am interested in an implementation that is simple, correct, and as stable as possible.
Performance is not a top priority, correctness and simple code come first.
Polinomial Interpolation
Polynomials have a lot of nice properties, unfortunately, in real life, most curves that describe something are not polynomials. But we can use polynomials to approximate those complicated curves, and by creating applying a polinomial interpolation, it actually becomes pretty easy. Since we do not have any tools for solving the linear system defined by the Vandermonde matrix, we will just write directly the resulting polynomial using Lagrange polynomials. With this implementation of the polynomial class, writing the algorithm is straightforward.
This first function defines the base polynomial
template<class InputIt>
polynomial lagr_base_poly(InputIt first, InputIt last, const std::size_t j)
{
const double x_j = first[j];
polynomial prevpoly(1);
std::size_t k = 0;
for(auto it = first; it != last; ++it, ++k){
if(k == j){
continue;
}
const double x_i = *it;
polynomial curr_poly({ x_i/(x_jx_i), 1/(x_jx_i)});
prevpoly = prevpoly * curr_poly;
}
return prevpoly;
}
Given the base, we simply need to sum them together multiplied by a specific constant:
template<class InputIt>
polynomial lagr_poly(InputIt first, InputIt last, InputIt first2, InputIt last2)
{
polynomial toret;
std::size_t i{};
for(auto it = first; it != last; ++it, ++first2, ++i){
assert(first2 != last2);
(void)last2;
double y_i = *first2;
toret = toret + y_i * lagr_base_poly(first, last, i);
}
return toret;
}
I am unsure about the stability of this given solution. Solving the linear system would probably give us a better control over the error.
Here an example of how those functions can be used:
int main(){
const std::vector<double> data_x = {1, 2, 3};
const std::vector<double> data_y = {1, 8, 27};
auto pol = lagr_poly(std::begin(data_x), std::end(data_x), std::begin(data_y), std::end(data_y));
assert(pol.coeffs()[0] == 6);
assert(pol.coeffs()[0] == 11);
assert(pol.coeffs()[0] == 6);
}
Chebyshev Nodes
Apparently, we have already solved the interpolation problem. Unfortunately, we do not have a good control on the approximation error. The socalled Runge’s phenomenon shows that even if the interpolation polynomial evaluates exactly where we want in the interpolation points, the interpolation oscillates, and we have no control over the error.
On the contrary, if we can evaluate the approximation function in every point we want, then we can evaluate it in the Chebyshev nodes. Doing this gives us the ability to control the oscillations and therefore the maximum error.
We could define a function that fills a vector with the Chebyshev nodes. Since there is a direct formula for creating those, we can also define some sort of containerlike class:
class chebyshev_nodes{
open_interval interv;
std::size_t n;
public:
chebyshev_nodes(const open_interval& interval, std::size_t num_nodes) : interv(interval), n(num_nodes){
}
double operator[](std::size_t i) const {
assert(i<n);
auto toret = midpoint(interv)  (length(interv)/2)*std::cos(pi*fd(2*i+1)/fd(2*n));
return toret;
}
std::size_t size() const noexcept {
return n;
}
};
And generate the nodes like this
int main(){
chebyshev_nodes nodes({1,1}, 10);
for(std::size_t i = 0; i != 10; ++i){
std::cout << nodes[i] << "\n";
}
}
It has the advantage that we do not need to allocate any memory, but if we need to use those values over and over again, it may make sense to copy them in a container like a std::array
or std::vector
.
Spline Interpolation
Even if we managed to approximate a complex function with a 125degree polynomial, evaluating this polynomial isn’t probably very efficient. If we can’t choose the data we’re to interpolate, we cannot even use the Chebyshev nodes. A better approximation is probably given by splitting the domain of the function into subintervals and approximate it in every subinterval with a polynomial with a lower degree. Normally polynomials of grade 1 or 3 are more than sufficient, therefore I decided to code a class similar to polynomial, to represent a spline of degree 1.
struct poly_interv{
polynomial poly;
double b; // no need to store the whole interval, just right limit. Left limit is given by previous interval or inf
};
class interp_spline_deg1{
std::vector<poly_interv> p_is;
public:
interp_spline_deg1() = default;
explicit interp_spline_1(const std::vector<poly_interv>& p_is_) : p_is(p_is_){
}
double operator()(double x) const {
auto it = std::find_if(std::begin(p_is), std::end(p_is), [x](const poly_interv& v){return x <= v.b;});
if(it == std::end(p_is)){
assert(std::isnan(x)); // otherwise we should always find an interval
return x;
}
return it>poly(x);
}
};
And of course a function for creating a spline interpolation of degree 1 on a given interval of a function
template<class InputIt>
interp_spline_1 create_interp_spline_deg1(InputIt first, InputIt last, InputIt first2, InputIt last2){
std::vector<poly_interv> toreturn;
auto it = first;
auto jt = first +1;
auto it_y = first2;
auto jt_y = first2 +1;
for(; jt != last; ++it, ++jt, ++it_y, ++jt_y){
assert(jt_y != last2);
(void)last2;
polynomial p = lagr_poly(it, jt+1, it_y, jt_y+1);
assert(p.coeffs().size() <= 2 && "should create only linear splines");
poly_interv p_i;
p_i.poly = p;
p_i.b = *jt;
assert(toreturn.empty()  (approx_equal(p(*it), toreturn.back().poly(*it), 0.0001) && "border condition not true"));
toreturn.push_back(p_i);
}
assert(!toreturn.empty());
// extend intervals of spline
toreturn.back().b = std::numeric_limits<double>::infinity();
return interp_spline_deg1(toreturn);
}
The implementation is the most straightforward that came to my mind. There are a lot of optimizations that can be applied, but since readability and correctness should come first, I’m pretty happy with it. Suggestions on github are of course welcome.
An example on how to use and test the given implementation
int main(){
const auto err = 0.0001;
const std::vector<double> data_x = {2, 0, 1, 2, 3};
const std::vector<double> data_y = {8, 0, 1, 8, 27};
auto spline = create_interp_spline_1(std::begin(data_x), std::end(data_x),std::begin(data_y), std::end(data_y));
assert(approx_equal(spline(2), 8, err));
assert(approx_equal(spline( 0), 0, err));
assert(approx_equal(spline( 1), 1, err));
assert(approx_equal(spline( 2), 8, err));
assert(approx_equal(spline( 3), 27, err));
}
Conclusion
Writing the polynomial class and the interpolation functions was an interesting exercise.
It also made me ask different things, in particular how memory should be handled.
The polynomial class could take a template parameter to let the user provide a different allocator, similar to the std::vector
class.
Another "problem" is how "invalid floating point values", like inf
or nan
should be treated.
Most functions, if fed with nan
, returns some nan
value, in order to propagate it so that someone on a higher level can see that something is broken.
By doing so we do not loose the information that we got an invalid value.
We have to pay attention to directly test floating point values.
For example, we should always write a⇐b
and never !(a>b)
, since those operations are not the same.
Therefore it makes sense to write
if(a<=b){
// ...
} else if(b<=a) {
// ...
} else {
// ...
}
and not
if(a<=b){
// ...
} else {
// ...
}
More information about comparisons in C++ can be found here
We could otherwise throw an exception, but I have no idea what the best practices for error handling in mathematical programs are. The floating point exception environment relies both on global variables and invalid values for notifying errors. I fear that exceptions, a third mechanism, would not be very welcome by most mathematical computations programs. On the one side they provide a lot more information than a numeric value, on the other side it is always difficult to say what an exceptional situation is, in particular in a mathematical context.