One thing that many of my students (at all levels!) seem to have trouble with are those pesky minus signs. For some reason people get very phobic around them and very often assume that when a minus sign comes up then they must have done something wrong. The fact is that minus signs are a very normal thing to arise – but that still seems to be no consolation to many.

One question that a lot of people find bewildering is the fact that when two negative numbers are multiplied together then a positive answer results – yes that’s right, for example $-3\times-4=12$ (NOT $-12$ as I’m sure I’ve said many hundreds of times during my time as a maths tutor). How is this so? It does seem a little counter-intuitive. Well…here’s a proof of it. The proof that follows will show, once and for all, that two negative numbers (real numbers if you must know, but then does it make sense anyway to talk about negative complex numbers or quaternions or anything like that?).

Here we go…

Throughout the proof I will assume the axioms of real numbers found in Mary Hart – Guide 2 Analysis Second Edition (i.e. that the real numbers are an abelian group under addition and an abelian group under multiplication, multiplication distributes over addition, the real numbers satisfy the order axioms and the completeness axiom).

First I want to prove that $0t=0$ for any $t \in \mathbb{R}$

$0t = (0+0)t$ as $0$ is the additive identity and therefore $0+0=0$

$0t = 0t + 0t$ by distributivity

$0t+(-0t)=(0t+0t)+(-0t)$

$0=0t+(0t+(-0t))$ by associativity of addition and $0t$ and $-0t$ are additive inverses

$0=0t+0$

$0=0t$  as required (*)

Next I want to prove that $-(-s)=s$ for all $s \in \mathbb{R}$

$-(-s)+(-s)=0$ since they are additive inverses

$(-(-s)+(-s))+s=s$

$-(-s)+((-s)+s)=s$ by associativity of addition

$-(-s)+0=s$ since $s$ and $-s$ are additive inverses

$-(-s)=s$ since $0$ is the additive identity (**)

Thirdly I need to prove that $s(-t)=-(st)=(-s)t$ for any $s,t \in \mathbb{R}$

$st+(-s)t = (s+(-s))t$ by distributivity

$st+(-s)t= 0t$ since $-s$ is the unique inverse of $s$

$st+(-s)t= 0$ by (*)

$-(st)=(-s)t$

similarly $-(st)=s(-t)$

and so $s(-t)=-(st)=(-s)t$  as required  (***)

Now I need to show that $(-s)(-t) = st$ for any $s,t \in \mathbb{R}$

$(-s)(-t)+(-((-s)(-t)))=0$ since they are additive inverses

$(-s)(-t)+(-(-s))(-t) = 0$ by (***)

$(-s)(-t)+s(-t)=0$ by (**)

$(-s)(-t) = -(s(-t))$ by uniqueness of inverses

$(-s)(-t) = s(-(-t))$ by (***)

$(-s)(-t) = st$ by (**)

It may appear that I am done here but in fact I have not said that (despite appearances) $-s$ and $-t$ are indeed negative; however, if we now assume that $s>0$ and $t>0$ then by the order axioms we have that $st>0$. But is it true that $-s$ and $-t$ are negative? Let’s find out…

since $s>0$ then by the order axioms we have that $s+(-s)>-s$. But $s+(-s)=0$ since they are inverses so $0>-s$. In other words $-s<0$ and thus $-s$ is negative. Similarly it can be shown that $-t$ is also negative hence $(-s)(-t)$ is indeed the product of two negative real numbers and thus since $(-s)(-t)=st$ and $st>0$ then $(-s)(-t)>0$ and the product $(-s)(-t)$ is positive. The proof is complete.

This proves that the fact that two negatives multiplied together give a positive is not just some rule that someone made up for a laugh to make maths difficult for GCSE and A-Level students – it is entirely consistent with the axioms of the real number system. Edmund Landau gives a much more thorough treatment of this in his book Foundations of Analysis, which I mentioned in my previous post on $2 \times 2 = 4$ in which the author builds gradually up to the real number system from the natural numbers and doesn’t assume the axioms that I have stated above. However this way of doing things is much longer to carry out taking Landau almost 90 pages to do; conciseness is desirable but must give way to thoroughness from time-to-time.

This is an example of a very simple real function that is only differentiable a finite number of times.

Let $f:\mathbb{R} \to \mathbb{R}$ be a function defined by $f(x)=x|x|$. The aim is to show that this function is differentiable but that it is not twice differentiable. Notice that $f(x)$ can also be written as
$$ f(x) = \left\{
\begin{array}{l l}
x^2 & \quad \text{if $x\geq0$}\\
-x^2 & \quad \text{if $x<0$} \end{array} \right.$$ The graph of this function looks as follows

The graph of y=x|x|

It is clear that $f(x)$ is differentiable for $x\!>\!0$ and $x\!<\!0$ with derivative $f^{\prime}(x)=2x$ and $f^{\prime}(x)=-2x$ respectively. We only need to check that $f(x)$ is differentiable at $x=0$.

By definition a function $g:I \to \mathbb{R}$ where $I \subset \mathbb{R}$ is differentiable at a point $x \in I$ if the limit
$$\lim_{h\to 0}\frac{g(x+h)-g(x)}{h}$$
exists. $g$ is called differentiable if it is differentiable at every point $x \in I$.

Let’s check that this limit exists for $f$ when $x=0$
$$\lim_{h\to 0}\frac{f(0+h)-f(0)}{h}=\lim_{h\to 0}\frac{(0+h)|0+h|}{h}$$
$$=\lim_{h\to 0}\frac{h|h|}{h}=\lim_{h\to 0}|h|=0$$

So $f(x)$ is indeed differentiable at $x=0$ and we can write the derivative of $f(x)$ as $f^{\prime}(x)=2|x|$. The graph of the derivative of $f$ looks as follows

The graph of 2|x|

The graph of 2|x| and the derivative of x|x|

We see that $f^{\prime}(x)$ is differentiable when $x \neq 0$ but when we try to find the derivative of $f^{\prime}$ at $x=0$ we have
$$\lim_{h\to 0^{+}}\frac{f^{\prime}(0+h)-f^{\prime}(0)}{h}=\lim_{h\to 0^{+}}\frac{2|h|}{h}=2$$
$$\lim_{h\to 0^{-}}\frac{f^{\prime}(0+h)-f^{\prime}(0)}{h}=\lim_{h\to 0^{-}}\frac{2|h|}{h}=-2$$
The left and right limits are not the same and therefore $f^{\prime}$ is not differentiable at $x=0$. The conclusion is that $f$ is differentiable but not twice differentiable.

It is not much more difficult to show (by induction is possibly the easiest way) that the function given by $h(x)=x^{n}|x|$ is $n$-times differentiable but not $n+1$ times differentiable.

Interpolation is widely used in mathematics and in the sciences – mathematics tends to concentrate on the theoretical side of things and the methods are then used by scientists with the confidence that what they are using works.

What is interpolation? Well an example would be if you are measuring the temperature outside on a particular day then you would take the temperature at certain times and record the temperature – but what about the times in between? We can’t record everything but we might need to know what the temperature was at one of these intermediate times – this is where interpolation comes in. Interpolation is a way of “joining the dots” within (not outside) a dataset so that estimates can be made about the behaviour. This can be done in hundreds of different ways using different methods each with their pros and cons.

In a previous post I talked about the limitations of Euler’s method – well here I’m going to talk about the limitations of polynomial interpolation and specifically the Lagrange representation and one way that the problem can be partially resolved. Don’t misunderstand me – the Lagrange representation is very useful but it, as with almost any numerical technique, has its own problems.

Let’s look at the function $f(x)=\dfrac{1}{1+25x^{2}}$ on the interval $[-1,1]$. This is what it looks like

Graph of f(x)

A graph of f(x) drawn using SAGE Math

 

Now given $n+1$ distinct points $x_{0}, x_{1}, \ldots , x_{n}$ in $[-1,1]$ and their corresponding $y$-values $y_{0}, y_{1}, \ldots , y_{n}$ then the Lagrange representation is defined as $$P_{n}=\sum_{j=0}^{n}y_{j}\ell_{j}$$ where $$\ell_{j}=\prod_{i=0,i\neq k}^{n}\dfrac{x-x_{i}}{x_{k}-x_{i}}$$

This representation can be proved to be unique and that $P(x_{i})=y_{i}$ but I don’t want to get sidetracked with this. So going back to the function $f(x)$ – first I’m going to choose $5$ equally-spaced points along the interval [-1,1]. The resulting interpolating polynomial looks as follows (I am not going to try to write an expression for the interpolating polynomials because they are very difficult to find explicitly and they don’t really tell us anything anyway)

Lagrange representation of f(x)

Lagrange representation of f(x) with 5 equally spaced points

The interpolating curve is shown in red and $f(x)$ is shown in blue. This is not bad, after all we only have $5$ points to work with so we might expect that as the number of points increases we get a more accurate picture – right? Well…no. As the number of points increases we have to increase the degree of the interpolating polynomial and if we choose $10$ equally spaced points on the interval $[-1,1]$ we get this picture

Lagrange representation of f(x)

Lagrange representation of f(x) with 10 equally spaced points

Things are starting to go a bit awry – maybe if we increase the number of points even further then things will start to settle down. Let’s look what happens when we have $16$ equally spaced points.

Lagrange representation of f(x)

Lagrange representation of f(x) with 16 equally spaced points

Maybe that wasn’t such a good idea. Things are starting to get so out of control that as the number of interpolation points (and consequently the degree of the interpolating polynomial) increases, the interpolating polynomial oscillates wildly between both extremely large positive and extremely large negative values near the edges of the interval $[-1,1]$. This behaviour gets so bad $n$ increases that the interpolating polynomial grows without bound near the edges of the interval – this is known as Runge’s phenomenon and makes the interpolating polynomial practically useless.

One way around this is to choose a different set of interpolation points. One of the problems is the equal spacing of the points – to resolve this in part we can choose Chebyshev nodes. Using these Chebyshev nodes we get a very different picture – the following diagram shows what things look like when $10$ points are chosen

Lagrange representation of f(x)

Lagrange representation of f(x) using 10 Chebyshev nodes

Now compare that to what we had before and we see something that is much better behaved. Has Runge’s phenomenon been eliminated? Mostly, yes – but in truth it can never be completely eliminated; the Chebyshev nodes, however, massively reduce the effects of Runge’s phenomenon. Runge’s phenomenon does not always occur but it is something that can go wrong from time-to-time, so as with all numerical methods you have to take care when applying the method to solve problems; there may be another more suitable method that needs to be applied.