One thing that many of my students (at all levels!) seem to have trouble with are those pesky minus signs. For some reason people get very phobic around them and very often assume that when a minus sign comes up then they must have done something wrong. The fact is that minus signs are a very normal thing to arise – but that still seems to be no consolation to many.

One question that a lot of people find bewildering is the fact that when two negative numbers are multiplied together then a positive answer results – yes that’s right, for example $-3\times-4=12$ (NOT $-12$ as I’m sure I’ve said many hundreds of times during my time as a maths tutor). How is this so? It does seem a little counter-intuitive. Well…here’s a proof of it. The proof that follows will show, once and for all, that two negative numbers (real numbers if you must know, but then does it make sense anyway to talk about negative complex numbers or quaternions or anything like that?).

Here we go…

Throughout the proof I will assume the axioms of real numbers found in Mary Hart – Guide 2 Analysis Second Edition (i.e. that the real numbers are an abelian group under addition and an abelian group under multiplication, multiplication distributes over addition, the real numbers satisfy the order axioms and the completeness axiom).

First I want to prove that $0t=0$ for any $t \in \mathbb{R}$

$0t = (0+0)t$ as $0$ is the additive identity and therefore $0+0=0$

$0t = 0t + 0t$ by distributivity


$0=0t+(0t+(-0t))$ by associativity of addition and $0t$ and $-0t$ are additive inverses


$0=0t$  as required (*)

Next I want to prove that $-(-s)=s$ for all $s \in \mathbb{R}$

$-(-s)+(-s)=0$ since they are additive inverses


$-(-s)+((-s)+s)=s$ by associativity of addition

$-(-s)+0=s$ since $s$ and $-s$ are additive inverses

$-(-s)=s$ since $0$ is the additive identity (**)

Thirdly I need to prove that $s(-t)=-(st)=(-s)t$ for any $s,t \in \mathbb{R}$

$st+(-s)t = (s+(-s))t$ by distributivity

$st+(-s)t= 0t$ since $-s$ is the unique inverse of $s$

$st+(-s)t= 0$ by (*)


similarly $-(st)=s(-t)$

and so $s(-t)=-(st)=(-s)t$  as required  (***)

Now I need to show that $(-s)(-t) = st$ for any $s,t \in \mathbb{R}$

$(-s)(-t)+(-((-s)(-t)))=0$ since they are additive inverses

$(-s)(-t)+(-(-s))(-t) = 0$ by (***)

$(-s)(-t)+s(-t)=0$ by (**)

$(-s)(-t) = -(s(-t))$ by uniqueness of inverses

$(-s)(-t) = s(-(-t))$ by (***)

$(-s)(-t) = st$ by (**)

It may appear that I am done here but in fact I have not said that (despite appearances) $-s$ and $-t$ are indeed negative; however, if we now assume that $s>0$ and $t>0$ then by the order axioms we have that $st>0$. But is it true that $-s$ and $-t$ are negative? Let’s find out…

since $s>0$ then by the order axioms we have that $s+(-s)>-s$. But $s+(-s)=0$ since they are inverses so $0>-s$. In other words $-s<0$ and thus $-s$ is negative. Similarly it can be shown that $-t$ is also negative hence $(-s)(-t)$ is indeed the product of two negative real numbers and thus since $(-s)(-t)=st$ and $st>0$ then $(-s)(-t)>0$ and the product $(-s)(-t)$ is positive. The proof is complete.

This proves that the fact that two negatives multiplied together give a positive is not just some rule that someone made up for a laugh to make maths difficult for GCSE and A-Level students – it is entirely consistent with the axioms of the real number system. Edmund Landau gives a much more thorough treatment of this in his book Foundations of Analysis, which I mentioned in my previous post on $2 \times 2 = 4$ in which the author builds gradually up to the real number system from the natural numbers and doesn’t assume the axioms that I have stated above. However this way of doing things is much longer to carry out taking Landau almost 90 pages to do; conciseness is desirable but must give way to thoroughness from time-to-time.

What follows is a proof of the well known fact that $2 \times 2 = 4$.

Define $2 = 1+1$ and $4=((1+1)+1)+1$

Then $2.2= (1+1).(1+1)$ by definition of $2$ (where $.$ is used instead of $\times$)

                $=1′.1’$ by Theorem 4, since $1+1=1’$

                $=1′.1+1’$ by Theorem 28, with $x=1’$ and $y’=1’$

                $=1’+1’$ by Theorem 28, since $x.1=x$

                $=1’+(1+1)$ by Theorem 4, since $1+1=1’$

                $=(1’+1)+1$ by Theorem 5, associativity of $+$

                $= ((1+1)+1)+1$ by Theorem 4, since $1+1=1’$

                $=4$ by definition of $4$

The theorem and definition numbers refer to the numbering in the book titled Foundations of Analysis by Edmund Landau which can be downloaded at here.

Why do we need to bother proving something as trivial as this? The answer is because we can. Pure mathematics is not really concerned with whether or not something is useful (although it could be argued that the above multiplication could come in quite useful from time-to-time) but takes the approach that if something can be proved then it should be proved. Mathematicians are very strict people and will refuse to accept something as true (or false) unless it has been proved to be the case; a proof allows the mathematician to be 100% certain about something and thus allows the mathematician to use certain facts and theorems, once they have been proved, in proving other facts and theorems and to develop a theory. This process is very strict and there are no exceptions – if something hasn’t been proved in a mathematical sense (which is much stronger than scientific proofs seen in other sciences) then it cannot be trusted.

The above proof is really just a special (and fun) case of the more general theorem (to be found in the book Foundations of Analysis) proving that multiplication works as we expect it to work for all numbers (Theorem 28 and Chapter 4). Note that this is something that we take for granted because we have been told from a very young age that multiplication does work yet no-one has ever checked multiplication explicitly on every pair of numbers so how do we really know that it works? Addition is something that is also considered in the book and proved to work as we expect it to (Theorem 4 and Chapter 2). Verifying that addition and multiplication behave the way that we want them to is an important thing to know – after all if we just blindly accepted that they worked without proving them and things started to go wrong somewhere down the line then we would only have ourselves to blame for being so naive! Imagine if we decided not to bother proving some other mathematical principles such as the ones that stop skyscrapers from falling down, that stop bridges collapsing or even the ones that stop people hacking into your bank account and helping themselves.

Foundations of Analysis is a unique book. This is not the kind of book where you expect to find lots of worked examples and then lots of practice questions for you to have a go at followed by pages of answers and solutions. Nor will you find encouraging and motivating words giving you hints and tips and pitfalls to avoid. The text falls into about four categories – Axiom, Definition, Theorem, Proof and builds up the through various number systems from the Natural Numbers, ${1, ….}$ to the Complex Numbers. The style is extreme and brutally cold with all superfluous writing stripped out – exactly as it is intended to be. This is an example of how efficient mathematical and scientific texts can be. Yet this shows how (necessarily) pedantic mathematicians have to be. I think the author’s attitude would be that if you can’t be bothered to read it then don’t – but I can say that having read the book myself it was certainly worth it!

This is an example of a very simple real function that is only differentiable a finite number of times.

Let $f:\mathbb{R} \to \mathbb{R}$ be a function defined by $f(x)=x|x|$. The aim is to show that this function is differentiable but that it is not twice differentiable. Notice that $f(x)$ can also be written as
$$ f(x) = \left\{
\begin{array}{l l}
x^2 & \quad \text{if $x\geq0$}\\
-x^2 & \quad \text{if $x<0$} \end{array} \right.$$ The graph of this function looks as follows

The graph of y=x|x|

It is clear that $f(x)$ is differentiable for $x\!>\!0$ and $x\!<\!0$ with derivative $f^{\prime}(x)=2x$ and $f^{\prime}(x)=-2x$ respectively. We only need to check that $f(x)$ is differentiable at $x=0$.

By definition a function $g:I \to \mathbb{R}$ where $I \subset \mathbb{R}$ is differentiable at a point $x \in I$ if the limit
$$\lim_{h\to 0}\frac{g(x+h)-g(x)}{h}$$
exists. $g$ is called differentiable if it is differentiable at every point $x \in I$.

Let’s check that this limit exists for $f$ when $x=0$
$$\lim_{h\to 0}\frac{f(0+h)-f(0)}{h}=\lim_{h\to 0}\frac{(0+h)|0+h|}{h}$$
$$=\lim_{h\to 0}\frac{h|h|}{h}=\lim_{h\to 0}|h|=0$$

So $f(x)$ is indeed differentiable at $x=0$ and we can write the derivative of $f(x)$ as $f^{\prime}(x)=2|x|$. The graph of the derivative of $f$ looks as follows

The graph of 2|x|

The graph of 2|x| and the derivative of x|x|

We see that $f^{\prime}(x)$ is differentiable when $x \neq 0$ but when we try to find the derivative of $f^{\prime}$ at $x=0$ we have
$$\lim_{h\to 0^{+}}\frac{f^{\prime}(0+h)-f^{\prime}(0)}{h}=\lim_{h\to 0^{+}}\frac{2|h|}{h}=2$$
$$\lim_{h\to 0^{-}}\frac{f^{\prime}(0+h)-f^{\prime}(0)}{h}=\lim_{h\to 0^{-}}\frac{2|h|}{h}=-2$$
The left and right limits are not the same and therefore $f^{\prime}$ is not differentiable at $x=0$. The conclusion is that $f$ is differentiable but not twice differentiable.

It is not much more difficult to show (by induction is possibly the easiest way) that the function given by $h(x)=x^{n}|x|$ is $n$-times differentiable but not $n+1$ times differentiable.

Interpolation is widely used in mathematics and in the sciences – mathematics tends to concentrate on the theoretical side of things and the methods are then used by scientists with the confidence that what they are using works.

What is interpolation? Well an example would be if you are measuring the temperature outside on a particular day then you would take the temperature at certain times and record the temperature – but what about the times in between? We can’t record everything but we might need to know what the temperature was at one of these intermediate times – this is where interpolation comes in. Interpolation is a way of “joining the dots” within (not outside) a dataset so that estimates can be made about the behaviour. This can be done in hundreds of different ways using different methods each with their pros and cons.

In a previous post I talked about the limitations of Euler’s method – well here I’m going to talk about the limitations of polynomial interpolation and specifically the Lagrange representation and one way that the problem can be partially resolved. Don’t misunderstand me – the Lagrange representation is very useful but it, as with almost any numerical technique, has its own problems.

Let’s look at the function $f(x)=\dfrac{1}{1+25x^{2}}$ on the interval $[-1,1]$. This is what it looks like

Graph of f(x)

A graph of f(x) drawn using SAGE Math


Now given $n+1$ distinct points $x_{0}, x_{1}, \ldots , x_{n}$ in $[-1,1]$ and their corresponding $y$-values $y_{0}, y_{1}, \ldots , y_{n}$ then the Lagrange representation is defined as $$P_{n}=\sum_{j=0}^{n}y_{j}\ell_{j}$$ where $$\ell_{j}=\prod_{i=0,i\neq k}^{n}\dfrac{x-x_{i}}{x_{k}-x_{i}}$$

This representation can be proved to be unique and that $P(x_{i})=y_{i}$ but I don’t want to get sidetracked with this. So going back to the function $f(x)$ – first I’m going to choose $5$ equally-spaced points along the interval [-1,1]. The resulting interpolating polynomial looks as follows (I am not going to try to write an expression for the interpolating polynomials because they are very difficult to find explicitly and they don’t really tell us anything anyway)

Lagrange representation of f(x)

Lagrange representation of f(x) with 5 equally spaced points

The interpolating curve is shown in red and $f(x)$ is shown in blue. This is not bad, after all we only have $5$ points to work with so we might expect that as the number of points increases we get a more accurate picture – right? Well…no. As the number of points increases we have to increase the degree of the interpolating polynomial and if we choose $10$ equally spaced points on the interval $[-1,1]$ we get this picture

Lagrange representation of f(x)

Lagrange representation of f(x) with 10 equally spaced points

Things are starting to go a bit awry – maybe if we increase the number of points even further then things will start to settle down. Let’s look what happens when we have $16$ equally spaced points.

Lagrange representation of f(x)

Lagrange representation of f(x) with 16 equally spaced points

Maybe that wasn’t such a good idea. Things are starting to get so out of control that as the number of interpolation points (and consequently the degree of the interpolating polynomial) increases, the interpolating polynomial oscillates wildly between both extremely large positive and extremely large negative values near the edges of the interval $[-1,1]$. This behaviour gets so bad $n$ increases that the interpolating polynomial grows without bound near the edges of the interval – this is known as Runge’s phenomenon and makes the interpolating polynomial practically useless.

One way around this is to choose a different set of interpolation points. One of the problems is the equal spacing of the points – to resolve this in part we can choose Chebyshev nodes. Using these Chebyshev nodes we get a very different picture – the following diagram shows what things look like when $10$ points are chosen

Lagrange representation of f(x)

Lagrange representation of f(x) using 10 Chebyshev nodes

Now compare that to what we had before and we see something that is much better behaved. Has Runge’s phenomenon been eliminated? Mostly, yes – but in truth it can never be completely eliminated; the Chebyshev nodes, however, massively reduce the effects of Runge’s phenomenon. Runge’s phenomenon does not always occur but it is something that can go wrong from time-to-time, so as with all numerical methods you have to take care when applying the method to solve problems; there may be another more suitable method that needs to be applied.