DC Mathematica 2018

In Memory of Dr Purchase Mathematics Teacher at Dulwich College 2002-2017

It was with great sadness that the College heard last November that Dr Purchase had died. Dr Purchase joined the College in 2002, having taught briefly at Hills Road Sixth Form College and at Eton College beforehand. Dr Purchase oversaw the writing of this magazine, DC Mathematica, each year, helping organise the articles submitted by boys and staff and arranging the printing in time for Founder’s Day. Dr Purchase’s love of Mathematics was evident to those with whom he came into contact. The subject of his PhD thesis was galactic clusters, solving the n -body problem using computational numerical methods. In his memory, the prize for the best article for DC Mathematica will be named the Purchase Prize from now on. Dr Purchase will be much missed by his colleagues at the College, both in the Mathematics Department and more widely. While many knew Dr Purchase as a Mathematics teacher, anyone in his form would tell you that he was far more than that. In addition to being my form tutor, he was a mentor, especially to me, but I am certain to many others as well and in my time under his tutelage, he taught me far more than just Pure Mathematics. Dr Purchase was always willing to listen to any concerns I had, be they academic or not, and it seemed to me that he would always do his absolute best to provide me with a solution. While he was light years ahead of me in terms of academic knowledge, we always spoke as equals – I never felt talked down to. I know that he will be sorely missed by his students and all those who had the privilege to know him. Dr Purchase was also widely respected by his students, one of whom, Theo Podger (13MF3) writes:

Contents:

 ‘The Art of Indifference’ – Boris Ter-Avanesov (12FM)

‘Hawking’s Tea’ – Mr Ottewill

 ‘Chinese Remainder Theorem’ – Hin Chi Lee (12JLG)

 ‘The Importance and Role of India in Maths’ – Saajid Khan (7W)

 ‘The Birthday Paradox’ – James Storey (8R)

 ‘Music is Maths’ – by Samuel Smith (12AC)

 ‘Prime Numbers’ – Jeremy Samoogan (7R)

 ‘The Mechanics of Snowboarding’ – Toby Evans (10P)

 ‘Parallax: Applications’ – Bryan Tan (11T), Jensen Tong (11F)

 ‘A Physical Approach to The Basel Problem’ – Jinkai Li (12JAR), Bryan Lu (12JLG)

 ‘Conway’s Soldiers’ – Lunzhi Shi (11H)

 ‘Frequency Extracting, Sound Editing and The Fourier Transform’ – Simon Mengzhe Xu (JLG)

Editorial Team:

Boris Ter-Avanesov, Simon Mengzhe Xu (formatting), Ezana Adugna (design)

The Art of Indifference by Boris Ter-Avanesov (12FM)

This article will discuss the issue of multiple contradictory solutions arising in both pure and applied probability problems as a consequence of even the subtlest of variations in the interpretation of the phrase “at random”. Instrumental to this discussion are the principle of indifference, the principle of maximum ignorance and the principle of transformation groups. I explore some proposed solutions with particular focus on the work of E.T. Jaynes using Bertrand’s Paradox and other similar problems to enunciate some of the central issues in probability theory. In the physical world, we can categorise systems as being deterministic, chaotic or random. With deterministic systems, provided that we can measure the initial conditions with sufficient accuracy, we can predict the evolution of the system at any point in the future. Chaotic systems, though believed to follow deterministic physical laws, are particularly sensitive to the values of their initial conditions and whilst we might be able to predict the evolution of the system quite reliably in the short term, over longer time periods the effects of errors in the measurement of the initial conditions accumulate and give rise to unpredictable behaviour. We use these types of systems as sources of chance because they can be effectively random, such as the flip of a coin, the roll of a dice or future weather patterns. However, in truly random systems no amount of accuracy in the measurement of initial conditions equips us any the better to forecast the state of the system even at the very next instant in time. We apply the concept of chance to situations where we lack enough information to say for certain what the outcome will be, irrespective of which of the three categories the system falls into. When we have absolutely no information then as far as we are aware we are dealing with the third type of system and our best guess is to attribute equal likeliness to all possible outcomes. For this reason we use the uniform distribution or more generally the principle of indifference as the starting point in such analyses. The principle of indifference, sometimes called the principle of insufficient reason, is a very old rule of thumb that has been in use since the earliest writers on probability such as Jacob Bernoulli and Pierre Laplace. It states that if you have n options which are indistinguishable except by name then you assign equal probabilities of 1  to each of them. The principle has had success in both abstract and applied mathematics, for example with James Clerk Maxwell's predictions of the behaviour of gases. Unfortunately there are cases when it seems to lead to incorrect results. Even in deterministic physical systems we are quite prepared to see a small amount of variation in the results of experiments. We do not expect to see variation in the results of purely mathematical investigations. Surely, inconsistency in the answers to mathematical questions is seen as contradictory and hence paradoxical as we imagine these rigorous formal systems to preclude the types of subtle variation that lead to the spread of physical results. However, there are numerous examples in probability in which perfectly valid alternative methods for the same problem give rise to contradictory results.

Bertrand’s Paradox

‘Consider an equilateral triangle inscribed in a circle. Suppose a chord of the circle is chosen at random. What is the probability that the chord is longer than a side of the triangle?’ Bertrand presents three arguments to answer this problem: The ‘random endpoints method’, the ‘random radius method’, and the ‘random midpoint method’, all three of which seem valid yet yield different numerical values.

The random endpoints method entails choosing two random points on the circumference and joining them to form a chord of the circle. Now, the triangle can be rotated so that one of these points coincides with one of its vertices. As visible in figure 1, a chord that goes through the angle at that vertex is longer than a side length whilst a chord that doesn’t is shorter. Since said angle is necessarily 60 ° , the probability of the chord exceeding the sides in length is 60 180 - or more simply, 1 3 . Notice that this is equivalent to picking an angle in the range 0 – 180 that the chord will form with a tangent.

The random radius method requires for a radius of the circle to be drawn and the triangle rotated such that the radius is perpendicular to one of the sides of the triangle. Next, a random chord parallel to the side is to be drawn through the radius. From figure 2 we can see that such a chord is longer than the side if the point where it meets the radius is closer to the centre than the point where the radius intersects the side. The side of the triangle bisects the radius exactly so this corresponds to 1 2 of such chords.

Finally, the random midpoint method involves the following procedure: a

randomly selected point within the circle is selected and the chord that has said point as its midpoint is drawn through the circle. A concentric circle with 1 2 the radius should be drawn as shown in figure 3. If the chosen point lies within the smaller circle, then the chord drawn through it is longer than a side of the triangle. The area of the smaller circle is 1 4 that of the larger circle, which means that the probability of the chord drawn being longer than a side length of the triangle is 1 4 .

Furthermore, a similar problem can be found in ‘Fifty Challenging Problems In Probability’ (Frederick Mosteller, 1965), in which the question asks ‘if a chord is randomly drawn in a circle, what is the probability that its length will exceed the radius?’. This particular problem also has three solutions.

The first method: ‘Assume that the distance of the chord from the centre of the circle is evenly (uniformly) distributed from 0 to r. Since a regular hexagon of side r can be inscribed in a circle, to get the probability, merely find the distance d from the center and divide by radius. Note that this is the altitude of an equilateral triangle of side r. Therefore from plane geometry we get d = √ 2 − 2 4 = √3 2 and consequently, the desired probability is approx. = 0.866’ The second method: ‘Assume that the midpoint of the chord is evenly distributed over the interior of the circle. Consulting the figure again, we see that the chord is longer than the radius when the midpoint of

the chord is within d of the centre. Thus all points in the circle of radius d, concentric with the original circle, can serve as midpoints of the chord. Their fraction, relative to the area of the original circle, is 𝜋 2 𝜋 2 =

 2 2

= 3 4 = 0.75. This probability is the square of the result we got in method 1.

The third method: ‘Assume that the chord is determined by two points chosen so that their positions are independently evenly distributed over the circumference of the original circle. Suppose that the first point falls at A in this figure. Then, for the chord to be shorter than the radius, the second point must fall on the arc ABC, whose length is 1 3 the circumference. Consequently, the probability that the chord is longer than the radius is 1- 1 3 = 2 3 .’

There are many problems that can give multiple contradictory answers, particularly in geometric probability and the application of probability to physical situations. In some famous problems often only one method is cited when in fact similar ambiguities could arise if various assumptions are made differently. In the problem of Buffon’s Needle, as well as question A6 on the 1992 paper of the notoriously difficult Putnam examination, we must be given extra information as to how the random variable is selected in order for there to be a unique solution. For instance, in the verbatim statement of the Putnam problem –‘Four points are chosen independently and at random on the surface of a sphere (using the uniform distribution). What is the probability that the center of the sphere lies inside the resulting tetrahedron?’- we are told where to apply the uniform distribution and are not given the opportunity to explore other methods for choosing random points which may affect the answer. As previously stated, there is a broad consensus to use the uniform distribution (U) in cases where a preference for any particular outcome does not exist – hence indicating randomness. The problem arises when there are different options for which random variable this is to be applied to. In Bertrand’s Paradox, method 1 assigns a uniform distribution to the angle between the chord and tangent, so we can say 𝜃 ~ U (0 ° , 180 ° ). Method two assigns the uniform distribution to the chord midpoints in a line which forms a radius i.e. L ~ U (0, r) and method three assigns the uniform distribution to midpoints chosen from an area. The fact that these three random variables and their associated sample spaces: angle, length and area, are of different types is the key to understanding the ambiguity. If one of these random variables is chosen to be uniformly distributed it is often the case that the others necessarily can’t be. This is made clearer in the case of another problem. We are asked to draw a random cube with side length (L) somewhere between 3 and 5 cm (so that the surface area is between 54 cm 2 and 150 cm 2 and the volume is between 27 cm 3 and 125 cm 3 ). Most people would assume that there is a uniform distribution over the side length itself, L ~ U (3, 5). For a uniformly distributed random variable X ~ U (a, b) the average or expected value is E(X) = + 2 , which in this case would give a side length of 4 cm. Based on this the average surface area would be 96 cm 2 and the average volume would be 64 cm 3 . However, if we instead assume a uniform distribution over the surface area such that A ~ U (54, 150) the expected value will be 102 cm 2 , but this corresponds to an average side length of 4.123 cm. The problem worsens if instead we assume a uniform distribution over the volume, V ~ U (27, 125), which yields an expected volume of 76 cm 3 giving an average side length of 4.236 cm. Even though length, surface area and volume are functions of each other and so as random variables are wholly dependant, this does not mean that they have the same distribution. In fact, if one of them is uniformly distributed, the other two necessarily cannot be. This should highlight how variations in solutions can arise due to subtle differences in the application of the uniform distribution and therefore in the interpretation of randomness. But how can we know which random variable the uniform distribution should be applied to? The Issue

How Do We Choose?

One proposed solution to this issue is to take the average of all the different numerical solutions obtained by the different methods. This is called a meta-average or universal average, which in the case of Bertrand’s

( 1 2

+ 1 3

+ 1 4

)

= 13 36 ≈ 0.361 . This seems simple enough but actually leads to even more

paradox would give

3

questions. Is it problematic that the meta-average is different to any of the individual solutions? What if there are infinitely many solutions so that we can’t find an average? Can we assume that our alternative methods are to contribute equally to the meta-average or should we use a weighted average? Even in this case where we have three unique values, should we be so hasty to give them equal weighting? Casually assigning uniform distributions is the oversight that led us into this quandary, and applying it here, at the level of alternative methods rather than alternative outcomes of random variables, may be particularly hard to justify. And we are no better off if specific solutions arise more frequently than others; should their weighting in the meta-average be increased? It would seem unnatural to count a solution equally if it is more

prevalent, however increasing its weighting suggests that we think it is ‘more correct’, which is surely a problematic concept in mathematics in and of itself. In any case, the fatal flaw of the meta-average is that it is always susceptible to revision unless we can somehow prove that our methods form an exhaustive set. Can we ever be certain that there is no risk of a novel method being found that yields yet another different solution, thus changing the meta-average? Another solution was proposed by Edwin T. Jaynes in his 1973 paper “The Well-Posed Problem”. He says the issue arises because “we have not been reading out all that is implied by the statement of the problem; the things left unspecified must be taken into account just as carefully as the ones that are specified”. He demonstrates how this idea can help us choose between the three possible answers to Bertrand’s paradox as well as place restrictions on which methods can be applied to similar problems. The key is that the three methods discussed by Bertrand give rise to different probability distributions of chord lengths (or distributions of positions of midpoints, since a chord is uniquely defined by its midpoint). It seems obvious that we need to know “which probability distribution describes our state of knowledge when the only information available is that given in the statement of the problem”, but Jaynes’ genius insight is to realise that “if we start with the assumption that Bertrand’s problem has a definite solution in spite of the many things left unspecified, then the statement of the problem automatically implies certain invariance properties, which in no way depend on our intuitive judgments.”

Basically, because the problem does not state the orientation, size or position of the circle, then the correct method must be general enough to give the same answer if these parameters are altered – we need to be able to arrive at the same probability regardless of rotations, enlargements or changing position. We say the solution must be rotation invariant, scale invariant and translation invariant as these are left unspecified in the statement of the problem. Jaynes formulates these ideas mathematically and demonstrates that the probability distribution of chords must be of a certain form in order to meet these three criteria, which is enough to eliminate two of the three possible solutions. His method does not

necessarily show us which distribution is correct, but allows us to eliminate distributions that are definitely wrong (because they violate the indifference criteria). We can also demonstrate this visually by graphing the distributions of chords and their midpoints when generated “randomly” according to each of the three methods. After adjusting for the fact that the centre of the circle is the only midpoint that does not uniquely define a chord (an infinite number of diameters share this midpoint), we see that method 2 is the only one that is both scale invariant and translation invariant. The distribution looks the same if we change the size or location of the circle. Method 3 is only scale invariant and method 1 is neither. This confirms the result from Jaynes’ calculations – that 2 out of the 3 methods lead to chord distributions that are not invariant in the desired ways and therefore must be rejected. Fortunately in this case this leaves only one candidate.

The idea that equivalent problems must yield the same result is what Jaynes calls the principle of maximum ignorance. To Jaynes, problems are equivalent if they only differ in respect to factors not mentioned in the statement of the problem, and we cannot allow the variation of factors not specified in the problem to alter our result - our result needs to be general enough to cover all the different situations the problem could be describing. Jaynes' method for ensuring this via invariance is called the principle of transformation groups (which could be a good topic for a future article). However, his critics, namely Darrell Rowbottom, Nicholas Shackel and Diederik Aerts, note that whilst this method tries to ensure that the principle of indifference i.e. the uniform distribution is used correctly and applied to the right random variable, the principle of maximum ignorance is itself using the principle of indifference – not on the level of random variables, but on the level of equivalent problems. He is effectively using a principle to help ensure the correct use of that same principle. The worry is that any problems associated with applying the principle of indifference on the level of random variables might still persist when used on the higher level of equivalent problems. ‘with most other writers on probability theory that it is dangerous to apply this principle at the level of indifference between events, because our intuition is a very unreliable guide in such matters, as Bertrand’s paradox illustrates. However, the principle of indifference may, in [his] view, be applied legitimately at the more abstract level of indifference between problems; because that is a matter that is definitely determined by the statement of a problem, independently of our intuition.’ It does therefore seem like a less risky application of the principle of indifference, but by no means fool proof – how can you be sure that you have included all the relevant invariances? The statement of a problem will always leave a great many things unstated, and whilst some of them have no bearing on the solution, for example the time of day or the strength of the dollar, there is always a risk that we have not accounted for all of the relevant factors. This bears similarity to the problems faced by scientists in trying to perform a “fair test” when it is hard to guarantee that they are controlling (or keeping constant) all but the independent and dependent variables. ‘on the one hand, one cannot deny the force of arguments which, by pointing to such things as Bertrand’s paradox, demonstrate the ambiguities and dangers in the principle of indifference. But on the other hand, it is equally undeniable that use of this principle has, over and over again, led to correct, nontrivial, and useful predictions. Thus it appears that while we cannot wholly accept the principle of indifference, we cannot wholly reject it either; to do so would be to cast out some of the most important and successful applications of probability theory.’ As previously mentioned, Maxwell was responsible for one of the first great triumphs of kinetic theory in which he was able to predict various macroscopic properties of gases such as viscosity, thermal conductivity, diffusion rates etc. from information that seemed inadequate to determine these states uniquely. He ‘was able to predict all these quantities correctly by a ‘pure thought’ probability analysis which amounted to recognizing the ‘equally possible’ cases.’ Because Maxwell’s theory leads to testable predictions, the question of him applying the uniform distribution correctly does not belong to the realm of philosophy but to the realm of verifiable fact. Whilst this is offered as a defense by Jaynes, it also demands that these successes be explained by Jaynes’ new principles. New theories must to be able to explain the past successes of old theories. Jaynes acknowledges this, saying that he agrees A second reassurance offered by Jaynes is that

To this end Jaynes suggests

“that the cases in which the principle of indifference has been applied successfully in the past are just the ones in which the solution can be “reverbalized” so that the actual calculations used are seen as an application of indifference between problems, rather than events.”

This reverbalising is a most important “higher level problem”, that of how to form questions that are “well- posed”. If we had criteria for this then we should be able to distinguish what types of problems are susceptible to Jaynes’ principles and methods. So Bertrand’s paradox has led us to a much deeper problem; how to “well-pose” problems and how to determine which types of problems are well-posable. This is what I believe Jaynes means by the “Well-Posed Problem”. If this is solved then via Jaynes’ principles we may have a method that, whilst not guaranteeing that it produces the precise probability distributions, at least provides a method for finding distributions that are not definitely wrong, which is a step away from this century old paradox.

Bibliography: Bertrand, J. Calcus des Probabilités. Jaynes, E. T. The Well-Posed Problem. Stewart, I. Does God Play Dice? Rowbottom, D. P. Bertrand’s Paradox Revisited: Why Bertrand’s ‘Solutions’ Are All Inapplicable Mosteller, F. Fifty Challenging Problems in Probability Aerts, D. Solving the Hard Problem of Bertrand’s Paradox https://en.wikipedia.org/wiki/Bertrand_paradox_(probability)#Further_reading https://en.wikipedia.org/wiki/Principle_of_indifference

Hawking’s Tea by Mr Ottewill

Stephen Hawking, who died earlier this year, is rightly judged to have been one of the greatest theoretical physicists of the twentieth century, if not of all time. He came up with pioneering ideas in cosmology, not least on fundamental questions such as the start of the universe, as well as making great strides in combining general relativity with quantum mechanics.

An autobiography of Hawking called ‘Stephen Hawking – A Life in Science’ by Michael White and John Gribbin, relates the following anecdote from a fellow student at his high school in St Albans:

One particular example of Stephen’s highly developed insight left a lasting impression on John McClenahan. During a sixth-form physics lesson, the teacher posed the question, “If you have a cup of tea, and you want it with milk and it’s far too hot, does it get to a drinkable temperature quicker if you put the milk in as you pour the tea, or should you allow the tea to cool down before adding the milk?” While his contemporaries were struggling with a muddle of concepts to argue the point, Stephen went straight to the heart of the matter and almost instantly announced the correct answer: “Ah! Milk in first, of course,” and then went on to give a thorough explanation of his reasoning: because a hot liquid cools more quickly than a cool one, it pays to put the milk in first, so that the liquid cools more slowly. This article considers what some of the ‘muddle of concepts’ that his contemporaries might have been looking at are, i.e. what might be supposed to be the ‘standard’ way to answer the question. In doing this we uncover a potential counter argument to Hawking’s insight.

The main physics concepts needed to tackle the problem are:

(1) The formula for the temperature of a mixture formed by combining two substances with different temperatures is:

1

 1

 1  1

+ 2 + 2

 2  2

 2

 =

1

where 1 , 2 = the initial masses of the substances

 1 ,  2 = the specific heats of the substances*  1 ,  2 = the initial temperatures of the substances

* Note: the specific heat of a substance is the amount of heat per unit mass required to raise the temperature of the substance by one degree Celsius, or in other words how difficult it is to ‘heat up’ the substance.

(2) We also need to use Newton’s law of cooling which states that:

d 𝜃 d 

= −𝜆(𝜃 − 𝜃  )

where 𝜃 = the temperature of the substance

t = the time since the start of the cooling process 𝜆 = a constant of proportionality 𝜃  = the temperature of the surrounding air

It can be shown using A-level techniques that the solution to this differential equation is:

) −𝜆

𝜃 = 𝜃 

+ (𝜃 0

− 𝜃 

where 𝜃 0 = the temperature of the substance at the start time.

Equipped with these formulae we can tackle the problem set by Hawking’s teacher as follows.

We first need to estimate some rough masses and temperatures for the tea and milk and surroundings as well as a value for λ .

We might estimate a typical cup of water to be 250 ml and the amount of milk to be 10 ml, which, given that the density of both water and milk are close to 1 g/ml, means that the mass of the initial tea before being mixed with milk might be estimated at 250 g and that of the milk as 10 g.

We might estimate the starting temperature for the tea without milk as a little below 100 ºC, perhaps 97 ºC, that of the milk as being 3 ºC, and the surrounding air in a typical room being 20 ºC.

We can quote from standard sources that the specific heat of water is roughly 4190 J / kg °C and that of milk is roughly 3930 J / kg °C.

Estimating a value of λ from first principles turns out to be quite hard. An interesting section of a publication by the Open University titled ‘Introduction to Mathematical Modelling’ shows how hard it can be, using a cup of tea as an example. It first estimates λ by considering heat loss through just the circular surface of the tea in the cup, then by including loss through the side of the cup too, in both cases coming up with wildly inaccurate figure which would indicate that the cup would take hours to cool (something the publication admits openly). A better way to estimate λ is by working backwards from the time taken by a typical cup of tea-like substance to cool, e.g. supposing that it takes roughly 10 minutes for a fluid close to 100 ºC to reach a drinkable temperature of around 60 ºC. Substituting these values in to the second formula in (2) indicates that a sensible value of λ might be 0.001. If the tea starts at 97 ºC and we add the milk immediately then by the formula in (1) this brings the temperature down to 93.59 ºC. If we let this cool for another ten minutes then the formula in (2) shows that the temperature reduces to 60.39 ºC. On the other hand, if we first let the tea cool for ten minutes without milk this brings its temperature down to 62.26 ºC, and if we now add the milk at 3 ºC then this bring the tea to 60.11 ºC which is less than with the milk added at the start, as predicted by Hawking. As a final note, however, I believe that there is a slight problem with the above, namely that it assumes that when the milk is added at the later time, the milk is still at 3 ºC. Assuming that the milk has been standing around during the ten minutes that the tea has been cooling, then using a value for λ of 0.001 we find that the milk will have warmed to 10.67 ºC. If we let the tea cool and then add the milk at this temperature then we find that the resulting temperature is 60.39 ºC – exactly the same temperature found when the milk was added at the start. Of course, I am not saying that Hawking wouldn’t have realised all of this, even as an A-level student, just that there is actually quite an important condition missing from the original anecdote, namely that the milk is added at the same temperature whether added early or later on. While this might seem a sensible assumption, it potentially makes the conclusion even more obvious and is possibly another hint at how Hawking came to see the answer so quickly. If the milk is kept in the fridge while the tea is cooling and then added at the end then this will of course cause a lower final temperature, in just the same way as if any part of the final tea mixture were kept in the fridge during the ten minutes. We now have enough information to answer the question posed by Hawking’s physics teacher. This all seems to vindicate Hawking’s answer, along with his reasoning, i.e. the rate of cooling is greater when the temperature is higher, and hence it ends up slightly cooler if the milk is added later.

Chinese Remainder Theorem by Hin Chi Lee (12JLG)

The Chinese Remainder Theorem allows solving of a system of simultaneous linear congruences involving modulo operations. It has first been discovered by the Chinese mathematician Sunzi. The theorem is commonly used when dealing with large integers in computers or sometimes serves calculations in astronomical purposes. We denote that “x divided by n leaves a remainder of a” as  ≡   This is known as the modulo operation. Eg. 9 divided by 4 leaves a remainder of 1, hence this identity can be denoted by the following: 9 ≡ 1  4 It states that given a set of simultaneous congruences where are pairwise co-prime 1 for = 1, 2, … ,   ≡ 1  1  ≡ 2  2 :  ≡     can be solved by the following sequence:  = 1  1 1 + 2  2 2 + ⋯+      where = 1 2 …  and  1 ∙ 𝑀  𝑖 = 1  2

Proof of existence of solution

For any system of linear congruences,

 = 1  = 2

 1  2 …  

 = 

If we define:

=

Since are pairwise co-prime, we know that the greatest common divisor of and is 1. Bezout’s identity state that “ for any nonzero integers 1 and 2 , of which the great common divisor is d, there exists integers  1 and  2 such that  1 1 +  2 2 = d” . In this case:  +  = 1 Where  and  are integers. The solution of x can be given by:



 ≡ ∑



=1

Since 

is a multiple of for all values of ≠  , we obtain for every value:  ≡     ≡ (1 −  )    ≡   3

1 Pairwise Co-prime: Every pair of numbers within the set have 1 as their greatest common divisor 2 http://mathworld.wolfram.com/ChineseRemainderTheorem.html 3 https://pdfs.semanticscholar.org/19c5/08c974a0da5f7212f1b71a4b41f40ae33688.pdf

Why does the theorem work?

If we look at the following system:

 ≡ 1  ≡ 2

 1  2 …  

 ≡ 

We know that the solution is given by:

1

2



 ≡ 1

 1

+ 2

 2

+ ⋯+ 

 



Note that:

= 1

2

… 

Hence:

𝑀  1

 ≡ 1

 1

 1

Since ≡ 0  1 . Therefore in order to obtain the original equation:  ≡ 1

 1

The following must be true:

𝑀  1

 1

≡ 1  1

The rest of the congruences can be proved similarly. 4 We have introduced the  term for different

terms and such value can be computed by Euclid’s

Algorithm.

Test yourself: Can you solve the following problem using the Chinese Remainder Theorem?

Comets 2P/Encke, 4P/Faye, and 8P/Tuttle have orbital periods of 3 years, 8 years, and 13 years, respectively. The last perihelions 5 of each of these comets were in 2017, 2014, and 2008, respectively. Assuming that time is measured in whole numbers of years and that each orbital period is constant, what is the next year in which all three of these comets will achieve perihelion in the same year? 6

4 http://www.math.tamu.edu/~jon.pitts/courses/2005c/470/supplements/chinese.pdf 5 Perihelion: the point at which the celestial body is closest to the sun in its orbit 6 Problem extracted from https://brilliant.org/wiki/chinese-remainder-theorem/

The Importance and Role of India in Maths by Saajid Khan (7W)

Some of the greatest breakthroughs and mathematicians have come from India, for example, π, the first ever usage of zero, trigonometry, linear and quadratic equations, sine tables and many, many, more.

Chants from the early Vedic period (before 1000 BC) made numbers from tens to hundreds to trillions and they also made some operations such as addition, subtraction, multiplication, fractions, squares, cubes and roots. A 4th Century AD Sanskrit text says that the Buddha counting numbers up to 10 53 . As there are about 10 80 atoms in the whole universe, this is as close to infinity as anyone in the world came to. It also shown

that the size of an atom, which comes incredibly close to the actual size of a carbon atom (about 70 trillionths of a metre). As early as the 8th Century BC (long before Pythagoras) a text known as the ‘Sulba Sutras’ listed some simple Pythagorean triples. The Sulba Sutras also contain solutions of linear and quadratic equations (Extremely complicated!) and give a remarkably accurate figure for the square root of 2, founded by adding 1 + 1 ⁄ 3 + 1 ⁄ (3 x 4) - 1 ⁄ (3 x 4 x 34)

, which is a value of 1.4142156, to 5 decimal

The evolution of Hindu-Arabic numerals.

places.

In the early 3rd or 2nd century BC, Jain (believers of Jainism,) mathematicians recognized five different types of infinities: infinite in one direction, in two directions, in area, infinite everywhere and continuously infinite. Ancient Buddhist literature demonstrates an awareness of unknown and infinite numbers, with numbers said to be of three types: countable, uncountable and infinite.

Like the Chinese, the Indians early discovered the benefits of a decimal place value number system, and were using it before about the 3rd Century AD. They perfected the system, especially the written symbol of the numerals. The Indians were also responsible for the first usage of a circle. The character for the number zero is usually credited to a 9th Century AD engraving in a temple in

Gwalior (Central India.) The use of zero as a number which could be used in calculations and mathematical investigations would revolutionize mathematics.

The earliest use of a circle character for the number zero was in India.

Brahmagupta (a famous Indian mathematician) recognized the basic mathematical rules for dealing with zero: 1 + 0 = 1; 1 - 0 = 1; and 1 x 0 = 0). He also established rules for dealing with negative numbers, and

pointed out that quadratic equations (Extremely complicated!) could in theory have two possible solutions, one of which could be negative. He even attempted to write down these concepts, using the initials of the

names of colours to represent unknown numbers in his equations. That is what we now call algebra. Indian mathematicians made advances in the theory of trigonometry, a method of linking geometry and numbers first developed by the Greeks. They used it to survey the land around them and navigate the seas. For instance, Indian astronomers used trigonometry to calculate the relative distances between the Earth and the Moon and the Earth and the Sun. They realized that, when the Moon is half full and directly

Indian astronomers used trigonometry tables to estimate the relative distance of the Earth to the Sun and Moon. opposite the Sun, then the Sun, Moon and Earth form a right angled triangle, and were able to accurately measure the angle as 1 7 °. Their sine tables (complex!) gave a ratio for the sides of such a triangle as 400:1, indicating that the Sun is 400 times further away from the Earth than the Moon. Next the Indian astronomers wanted to be able to calculate the sine function of any given angle. A text called the “Surya Siddhanta”, dating from around 400 AD, contains the start of modern trigonometry, including the first real use of sines, cosines, inverse sines, tangents and secants (Also difficult!) As early as the 6th Century AD, the great Indian mathematician and astronomer Aryabhata produced definitions of sine, cosine, versine and inverse sine, and specified complete sine and versine table, to an accuracy of 4 decimal places. Aryabhata also demonstrated solutions to simultaneous quadratic equations, and produced an approximation for the value of π equivalent to 3.1416, correct to four decimal places. He then used this to estimate the circumference of the Earth turning out as 24,835 miles, only 70 miles off its real value. But, perhaps even more surprising, he seems to have been aware that π is an irrational number, and that any calculation can only ever be an approximation.

Bhaskara II (who lived in the 12th Century) explained the operation of division by zero. He noticed that dividing one into two pieces makes a half, so 1 ÷ 1 2 = 2. So, 1 ÷ 1 3 = 3. Therefore, dividing 1 by smaller and smaller factions makes a larger and larger number of pieces. Hence, dividing one into zero pieces would make infinity so 1 ÷ 0 = ∞. Bhaskara II also made important contributions to many different areas of mathematics from solutions of quadratic, cubic and quartic equations (including negative and irrational

solutions) to solutions of Diophantine equations of the second order to preliminary concepts of infinitesimal calculus and mathematical analysis to spherical trigonometry and other aspects of trigonometry. Some of his findings predate similar discoveries in Europe by several centuries, and he made important contributions in terms of the systemization of current knowledge and improved methods for known solutions. The Kerala School of Astronomy and Mathematics was found in the late 14th Century by Madhava of Sangamagrama, sometimes called ‘the greatest mathematician-astronomer of medieval India’. He developed infinite series approximations for a range of trigonometric functions, including π, sine, cosine, etc. Some of his contributions to geometry and algebra and his early forms of differentiation and integration for simple functions may have been transmitted to Europe via Jesuit missionaries, and it is possible that the later European development of calculus was influenced by his work to some extent.

The Birthday Paradox by James Storey (8R)

At a party what are the chances that a pair of people share the same birthday? This is known as the Birthday Problem. It is a probability problem with a surprising solution!

We need to define the problem more clearly in order to get an answer. Let’s say that there was a party and you were the first person to arrive. Clearly, there would be a 100% chance that your birthday matches your birthday! So far, easy!

Now another person walks into the room, what is the probability of them sharing your birthday?

Well, let’s consider the chances of them not sharing your birthday. There are 365 days in a year (not including leap years) and so the other person therefore has a 364 365 chance of not sharing your birthday. This is a 99.7% chance of them not sharing your Birthday which leaves a 0.3% chance that they will share your birthday.

Now assuming that another person walks in there is a 363 365

of them not sharing a birthday with the two

of you in the room providing that the two people in the room do not share the same birthday. So we now have the probability of not sharing a birthday as

365 365

𝑋 364 365

𝑋 363 365

𝑋 100 = 99.179%

So the probability that they share a birthday with at least one person in the room is 100 – 99.179 = 0.821%.

Applying this as more people enter we get an interesting effect ... it only requires 23 people to enter the room before there is a 50% chance of two people sharing the same birthday.

Music is Maths by Samuel Smith (12AC)

‘ Don Caballero, 2 (1995)’

Music and mathematics. A combination of two very different subjects, each appealing to a different kind of

person. One is all kinds of fun, useful and interesting; pleasing to people of all ages and the other is

mathematics. Yet these two seemingly polar opposites slowly came together in a variety of different genres,

through electronica to rock.

What I have displayed as the image on this article is perhaps the best record to make use of the phrase: math

rock (how long did it take for them to think of that one?) The sub-genre is usually characterized by

unconventional time signatures, abrupt pauses in the middle of songs and usually a lot of noise. Don

Caballero 2 shows off all of these in quite a flashy, often blinding manner. For example, on the tracks which

are sensibly named ‘ No-One Gives A Hoot About FAUX-ASS Nonsense’, ‘please tokio, please THIS IS

TOKIO’ and ‘ Cold Knees (In April)’ (where the ‘noise’ begins at about two minutes in) show these different

chaotic characteristics clash together in ¾, 16/8, and even 7/8.

Moving away from rock, and onto more electronic means, many would

often recommend the record Geogaddi by Scottish duo ‘Boards of

Canada’ if you wanted a particularly maths-themed work. This record

is often dubbed as ‘the most haunted album you will ever hear’ by

most internet users, but for good reason: the album contains VHS

static samples, eerie silent tracks and references to maths! No wonder

people find this terrifying.

This album includes more technical mathematics than math

rock does. The album starts with ‘ Music Is Math’. Another

track, which is called ‘A is to B as B is to C’ is an allusion to

the golden ratio, a topic where the ratio of the area of two

objects is equal to the ratio of the sum of the larger shape. The

song ‘ The Devil Is In The Details’ contains bass frequencies

which are arranged in the order of the Fibonacci Sequence. The

album contains 23 tracks, which is a reference to the ’23

The Golden Ratio

Enigma’, a belief that every event to ever occur is based around

the number 23 .

Most interestingly, the song ‘ The Smallest Weird Number’

introduces the ‘weird number’. What is the smallest weirdest

number, you ask? According to mathematics, it is 70, which is

unsurprisingly the name of their recording company. Weird

numbers are more of an elaborate trick, or a way to pass the

time, and to my knowledge, do not include a technical side

however. Let us take 70. Its factors are 1, 2, 5, 7, 10, 14, 35

which when added up, make 74. However, what makes a

number ‘weird’ is the fact that you cannot add any of these

numbers, in any order, to make 70 itself.

5+7+10+14+35 = 71

5+7+10+14+35 = 69

The duo has another song called ‘Sixtyten’ which is a perfectly normal alternative for the word ‘seventy’ but

strangely not accepted on GCSE Maths Papers for some reason.

All in all, this can hopefully show you that mathematics can go beyond the usual differentiation exercise and

can be extended to more creative heights. Who knows, maybe one day, you will be allowed to listen to

music in maths lessons thanks to this article.

Prime Numbers by Jeremy Sanmoogan (7R)

Prime numbers are numbers that can only be split into whole numbers, if divided by themselves or one. You probably knew that, but here are some interesting facts that you may not have known:

1. Any positive number can be made by adding together two or fewer prime

number(s) 2. 37 is the 12 th prime number while 73 is the 21 st prime number, with 37 to 73 and

12 to 21 both being reverses (This is called an emirp , which is prime spelt

backwards.)

3. All prime numbers larger than 3 are in the form of (6n ± 1) e.g. 47 = 6 × 8 – 1 or

79 = 6 × 13 + 1

4. Between any number bigger than 1, there will always be a prime number between

it and its double.

5. 73,939,133 is a prime number wherein when the end digit is removed, it stays

prime. If you check, you will see that 7,393,913,739,391,73,939 and all of those

numbers are prime.

Hopefully you learnt something reading this article on interesting facts about prime numbers.

*Green is for numbers and Red is for operations

The Mechanics of Snowboarding by Toby Evans (10P)

Whilst watching the winter Olympics in Pyeongchang, I was intrigued by both the sports, and the mechanics behind them, particularly the half-pipe snowboarding. The snowboarders accelerate down a slightly angled slope, climb the pipe and perform jumps and tricks. Shaun White won an amazing gold whilst Yuto Totsuka crashed out since he failed to complete his trick before he landed. One more second in the air could have made all the difference, so I decided to investigate the circumstances which would optimise performance.

A successful halfpipe jump optimises air time and maximises height, my research therefore investigates the mechanics which facilitate this, illustrated below .

Velocity of snowboarder at take- off

V o

Trajecto ry angle

C

A

∆ℎ

Velocity of snowboarder at base of halfpipe

V b

B

Shaun enters the half pipe at A and gathers speed whilst descending at an angle  . At point B, he climbs the pipe at an angle 𝜃 , increasing vertical height by ∆ℎ . The angle of take-off at point C is . To get the gold, Shaun needs to perform the maximum number of tricks whilst airborne and attain a maximum height. So, I want to find out the values of the variables which will maximise both time in the air, t, and maximum height reached, Hmax. Assuming zero air resistance and minimal friction, the angle of the climb, 𝜃 , does not

impact upon loss of velocity between V b , but it does impact upon the trajectory angle, , which determines the maximum height reached by the snowboarder, therefore, 𝜃 is significant. From the diagram above, +  + 𝜃 = 90 , hence = 90 −  − 𝜃 , I will investigate the impact of . and V o

H max .

V v

V o

C



The diagram above illustrates the trajectory from the top of the half pipe, at point C.

Accordingly, I set up the problem as follows:

Superpipes are standard for all major competitions and have walls built of snow which are 6.7 metres high, therefore d = 6.7m, the pitch angle is typically 18 o , therefore  = 18 o and gravitational field strength, g = 9.8 m/s 2 . We are also assuming that Shaun’s mass is 75kg. The law of conservation of energy states that energy can neither be created nor destroyed, rather it can only be transformed from one form to another. Therefore, as Shaun gains height, he gains gravitational potential energy, but loses kinetic energy, hence:

1 2

1 2

𝑉 2 =

𝑉 2 + ∆ℎ

Therefore, in terms of Vb, we can find Vo as :

𝑉 = √𝑉 2 − 2∆ℎ

The optimal take off velocity, Vo, is driven by the factors that the judges consider in their scores: height reached, technical difficulty, performance and overall control. The objective is not, like slalom, to achieve a maximum initial speed, but rather one which optimises the total score. In Pyeongchang, Shaun White attempting to beat Hirano, came out with energy and speed, but fell midway through to gain only 55 points. His more controlled final run, however, earned 97.75 points and the gold medal. Vb is therefore assumed to be in the range of 15-20 m/s. From the chart below, we see that initial speed is almost directly proportional to time airborne, and also maximum height achieved. The equations which I input into excel and their derivation are lengthy, and therefore they are not included in my main essay but added in the appendix . I then looked at the values of Hmax and t while changing Vo from 15 to 20 m/s, as shown in the graphs below.

Maximum height reached versus velocity

14

12

10

8

6

4

2

0

15

16

17

18

19

20

Velocity at the base of the pipe

Airborne time versus velocity

As velocity increases, so do both Hmax and t , indicating that they are directly proportional, therefore I set Vb to 20 m/s, the speed that enables tricks without impeding control. This leaves three unknown variables, , Hmax and t . I then investigated the impact of on Hmax and t : the two variables which would affect the judges score. (The graphs are only plotted to 72 o for since otherwise, after that point, 𝜃 would be negative )

3.5

3

2.5

2

1.5

1

0.5

0

15

16

17

18

19

20

Velocity at the base of the halfpipe

Maximum height reached versus alpha

14

12

10

8

6

4

2

0

0

10

20

30

40

50

60

70

80

Alpha

As alpha increases, both airborne time and maximum height reached increase, however airborne time does peak before starting to fall again, as shown in the graph below :

Airborne time versus alpha

4

3.5

3

2.5

2

1.5

1

0.5

0

0

10

20

30

40

50

60

70

80

Alpha

Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 Page 9 Page 10 Page 11 Page 12 Page 13 Page 14 Page 15 Page 16 Page 17 Page 18 Page 19 Page 20 Page 21 Page 22 Page 23 Page 24 Page 25 Page 26 Page 27 Page 28 Page 29 Page 30 Page 31 Page 32 Page 33 Page 34 Page 35 Page 36 Page 37 Page 38 Page 39 Page 40 Page 41 Page 42 Page 43 Page 44 Page 45 Page 46 Page 47 Page 48 Page 49 Page 50 Page 51 Page 52 Page 53 Page 54 Page 55 Page 56

Made with FlippingBook - professional solution for displaying marketing and sales documents online