16214-SGW-Sixth Form Journal 2023-HI Res

STEM

The following graph is an illustration of the probability of each digit to lead a number. The likelihood of the leading digit in a set of data being one is much larger than any other digit and decreases logarithmically. These values are calculated using the formula below, where d is any digit from one to nine. A logarithmic function is written as ‘log’.

are assigned (not naturally occurring) will not follow Benford’s Law. This stipulation is significant because, when combined with certain Microsoft Excel functions, a simple spreadsheet can turn into a fraud detection mechanism: enter data into a table, perform a COUNTIF function to count the number of leading digits, and compare with the distributions laid out by Benford’s Law. If the distribution does not reflect Benford’s Law, there is evidence to suggest that assigned or fraudulent data has been entered. Indeed, Benford’s Law was used to reveal fraudulent macroeconomic data by Greece, in reference to their 2001 eurozone application. The eurozone comprises of 19 nations (including Greece) which use the euro as their currency (before 2001, Greece used the Greek Drachma). According to Ben Goldacre, Greece had submitted economic data which deviated very far from the distribution set out by Benford’s Law. Greece inventing this data was problematic; they claimed to have a lower government deficit, and therefore greater capacity to finance debt repayments, than was actually the case. When the Financial Crisis hit in 2008, Greece was much closer to defaulting on their debt than was previously anticipated. When Greece did eventually default, the European Central Bank had to ‘bail out’ the Greek economy. This caused a sudden devaluation of the Euro, which affected other eurozone countries’ ability to finance debt repayments, notably Ireland, Italy, Spain and Portugal. Given how profound the eurozone crisis was, and how easy it could have been to prevent, it would seem sensible to suggest regular use of Benford’s Law to determine the legitimacy of significant data.

P(d) = log10(1 + 1/d) To understand why this works, it is vital to understand that if 10x = 100, then log10(100) = x. With this in mind, one can observe the following logs: log10(2) = 0.301 and log10(3) = 0.477, which can then be turned into percentages: 30.1% and 47.7%. This is relevant because 30.1% is the same percentage given in the graph above for the digit 1 and 47.7% is the same percentage given when 30.1 and 17.6 are summed (the percentage given for the digit 2), and this repeats. However, there are certain times when this law cannot apply. A set of data where the numbers

41

Made with FlippingBook - Online Brochure Maker