# THE FIRST-DIGIT PHENOMENON

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 THE FIRST-DIGIT PHENOMENON T. P. Hill A century-old observation concerning an unexpected pattern in the first digits of many tables of numerical data has recently been discovered to also hold for the stock market, census statistics, and accounting figures. New mathematical insights establish this empirical law as part of modern probability theory, and recent applications include testing of mathematical models, design of computers, and detection of fraud in tax returns. In 1881, the astronomer/mathematician Simon Newcomb published a 2-page article in the American Journal of Mathematics describing his observation that books of logarithms in the library were dirtier in the beginning and progressively cleaner throughout. From this he inferred that researchers using the logarithm tables (fellow astronomers and mathematicians, as well as biologists, sociologists and other scholars) were looking up numbers starting with 1 more often than numbers starting with 2, and numbers with first digit 2 more often than 3, and so on. After a short heuristic argument, Newcomb concluded that the probability that a number has first significant digit (i.e., first non-zero digit) d is ( Prob(first significant digit = d) =log ), d =1,2,...,9. d In particular, his conjecture was that the first digit is 1 about 30% of the time, and is 9 only about 4.6% of the time (Figure 1). That the digits are not equally likely comes as somewhat of a surprise, but to claim an exact law describing their distribution is indeed striking. Newcomb s article went unnoticed, and 57 years later General Electric physicist Frank Benford, apparently unaware of Newcomb s paper, made exactly the same observation about logarithm books and also concluded the same logarithm law. But then Benford tested his conjecture with an effort to collect data from as many as fields as possible 1

2 and to include a wide variety of types... the range of subjects studied and tabulated was as wide as time and energy permitted. Evidence indicates that Benford spent several years gathering data, and the table he published in 1938 in the Proceedings of the American Philosophical Society was based on 20,229 observations from such diverse data sets as areas of rivers, American League baseball statistics, atomic weights of elements, and numbers appearing in Reader s Digest articles. His resulting table of first significant digits (the number to the left of the decimal point in scientific notation; Figure 2) fit the logarithm law exceedingly well. Unlike Newcomb s article, Benford s paper drew a great deal of attention (partly due to the good fortune of having it published adjacent to a soon-to-be-famous physics paper), and Newcomb s contribution having become completely forgotten, the logarithm probability law came to be known as Benford s Law. (There is also a general significant-digit law which includes not only the first digits, but also the second (which may be 0) and all higher significant digits, and even the joint distribution of the digits. The general law says, for example, that the probability that the first three significant digits are 3, 1, and 4, in that order, is ( log ) = and similarly for other significant-digit patterns. From this general law it follows that the second significant digits, although decreasing in relative frequency through the digits as do the first-digits, are much more uniformly distributed than the first digits, and the third than the second, etc. The general law also implies that the significant digits are not independent as might be expected, but that instead knowledge of one digit affects the likelihood of another. For example, an easy calculation shows that the (unconditional) probability that the second digit is 2 is = 0.109, but the (conditional) probability that the second digit is 2 given that the first digit is 1, is = ) Of course, many tables of numerical data do not follow this logarithmic distribution lists of telephone numbers in a given region usually begin with the same few digits, and even neutral data such as square root tables are not a good fit. However, a surprisingly diverse collection of empirical data does obey the logarithm law for significant digits, and 2

3 since Benford s popularization of the law, an abundance of empirical evidence has appeared: tables of physical constants, numbers appearing in newspaper front pages, accounting data and scientific calculations. The assumption of logarithmically-distributed significant-digits (i.e., floating point numbers) in scientific calculations is widely used and well-established, and Stanford computer scientist Donald Knuth s classic text The Art of Computer Programmer devotes a section to the log law. More recently, analyst Eduardo Ley of Resources for the Future in Washington, DC has found that stock market figures (Dow-Jones Industrial Average index, and Standard and Poor s index) fit Benford s law closely, and accounting professor Mark Nigrini, whose PhD thesis was based on applications of Benford s law, discovered that in the 1990 U.S. Census, the populations of the three thousand counties in the U.S. are also a very good fit to Benford s law (Figure 1). The skeptical reader is encouraged to perform a simple experiment such as listing all the numbers occurring on the front pages of several local newspapers, or randomly selecting data from a Farmer s Alamack, as Knuth suggests. In the 60 years since Benford s article appeared there have been numerous attempts by mathematicians, physicists, statisticians and amateurs to prove Benford s law, but there have been two main stumbling blocks. The first is very simple some data sets satisfy the law, and some do not, and there never was a clear definition of a general statistical experiment which would predict which tables would, and which would not. Instead, researchers endeavored to prove that the log law is a built-in characteristic of our number system; that is, to prove that the set of all numbers satisfies the log law, and then to suggest that this somehow explains the frequent empirical evidence. Attempts at proofs were based on various mathematical averaging and integration techniques, as well as probabilistic drawing balls from urns schemes. One popular hypothesis in this context has been that of assuming scale invariance, which corresponds to the intuitively attractive idea that if there is indeed any universal law for significant digits, then it certainly should be independent of the units used (e.g., metric or English). It was observed empirically that tables which fit the log law closely also fit it closely if converted (by scale multiplication) to other units, or converted to reciprocal 3

4 units. For example, if stock prices closely follow Benford s law (as Ley found they do), then conversion from dollars per stock to pesos (or yen) per stock should not alter the first-digit frequencies much, even though the first digits of individual stock prices will often change radically (Figure 3a). Similarly, converting from dollars per stock to stocks per dollar in Benford tables will also retain nearly the same digital frequencies, whereas in stock tables not originally close to Benford s law (such as uniformly distributed prices, Figure 3b), changing currencies or converting to reciprocals will often dramatically alter the digital frequencies. Although there was some limited success in showing that Benford s law is the only set of digital frequencies which remains fixed under scale changes, the second stumbling block in making mathematical sense of the law is that none of the proofs were rigorous as far as the current theory of probability is concerned. Although both Newcomb and Benford phrased the question as a probabilistic one ( what is the probability that the first significant digit of a number is d? ), modern probability theory requires the intuitive countable additivity axiom that if a positive integer (not digit) is picked at random and p(1) is the probability that the number 1 is picked, p(23) the probability 23 is picked, etc., then p(1) + p(2) + p(3) + =1, whereas all proofs prior to 1995 failed to satisfy this basic axiom. One possible drawback to a hypothesis of scale-invariance in tables of universal constants is the special role played by the constant 1. For example, consider the two physical laws f = ma and e = mc 2. Both laws involve universal constants, but the force equation constant 1 is not recorded in tables, whereas the speed of light constant C is. If a complete list of universal physical constants also included the 1 s, it is quite possible that this special constant 1 will occur with strictly positive probability p. But if the table is scale-invariant, then multiplying by a conversion factor of 2 would mean that the constant 2 would also have this same positive probability p, and similarly for all other integers. However this would violate the countable additivity axiom, since then p(1) + p(2) + =p+p+p+ = infinity, not 1. Instead, suppose it is assumed that any reasonable universal significant-digit law 4

5 should be independent of the base, that is, should be equally valid when expressed in base 10, base 100, binary base 2, or any other base. (In fact, all of the previous arguments supporting Benford s law carry over mutatis mutandis to other bases, as many of the authors state.) In investigating this new base-invariance hypothesis, it was discovered that looking at equivalent significant-digit sets of numbers, rather than individual numbers themselves, eliminates the previous problems of countable additivity, and allows a formal rigorous proof that the log law is the only probability distribution which is scale-invariant, and the only one which is base-invariant (excluding the constant 1). (The formal base-invariance theorem in fact states that the only probability distributions on significant digits which are base-invariant are those in which the special constant 1 occurs with possibly positive probability, and the rest of the time the distribution is the log law. The generality of this result implies that any other property which is found to imply Benford s law is necessarily base-invariant, and hence a corollary of this theorem.) These two new results were clean mathematically, but they hardly helped explain the appearance of Benford s law empirically. What do 1990 census statistics have in common with 1880 users of logarithm tables, numerical data from front pages of newspapers of the 1930 s collected by Benford, or computer calculations observed by Knuth in the 1960 s? What do these tables have in common, and why should they be logarithmic, or equivalently, scale or base invariant? Many tables are not of this form, including even Benford s individual tables (as he noted), but as University of Rochester mathematician Ralph Raimi pointed out, what came closest of all, however, was the union of all his tables. Combine the molecular weight tables with baseball statistics and the areas of rivers, and then there is a good fit to Benford s law. Instead of thinking of some universal table of all possible constants (Raimi s stock of tabular data in the world s libraries or Knuth s some imagined set of real numbers ), what seems more natural is to think of data as coming from many different distributions, as in Benford s study, in collecting numerical data from newspapers, or in listing stock prices. Using this idea, modern mathematical probability theory, and the recent scale and 5

6 base-invariance proofs, it is not difficult to derive the following new statistical form of the significant-digit law. If distributions are selected at random (in any unbiased way) and random samples are taken from each of these distributions, then the significant-digit frequencies of the combined sample will converge to Benford s distribution (Figure 4), even though the individual distributions selected may not closely follow the law. For example, suppose you are collecting data from a newspaper, and the first article concerns lottery numbers (which are generally uniformly distributed), the second article concerns a particular population with a standard bell curve distribution, and the third is an update of latest calculations of atomic weights. None of these distributions has significant digit frequencies close to Benford s law, but their average does (Figure 5), and sampling randomly from each will yield digital frequencies close to Benford s law. It is of course crucial that the sampling be neutral or unbiased as to scale or base. For example, sampling volumes of soft drink products from various manufacturers in Europe will surely not be scale invariant, since those product volumes are closely related to liters, and conversion to other units such as gallons will yield a totally different range of first-digit frequencies. On the other hand, if samples of various species of mammals are selected at random and their volumes determined, it seems much more likely that the resulting data will be unrelated to liters or gallons or other choices of units, and hence that the significant digits will closely follow Benford s law. One of the points of the new random samples from random distributions theorem is that there are many natural sampling procedures which lead to the same log distribution, whereas previous arguments were based on the assumption that tables following the log law were all representative of the same mystical underlying set of all constants. Thus the random-sample theorem helps explain how the logarithm-table digital frequencies observed a century ago by Newcomb, and modern tax, census, and stock data all lead to the same log distribution. The new theorem also helps predict the appearance of the significant-digit phenomenon in many different empirical contexts (including your morning newspaper), and thus helps justify some of the recent applications of Benford s law. One of the applications of the significant-digit law has been to testing of mathematical models (Figure 6). Suppose that a new model is proposed to predict future stock indices, 6

7 census data, or computer usage. If current data follows Benford s law closely or if a hypothesis of unbiased random samples from random distributions seems reasonable, then the predicted data should also follow Benford s law closely (or else perhaps the model should be replaced by one which does). Such a Benford-in, Benford-out test is at best only a double-check on reasonableness, since the law says nothing about the raw data itself; e.g., Benford s law does not distinguish between the numbers 20 and 200,000 both have first significant digit 2, and all other digits 0. Another application of Benford s law that has been recently studied is to the design of computers. If computer users of tomorrow are likely to be performing calculations taken from many (unbiased random) distributions, as Knuth and other computer scientists claim is the case today, then their floating-point calculations will be based on data which closely follows Benford s law. In particular, the numbers they will be computing will not be uniformly distributed over the floating point numbers, but will rather follow the log distribution. If this is indeed the case, then it is possible to build computers whose designs capitalize on knowing the distribution of numbers they will be manipulating. If 9 s are much less frequent than 1 s (or the analog for whatever base the computer is using recall the principle of base-invariance of Benford s law), then it should be possible to construct computers which use that information to minimize storage space, or to maximize rate of output printed, for example. The underlying idea is simple think instead of a cash register drawer. If the frequency of transactions involving the various denominations of bills is known, then the drawer may be specially designed to take advantage of that fact by using bins of different sizes, or which are located in a particular arrangement (such as typewriter and computer keyboards). In fact, German mathematician Peter Schatte has determined that based on an assumption of Benford input, the computer design that minimizes expected storage space (among all computers with binary-power base) is base 8, and other researchers are currently exploring use of logarithmic computers to speed calculations. A current development in the field of accounting is the application of Benford s law to detect fraud or fabrication of data in financial documents. Nigrini has amassed extensive empirical evidence of the occurrence of Benford s law in many areas of accounting and 7

8 demographic data, and has come to the conclusion that in a wide variety of accounting situations, the significant digit frequencies of true data conform very closely to Benford s law (Figure 7). When people fabricate data, on the other hand, either for fraudulent purposes or just to fill in the blanks, the concocted data rarely conforms to Benford s law. That people cannot act truly randomly, even in situations where it is to their advantage to do so, is a well-established fact in psychology. One of my own favorite examples of this from my field of probability is this. The first day of class in an introductory semester of probability theory, I asked the students to do the following homework assignment that evening. If their mother s maiden name begins with A through L, they are to flip a coin 200 times and record the results. The rest of the class is to fake a sequence of 200 heads and tails. The next day I collect the results, and separate the fakers data from the others with 95% accuracy by the following rule. A sequence of 200 truly random tosses of a fair coin contains a run of 6 heads or 6 tails with very high probability (the exact calculation is quite involved), yet the average person trying to fake a random sequence very rarely writes such long runs. Nigrini s PhD thesis in accounting was based on an analogous idea using Benford s law. Assuming that true accounting data follows Benford s law fairly closely (as his research indicated it does), then substantial deviations from that law suggest possible fraud or fabrication of data. Nigrini has designed several goodness-of-fit tests to measure conformity with Benford s law, and The Wall Street Journal (July 10, 1995) reported that the District Attorney s office in Brooklyn, New York was able to detect fraud in seven New York companies using Nigrini s tests (Figure 7). From the evidence to date, it appears that both fraudulent and random-guess data tend to have far too few numbers beginning with 1, and far too many beginning with 6. Based on these preliminary successes, Nigrini has been asked to consult with the internal revenue services of several countries, and is currently helping install Benford goodness-of-fit tests in major accounting fraud-detection computer packages. At the time of Professor Raimi s article in Scientific American on Benford s law over a quarter century ago, the significant-digit phenomenon was thought to be merely a math- 8

9 ematical curiosity without real-life application, and without a satisfactory mathematical explanation. He wrote, Thus all the explanations of [Benford s law] so far given seem to lack something of finality, and concluded the answer remains obscure. Although perhaps the final chapter on the significant-digit phenomenon has not been written, today the answer is much less obscure, is firmly couched in the modern mathematical theory of probability, and is seeing important applications to society. 9

10 Bibliography F. Benford (1938) The law of anomalous numbers, Proceedings of the American Philosophical Society 78, T. Hill (1995) Base-invariance implies Benford s law, Proceedings of the American Mathematical Society 123, T. Hill (1996) A statistical derivation of the significant-digit law, Statistical Science 10, E. Ley (1996) On the peculiar distribution of the U.S. stock indices digits, American Statistician, toappear. M. Nigrini (1996) A taxpayer compliance application of Benford s law, Journal of the American Taxation Association 18, R. Raimi (1969) The peculiar distribution of first digits, Scientific American, December,

### The Difficulty of Faking Data Theodore P. Hill

A century-old observation concerning the distribution of significant digits is now being used to help detect fraud. The Difficulty of Faking Data Theodore P. Hill Most people have preconceived notions

### Benford s Law: Tables of Logarithms, Tax Cheats, and The Leading Digit Phenomenon

Benford s Law: Tables of Logarithms, Tax Cheats, and The Leading Digit Phenomenon Michelle Manes (mmanes@math.hawaii.edu) 9 September, 2008 History (1881) Simon Newcomb publishes Note on the frequency

### NUMB3RS Activity: We re Number 1! Episode: The Running Man

Teacher Page 1 NUMB3RS Activity: We re Number 1! Topic: Benford s Law Grade Level: 9-12 Objective: Explore Benford s Law Time: 20-30 minutes Introduction In The Running Man, Charlie discusses how Benford

### Benford s Law and Fraud Detection, or: Why the IRS Should Care About Number Theory!

Benford s Law and Fraud Detection, or: Why the IRS Should Care About Number Theory! Steven J Miller Williams College Steven.J.Miller@williams.edu http://www.williams.edu/go/math/sjmiller/ Bronfman Science

### RANDOM-NUMBER GUESSING AND THE FIRST DIGIT PHENOMENON1

Psychological Reports, 1988, 62,967-971. Psychological Reports 1788 RANDOM-NUMBER GUESSING AND THE FIRST DIGIT PHENOMENON1 THEODORE P. HILL' Georgia Irrrlirute of Technology Summmy.-To what extent do individuals

### 7. Normal Distributions

7. Normal Distributions A. Introduction B. History C. Areas of Normal Distributions D. Standard Normal E. Exercises Most of the statistical analyses presented in this book are based on the bell-shaped

### Detecting Fraud in Health Insurance Data: Learning to Model Incomplete Benford s Law Distributions

Detecting Fraud in Health Insurance Data: Learning to Model Incomplete Benford s Law Distributions Fletcher Lu 1 and J. Efrim Boritz 2 1 School of Computer Science, University of Waterloo & Canadian Institute

### Get M.A.D. with the Numbers! Moving Benford s Law from Art to Science

Get M.A.D. with the Numbers! Moving Benford s Law from Art to Science BY DAVID G. BANKS, CFE, CIA September/October 2000 Until recently, using Benford s Law was as much of an art as a science. Fraud examiners

### The Math. P (x) = 5! = 1 2 3 4 5 = 120.

The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct

### Digital analysis: Theory and applications in auditing

Digital analysis: Theory and applications in auditing Tamás Lolbert, auditor, State Audit Office of Hungary E-mail: lolbertt@asz.hu Newcomb in 1881 and Benford in 1938 independently introduced a phenomenon,

### AMS 5 CHANCE VARIABILITY

AMS 5 CHANCE VARIABILITY The Law of Averages When tossing a fair coin the chances of tails and heads are the same: 50% and 50%. So if the coin is tossed a large number of times, the number of heads and

### Revised Version of Chapter 23. We learned long ago how to solve linear congruences. ax c (mod m)

Chapter 23 Squares Modulo p Revised Version of Chapter 23 We learned long ago how to solve linear congruences ax c (mod m) (see Chapter 8). It s now time to take the plunge and move on to quadratic equations.

### Introduction to Statistics for Computer Science Projects

Introduction Introduction to Statistics for Computer Science Projects Peter Coxhead Whole modules are devoted to statistics and related topics in many degree programmes, so in this short session all I

### Cheating behavior and the Benford s Law Professor Dominique Geyer, (E-mail : dgeyer@audencia.com) Nantes Graduate School of Management, France

Cheating behavior and the Benford s Law Professor Dominique Geyer, (Email : dgeyer@audencia.com) Nantes Graduate School of Management, France Abstract Previous studies (Carslaw, ; Thomas, ; Niskanen &

### MTH124: Honors Algebra I

MTH124: Honors Algebra I This course prepares students for more advanced courses while they develop algebraic fluency, learn the skills needed to solve equations, and perform manipulations with numbers,

### A Few Basics of Probability

A Few Basics of Probability Philosophy 57 Spring, 2004 1 Introduction This handout distinguishes between inductive and deductive logic, and then introduces probability, a concept essential to the study

### WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? introduction Many students seem to have trouble with the notion of a mathematical proof. People that come to a course like Math 216, who certainly

### 1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

Copyright c 2009 by Karl Sigman 1 Stopping Times 1.1 Stopping Times: Definition Given a stochastic process X = {X n : n 0}, a random time τ is a discrete random variable on the same probability space as

### Hypothesis Testing with z Tests

CHAPTER SEVEN Hypothesis Testing with z Tests NOTE TO INSTRUCTOR This chapter is critical to an understanding of hypothesis testing, which students will use frequently in the coming chapters. Some of the

### Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

### Section 1.3 P 1 = 1 2. = 1 4 2 8. P n = 1 P 3 = Continuing in this fashion, it should seem reasonable that, for any n = 1, 2, 3,..., = 1 2 4.

Difference Equations to Differential Equations Section. The Sum of a Sequence This section considers the problem of adding together the terms of a sequence. Of course, this is a problem only if more than

### Using Retrocausal Practice Effects to Predict On-Line Roulette Spins. Michael S. Franklin & Jonathan Schooler UCSB, Department of Psychology.

Using Retrocausal Practice Effects to Predict On-Line Roulette Spins Michael S. Franklin & Jonathan Schooler UCSB, Department of Psychology Summary Modern physics suggest that time may be symmetric, thus

### The program also provides supplemental modules on topics in geometry and probability and statistics.

Algebra 1 Course Overview Students develop algebraic fluency by learning the skills needed to solve equations and perform important manipulations with numbers, variables, equations, and inequalities. Students

### 13 Infinite Sets. 13.1 Injections, Surjections, and Bijections. mcs-ftl 2010/9/8 0:40 page 379 #385

mcs-ftl 2010/9/8 0:40 page 379 #385 13 Infinite Sets So you might be wondering how much is there to say about an infinite set other than, well, it has an infinite number of elements. Of course, an infinite

### P (A) = lim P (A) = N(A)/N,

1.1 Probability, Relative Frequency and Classical Definition. Probability is the study of random or non-deterministic experiments. Suppose an experiment can be repeated any number of times, so that we

### arxiv:1112.0829v1 [math.pr] 5 Dec 2011

How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly

### Using Benford s Law. World Headquarters the gregor building 716 West Ave Austin, TX 78701-2727 USA

Using Benford s Law To Detect Fraud World Headquarters the gregor building 716 West Ave Austin, TX 78701-2727 USA VII. THE DIGIT TESTS The five digit tests developed by Mark Nigrini mentioned earlier are

### CHAPTER 2. Inequalities

CHAPTER 2 Inequalities In this section we add the axioms describe the behavior of inequalities (the order axioms) to the list of axioms begun in Chapter 1. A thorough mastery of this section is essential

### A Guide to Benford s Law. A CaseWare IDEA Research Department Document

A Guide to Benford s Law A CaseWare IDEA Research Department Document January 30, 2007 CaseWare IDEA Inc. is a privately-held software development and marketing company, with offices in Toronto and Ottawa,

### We can express this in decimal notation (in contrast to the underline notation we have been using) as follows: 9081 + 900b + 90c = 9001 + 100c + 10b

In this session, we ll learn how to solve problems related to place value. This is one of the fundamental concepts in arithmetic, something every elementary and middle school mathematics teacher should

### It is not immediately obvious that this should even give an integer. Since 1 < 1 5

Math 163 - Introductory Seminar Lehigh University Spring 8 Notes on Fibonacci numbers, binomial coefficients and mathematical induction These are mostly notes from a previous class and thus include some

### Estimating the Frequency Distribution of the. Numbers Bet on the California Lottery

Estimating the Frequency Distribution of the Numbers Bet on the California Lottery Mark Finkelstein November 15, 1993 Department of Mathematics, University of California, Irvine, CA 92717. Running head:

### APPLICATIONS OF THE ORDER FUNCTION

APPLICATIONS OF THE ORDER FUNCTION LECTURE NOTES: MATH 432, CSUSM, SPRING 2009. PROF. WAYNE AITKEN In this lecture we will explore several applications of order functions including formulas for GCDs and

### Chi Square Tests. Chapter 10. 10.1 Introduction

Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square

### A New Interpretation of Information Rate

A New Interpretation of Information Rate reproduced with permission of AT&T By J. L. Kelly, jr. (Manuscript received March 2, 956) If the input symbols to a communication channel represent the outcomes

### , for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0

Chapter 4 The Poisson Distribution 4.1 The Fish Distribution? The Poisson distribution is named after Simeon-Denis Poisson (1781 1840). In addition, poisson is French for fish. In this chapter we will

### The Effective Use of Benford s Law to Assist in Detecting Fraud in Accounting Data

Journal of Forensic Accounting 1524-5586/Vol.V(2004), pp. 17-34 2004 R.T. Edwards, Inc. Printed in U.S.A. The Effective Use of Benford s Law to Assist in Detecting Fraud in Accounting Data Cindy Durtschi

### We never talked directly about the next two questions, but THINK about them they are related to everything we ve talked about during the past week:

ECO 220 Intermediate Microeconomics Professor Mike Rizzo Third COLLECTED Problem Set SOLUTIONS This is an assignment that WILL be collected and graded. Please feel free to talk about the assignment with

### People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

PROBABILITY AND LIKELIHOOD, A BRIEF INTRODUCTION IN SUPPORT OF A COURSE ON MOLECULAR EVOLUTION (BIOL 3046) Probability The subject of PROBABILITY is a branch of mathematics dedicated to building models

### Selecting Research Participants

C H A P T E R 6 Selecting Research Participants OBJECTIVES After studying this chapter, students should be able to Define the term sampling frame Describe the difference between random sampling and random

### Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80)

Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80) CS 30, Winter 2016 by Prasad Jayanti 1. (10 points) Here is the famous Monty Hall Puzzle. Suppose you are on a game show, and you

### ARE STOCK PRICES PREDICTABLE? by Peter Tryfos York University

ARE STOCK PRICES PREDICTABLE? by Peter Tryfos York University For some years now, the question of whether the history of a stock's price is relevant, useful or pro table in forecasting the future price

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem

IEOR 6711: Stochastic Models I Fall 2012, Professor Whitt, Tuesday, September 11 Normal Approximations and the Central Limit Theorem Time on my hands: Coin tosses. Problem Formulation: Suppose that I have

### Predictive Coding Defensibility

Predictive Coding Defensibility Who should read this paper The Veritas ediscovery Platform facilitates a quality control workflow that incorporates statistically sound sampling practices developed in conjunction

### Invoice Number Vendor Number Amount 129304 A543891 \$1,035.71 129304 A543891 \$1,035.71

Fraud Detection: Using Data Analysis Techniques A new approach being used for fraud prevention and detection involves the examination of patterns in the actual data. The rationale is that unexpected patterns

### 6 Scalar, Stochastic, Discrete Dynamic Systems

47 6 Scalar, Stochastic, Discrete Dynamic Systems Consider modeling a population of sand-hill cranes in year n by the first-order, deterministic recurrence equation y(n + 1) = Ry(n) where R = 1 + r = 1

### 36 Odds, Expected Value, and Conditional Probability

36 Odds, Expected Value, and Conditional Probability What s the difference between probabilities and odds? To answer this question, let s consider a game that involves rolling a die. If one gets the face

### Probability and Expected Value

Probability and Expected Value This handout provides an introduction to probability and expected value. Some of you may already be familiar with some of these topics. Probability and expected value are

The result of the bayesian analysis is the probability distribution of every possible hypothesis H, given one real data set D. This prestatistical approach to our problem was the standard approach of Laplace

### The Laws of Cryptography Cryptographers Favorite Algorithms

2 The Laws of Cryptography Cryptographers Favorite Algorithms 2.1 The Extended Euclidean Algorithm. The previous section introduced the field known as the integers mod p, denoted or. Most of the field

### Economics 1011a: Intermediate Microeconomics

Lecture 12: More Uncertainty Economics 1011a: Intermediate Microeconomics Lecture 12: More on Uncertainty Thursday, October 23, 2008 Last class we introduced choice under uncertainty. Today we will explore

### Lies My Calculator and Computer Told Me

Lies My Calculator and Computer Told Me 2 LIES MY CALCULATOR AND COMPUTER TOLD ME Lies My Calculator and Computer Told Me See Section.4 for a discussion of graphing calculators and computers with graphing

### Factoring & Primality

Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### For the purposes of this course, the natural numbers are the positive integers. We denote by N, the set of natural numbers.

Lecture 1: Induction and the Natural numbers Math 1a is a somewhat unusual course. It is a proof-based treatment of Calculus, for all of you who have already demonstrated a strong grounding in Calculus

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### Sampling Distributions and the Central Limit Theorem

135 Part 2 / Basic Tools of Research: Sampling, Measurement, Distributions, and Descriptive Statistics Chapter 10 Sampling Distributions and the Central Limit Theorem In the previous chapter we explained

### MATHEMATICS OF FINANCE AND INVESTMENT

MATHEMATICS OF FINANCE AND INVESTMENT G. I. FALIN Department of Probability Theory Faculty of Mechanics & Mathematics Moscow State Lomonosov University Moscow 119992 g.falin@mail.ru 2 G.I.Falin. Mathematics

### Computational Finance Options

1 Options 1 1 Options Computational Finance Options An option gives the holder of the option the right, but not the obligation to do something. Conversely, if you sell an option, you may be obliged to

### How many numbers there are?

How many numbers there are? RADEK HONZIK Radek Honzik: Charles University, Department of Logic, Celetná 20, Praha 1, 116 42, Czech Republic radek.honzik@ff.cuni.cz Contents 1 What are numbers 2 1.1 Natural

### Chapter 8. Exponential and Logarithmic Functions

Chapter 8 Exponential and Logarithmic Functions This unit defines and investigates exponential and logarithmic functions. We motivate exponential functions by their similarity to monomials as well as their

### 15.1 Least Squares as a Maximum Likelihood Estimator

15.1 Least Squares as a Maximum Likelihood Estimator 651 should provide (i) parameters, (ii) error estimates on the parameters, and (iii) a statistical measure of goodness-of-fit. When the third item suggests

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Inflation. Chapter 8. 8.1 Money Supply and Demand

Chapter 8 Inflation This chapter examines the causes and consequences of inflation. Sections 8.1 and 8.2 relate inflation to money supply and demand. Although the presentation differs somewhat from that

### Prime Numbers. Chapter Primes and Composites

Chapter 2 Prime Numbers The term factoring or factorization refers to the process of expressing an integer as the product of two or more integers in a nontrivial way, e.g., 42 = 6 7. Prime numbers are

### Continued Fractions. Darren C. Collins

Continued Fractions Darren C Collins Abstract In this paper, we discuss continued fractions First, we discuss the definition and notation Second, we discuss the development of the subject throughout history

### Integer roots of quadratic and cubic polynomials with integer coefficients

Integer roots of quadratic and cubic polynomials with integer coefficients Konstantine Zelator Mathematics, Computer Science and Statistics 212 Ben Franklin Hall Bloomsburg University 400 East Second Street

### 6.4 Normal Distribution

Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

### A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions

A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions Marcel B. Finan Arkansas Tech University c All Rights Reserved First Draft February 8, 2006 1 Contents 25

### THE LAW OF THE THIRD

THE LAW OF THE THIRD A FUNDAMENTAL RESEARCH DISCOVERING THE MATHEMATICAL EQUATION RETULATING RANDOMNESS WITH REPLACEMENT By Don A. R. Colonne B.Sc., MBA, M.A.(Econ) List of Tables 1. Data obtained from

### Students in their first advanced mathematics classes are often surprised

CHAPTER 8 Proofs Involving Sets Students in their first advanced mathematics classes are often surprised by the extensive role that sets play and by the fact that most of the proofs they encounter are

### Is there life on other worlds?

activity 5 Is there life on other worlds? purpose Is there life on other worlds? People have pondered this question since ancient times. But now, for the first time in human history, advances in the biological

### Cosmological Arguments for the Existence of God S. Clarke

Cosmological Arguments for the Existence of God S. Clarke [Modified Fall 2009] 1. Large class of arguments. Sometimes they get very complex, as in Clarke s argument, but the basic idea is simple. Lets

### 3. Data Analysis, Statistics, and Probability

3. Data Analysis, Statistics, and Probability Data and probability sense provides students with tools to understand information and uncertainty. Students ask questions and gather and use data to answer

### Pascal is here expressing a kind of skepticism about the ability of human reason to deliver an answer to this question.

Pascal s wager So far we have discussed a number of arguments for or against the existence of God. In the reading for today, Pascal asks not Does God exist? but Should we believe in God? What is distinctive

### Mathematics for Computer Science

Mathematics for Computer Science Lecture 2: Functions and equinumerous sets Areces, Blackburn and Figueira TALARIS team INRIA Nancy Grand Est Contact: patrick.blackburn@loria.fr Course website: http://www.loria.fr/~blackbur/courses/math

### The Reinvestment Assumption Dallas Brozik, Marshall University

The Reinvestment Assumption Dallas Brozik, Marshall University Every field of study has its little odd bits, and one of the odd bits in finance is the reinvestment assumption. It is an artifact of the

### 4. Which of the following is a correct metric unit for volume? A. Smidgens B. Drops C. Microns D. Liters Answer: D

Chapter 1: Physics, the Fundamental Science 1. People sometimes have difficulty distinguishing between scientific explanations of common events and other kinds of explanation (superstition, prejudice,

### 9. Sampling Distributions

9. Sampling Distributions Prerequisites none A. Introduction B. Sampling Distribution of the Mean C. Sampling Distribution of Difference Between Means D. Sampling Distribution of Pearson's r E. Sampling

### It is time to prove some theorems. There are various strategies for doing

CHAPTER 4 Direct Proof It is time to prove some theorems. There are various strategies for doing this; we now examine the most straightforward approach, a technique called direct proof. As we begin, it

### If n is odd, then 3n + 7 is even.

Proof: Proof: We suppose... that 3n + 7 is even. that 3n + 7 is even. Since n is odd, there exists an integer k so that n = 2k + 1. that 3n + 7 is even. Since n is odd, there exists an integer k so that

### Euclid s Algorithm for the Greatest Common Divisor Desh Ranjan Department of Computer Science New Mexico State University

Euclid s Algorithm for the Greatest Common Divisor Desh Ranjan Department of Computer Science New Mexico State University 1 Numbers, Division and Euclid People have been using numbers, and operations on

### 1 Interest rates, and risk-free investments

Interest rates, and risk-free investments Copyright c 2005 by Karl Sigman. Interest and compounded interest Suppose that you place x 0 (\$) in an account that offers a fixed (never to change over time)

### If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not

4.1 REVIEW AND PREVIEW RARE EVENT RULE FOR INFERENTIAL STATISTICS If, under a given assumption, the of a particular observed is extremely, we conclude that the is probably not. 4.2 BASIC CONCEPTS OF PROBABILITY

A Bit About the St. Petersburg Paradox Jesse Albert Garcia March 0, 03 Abstract The St. Petersburg Paradox is based on a simple coin flip game with an infinite expected winnings. The paradox arises by

### Mathematical Induction

Mathematical Induction Victor Adamchik Fall of 2005 Lecture 1 (out of three) Plan 1. The Principle of Mathematical Induction 2. Induction Examples The Principle of Mathematical Induction Suppose we have

### Gambling Systems and Multiplication-Invariant Measures

Gambling Systems and Multiplication-Invariant Measures by Jeffrey S. Rosenthal* and Peter O. Schwartz** (May 28, 997.. Introduction. This short paper describes a surprising connection between two previously

### The Multiplication Game

The Multiplication Game Kent E. Morrison California Polytechnic State University San Luis Obispo, CA 93407 kmorriso@calpoly.edu The game You walk into a casino, and just inside the main entrance you see

### Introduction to time series analysis

Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples

### Statistics and Probability

Statistics and Probability TABLE OF CONTENTS 1 Posing Questions and Gathering Data. 2 2 Representing Data. 7 3 Interpreting and Evaluating Data 13 4 Exploring Probability..17 5 Games of Chance 20 6 Ideas

### Summarizing Data: Measures of Variation

Summarizing Data: Measures of Variation One aspect of most sets of data is that the values are not all alike; indeed, the extent to which they are unalike, or vary among themselves, is of basic importance

### The Two Envelopes Problem

1 The Two Envelopes Problem Rich Turner and Tom Quilter The Two Envelopes Problem, like its better known cousin, the Monty Hall problem, is seemingly paradoxical if you are not careful with your analysis.

### Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D.

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. In biological science, investigators often collect biological

### How to build a probability-free casino

How to build a probability-free casino Adam Chalcraft CCR La Jolla dachalc@ccrwest.org Chris Freiling Cal State San Bernardino cfreilin@csusb.edu Randall Dougherty CCR La Jolla rdough@ccrwest.org Jason

### 1 Uncertainty and Preferences

In this chapter, we present the theory of consumer preferences on risky outcomes. The theory is then applied to study the demand for insurance. Consider the following story. John wants to mail a package

### The Assumption(s) of Normality

The Assumption(s) of Normality Copyright 2000, 2011, J. Toby Mordkoff This is very complicated, so I ll provide two versions. At a minimum, you should know the short one. It would be great if you knew