FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES



Similar documents
1 Gambler s Ruin Problem

6.042/18.062J Mathematics for Computer Science December 12, 2006 Tom Leighton and Ronitt Rubinfeld. Random Walks

Effect Sizes Based on Means

As we have seen, there is a close connection between Legendre symbols of the form

PRIME NUMBERS AND THE RIEMANN HYPOTHESIS

TRANSCENDENTAL NUMBERS

Price Elasticity of Demand MATH 104 and MATH 184 Mark Mac Lean (with assistance from Patrick Chan) 2011W

Math 319 Problem Set #3 Solution 21 February 2002

Stochastic Derivation of an Integral Equation for Probability Generating Functions

Lecture 21 and 22: The Prime Number Theorem

8 Primes and Modular Arithmetic

The Magnus-Derek Game

Number Theory Naoki Sato

Introduction to NP-Completeness Written and copyright c by Jie Wang 1

SQUARE GRID POINTS COVERAGED BY CONNECTED SOURCES WITH COVERAGE RADIUS OF ONE ON A TWO-DIMENSIONAL GRID

The Online Freeze-tag Problem

More Properties of Limits: Order of Operations

A Modified Measure of Covert Network Performance

Lectures on the Dirichlet Class Number Formula for Imaginary Quadratic Fields. Tom Weston

Pythagorean Triples and Rational Points on the Unit Circle

Applications of Fermat s Little Theorem and Congruences

Complex Conjugation and Polynomial Factorization

SOME PROPERTIES OF EXTENSIONS OF SMALL DEGREE OVER Q. 1. Quadratic Extensions

Multiperiod Portfolio Optimization with General Transaction Costs

POISSON PROCESSES. Chapter Introduction Arrival processes

Measuring relative phase between two waveforms using an oscilloscope

Every Positive Integer is the Sum of Four Squares! (and other exciting problems)

Assignment 9; Due Friday, March 17

ENFORCING SAFETY PROPERTIES IN WEB APPLICATIONS USING PETRI NETS

Computational Finance The Martingale Measure and Pricing of Derivatives

Stat 134 Fall 2011: Gambler s ruin

United Arab Emirates University College of Sciences Department of Mathematical Sciences HOMEWORK 1 SOLUTION. Section 10.1 Vectors in the Plane

Managing specific risk in property portfolios

An important observation in supply chain management, known as the bullwhip effect,

A MOST PROBABLE POINT-BASED METHOD FOR RELIABILITY ANALYSIS, SENSITIVITY ANALYSIS AND DESIGN OPTIMIZATION

SUM OF TWO SQUARES JAHNAVI BHASKAR

Index Numbers OPTIONAL - II Mathematics for Commerce, Economics and Business INDEX NUMBERS

Homework until Test #2

THE NUMBER OF REPRESENTATIONS OF n OF THE FORM n = x 2 2 y, x > 0, y 0

c 2009 Je rey A. Miron 3. Examples: Linear Demand Curves and Monopoly

k, then n = p2α 1 1 pα k

Continued Fractions. Darren C. Collins

Continued Fractions and the Euclidean Algorithm

A Note on Integer Factorization Using Lattices

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

arxiv:math/ v2 [math.nt] 2 Mar 2006

CHAPTER 5. Number Theory. 1. Integers and Division. Discussion

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, Notes on Algebra

I. GROUPS: BASIC DEFINITIONS AND EXAMPLES

Point Location. Preprocess a planar, polygonal subdivision for point location queries. p = (18, 11)

Number Theory Naoki Sato

The Taxman Game. Robert K. Moniot September 5, 2003

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

3. Mathematical Induction

Our Primitive Roots. Chris Lyons

MATH 289 PROBLEM SET 4: NUMBER THEORY

5544 = = = Now we have to find a divisor of 693. We can try 3, and 693 = 3 231,and we keep dividing by 3 to get: 1

On Multicast Capacity and Delay in Cognitive Radio Mobile Ad-hoc Networks

X How to Schedule a Cascade in an Arbitrary Graph

Appendix: Translation of Euler s E744. On divisors of numbers contained in the form mxx + nyy

A Simple Model of Pricing, Markups and Market. Power Under Demand Fluctuations

Lecture 13 - Basic Number Theory.

Stability Improvements of Robot Control by Periodic Variation of the Gain Parameters

Number Theory. Proof. Suppose otherwise. Then there would be a finite number n of primes, which we may

Chapter 11 Number Theory

Coin ToGa: A Coin-Tossing Game

3 Some Integer Functions

ON FIBONACCI NUMBERS WITH FEW PRIME DIVISORS

Risk in Revenue Management and Dynamic Pricing

The fast Fourier transform method for the valuation of European style options in-the-money (ITM), at-the-money (ATM) and out-of-the-money (OTM)

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

DETERMINANTS IN THE KRONECKER PRODUCT OF MATRICES: THE INCIDENCE MATRIX OF A COMPLETE GRAPH

Math Workshop October 2010 Fractions and Repeating Decimals

Lecture 3: Finding integer solutions to systems of linear equations

HOMEWORK 5 SOLUTIONS. n!f n (1) lim. ln x n! + xn x. 1 = G n 1 (x). (2) k + 1 n. (n 1)!

Piracy and Network Externality An Analysis for the Monopolized Software Industry

On Software Piracy when Piracy is Costly

Chapter 4. Probability and Probability Distributions

The Lognormal Distribution Engr 323 Geppert page 1of 6 The Lognormal Distribution

Comparing Dissimilarity Measures for Symbolic Data Analysis

Monitoring Frequency of Change By Li Qin

Algebraic and Transcendental Numbers

Prime numbers and prime polynomials. Paul Pollack Dartmouth College

Sample Induction Proofs

Risk and Return. Sample chapter. e r t u i o p a s d f CHAPTER CONTENTS LEARNING OBJECTIVES. Chapter 7

F inding the optimal, or value-maximizing, capital

FACTORING SPARSE POLYNOMIALS

MATH Fundamental Mathematics IV

On the predictive content of the PPI on CPI inflation: the case of Mexico

C-Bus Voltage Calculation

1 Error in Euler s Method

Probabilistic models for mechanical properties of prestressing strands

The risk of using the Q heterogeneity estimator for software engineering experiments

Pressure Drop in Air Piping Systems Series of Technical White Papers from Ohio Medical Corporation

PRIME FACTORS OF CONSECUTIVE INTEGERS

Goldbach's Strong Conjecture This states that all even integers are formed by the addition of at least one pair of prime numbers.

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 2

On the largest prime factor of x 2 1

Arrangements of Stars on the American Flag

The sum of digits of polynomial values in arithmetic progressions

Transcription:

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT Abstract. We consider statistical roerties of the sequence of ordered airs obtained by taking the sequence of rime numbers and reducing modulo m. Using an inclusion/exclusion argument and a cut-off of an infinite roduct suggested by Pólya, we obtain a heuristic formula for the robability that a air of consecutive rime numbers of size aroximately x will be congruent to a, a+d) modulo m. We demonstrate some symmetries of our formula. We test our formula and some of its consequences against data for x in various ranges. 1. Introduction In a beautiful aer from 1959, Pólya [5] resented a heuristic formula to aroximate the number of rimes < x such that q = + d is also rime, where d is a ositive even integer. His formula is a secial case of the Bateman Horn conjecture [2], and had already been roosed by Hardy Littlewood [4]. However, his method of derivation is much more elementary. Pólya uses the standard robabilistic model of the rimes. His innovation is deciding where to cut off a certain roduct, which he determines by invoking Mertens s Theorem. In our treatment, this cut-off ste, involving Euler s constant, occurs in 3.1). Pólya does not care whether there are rimes between and q. In contrast, we ask about consecutive rime airs, and we are interested in gas not only equal to d but congruent to d modulo a given ositive integer m. This interest stems from our earlier work [1] in which we studied the statistics of the sequence of rime numbers modulo m. For this reason, we actually look at the finer count of consecutive rime airs congruent to a given a, a + d) mod m). We can thus state our roblem as follows: ) Given ositive integers a, d, and m, and a ositive number x, define Na, d, m, x) to be the number of consecutive rime airs < q such that < x, a and q a + d mod m). What are the asymtotics of Na, d, m, x) as x tends to infinity? Dirichlet s Theorem tells us that if we look only at single rimes, the number of rimes < x with a mod m) is aroximately πx)/ϕm), indeendent of a rovided that a is relatively rime to m). Here ϕ is the Euler ϕ-function. One might suose that the residues modulo m of consecutive rime airs would be indeendent in a seudo-robabilistic sense, and therefore that Na, d, m, x) would be aroximately indeendent of a and d rovided that a and a + d are relatively rime to m). However, this aears not to be the case in the numerical evidence comiled in [1] and in this aer. Date: 2010-4-26. 2000 Mathematics Subject Classification. 11N05, 11K45, 11N69. Key words and hrases. Prime airs, Bateman Horn, Hardy Littlewood, Pólya. 1

2 AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT For examle, suose m = 4 and consider consecutive airs of rimes congruent to 1, 1) and 1, 3) resectively. Our data show that there is a definite tendency for the 1, 1)-airs to occur considerably less frequently than the 1, 3)-airs. We do not know if this tendency ersists as x increases. The work of Hardy and Littlewood [4] suggests an exlanation of the imbalance. If d is a ositive even integer, then Hardy and Littlewood roose that see [4,. 42, Conjecture B]: the robability that x and x + d are rime 1 2 2C 2 log x). 2 Here In articular C 2 = >2 ) 1 1. 1) 2 d,>2 the robability that x and x + 2 are rime 2C 2 log x). 2 We observe that the deendence on d occurs only through the factor 1 d,>2, and 2 therefore this factor should tell us the relative frequency of rimes such that + d is also rime when comared to the frequency of twin rimes. Here is a table of values for small d: d,>2 d 2 4 6 8 10 12 14 16 1 4 6 1 1 2 1 2 1 2 3 5 Thus, for examle, rimes such that + 6 is also rime should occur twice as often as twin rimes. If is a rime congruent to 1 mod 4, and the next rime q is congruent to 3 mod 4, then q = + d for d {2, 6, 10, 14,... }; while if the next rime q is congruent to 1 mod 4, then q = + d for d {4, 8, 12, 16,... }. The table above suggests that, q) 1, 3) mod 4) would be more likely. However the suggestion raises further questions, since even if + d is rime, it does not mean that it is the next rime after. For examle, it is erfectly ossible for each of, + 4, and + 6 to be rime. Thus there are interactions between the various events and + d are rime that need to be taken into account if we want to redict whether, q) 1, 3) mod 4) is more likely than, q) 1, 1) mod 4). To the best of our knowledge, Problem ) is wide oen, and cannot be treated using L-functions, unlike the case of Dirichlet s Theorem. In this aer, we attemt a heuristic solution to Problem ) by alying an inclusion/exclusion argument, together with Pólya s heuristic argument, in 3. This leads to a formula for a function we call P a, d, m, x), which heuristically should be the robability that x is a rime congruent to a mod m) and the next largest rime is congruent to a + d mod m). Thus, we should have the aroximation Na, d, m, x) P a, d, m, x)x, in the sense that 1.1) lim x P a, d, m, x)x Na, d, m, x) = 1. However, 1.1) is not the aroriate conjecture to verify. For the same reasons that the aroximation πx) x/ log x is less accurate than πx) x dy/ log y, we instead comare 2

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES 3 Na, d, m, X) Na, d, m, x) with X y=x+1 P a, d, m, y). We do not attemt to rovide a measure of how good this aroximation should be, since there is no actual underlying robability distribution we are samling. Neither did Pólya in [5]. In fact, our exression for P is an infinite series, and we do not know whether this series converges. Thus, we choose an integer J 0 and truncate P at the J th term to obtain a finite exression P J a, d, m, x). Our heuristic rooses that P J a, d, m, x) should be the robability that x is a rime congruent to a mod m) and that the next rime is x+d+jm for some j = 0,..., J. Thus, if J is sufficiently large given x), our heuristic suggests that P J a, d, m, x) should be a density function for Na, d, m, x) in the sense given above. Our heuristic does not tell us exactly how to choose J. We suggest below that d + Jm 4 log x is a reasonable choice. We first investigate roerties of the heuristic formula for P J for a fixed J, and test them against some actual data. Below we resent data for only a few values of m, which suffice to convey the general icture. The function P J a, d, m, x) ossesses two exact symmetries. First, Proosition 4.2 states that P J a, d, m, x) = P J a d, d, m, x). We call this antidiagonal symmetry because of its aearance when we arrange the values of P J a, d, m, x) in a matrix indexed by a and a+d. Second, Proosition 4.1 says that when m is a ower of 2, P J a, d, 2 k, x) is indeendent of a. We test these two symmetry redictions against actual data in 5 and 6 resectively. The idea is that X y=x+1 P Ja, d, m, y) should closely aroximate Na, d, m, X) Na, d, m, x). Therefore the matrix of Na, d, m, X), indexed by a, a + d), should show aroximately these two symmetries. In addition to these symmetries, the observations at the start of 4 imly that if we combine the terms in 1/log x) 2 from P J a, d, m, x), we obtain a sum which is indeendent of a. Heuristically, we might exect this term to be dominant as x and therefore indeendence of a should aear for large x. We test this rediction numerically in 8. It would be interesting to have analytical roofs of any of these symmetries for the ratio Na, d, m, x)/πx) in the limit as x. As stated above, we do not know whether Na, d, m, x)/πx) might be indeendent of both a and d in this limit. We comare the actual values of X y=x+1 P Ja, d, m, y) and Na, d, m, X) Na, d, m, x) for x = 10 3, X = 10 6, and various small values of a, d, and m in 7. We begin our sum at 10 3 because our heuristic relies on x being large relative to a, d, and m. The alternating sums 3.7) become too large for feasible comutation when X gets much beyond 10 6. For any x, P J 1, 2, 2, x) should aroximate log x) 1. We test this for J = 28 and various x in 4. We also discuss what haens when J varies. We rove an internal consistency, called vertical comatibility, in Proosition 4.3 and Corollary 4.4. This asserts that if m n and J is given, there exists J such that P J a, d, m, x) equals a certain sum of P J a, d, n, x) s. The function P J aears to stabilize surrisingly quickly as a function of J. Exerimentally, for small values of x, P J still aears to remain aroximately constant, even when J is much larger than 4 log x. See the remarks at the end of 4. This stabilization as J varies is somewhat surrising; our heuristic is based on a robabilistic icture, whereas, for examle,

4 AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT we know that the next rime after is certainly less than 2. It would be interesting to have a theoretical gras of how P J varies with J. We could try to eliminate the deendence on J by looking at P a, d, m, x) := lim J P J a, d, m, x). However, we see no way of roving that the limit exists. The sum defined by the limit is certainly not absolutely convergent see 4 below). Our heuristic makes less and less sense if x is fixed and J increases. Acknowledgments. We thank Jenny Baglivo and K. Soundarajan for their hel, and the referee for helful comments. The first and second authors thank the National Science Foundation for artial funding of this research under grant DMS-0455240. 2. Notation To simlify the job of the reader, we record all of our considerable notation at the outset. Fix an integer m > 1, and a ositive integer a. Let S be a finite set of nonnegative numbers containing 0. Let be a rime. Let x > 2 be an integer. We define: 2.1) 2.2) 2.3) n = ns) = cards) n = n S) = cards mod ) 1 n ) 1 if n = 1 n otherwise ) 2.4) A n = 1 n ) n 2.5) 2.6) 2.7) 2.8) 2.9) 1 1 αs) = 1 n ) S) 1 ns) ) 1 if a + s, m) = 1 for all s S βa, m, S) = m n S) m 0 otherwise S k = {0, 1,..., k} Qa, d, m, x, j) = 1) ns) A ns) αs)βa, m, S) log x) ns) {0,d+jm} S S d+jm P J a, d, m, x) = J Qa, d, m, x, j) j=0

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES 5 The definition of n in 2.2) means the number of distinct residue classes modulo in the set S. Once is larger than the largest element in S, n S) = ns). The roduct in 2.4) is infinite but converges, because for > n, both the numerator and denominator have leading terms 1 n. The roduct in 2.5) contains only finitely many terms that do not equal 1, because once > n and > maxs), the numerator and denominator are equal. Below, we will tyically think of a as a ositive integer. However, a aears only as the first argument in βa, m, S). In turn, the definition of βa, m, S) only deends on a in testing the condition a + s, m) = 1. Thus, βa, m, S), Qa, d, m, x, j), and P J a, d, m, x) only deend on the value of a mod m). 3. The Heuristic We aly Pólya s heuristic [5] to the roblem of determining the following robabilities: 1. Given a ositive integer d, the robability that x and x + d are successive rimes. 2. Given integers a, d 1 and m 2, the robability that x and x + d are successive rimes and x a mod m). 3. Given integers a, d 1 and m 2, the robability x is rime, x a mod m) and the next rime is congruent to a + d mod m). 3.1. The robability that x and x + d are successive rimes. Let S be a nonemty finite set of nonnegative integers. Let be a rime number. Then for integers x Prob x + k for k S) n, where n = n S) is defined in 2.2). Note that 1 n. Furthermore, if n =, then each residue class mod is reresented by some k S, hence x + k for some k S, so that Prob x + k for k S) is indeed 0. Also n = n := card S as soon as is sufficiently large larger than the largest element in S will suffice). In [5], Pólya roosed a way to ass from the robability that x + k for k S to the robability that x + k is rime for k S. He argues that the events x + k for k S for various rimes should be considered heuristically indeendent as long as is small comared with x, so that we may conclude that Prob x + k for k S for all < y) n <y as long as x is sufficiently larger than y. Now if x > y > x 1/2 then x + k for k S for all < y x + k is rime for k S at least if x is large enough that the size of the elements of S can be ignored. Pólya roosed that the correct value of y should be x µ, where µ = e γ, and γ = 0.5772... is Euler s constant. His justification for this trick of the magic µ is simly and solely that it works when S = {0}: indeed by Mertens s Theorem, 1 1 log x, <x µ

6 AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT while the robability that x is rime is also 1 we roose log x, by the Prime Number Theorem. Accordingly 3.1) Probx + k is rime for each k S) n. <x µ We rewrite 3.1) as follows: n n <x µ n 3.2) n n n n n<<x µ n n n n n<<x µ n<< n n 1 n )) ) 1 n ) n 1 1 n<<x µ <x µ ) 1 n ) n 1 1 <x µ n<<x µ 1 1 ) n 1 1 ) n. The second roduct may be viewed as a finite roduct, because n = n for sufficiently large. The first two roducts are αs), as defined in 2.5). The third roduct aroaches A n, as defined in 2.4). The fourth roduct satisfies <x µ 1 1 ) n by Mertens s Theorem. Putting all this together gives: 1 log x) n. 3.3) Probx + k is rime for k S) αs) log x). n As a check, let d be a ositive integer, and take S = {0, d}. We should retrieve the Hardy Littlewood formula [4]. In fact, ) ) A 2 = 1 2 2 = 4 1 2 2 = 4 ) 1 1, 1 ) 1 >2 1 ) 1 1) 2 >2 and n S) = 1 if d, n S) = 2 otherwise. Hence, 0 if d is odd α{0, d}) = 1 1 if d is even, 2 2 d,>2 which agrees with the formula from [4] quoted in the Introduction note that our A 2 is 4 times Hardy and Littlewood s C 2 ). Let d be a ositive integer. We want to aly the formula 3.3) to find the robability that x and x + d are successive rimes. Let S d = {0, 1,..., d}. Then by inclusion/exclusion: A n

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES 7 3.4) Probx is rime and x + d is the next rime) = 1) ns) A ns) αs) log x). ns) {0,d} S S d 3.2. The robability that x and x + d are successive rimes and x a mod m). We start over with a congruence condition. Let m 2 and a 1 be integers. Then Probx a mod m)) 1 m. Suose that is a rime divisor of m: then the congruence x a mod m determines whether x + k for any k S. Indeed if a + k is not relatively rime to m for some k S then Probx a mod m) and x + S are rimes) = 0. Here x + S denotes {x + s : s S}. On the other hand, if m, the conditions x + k for k S should be indeendent of congruences modulo m and indeendent of each other), heuristically seaking. So if a + k is relatively rime to m for every k S then Probx a mod m) and x + S are rimes) 1 m <x µ m n. So 1 n if a + k, m) = 1 k S m Probx a mod m) and x + S are rimes) <x µ m 0 otherwise. Now, in the first case, when the elements of a + S are all relatively rime to m, n < for all m, so we have 1 n = 1 n 1 A n αs) m m n m n <x µ log x). n Thus <x µ m m 3.5) Probx a mod m) and x + S are rimes) βa, m, S)αS) log x), n where βa, m, S) is defined in 2.6). Let d be a ositive integer. Then by inclusion/exclusion: m A n 3.6) Probx is rime, x a mod m), and x + d is the next rime) 1) ns) 2 Probx a mod m) and x + S are rimes) {0,d} S S d 1) ns) A ns) αs)βa, m, S) log x). ns) {0,d} S S d = Qa, d, m, x, 0).

8 AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT 3.3. The robability that x is rime, x a mod m), and the next rime is a + d mod m). Finally, suose that 1 d m, and that a, m) = 1 and a + d, m) = 1. Then 3.7) Probx a mod m), x rime, and next rime is a + d mod m) J P J a, d, m, x) = 1) ns) A ns) αs)βa, m, S) log x). ns) j=0 {0,d+jm} S S d+jm What value of J should be used in the comutation of P J a, d, m, x)? One ossibility is to let j range from 0 u to. However, there are various finite uer bounds which can be used. We need to use a large enough value of J to insure that the set x + S d+jm contains the next rime after x assuming that x is rime). Bertrand s Postulate guarantees that choosing J so that d + Jm x is sufficient. We can safely use a much smaller value of J, however. The average ga between rimes of size x is log x, and the standard deviation is also log x. We can be almost certain that x + S d+jm will contain the next rime if d + Jm 4 log x. It is worth recalling Pólya s remark [5,. 384, footnote]: When we consider a fixed number of rimes, the robabilities introduced can be regarded as indeendent, but they cannot be so regarded when the number of rimes increases in an arbitrary manner. Unfortunately, as a ractical matter, we are only able to choose J so that d+jm is smaller than 55; otherwise, the number of subsets of S d+jm is too large to be comutationally feasible. Thus when comuting our heuristic we need to limit ourselves to 4 log x 55, or x less than roughly 10 6. We will discuss this further below. 4. Some Observations We see that: If S contains any odd integers, then αs) = 0, because n 2 S) = 2. If a, m) 1, then βa, m, S) = 0, because 0 S. Therefore, we will henceforth assume that a, m) = 1. If {0, d + jm} S and a + d, m) 1, then βa, m, S) = 0. If a a mod m), then βa, m, S) = βa, m, S). If βa, m, S) 0, then βa, m, S) is indeendent of a. Proosition 4.1. If m is a ower of 2, then αs)βa, m, S) is indeendent of a. Hence, if m = 2 k and a is odd, then P J a, d, m, x) is indeendent of a. Proof. Because a, m) = 1, we know that a is odd. If S contains any odd elements, then αs) = 0. If S contains only even numbers, then a + S will contain only odd numbers, and hence will be relatively rime to m, regardless of the value of a. It is temting to rearrange the terms in 2.9) in order of increasing ns) and let J. However, the sum of the terms with ns) = 2 is divergent. This is true in general, but it is simlest to see when m = 2 k. In that case, we must have a odd and d even in order for αs)βa, m, S) 0. The terms with ns) = 2 all have the form S = {0, d + jm}. In this situation, we can easily comute that ) ) 1 2 βa, m, S) = = 1 2 k 1 2. k 1

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES 9 On the other hand, αs) will always be larger than 1, so all of the infinitely many) terms 2 A with ns) = 2 will be larger than 2, and hence will diverge. 2 k log x) 2 Proosition 4.2 Antidiagonal symmetry ). P J a, d, m, x) = P J a d, d, m, x). Proof. We will show that for each set {0, d + jm} S S d+jm in the sum in 2.9) for P J a, d, m, x), there is a set {0, d + jm} S S d+jm with ns) = ns ), αs) = αs ), and βa, m, S) = β a d, m, S ). In articular, set S = d + jm) S. We automatically have {0, d + jm} S S d+jm. Obviously, ns) = ns ), and for any rime, n S) = n S ). This shows that αs) = αs ). Finally, suose that for every element s S, a + s, m) = 1, so βa, m, S) 0. Then a s, m) = 1, and so jm a s, m) = 1. Therefore, d + jm s) + a d), m) = 1, showing that β a d, m, S ) 0. We know that n S) = n S ), and therefore if βa, m, S) 0, we see that βa, m, S) = β a d, m, S ). Conversely, if βa, m, S) = 0, the same argument shows that β a d, m, S ) = 0. We may conclude that P J a, d, m, x) = P J a d, d, m, x). Proosition 4.3 Vertical comatibility ). Fix J. Suose that n is a ositive integer and m n. Then P J a, d, m, x) = P J a, d, n, x), where J + 1)m = J + 1)n. a a mod m) d d mod m) 1 a,d n Proof. We will first show that each set S d+jm used in the sum on the left-hand side of the equation aears in the sum on the right-hand side of the equation for a unique d. To see that, choose d d + jm mod n) with 1 d n. Write d + jm) d = j n, and then d + jm = d + j n. On the other hand, if we start with d d mod m) and j 0, it is clear that we can find a unique j so that d + jm = d + j n. We know that j J and d n + d m. Therefore, the uer bound for J is J +1)n m. m This leaves us needing to show that βa, m, S) = βa, n, S). a a mod m) 1 a n Notice that if βa, m, S) = 0, then automatically βa, n, S) = 0 for all ossible values of a a mod m). So we may assume that βa, m, S) 0. It suffices to consider the case where n = qm, with q rime. Notice that the set {a, a + m, a + 2m,..., a + q 1)m} contains the comlete set of integers a with a a mod m) and 1 a qm. In articular, there are q such integers a. There are two cases to finish the roof: 1. q m. If a + S contains only elements relatively rime to m, and a a mod m), then a + S contains only elements relatively rime to m. Hence, a + S contains only elements relatively rime to qm. Therefore, βa, m, S) = 0 βa, qm, S) = 0. Note also that if is rime, then qm m.

10 AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT If βa, qm, S) 0, then we know that βa, qm, S) is indeendent of a. There are q ossible values of a, and we have βa, qm, S) = 1 qm n a S) qm = 1 qm n S) m = q 1 qm n S) m = 1 ) = βa, m, S). m n S) m 2. q m. In this case, the elements of the set {a, a + m,..., a + q 1)m} are distributed over all q of the residue classes modulo q. Suose that n q S) = c. Then there are q c ossible residue classes r mod q) so that r + S will be relatively rime to q, and there are c residue classes so that r + S will have a factor in common with q namely r s mod q) for some s S). Let r be one of the residue classes with r + S relatively rime to q. Recall as usual that if βa, qm, S) 0, then it is indeendent of a. We have βa, qm, S) = q n q S))βr, qm, S) a a mod m) 1 a qm = q n q S)) 1 qm = q n q S)) 1 qm = 1 m m qm n S) n S) ) q q n q S) = βa, m, S). m n S) Corollary 4.4. Choose an even integer m > 2, fix J and let J = J +1)m 2. Then 2 4.1) P J a, d, m, x) = P J 1, 2, 2, x). 1 a m 1 d m Proof. Proosition 4.3 reduces the left-hand side of 4.1) to P 1, 2, 2, x) + P 1, 1, 2, x) + P 2, 1, 2, x) + P 2, 2, 2, x). The last three terms in this sum are all 0. Note that if for some reason it is desirable to begin with odd m, we may relace m with 2m without changing the sum on the left-hand side of equation 4.1), again because of Proosition 4.3.

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES 11 Theoretically, P J 1, 2, 2, x) should give the robability that x is rime, which according to our heuristic is log x) 1. We can only sum u to J = 28 because the comutational time involved becomes inordinate, and we tabulate our results here: x P 28 1, 2, 2, x) log x) 1 Ratio 10 0.425586 0.434294 0.9799 10 2 0.217147 0.217147 1.0000 10 3 0.144765 0.144765 1.0000 10 4 0.108567 0.108574 0.9999 10 5 0.0867455 0.0868589 0.9987 10 6 0.0719549 0.0723824 0.9941 We conjecture that the relatively oor agreement with rediction when x = 10 is caused by relacing a finite roduct over the range < x µ with the infinite roduct defining A n in the enultimate term of formula 3.2). The roduct defining A n converges relatively raidly, but there is still an areciable error introduced by including so many more terms when x = 10. The fact that the ratio is slowly tending away from 1 might show that for x 10 4, J should be larger than 28 for best results. More interestingly, the comutations for x = 10 2 and x = 10 3 converged relatively raidly and did not change significantly when more terms were included in the sum. For x = 10 2, the series attained the value 0.217... for J = 13. Extending the sum over larger values of J did not change the first three significant digits. The sum took the value 0.21715... at J = 20 and that did not change u to our maximum value of J = 28. For x = 10 3, the summation takes values varying between 0.144765... and 0.144767... when J runs from 16 to 28. It is hard to understand this aarent stabilization of P J for varying J; our heuristic makes less and less sense as J increases for fixed x. 5. Numerical checks of antidiagonal symmetry Take for examle m = 5. We count residues mod 5 of consecutive rime airs < q with < 982451653. 982451653 is the 50, 000, 000 th rime number). We then record our results in a matrix with ϕm) rows and columns. The results for m = 5 are 2289170 3778890 3732547 2698886 3189954 2190360 3386288 3734018 2995506 3535854 2191584 3777311 4024863 2995514 3189838 2289414 where a ij records the count of i mod 5), q j mod 5)). The airs 3, 5) and 5, 7) are thus omitted from the counts in this table.) To check antidiagonal symmetry, we take the ratio of each entry to its reflection across the antidiagonal. We obtain rounded to 6 laces): 0.999893 1.000418 0.999606 1.000000 1.000036 0.999441 1.000000 1.000394 0.999997 1.000000 1.000559 0.999582 1.000000 1.000003 0.999964 1.000107 The entries are all within.0006 of 1, while the ratio of the largest to the smallest entry in this array is about 1.84.

12 AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT If m = 12, we have the following table of counts of rime air residues mod 12, q mod 12), where < q are consecutive rimes with < 982451653 we omit the airs 2, 3) and 3, 5) that are not relatively rime to 12. 2265842 3746000 3296656 3190380 2944446 2266005 3994598 3295820 3294707 3190968 2268555 3745878 3993884 3297895 2940299 2268064 For examle the entry 3994598 records the number of airs of consecutive rimes congruent to 5, 7) mod 12). Again, our heuristic redicts that this matrix should be symmetric across the antidiagonal. Taking the ratio of each entry to its reflection across the antidiagonal, we obtain rounded to 6 laces): 0.999020 1.000033 1.000254 1.000000 1.001410 0.998876 1.000000 0.999746 0.999033 1.000000 1.001125 0.999967 1.000000 1.000968 0.998592 1.000981 The entries are all within.002 of 1, while the ratio of the largest to the smallest entry in this array is about 1.76. 6. Numerical Checks of Power of 2 Prediction As noted above, when m = 2 k, our heuristic redicts that Na, d, 2 k, x) should be indeendent of a. Taking m = 16, we count residues mod 16 of consecutive rime airs < q with < 982451653: 409015 997313 843077 1082276 749027 721892 804113 642323 643202 408017 995715 843672 1083407 748537 722996 805076 803581 643323 408783 996936 842431 1082780 749737 723085 721382 805281 642633 408177 997195 843473 1081932 749402 750223 721241 804604 642735 407612 997681 843417 1082432 1082176 750074 723110 803572 643435 407018 996935 843628 842474 1082071 751349 721474 804368 643327 408482 996565 996983 843301 1081386 750633 722470 805240 642498 407695 Our conjecture redicts that the numbers on the broken diagonals arallel to the main diagonal should be aroximately equal. For examle, the numbers in bold in the table above form a broken diagonal. We can test our conjecture by noting that the counts in the matrix range from 407018 to 1083407, with a ratio aroximately 2.66, while the counts in the broken diagonal in bold above range from 842431 to 843672, with a ratio of 1.00147.... The next broken diagonal beginning with 1082276) varies from 1081386 to 1083407, with a ratio of 1.00186.... The remaining broken diagonals exhibit similar behavior. 7. Numerical checks of the heuristic In 5 and 6, we tested the heuristic indirectly, by observing symmetries in the heuristic and seeing if the same symmetries aeared in the counts of residues of rime airs. In this

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES 13 section we comare our heuristic formula directly with such counts. This requires comuting the heuristic numerically, which, as noted above, becomes raidly unwieldy as J increases, so that we must restrict J so that d + Jm < 55 in our calculations, and corresondingly restrict x to be at most 10 6. We will use m = 4, 5, 11, and 12 as tyical examles. modulo m between 10 3 and 10 6 : m = 5 3207 6536 6284 3550 5053 2868 5430 6224 4383 5800 2810 6630 6934 4371 5100 3150 We count rime airs reduced m = 12 2942 6315 5253 5018 4358 2959 7034 5216 5257 5010 2918 6438 6970 5283 4419 2940 We comute 106 P J a, d, m, x) for all ossible values of a and d for each value of m above. x=10 3 Because of limited comuter time, we choose J maximal with d + Jm < 55. The results are: m = 5 3136.4 6562.3 6237.6 3570.7 5081.4 2738.6 5464.4 6237.6 4360.4 5838.2 2738.6 6562.3 6946.5 4360.4 5081.4 3136.4 m = 12 2960.0 6418.9 5258.5 4877.2 4279.9 2960.0 7013.5 5258.5 5258.5 4877.2 2960.0 6418.9 7013.5 5258.5 4279.9 2960.0 To gauge the success of the heuristic in redicting the counts, we take the ratios of the corresonding entries in these tables, i.e., we calculate, for each congruence class i, j) mod m), the ratio actual count)/redicted count), with the following results, rounded to two laces: m = 5 1.02 1.00 1.01 0.99 0.99 1.05 0.99 1.00 1.01 0.99 1.03 1.01 1.00 1.00 1.00 1.00 m = 12 0.99 0.98 1.00 1.03 1.02 1.00 1.00 0.99 1.00 1.03 0.99 1.00 0.99 1.00 1.03 0.99 By comarison, the ratio of the largest to the smallest count is 6934/2810 2.47 for m = 5), and 7034/2918 2.41 for m = 12). Here are the results for m = 4, again for the range 10 3 to 10 6 : 10 6 m = 4 P J a, d, 4, x) x=10 3 ratios 16574 22521 16618.8 22407.7 1.00 1.01 22520 16715 22407.7 16618.8 1.01 1.01

14 AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT Finally, m = 11: m = 11 263 1071 1242 522 1019 426 1625 617 771 285 819 200 836 1181 485 1106 329 1603 501 765 346 814 217 987 1227 675 1063 324 1555 588 834 328 817 272 834 1221 499 1060 318 1640 682 1015 362 1028 271 1095 1223 663 1049 448 1618 534 775 256 794 275 866 1219 485 1036 382 1826 594 784 253 1004 262 1004 1230 518 1055 366 1599 596 779 387 803 198 867 1208 604 1072 316 1795 577 1004 355 836 192 1062 1238 599 1038 403 1597 665 832 333 845 273 10 6 P J a, d, 11, x) x=10 3 263.7 1048.5 1242.9 497.9 1029.7 434.8 1642.4 606.3 778.9 257.8 802.7 191.4 851.5 1219.9 486.6 1077.4 313.4 1595.3 488.1 778.9 335.3 823.1 193.7 983.5 1205.5 655.8 1069.6 331.0 1595.3 606.3 801.8 332.3 800.4 237.2 865.3 1250.3 493.8 1069.6 313.4 1642.4 656.4 1000.8 359.0 1018.3 230.0 1117.3 1250.3 655.8 1077.4 434.8 1627.2 560.9 777.0 252.1 780.6 230.0 865.3 1205.5 486.6 1029.7 389.7 1806.0 592.5 805.6 252.1 1018.3 237.2 983.5 1219.9 497.9 1072.6 322.0 1596.1 592.5 777.0 359.0 800.4 193.7 851.5 1242.9 615.4 1103.7 322.0 1806.0 560.9 1000.8 332.3 823.1 191.4 1048.5 1245.0 615.4 1072.6 389.7 1627.2 656.4 801.8 335.3 802.7 263.7 ratios 1.00 1.02 1.00 1.05 0.99 0.98 0.99 1.02 0.99 1.11 1.02 1.04 0.98 0.97 1.00 1.03 1.05 1.00 1.03 0.98 1.03 0.99 1.12 1.00 1.02 1.03 0.99 0.98 0.97 0.97 1.04 0.99 1.02 1.15 0.96 0.98 1.01 0.99 1.01 1.00 1.04 1.01 1.01 1.01 1.18 0.98 0.98 1.01 0.97 1.03 0.99 0.95 1.00 1.02 1.02 1.20 1.00 1.01 1.00 1.01 0.98 1.01 1.00 0.97 1.00 0.99 1.10 1.02 1.01 1.04 0.98 1.14 1.00 1.01 1.00 1.08 1.00 1.02 1.02 0.97 0.98 0.97 0.98 0.99 1.03 1.00 1.07 1.02 1.00 1.01 0.99 0.97 0.97 1.03 0.98 1.01 1.04 0.99 1.05 1.04 The counts for m = 11 vary from 192 to 1826, with ratio 1826/192 9.51. Our redicted values are for the most art are quite close to the observed counts, though the redictions for a few entries notably, and curiously, 3, 3), 4, 4), 5, 5), 6, 6), 7, 7)) are a bit low, only 85% 90% of their observed counts. This is artly exlained by the size of the entries themselves the main diagonal entries are the smallest, in either the table of counts or the table of redictions, so a discreancy there between rediction and observation shows u as a larger

FREQUENCIES OF SUCCESSIVE PAIRS OF PRIME RESIDUES 15 ercentage error than the same discreancy elsewhere. For examle, the observed count for 5, 5) is 41 above the redicted count, or 18% above; the observed count for 10, 9) is 42 above redicted, or 5% above; the observed count for 2, 4) is 39 below redicted, or 3% below. 8. Another rediction If we let x and fix J, we can view the dominant terms in P J a, d, m, x) as the ones with ns) = 2. In reality, the estimation is more comlex, because the value of J should increase with x.) If ns) = 2, then S = {0, d + jm}. As long as a, m) = a + d, m) = 1, these dominant terms in P J a, d, m, x) will be indeendent of a. To see if in fact this rediction aears to be true, we comute rime air counts modulo 5 and 12 for the first 10 7 robable) rimes larger than 10 100. The results are m = 5 300655 318723 320000 310947 313970 300412 315948 319266 312837 317325 300976 318355 322830 313721 314012 300023 m = 12 301604 318324 315239 314667 313320 300633 319515 316479 315294 315144 300873 318316 320319 315883 314055 300335 This data is consistent with the rediction: for examle for m = d = 5 res. m = d = 12), art of the rediction is that the entries on the main diagonal should tend to equality; an ad hoc way to judge this is to take the ratio of the largest to the smallest terms on the diagonal for both our first set of residues, involving rimes less than 10 6, and for this set. For m = 5 resectively m = 12), the ratio for the small-rime data set is 1638 1.18 1390 res. 1501 300976 1.05), while for the large-rime data set, the ratio is 1.003 res. 1430 300023 301604 300335 1.004). 9. Further Oen Questions One obvious question is whether the different residue classes of rime airs occur asymtotically equally often: in other words, given any a, d, a, d with a, a + d, a, a + d relatively rime to m, does Na, d, m, x) Na, d, m, x) 1 as x? This corresonds to asking whether all of the entries in the matrices in the revious section are tending towards equality as x tends to infinity. One way to gauge the variation in our tables is to take the ratio of the largest to the smallest count. In 7, the ratio for m = 5 is aroximately 2.47 res. 2.39 for m = 12), while the large rime data set in 8 the ratio for m = 5 is aroximately 1.08 res. 1.07). Obviously, the terms aear to be getting closer together, but of course we can t tell if they are tending towards a limiting ratio of 1. If the different residue classes of rime airs do occur asymtotically equally often, we are in a rime race situation, as studied in [6] and [3]. For examle, take m = 4 and count rime airs congruent to 1, 1) mod 4) versus rime airs congruent to 1, 3) mod 4). Even if the counts are asymtotically equal, i.e. if lim x N1, 4, 4, x)/n1, 2, 4, x) = 1, we observe for finite chunks of rimes a decided tendency for the total of 1, 3) s to exceed

16 AVNER ASH, LAURA BELTIS, ROBERT GROSS, AND WARREN SINNOTT the total of 1, 1) s. We can kee score, as we search through the sequence of odd rimes, 3, 5, 7,..., and see when the 1, 1) s are ahead of the 1, 3) s and when the oosite holds. It seems quite ossible that N1, 4, 4, x) > N1, 2, 4, x) for almost all values of x. Another question concerns the aroriate value of J to use in our heuristic. The comutation mentioned at the end of 4 shows that in at least one case, P J a, d, m, x) is nearly indeendent of J for a range of values of J. A recise understanding of the deendence of P J a, d, m, x) on J would go a long way towards establishing the solidity of our heuristic. References [1] A. Ash, B. Bate, and R. Gross, Frequencies of consecutive tules of Frobenius classes, Exeriment. Math., to aear. [2] Paul T. Bateman and Roger A. Horn, A heuristic asymtotic formula concerning the distribution of rime numbers, Math. Com. 16 1962), 363 367. MR0148632 26 #6139) [3] Andrew Granville and Greg Martin, Prime number races, Amer. Math. Monthly 113 2006), no. 1, 1 33. MR2202918 2006h:11112) [4] G.H. Hardy and J.E. Littlewood, Some roblems of Partitio Numerorum III: On the exression of a number as a sum of rimes, Acta Math. 44 1923), 1 70. [5] G. Pólya, Heuristic reasoning in the theory of numbers, Amer. Math. Monthly 66 1959), 375 384. MR0104639 21 #3392) [6] Michael Rubinstein and Peter Sarnak, Chebyshev s bias, Exeriment. Math. 3 1994), no. 3, 173 197. MR1329368 96d:11099) Deartment of Mathematics, Boston College, Chestnut Hill, MA 02467-3806 E-mail address: ashav@bc.edu Deartment of Mathematics, Boston College, Chestnut Hill, MA 02467-3806 E-mail address: lbeltis@gmail.com Deartment of Mathematics, Boston College, Chestnut Hill, MA 02467-3806 E-mail address: gross@bc.edu Deartment of Mathematics, The Ohio State University, Columbus, OH 43210-1174 E-mail address: sinnott@math.ohio-state.edu