On the Union of Arithmetic Progressions

On the Union of Arithmetic Progressions Shoni Gilboa Rom Pinchasi August, 04 Abstract We show that for any integer n and real ɛ > 0, the union of n arithmetic progressions with pairwise distinct differences, each of length n, contains at least c(ɛ)n ɛ elements, where c(ɛ) is a positive constant depending only in ɛ This estimate is sharp in the sense that the assertion becomes invalid for ɛ = 0 We also obtain estimates for the asymmetric case where the number of progressions is distinct from their lengths Introduction A finite arithmetic progression is one of the most fundamental objects in number theory Formally it is a sequence of numbers a, a,, a n such that for every < i < n we have a i+ a i = a i a i The difference a i a i is called the difference of the arithmetic progression For integers n > and l > let u l (n) be the minimum possible cardinality of a union of n arithmetic progressions, each of length l, with pairwise distinct differences Clearly, u l (n) n l, but this inequality is not tight in general, not even up to a multiplicative constant It is not hard to see, for instance, that u (n) = n + + 4 This is because any two numbers form an arithmetic progression of length and therefore any set of m numbers such that no two of their differences are the same (for example,,,, m ) is a union of ( ) m arithmetic progressions with pairwise distinct differences The following example, given in [9], shows that log 4 ) u 3 (n) = O (n log 7 : Let k be the minimal integer such that 9k 33k n For any disjoint A, B,,, 3k} of cardinality k, consider the three term arithmetic progression i A 3i, i A 3i + i B 3i, i B 3i } We get (3k)! (k!) 3 9k 33k n three term arithmetic progressions with pairwise distinct differences, whose union is of size ( ) ( 3k k 6 7 ) k log 4 ) 9 4 = O (n log 7 We note that a lower bound u 3 (n) n 6/ directly follows from a result of Katz and Tao [8] Mathematics Dept, The Open University of Israel, Raanana 4307, Israel tipshoni@gmailcom Mathematics Dept, Technion Israel Institute of Technology, Haifa 3000, Israel room@mathtechnionacil Supported by ISF grant (grant No 357/)

The trivial upper bound u l (n) n l is not tight for large values of l as well In particular, consider the symmetric case where l = n It turns out that u n (n) = o(n ) To see this take (perhaps the most natural choice of) arithmetic progressions: A j = i j i [n]} for j =,, n Here, and in the sequel, the notation [n] stands for the set,,, n} Clearly, the union n j= A j is precisely the set i j i, j [n]} It was shown already in 955 by Erdős [5] that for some α > 0 The exact assymptotics i j i, j [n]} = o ( n /(log n) α) i j i, j [n]} n +log log (log n) log (log log n) 3 was given in 008 by Ford [6] Consequently, we obtain the desired improved upper bound for u n (n) In this paper we show that u l (n) cannot be much smaller than n l, provided l is not much smaller than n, as captured in the following theorem, giving a lower bound for u l (n) for smaller values of l as well Theorem For every ε > 0 there is a positive constant c (ε), depending only on ε, such that for any positive integers n and l c (ε) n ε l for l n ε u l (n) c (ε) l for n ε l n ε c (ε) n ε l for n ε l The proof of Theorem relies upon upper bounds for the following two functions, that are of independent interest, g d (n) = f d (m, n) = max B (0, ) B n max A,B (0, ) A m, B n (a, b) A B a } b [d], (b, b ) B b < b, p, q [d] : b b = p q } The paper is organized as follows In Section we provide an upper bound for the function g d above Using this bound, we provide an upper bound for the function f d in Section 3 Theorem is proved in Section 4 In Section 5 we give a number-theory application of our upper bound for the function g d Addendum: After this paper was completed and before it was published it was brought to our attention that the main tools in our proof as well as the main lemma in our paper, which is

Proposition in Section, appear in a different context in [] Lemma 39 in [] gives only a slightly weaker bound than the one in Proposition in this paper Lemma 30 in [] is a consequence of Lemma 39 in [] in the same way that Theorem 5 in this paper is a consequence of Proposition In fact, Lemma 30 in [] is stated in a stronger form than Theorem 5 that allows us to get an explicit subquadratic lower bound for u l (n) in Theorem in terms of n and l without introducing ɛ > 0 We comment about this in Section 5 Rational quotients with bounded numerator and denominator For positive integer d define R d = k l } k, l [d] Definition For a positive integer d and a finite set B of positive real numbers define } b G d (B) = (b, b ) B b < b, R d b For positive integers n and d define g d (n) = max B (0, ) B n G d (B) Clearly, g d (n) (n ) R d < n d () This bound is useful when d is small For large values of d we have the following improved upper bound whose proof is the main goal of this section: Proposition For any positive integer k, there is a positive constant c (k), depending only on k, such that for any positive integer n and any integer d > c (k) g d (n) < (00k + )n + k d k () The proof of Proposition will be as follows: Given a set B of n positive real numbers, G d (B) can be viewed as the set of edges of a graph whose vertices are the numbers in B If the number of edges in this graph is large, then the Bondy-Simonovits Theorem [3] implies that this graph must contain many cycles of length k However, the number of such cycles is easy to bound in terms of the number of solutions to r r r k =, where each of the r i s is a quotient of two natural numbers that are smaller than or equal to d standard estimates for the number of divisors function This number of solutions can be estimated using 3

Hence, given a finite set of positive real numbers B we define C d,k (B) = (b, b,, b k ) B k b, b,, b k, b k b b 3 b b k R d, i < j k : b i b j Notice that C d,k (B) corresponds to the set of cycles of length k in the graph corresponding to G d (B) We start with bounding the cardinality of C d,k (B) from below in terms of G d (B) We first get a basic lower bound for C d,k (B) using the Bondy-Simonovits Theorem, which states that a graph with n vertices and no simple cycles of length k has no more than 00k n + k we enhance this basic lower bound for C d,k (B) in the case where G d (B) is large } edges Later, Lemma 3 For any positive integers k and d, and for any finite set B of n positive real numbers, 4k C d,k(b) G d (B) 00k B + k Proof Form a graph on the vertex set B, by connecting two distinct vertices b, b B if and only if b b R d This graph obviously has B vertices, G d (B) edges, and precisely k C d,k(b) simple cycles of length k Now remove an edge from every simple cycle of length k in this graph We get a graph with B vertices and at least G d (B) 4k C d,k(b) edges The resulting graph has no simple cycle of length k The result now follows directly from the Bondy-Simonovits Theorem stated above Next, we use a standard probabilistic argument to enhance the lower bound of Lemma 3 Lemma 4 For any positive integers k and d, and for any finite set B of positive real numbers such that B > d + (k ), we have the following inequality: G d (B) < 4k C d,k(b) d (k ) + 00k B + k d k Proof Let p := d (k ), and let B p be a random subset of B obtained by choosing each element independently with probability p By Lemma 3, 4k C d,k(b p ) G d (B p ) 00k B p + k Taking expectations, we get ( ) 4k E C d,k(b p ) E G d (B p ) 00k E B p + k (3) Notice that from the linearity of expectation we have: E C d,k (B p ) = C d,k (B) p k (4) 4

and E G d (B p ) = G d (B) p (5) ( ) As for E B p + k, note that E B p = B p and V B p = B p( p) Therefore, since < B p, E ( B p ) = V B p + (E B p ) = B p( p) + ( B p) < B p Now, by Jensen s inequality, ( ) E B p + k ( E ( B p )) + k < B + k p + k (6) Plugging (4), (5) and (6) in (3) we get hence 4k C d,k(b) p k > G d (B) p 00k B + k p + k, G d (B) < 4k C d,k(b) p k + 00k B + k p + k = 4k C d,k(b) d (k ) + 00k B + k d k We now approach the task of bounding C d,k (B) from above We start with the following well known number-theoretic bound on the number of divisors τ(m) of an integer m Lemma 5 For any δ > 0 there is a positive constant c 3 (δ) depending only on δ, such that for any positive integer m, τ(m) < c 3 (δ)m δ Proof Let m = k i= pr i i be the prime factorization of m Notice that m has τ(m) = k i= ( + r i) divisors For any i k, Therefore, τ(m) m δ = k + r i k (p r < i i )δ i= i= (p r i i )δ = e δr i ln p i > + δr i ln p i + r i + δr i ln p i Hence, τ(m) < c 3 (δ)m δ, where c 3 (δ) := p prime p e /δ k i= min, δ ln p i } = δ ln p i k ln p i /δ δ ln p i p prime p e /δ δ ln p Lemma 6 For any positive integer k there is a positive constant c 4 (k), depending only on k, such that for any positive integer d and any finite set B of positive real numbers we have C d,k (B) < c 4 (k) B d k+ 4k 5

Proof We notice that C d,k (B) = (b, b,, b k ) B k b, b,, b k, b } k R d, i < j k : b i b j b b 3 b k b } (b, b,, b k ) B k b i b k i k : R d, R d b i+ b B (r, r,, r k ) i k : r i R d, r r r k = } ( } B ((p, p,, p k ), (q, q,, q k )) [d] k) p p p k = q q q k B d k m= } (p, p,, p k ) [d] k d k p p p k = m B τ(m) 4k m= By Lemma 5, τ(m) < c 3 (/3k 3 )m /3k3 for any m, and we get C d,k (B) < B d k m= This completes the proof with c 4 (k) := ( c 3 (/3k 3 ) ) 4k We are now prepared for proving Proposition (c 3 (/3k 3 )m /3k3) 4k ( c3 (/3k 3 ) ) 4k B d k+ 4k Proof of Proposition If n d + (k ), then () holds because g d (n) ( ) n ) < n = n + k n k n + k (d + k (k ) = n + k d k We therefore assume n > d + (k ) Let B be a set of n positive real numbers By Lemma 4, By Lemma 6, G d (B) < 00k n + k d k + 4k C d,k(b) d (k ) (7) C d,k (B) < c 4 (k)n d k+ 4k Hence, for d c (k) := (c 4 (k)/4k) 4k, Plugging (8) in (7) and using our assumption that n > d + (k ) C d,k (B) < 4k n d k+ k (8) d, we get that for d c (k), G d (B) < 00k n + k d k + n d + k < (00k + )n + k d k 6

3 Bounded integer quotients Definition 3 For positive integers m, n, and d define f d (m, n) = max (a, b) A B a A,B (0, ) b [d]} A m, B n Remark 3 Notice that f d (m, n) = f d (n, m) This is because given sets A and B of positive a real numbers such that A = m, B = n, and f d (m, n) = (a, b) A B b [d]}, the sets A = b b B} and B = a a A} show that f d(n, m) f d (m, n) Consequently, f d (m, n) = f d (n, m) Therefore, we may assume, if needed, with no loss of generality that m n, or that m n The main goal of this section is to prove the following proposition Proposition 33 For any ε > 0 there is a positive constant c 6 (ε), depending only on ε, such that for any positive integers m, n, and d f d (m, n) < c 5 (ε) minn ε, m ε } m n d Proof With no loss of generality (see Remark 3) assume that n m We may also assume that m < n d, because if m n d, then f d (m, n) n d = (n d)n d m n d (see Proposition 35 for further discussion) Let A and B be finite sets of positive real numbers such that A m, B n and f d (m, n) = (a, b) A B a b [d]} The proposition will follow by comparing lower and upper bounds for the cardinality of the set W = (a, b, b ) A B a b, a b [d], b < b } We first establish an upper bound for W For convenience define S d = (p, q) p, q [d], p < q, gcd(p, q) = } 7

We have: W = (b, b, k, k ) B [d] k b = k b A, b < b } (b, b, k, k ) B [d] } k b = k b, b < b = = (b, b, k, k ) B [d] b = k = p } = b k q (p,q) S d = (b, b ) B b = p } b q (k, k ) [d] k = p } = k q (p,q) S d = (b, b ) B b = p } d b q q (p,q) S d d ( G q (B) G q (B) ) d q = G d d(b) + q= d < G d (B) + d q G q (B) q= q= ( d G q (B) q d ) = q + Let k := /ε By Proposition, there is a positive constant c (k), depending only on k, such that for any c (k) < q d, For any q, we have by () that G q (B) < n q Therefore, Hence, c (k) W G d (B) + d q G q (B) + d q= G q (B) < (00k + )n + k q k d q=c (k)+ q G q (B) < < (00k + )n + k d k + (c (k) )n d + (00k + )n + k d d q=c (k)+ q + k W < c(ε)n +ε d, (9) where c(ε) := (00k + ) + (c (k) ) + (00k + ) q=c (k)+ q + k To get a lower bound for W, we define r(a) = b B a b [d]} for any a A Then, by the convexity of the function ( ) x = x(x ) (or what is sometime referred to as Jensen s inequality): W = ( ) r(a) m( m a A r(a) ) (0) a A 8

Combining the upper and lower bounds for W, namely, (9) and (0), we get Now, we deduce m( m a A r(a) ) < c(ε)n +ε d f d (m, n) = (a, b) A B a b [d]} = a A This implies the desired result, as n m < n d r(a) < m + m 4 + c(ε)m n+ε d 3 Tightness of Proposition 33 In this section we deviate from the flow of the argument to address the question whether Proposition 33 is tight The results of this section will not be used elsewhere We will show that the upper bound in Proposition 33 for f d (n, m) is essentially tight (see Proposition 34 below), provided each of the parameters m, n, and d is (sufficiently) smaller than the product of the other two When one of m, n, and d is considerably larger than the product of the other two, the upper bound in Proposition 33 is no longer tight, as follows from Proposition 35 below, in which the exact values of f d (m, n) in those cases are determined Proposition 34 If m 4 nd, n 4 md, and d 4 mn, then f d(m, n) 8 m n d Proof Set k = m d/n, l = n d/m, and t = m n/d Consider the sets Then Notice that A = (k + l) r i} r [t],i [k], and B = (k + l) r /j} r [t],j [l] (a, b) A B a b [d] } t k l m n m d A = t k d n = m and m n n d B = t l d m = n m n d m d n n d m = m n d 8 Proposition 35 If d m n then f d (m, n) = m n If n m d then f d (m, n) = m d 3 If m n d then f d (m, n) = n d 9

Proof For any A, B (0, ) with A m, B n we obviously have (a, b) A B a b [d]} A B = A B m n To see that this upper bound can actually be attained, consider, for instance, the sets A = /i} i [m] and B = [n] For any A, B (0, ) with A m, B n we have (a, b) A B a b [d]} = (a, k) A [d] a k B} m d This upper bound can indeed be attained, for example by taking A = (d + ) i } i [m] and B = (d + ) i /k} i [m],k [d] 3 For any A, B (0, ) with A m, B n we have (a, b) A B a b [d]} = (b, k) B [d] k b A} n d Equality is attained, for example, by taking A = (d + ) j k} j [n],k [d] and B = (d + ) j } j [n] 4 Union of arithmetic progressions In this Section we prove Theorem Recall that for integers n > and l >, u l (n) is the minimum possible cardinality of a union of n arithmetic progressions, each of length l, with pairwise distinct differences As a consequence of Proposition 33 we prove the first estimate of Theorem Proposition 4 For any ε > 0 there is a positive constant c 6 (ε), depending only on ε, such that for any positive integers n and l u l (n) > c 6 (ε)n ε l Proof Take n arithmetic progressions, each of length l, with pairwise distinct differences, and let U be their union If each x U belongs to less than n of the progressions, then n l < U n and consequently U > n l > n ε l Therefore, assume there is x U which belongs to at least n progressions In any such progression at least d := l of the terms are on the same side of x (that is, are either all smaller, or all larger than x) Therefore, in at least n/ progressions there are at least d terms on the same side of x and without loss of generality we assume they are larger than x in these progressions We now concentrate only on these progressions Let B be the set of differences of these arithmetic progressions, and let A = i b i [d], b B} Proposition 33 implies d B = (a, b) A B a b [d]} f d( A, B ) < c 5 (ε) B ε A B d, 0

hence U x + a a A} = A > c 5 (ε) B ε d c 5 (ε) This completes the proof with c 6 (ε) := c 5 (ε) 3 ε ( ) ε n l c 5 (ε) 3 ε n ε l The lower bounds in Theorem in the regime n ε l are established in Proposition 43 below The proof of Proposition 43 uses Proposition, ideas similar to those appearing in the proof of Proposition 33, and the following lemma (recall the definition of R d from Section ) Lemma 4 Suppose that the increasing arithmetic progressions (a + (j )b ) l j= and (a + (j )b ) l j= have at least r common elements, then b b R l r Proof Since (a + (j )b ) l j= and (a + (j )b ) l j= have at least two common elements, b b is necessarily rational, and with no loss of generality we may assume b, b are both integers The intersection of the two progressions under consideration is then itself an arithmetic progression with the difference lcm(b, b ) and the diameter (which is the difference between the largest and the smallest of its elements) at least (r )lcm(b, b ) Consequently, (r )lcm(b, b ) (l )b i and therefore b i /gcd(b, b ) l r (i, }), which is equivalent to the assertion of the lemma To complete the proof of Theorem, we establish Proposition 43 For any ε > 0 there is a positive constant c 7 (ε), depending only on ε, such that for any positive integers n and l u l (n) > c 7 (ε) min n ε l, l } Proof Let P, P,, P n be n arithmetic progressions, each of length l, with pairwise distinct differences We will use the following well known estimate of Dawson and Sankoff [4] on the cardinality of the union of sets via the cardinalities of their pairwise intersections, n P i i= Hence, we examine I := i <i n P i P i ( n i= P i ) i,j n P i P j () Let b,, b n be the differences of the progressions P,, P n, respectively and let B = b,, b n } Clearly l I = (i, i ) [n] b i < b i, P i P i r } r=

Trivially, (i, i ) [n] b i < b i, For r we use Lemma 4 to obtain Hence, (i, i ) [n] b i < b i, I P i P i r } ( ) n l + r= P i P i } g l r (n) ( ) n G l r (B) g l r (n) Let k := /ε By Proposition, there is a constant c (k) such that for any r c (k)+ + we have For g l r (n) < (00k + )n+ k l c (k)+ Therefore, I ( ) l k (00k + )n + k (l ) k r + r l, we use the simpler estimate () to get ( ) n + l c (k)+ + r= ( ) l g l (n) < n < (c (k) + ) n r r (00k + )n + k (l ) k (r ) k + l l (r ) k (c (k) + ) l r= c (k)+ + n Hence I < c(ε) n l maxn/l, n /k } c(ε) n l maxn/l, n ε }, () for some positive constant c(ε) depending only on ε The proposition follows by plugging () in () 5 An application: Graham s conjecture on average In this section we draw a number theoretical application to our upper bounds for the function g d in Section This application, apart from providing an alternative presentation for the proof of Proposition 43, is directly related to a famous conjecture of Graham [7] Theorem 5 For every ε > 0 there exists c(ε) > 0 with the following property Let a < < a n be n natural numbers Then

i<j n gcd(a i, ) < c(ε)n +ε (3) Proof Denote A = a,, a n } Notice that every summand on the left-hand side of (3) is of the form d for some positive integer d The simple but crucial observation is that if i < j n such that gcd(a i, ) = d, then a i R d Therefore, gcd(a i, ) = d, for i < j n, if and only if (a i, ) G d (A) \ G d (A) (Recall the definition of R d and G d (A) in Section ) Fix a positive integer k, to be determined later By Proposition, there exists c (k) > 0 such that for every d > c (k) G d (A) g d (n) < (00k + )n + k d k (4) For every d, G d (A) g d (n) < nd, by () We get that i<j n gcd(a i, ) = d (i, j) [n] i < j, gcd(a i, ) = ( G d (A) G d (A) ) d = d d d G d (A) + d G d (A) d c (k) c (k)n + (00k + )n + k d>c (k) d>c (k) = } d d = ( G d (A) d d + d + k Take k to be a positive integer such that k < ε and let c(ε) = c (k)+(00k+) d>c (k) to get the desired result ) d /k (d+) Remark It is not hard to verify that the bound in Theorem 5 cannot be improved to be linear in n This can be seen for example by taking a,, a n to be,, n, respectively Then a direct computation, using some classical number theory estimates, shows that in this case the left-hand side of (3) is of the order of magnitude of n log n, up to some absolute multiplicative constant Theorem 5 allows us to write in a slightly different way the proof of Proposition 43 Indeed, suppose we wish to bound from below the union of n arithmetic progressions, P,, P n, each of length l, with pairwise distinct differences a,, a n, respectively With no loss of generality we may assume that a < < a n and that they are all positive integers We will again use () Hence, we examine the cardinalities of the pairwise intersections of the progressions Consider two progressions of length l: p + (j )q} l j= and p + (j )q } l j=, where q, q are positive integers Their intersection is in itself an arithmetic progression and it is not hard to see that the difference of this progression (assuming it has at least two elements) is equal to the smallest 3

number divisible by both q and q It follows that the size of the intersection of the two progressions is less than or equal to + min(lq,lq ) lcm(q,q ) = + l gcd(q,q ) max(q,q ) It follows from () and the above discussion that the union n i= P i is bounded from below by (nl) nl + n + l i<j n gcd(a i, ) In view of Theorem 5, this expression is greater than min( 3c(ε) n ε l, l ) It is interesting to note the relation of Theorem 5 to a well known conjecture of Graham [7] Graham conjectured that given any n positive integers a < < a n, there are two of them that satisfy gcd(a i, ) n This conjecture has a long history with many contributions It was finally completely (that is, for all values of n) solved in [], where one can also find more details on the history and references related to this conjecture From (3) it follows that there is a pair of indices i < j n such that gcd(a i, ) This implies gcd(a i, ) > < c(ε)n+ε ( n ) c(ε) n ε This lower bound is indeed much weaker than the desired one in the conjecture of Graham, but on the other hand this argument shows that on average is quite large gcd(a i, ) Addendum: As was mentioned in the introduction of this paper, Theorem 5 appears already in [] (as lemma 30 there) In fact, Lemma 30 in [] is stated in a stronger form: Lemma 5 (Lemma 30 in []) There exists an absolute constant c > 0 such that for any positive integers a < < a n, we have i<j n gcd(a i, ) As we have seen above u l (n) is bounded from below by (nl) nl + n + l i<j n < ne c log n log log n (5) gcd(a i, ) Plugging here the upper bound from Lemma 30 from [] for i<j n following lower bound for u l (n): (nl) u l (n) nl + n + lne, c log n log log n where c > 0 is an absolute positive constant independent of l and n Acknowledgments gcd(a i, ), we get the We thank Vsevolod F Lev for interesting discussions about the problem and for pointing out the relation of Theorem 5 to the conjecture of Graham We thank Noga Alon for bringing to our attention the close relation between [] and this paper 4

References [] N Alon, I Z Ruzsa, Non-averaging subsets and non-vanishing transversals J Combin Theory Ser A 86 (999), no, 3 [] R Balasubramanian, K Soundararajan, On a conjecture of R L Graham Acta Arith 75 (996), no, 38 [3] J A Bondy and M Simonovits, Cycles of even length in graphs, J Combinatorial Theory Ser B 6 (974), 97 05 [4] D A Dawson and D Sankoff, An inequality for probabilities, Proc Amer Math Soc 8 (967), 504 507 [5] P Erdös, Some remarks on number theory, Riveon Lematematika 9 (955), 45 48 [6] K Ford, The distribution of integers with a divisor in a given interval, Ann of Math () 68 (008), no, 367 433 [7] R L Graham, Advanced Problem 5749*, Amer Math Monthly 77 (970), 775 [8] N H Katz and T Tao, Bounds on arithmetic projections, and applications to the Kakeya conjecture, Math Res Lett 6 (999), no 5-6, 65 630 [9] I Z Ruzsa, Sumsets, in European Congress of Mathematics, 38 389, Eur Math Soc, Zürich 5