# A Practical Application of Differential Privacy to Personalized Online Advertising

Save this PDF as:

Size: px
Start display at page:

## Transcription

6 this experiment. Hence, the probability of the event WIN is Pr [WIN] = Pr [Y V X = x i ] Pr [X = x i ] + Pr [ Y / V X = x ] [ ] i Pr X = x i = 1 ( 2 Pr [Y V X = x i ] + Pr [ Y / V X = x ] ) i = 1 ( 2 Pr [Y V X = x i ] + 1 Pr [ Y V X = x ] ) i = ( 2 Pr [Y V X = x i ] Pr [ Y V X = x ] ) i = ( 2 Pr [ Y V X = x i] e ε Pr [ Y V X = x ] ) i = Pr [ ( ) Y V X = x i] e ε eε 1 2 where the fifth equality follows from Definition 2.1. For small values of ε, it holds that e ε 1 + ε, hence obtaining that the probability of the adversary guessing correctly is Pr [WIN] eε ε Differential Privacy is Necessary in Practice Differential privacy seems to be a overly stringent notion of privacy, since, as explained above, it allows the adversary to select all entries in the database other than the entry under attack, and in the addition to select two possible values for this last entry. One may dismiss the necessity of such a strict notion arguing that it is hardly ever the case that the adversary has auxiliary information about most entries in a huge database such as that of Facebook, let alone, select all the entries but the one it is attacking. However, we show that the seemingly overly strong definition of differential privacy is actually exactly what is needed here and nothing weaker. We use the attack of Korolova [12] to illustrate our point. The attacker of [12]. We first outline the two main steps in the general scheme of the attacks described in [12]. In Appendix B, we give a more detailed description of one specific attack. Let U be a Facebook user profile for which we are interested in inferring information classified as restricted to a small subset of Facebook users, e.g., marked with Only me or Friends only visibility mode (that is, a certain feature F that can be specified as a criterion in the targeting process, e.g., U s sexual preferences). 1. Collect auxiliary information on the user U, on its Facebook profile, and on what distinguishes U s profile from the rest of the Facebook user network. Form a set of criteria (excluding any criterion on the feature of interest F ) in order to uniquely identify the user U. 2. Run a campaign with the above criteria for every possible value f for the feature F. If there is exactly one campaign (invoked with some value f for F ) for which impressions (or clicks) are reported and in this campaign they are unique (i.e., attributed to a single user), then conclude that f is the value appearing in U s profile for the feature F. Otherwise, update auxiliary information and run the attack again. 5

8 single entry of the input. We next show that for queries that have low sensitivity (in a very strong sense), it is enough to mask the value of q(x) by some carefully selected random variable. Query sensitivity [7]. Given a query q : D n R, the local sensitivity is a function of both q and a given database x, defined by LS q (x) = max q(x) q(x ). {x :d H (x,x )=1} The global sensitivity is a function of q taken to be the maximum local sensitivity over all databases x, i.e., GS q = max x D n (LS q(x)). In the case of the Facebook impressions statistic, the global sensitivity is the maximum number of times that a single person could conceivably view the advertisement within the given time period. In contrast, the global sensitivity of the unique impression statistic is exactly 1 (because each person either saw the advertisement or did not and this is what is counted). Differential privacy for sum queries from the Laplace distribution. The Laplace distribution, Lap(λ), is the continuous probability distribution with probability density function h(y) = exp( y /λ). 2λ For Y Lap(λ) we have that E[Y ] = 0, Var[Y ] = 2λ 2, and Pr[ Y > kλ] = e k. The following holds for all y, y : h(y) h(y ) = e y λ e y λ = e y y λ e y y λ, (2) where the inequality follows from the triangle inequality. The framework of output perturbation via global sensitivity was suggested in [7]. In this framework we consider queries of the form q : D n R. The outcome is obtained by adding to q(x) noise sampled from the Laplace distribution, calibrated to GS q. Formally, ˆf q is defined as ˆf q (x) = q(x) + Y, where Y Lap(GS q /ε). (3) This results in an ε-differentially private mechanism. To verify this, for a database y, denote by h y ( ) the probability density function of the distribution on the output of ˆfq (y). For every v R representing an outcome of ˆf q (x), it holds that h y (v) = h(v q(y)). Thus, for every pair of neighboring databases x, x and for every possible outcome v R, we have that, h x (v) h x (v) = h(v q(x)) h(v q(x )) ε (v q(x)) (v q(x )) e GS q (by Equation (2)) ε q(x ) q(x) = e GS q e ε (since GS q q(x ) q(x) ). 7

10 4. Unique clicks: q 4 (x) = {j C j > 0} We note that the unique impressions and clicks statistics are actually count queries, which are special cases of sum queries where all values are Boolean. 3.2 The Proposed Privacy-Preserving Mechanism In this section we present our concrete proposal for releasing the impression and click statistics in a privacy-preserving manner. The general mechanism we suggest follows by composing four differentially private mechanisms (each denoted S i and used with a privacy parameter ε i, where i {1,..., 4}); mechanism S i relates to query q i defined above. Each of the four mechanisms works by calibrating noise to the sensitivity of one of the four query types, using the method described in Example 2.2 for sum queries. As will be explained below, we select the privacy parameter ε i for each sub-mechanism S i according to the sensitivity of i th sum-query q i and to the level of accuracy that we consider relevant to this query. By Lemma A.5, a mechanism releasing all four (perturbed) counts is ε-differentially private for ε = ε 1 + ε 2 + ε 3 + ε 4. The sensitivity of the Facebook statistics. In order to use the mechanism of Example 2.2 (for each of the four sum-queries), we need to know the sensitivity of each sum-query. That is, we need to bound the change in each sum that may result in a change of the data of a single entry (i.e., the personal information of a single Facebook user). Regarding unique impressions and clicks, each database entry can change the result by only 1. Thus, the global sensitivity of the functions q 3 ( ) and q 4 ( ) is 1 (i.e., we have γ 3 = γ 4 = 1, where γ i denotes the global sensitivity of the i th sum-query). The case of non-unique impressions and non-unique clicks is less clear since we do not have an a priori bound on these counts. However, we argue that it quite easy to give some bound that is very unlikely to be crossed. This is especially true when giving the statistics per day, as does Facebook. For this case, we use γ 1 = 20 (the global sensitivity of q 1 ( )) as the bound on the number of impressions attributed to a single person in a given day, and we use γ 2 = 3 (the global sensitivity of q 2 ( )) as the bound on the number of clicks attributed to a single person in a given day. We note that such a bound can be artificially imposed (by Facebook in its statistics) by omitting counts of any additional impressions or clicks by the same user (alternatively, all counts may be taken into account, and the privacy guarantees rely on the low probability that some user surpasses the bounds). Let ε 1,..., ε 4 be privacy parameters that can be modified to tradeoff the privacy and utility, as desired (this will be discussed in more detail below). The mechanisms are then defined as follows: Mechanism S 1 output the perturbed number of impressions Z 1 : To obtain Z 1, we sample Laplacian noise Y 1 Lap(γ 1 /ε 1 ) and add it to the actual number of impressions. That is, the first sub-mechanism S 1 outputs Z 1 q 1 (x) + Y 1, where Y 1 Lap(20/ε 1 ). Mechanism S 2 outputting perturbed number of clicks Z 2 : To obtain Z 2, we sample Laplacian noise Y 2 Lap(γ 2 /ε 2 ) and add it to the actual number of clicks. That is, the second sub-mechanism S 2 outputs Z 2 q 2 (x) + Y 2, where Y 2 Lap(3/ε 2 ). Mechanism S 3 outputting perturbed number of unique impressions Z 3 : The mechanism outputs Z 3 q 3 (x) + Y 3, where Y 3 Lap(1/ε 3 ). 9

11 Mechanism S 4 outputting perturbed number of unique clicks Z 4 : The mechanism outputs Z 4 q 4 (x) + Y 4, where Y 4 Lap(1/ε 4 ). In all of the above, it does not seem natural to ever release negative values. To prevent this from happening, whenever the outcome is negative (i.e., to ensure Z i 0), we simply round the Z i to 0 whenever Y i < T i. It is easy to verify that this has no effect on the privacy of the mechanism (this is because the rounding operation is applied to the output of the mechanism, and does not depend on the original database and query; indeed, this operation can only reduce the amount of information that is released on the database). 3.3 Practical Choice of Privacy Parameters for Mechanism S In this section we suggest how to instantiate the privacy parameters ε 1, ε 2, ε 3, ε 4 of the mechanism S described in the previous section. We stress that this is just one suggestion. The principles described below for making this choice can be used to tailor the actual privacy and utility requirements, as desired by Facebook or any other advertising agency. The choice of privacy parameters. Before presenting the concrete privacy parameters for our mechanism, we discuss how to choose these parameters. The tension between privacy and utility is inherent; hence, our goal is to find privacy parameters that optimize this trade off. Specifically, a smaller value of ε yields a stronger privacy guarantee but also reduces the accuracy of the published statistics (since more Laplacian noise is added). Conversely, a larger ε yields more accurate results, but decreases the level of privacy. We choose the privacy parameters by first fixing the privacy parameter for the overall mechanism S and then adjusting the privacy parameters of the submechanisms to comply with this parameter. That is, our first requirement is that the mechanism S will indeed preserve privacy in a meaningful sense. We, hence, fix ε = 0.2 to be the privacy parameter for the overall mechanism and we seek privacy parameters for the sub-mechanisms ε 1, ε 2, ε 3, ε 4 such that ε = ε 1 + ε 2 + ε 3 + ε 4. To choose ε i, we take into account the sensitivity of the appropriate query and the magnitude of error that we deem reasonable for this query. Specifically, in choosing ε 1 (i.e., the privacy parameter for releasing the number of impressions), we consider an error of a few thousands of impressions to be reasonable, even for small campaigns (mainly since impressions are counted in thousands). Since the global sensitivity of this sum-query γ 1 = 20 implies that the add noise Y 1 will be sampled from the Laplace distribution with scale parameter 20/ε 1, we take ε 1 = 0.03 to obtain the desired guarantees. In choosing ε 2 (i.e., the privacy parameter for releasing the number of clicks), we consider an error of a bit over a hundred clicks to be reasonable. Since the global sensitivity of this sum-query γ 2 = 3 implies that the add noise Y 2 will be sampled with scale parameter 3/ε 2, we take ε 2 = 0.11 to obtain the desired guarantees. In choosing ε 3 and ε 4 we follow similar considerations, where in the case of unique impressions, we consider an error magnitude of 500 to be highly acceptable, and in the case of unique clicks, we consider an error magnitude of 70 to be acceptable. We, hence, take ε 3 = 0.01, and ε 4 = 0.05 to be the privacy parameters of our mechanisms. Hence, by Lemma A.5, the privacy parameter we obtain for the overall mechanism is indeed ε = ε 1 + ε 2 + ε 3 + ε 4 = 0.2. We remark that it is possible to allow a user (advertiser) to request any setting of the ε i parameters as long as the sum of all four does not exceed 0.2. (This makes sense if the accuracy of one of the statistics is much more important to a given advertiser than the others.) 10

12 Probability of error. Adding random noise to the released statistics reduces their correctness. Clearly, too much noise may reduce the accuracy of the evaluation of the effectiveness of the campaign. Furthermore, it may cause the marketer to pay more than is reasonable for such a campaign. We argue that the error will only be meaningful for very small campaigns (i.e., campaigns with a very small expected number of impressions and clicks). This is true since noise is added irrespective of the count results, and depends only on the privacy parameter ε. Hence, the probability of large error relative to a large count is exponentially small. We now analyze the size campaigns should have in order for the error to be meaningless, according to the parameters we selected. Error in impression count (Z 1 ): We analyze the probability that the reported number of impressions is with distance at least k from the actual impression count. Since this distance is exactly the amount of noise we added, we need to bound the probability that we added noise of size at least k, i.e., Y 1 k. By the properties of the Laplace distribution, this probability is e kε 1/20 = e k. Hence, for an error of size k = 2000, this probability is 0.05, and for an error of size k = 5000, this probability is Recall that impressions are counted in thousands, and hence this should not be considered to be a large error. Error in click count (Z 2 ): We analyze the probability that the reported number of clicks is with distance at least k from the actual click count. I.e., we analyze the probability of Y 2 > k. By the properties of the Laplace distribution, this probability is e kε 2/3 = e k. Hence, for an error of size k = 50, this probability is 0.16 and for error size k = 100, this probability 0.025, however, for an error of size k = 200, this probability decreases to We remark that this statistic seems to incorporate the most noise of all four counts. Error in unique impression count (Z 3 ): We analyze the probability of an error of size k in reporting the number of unique impressions, i.e., the probability that Y 3 > k. By the properties of the Laplace distribution, this probability is e kε 3 = e 0.01k. Hence, for an error of size k = 250, this probability is 0.082, and for an error of size 500, this probability is Error in unique click count (Z 4 ): We analyze the probability of an error of size k in reporting the number of unique impressions, i.e., the probability that Y 4 > k. By the properties of the Laplace distribution, this probability is e kε 4 = e 0.05k. Hence, for an error of size 50, this probability is 0.082, and for an error of size 100, this probability is Remark 3.1 Evidently, using the suggested mechanism may cause small advertising campaigns to be somewhat less cost effective than they currently are. However, we believe that most genuine advertisers aim for large enough campaigns and may only suffer small percentage of additional cost. Furthermore, a marketer that invokes many (large enough) campaigns, will eventually suffer almost zero additional cost since, on average, a zero amount of noise will be added to the reported statistics (for small campaign this may not be true, due to the one-sided rounding of negative outputs). The meaning of ε-differential privacy with ε = 0.2. We now try to shed some light on the privacy guarantee that is yielded by our mechanism. Consider an attacker that executes an attack similar to the one suggested in [12] aimed at revealing some private information of a specific Facebook user. For the sake of being specific, consider an employer trying to categorize its employees by their sexual orientation. That is, for each employee the employer would like to know whether 11

16 result). Thus the marketer holding the campaign may prefer to receive less (but more accurate) counts. Furthermore, it may well be the case that many of the counts that are of interest in large campaigns are quite irrelevant in smaller ones. Consider for example the social impressions and social clicks discussed above. These counts seem to be meaningless for campaigns with only a few hundred impressions and a few tens of clicks. References [1] Boaz Barak, Kamalika Chaudhuri, Cynthia Dwork, Satyen Kale, Frank McSherry, and Kunal Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In Proc. of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages , New York, NY, USA, ACM. [2] A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. In Proc. of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages , [3] Avrim Blum, Katrina Ligett, and Aaron Roth. A learning theory approach to non-interactive database privacy. In Proc. of the 40th ACM Symp. on the Theory of Computing, pages , New York, NY, USA, ACM. [4] I. Dinur and K. Nissim. Revealing information while preserving privacy. In Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages , [5] C. Dwork. Differential privacy. In M. Bugliesi, B. Preneel, V. Sassone, and I. Wegener, editors, Proc. of the 33rd International Colloquium on Automata, Languages and Programming, volume 4052 of Lecture Notes in Computer Science, pages Springer-Verlag, [6] C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology EUROCRYPT 2006, pages Springer, [7] C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In S. Halevi and T. Rabin, editors, Proc. of the Third Theory of Cryptography Conference TCC 2006, volume 3876 of Lecture Notes in Computer Science, pages Springer-Verlag, [8] C. Dwork and K. Nissim. Privacy-preserving datamining on vertically partitioned databases. In M. Franklin, editor, Advances in Cryptology CRYPTO 2004, volume 3152 of Lecture Notes in Computer Science, pages Springer-Verlag, [9] A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In Proceedings of the Twenty-Second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages , [10] A. Ghosh, T. Roughgarden, and M. Sundararajan. Universally utility-maximizing privacy mechanisms. In Proc. of the 41st ACM Symp. on the Theory of Computing,

18 to see that this implies the requirement of Definition 2.1. Hence, this mechanism is ε-differentially private. We remark that this mechanism yields almost no usefulness, however as shown above, generalizations of these ideas prove highly useful in constructing differentially private analyses when dealing with larger databases. Example A.2 Consider a very similar mechanism to that of Example A.1, which given x {0, 1} outputs x + Y where Y is obtained by limiting the random variable Y, sampled as before (i.e., Y is sampled according to Lap(1/ε)), to be within the interval [ k/ε, k/ε], for some large k (that is, if Y > k/ε we set Y = k/ε, similarly, if Y < k/ε we set Y = k/ε, and otherwise we set Y = Y ). Note that the resulting mechanism is no longer ε-differentially private since Pr [S(0) > k/ε] = 0 while Pr [S(1) > k/ε] > 0, hence the ratio between these two probabilities is unbounded. However, note that the probability that S(1) > k/ε is exponentially small in k; hence, the overall probability that an adversary is able to distinguish between the two cases stays practically the same as in Example A.1. It is therefore only natural to still call this mechanism private. Definition A.3 ((ε, δ)-differential privacy [6]) A mechanism S is said to be (ε, δ)-differentially private if for all neighboring pairs of databases x, x D n, and for all subsets of possible answers V: Pr[S(x) V] Pr[S(x ) V]e ε + δ. (4) The probability is taken over the coin tosses of the mechanism. A.2 Privacy of Sets While differential privacy is intended to capture the notion of individual privacy and furthermore is defined with respect to a change in a single entry, it would be somewhat disappointing to find out that it allows a change in, say, two or three entries to cause a massive change in output distribution. Fortunately, as we next show, this is not the case, but rather privacy of sets may deteriorate only linearly in the size of the set (for small sets and for small enough ε). Obviously, for an analysis to be meaningful, the distribution on the outputs must change with a change of many of the entries in the database the, thus privacy of sets must deteriorate, at least for large enough sets. Lemma A.4 Let S be an ε-differentially private mechanism and let x, x be two databases such that d H (x, x ) = c. Then Pr[S(x) V] Pr[S(x ) V] eεc. Proof: We prove the lemma by induction on c. For c = 1 it is simply the ε-differential privacy of S. Assume correctness for c and let x, x be two databases such that d H (x, x ) = c + 1. There exists a database x such that d H (x, x ) = c and d H (x, x ) = 1. By Equation (1) and by the induction hypothesis, it follows that Pr[S(x) V] Pr[S(x) V] Pr[S(x = ) V] Pr[S(x ) V] Pr[S(x ) V] Pr[S(x ) V] eεc e ε = e ε(c+1). 17

20 It is instructive to note that the Inference from Impressions attack, described in Figure 2, does not require the user U to pay attention to the ad (and certainly not to click on it) in order for the attack to succeed in inferring the U s private information. It is enough that the user U connects to Facebook sufficiently often so that the ads have a chance to be displayed to U at least once over the observation time period. 19

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### arxiv:1112.0829v1 [math.pr] 5 Dec 2011

How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly

### Differential privacy in health care analytics and medical research An interactive tutorial

Differential privacy in health care analytics and medical research An interactive tutorial Speaker: Moritz Hardt Theory Group, IBM Almaden February 21, 2012 Overview 1. Releasing medical data: What could

Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

### Notes from Week 1: Algorithms for sequential prediction

CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking

### Practicing Differential Privacy in Health Care: A Review

TRANSACTIONS ON DATA PRIVACY 5 (2013) 35 67 Practicing Differential Privacy in Health Care: A Review Fida K. Dankar*, and Khaled El Emam* * CHEO Research Institute, 401 Smyth Road, Ottawa, Ontario E mail

### Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

### Pay-per-action model for online advertising

Pay-per-action model for online advertising Mohammad Mahdian 1 and Kerem Tomak 1 Yahoo! Research. Email: {mahdian,kerem}@yahoo-inc.com Abstract. The online advertising industry is currently based on two

### 1 Portfolio mean and variance

Copyright c 2005 by Karl Sigman Portfolio mean and variance Here we study the performance of a one-period investment X 0 > 0 (dollars) shared among several different assets. Our criterion for measuring

### About the inverse football pool problem for 9 games 1

Seventh International Workshop on Optimal Codes and Related Topics September 6-1, 013, Albena, Bulgaria pp. 15-133 About the inverse football pool problem for 9 games 1 Emil Kolev Tsonka Baicheva Institute

### A Practical Differentially Private Random Decision Tree Classifier

273 295 A Practical Differentially Private Random Decision Tree Classifier Geetha Jagannathan, Krishnan Pillaipakkamnatt, Rebecca N. Wright Department of Computer Science, Columbia University, NY, USA.

### Lecture 5 - CPA security, Pseudorandom functions

Lecture 5 - CPA security, Pseudorandom functions Boaz Barak October 2, 2007 Reading Pages 82 93 and 221 225 of KL (sections 3.5, 3.6.1, 3.6.2 and 6.5). See also Goldreich (Vol I) for proof of PRF construction.

### Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints

Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Michael Bauer, Srinivasan Ravichandran University of Wisconsin-Madison Department of Computer Sciences {bauer, srini}@cs.wisc.edu

### Considerations of Modeling in Keyword Bidding (Google:AdWords) Xiaoming Huo Georgia Institute of Technology August 8, 2012

Considerations of Modeling in Keyword Bidding (Google:AdWords) Xiaoming Huo Georgia Institute of Technology August 8, 2012 8/8/2012 1 Outline I. Problem Description II. Game theoretical aspect of the bidding

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

### 17.6.1 Introduction to Auction Design

CS787: Advanced Algorithms Topic: Sponsored Search Auction Design Presenter(s): Nilay, Srikrishna, Taedong 17.6.1 Introduction to Auction Design The Internet, which started of as a research project in

### Chapter 7. Sealed-bid Auctions

Chapter 7 Sealed-bid Auctions An auction is a procedure used for selling and buying items by offering them up for bid. Auctions are often used to sell objects that have a variable price (for example oil)

### A New Interpretation of Information Rate

A New Interpretation of Information Rate reproduced with permission of AT&T By J. L. Kelly, jr. (Manuscript received March 2, 956) If the input symbols to a communication channel represent the outcomes

### CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Allan Borodin November 15, 2012; Lecture 10 1 / 27 Randomized online bipartite matching and the adwords problem. We briefly return to online algorithms

### arxiv:1402.3426v3 [cs.cr] 27 May 2015

Privacy Games: Optimal User-Centric Data Obfuscation arxiv:1402.3426v3 [cs.cr] 27 May 2015 Abstract Consider users who share their data (e.g., location) with an untrusted service provider to obtain a personalized

### Lecture 10: CPA Encryption, MACs, Hash Functions. 2 Recap of last lecture - PRGs for one time pads

CS 7880 Graduate Cryptography October 15, 2015 Lecture 10: CPA Encryption, MACs, Hash Functions Lecturer: Daniel Wichs Scribe: Matthew Dippel 1 Topic Covered Chosen plaintext attack model of security MACs

### A Brief Introduction to Property Testing

A Brief Introduction to Property Testing Oded Goldreich Abstract. This short article provides a brief description of the main issues that underly the study of property testing. It is meant to serve as

### Offline sorting buffers on Line

Offline sorting buffers on Line Rohit Khandekar 1 and Vinayaka Pandit 2 1 University of Waterloo, ON, Canada. email: rkhandekar@gmail.com 2 IBM India Research Lab, New Delhi. email: pvinayak@in.ibm.com

### Prediction of DDoS Attack Scheme

Chapter 5 Prediction of DDoS Attack Scheme Distributed denial of service attack can be launched by malicious nodes participating in the attack, exploit the lack of entry point in a wireless network, and

### ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

ECON 459 Game Theory Lecture Notes Auctions Luca Anderlini Spring 2015 These notes have been used before. If you can still spot any errors or have any suggestions for improvement, please let me know. 1

### Gambling Systems and Multiplication-Invariant Measures

Gambling Systems and Multiplication-Invariant Measures by Jeffrey S. Rosenthal* and Peter O. Schwartz** (May 28, 997.. Introduction. This short paper describes a surprising connection between two previously

### A Simple Characterization for Truth-Revealing Single-Item Auctions

A Simple Characterization for Truth-Revealing Single-Item Auctions Kamal Jain 1, Aranyak Mehta 2, Kunal Talwar 3, and Vijay Vazirani 2 1 Microsoft Research, Redmond, WA 2 College of Computing, Georgia

### Introduction. Digital Signature

Introduction Electronic transactions and activities taken place over Internet need to be protected against all kinds of interference, accidental or malicious. The general task of the information technology

### A survey on click modeling in web search

A survey on click modeling in web search Lianghao Li Hong Kong University of Science and Technology Outline 1 An overview of web search marketing 2 An overview of click modeling 3 A survey on click models

### CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e.

CHAPTER II THE LIMIT OF A SEQUENCE OF NUMBERS DEFINITION OF THE NUMBER e. This chapter contains the beginnings of the most important, and probably the most subtle, notion in mathematical analysis, i.e.,

### ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.

### Moral Hazard. Itay Goldstein. Wharton School, University of Pennsylvania

Moral Hazard Itay Goldstein Wharton School, University of Pennsylvania 1 Principal-Agent Problem Basic problem in corporate finance: separation of ownership and control: o The owners of the firm are typically

### Using Generalized Forecasts for Online Currency Conversion

Using Generalized Forecasts for Online Currency Conversion Kazuo Iwama and Kouki Yonezawa School of Informatics Kyoto University Kyoto 606-8501, Japan {iwama,yonezawa}@kuis.kyoto-u.ac.jp Abstract. El-Yaniv

### Stochastic Inventory Control

Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the

CS346: Advanced Databases Alexandra I. Cristea A.I.Cristea@warwick.ac.uk Data Security and Privacy Outline Chapter: Database Security in Elmasri and Navathe (chapter 24, 6 th Edition) Brief overview of

### Shroudbase Technical Overview

Shroudbase Technical Overview Differential Privacy Differential privacy is a rigorous mathematical definition of database privacy developed for the problem of privacy preserving data analysis. Specifically,

### Secure Computation Without Authentication

Secure Computation Without Authentication Boaz Barak 1, Ran Canetti 2, Yehuda Lindell 3, Rafael Pass 4, and Tal Rabin 2 1 IAS. E:mail: boaz@ias.edu 2 IBM Research. E-mail: {canetti,talr}@watson.ibm.com

### The Heat Equation. Lectures INF2320 p. 1/88

The Heat Equation Lectures INF232 p. 1/88 Lectures INF232 p. 2/88 The Heat Equation We study the heat equation: u t = u xx for x (,1), t >, (1) u(,t) = u(1,t) = for t >, (2) u(x,) = f(x) for x (,1), (3)

### 1 Formulating The Low Degree Testing Problem

6.895 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 5: Linearity Testing Lecturer: Dana Moshkovitz Scribe: Gregory Minton and Dana Moshkovitz In the last lecture, we proved a weak PCP Theorem,

### Exponential time algorithms for graph coloring

Exponential time algorithms for graph coloring Uriel Feige Lecture notes, March 14, 2011 1 Introduction Let [n] denote the set {1,..., k}. A k-labeling of vertices of a graph G(V, E) is a function V [k].

### Linear Codes. Chapter 3. 3.1 Basics

Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length

### Lecture 3: One-Way Encryption, RSA Example

ICS 180: Introduction to Cryptography April 13, 2004 Lecturer: Stanislaw Jarecki Lecture 3: One-Way Encryption, RSA Example 1 LECTURE SUMMARY We look at a different security property one might require

### 1 Construction of CCA-secure encryption

CSCI 5440: Cryptography Lecture 5 The Chinese University of Hong Kong 10 October 2012 1 Construction of -secure encryption We now show how the MAC can be applied to obtain a -secure encryption scheme.

### A Sublinear Bipartiteness Tester for Bounded Degree Graphs

A Sublinear Bipartiteness Tester for Bounded Degree Graphs Oded Goldreich Dana Ron February 5, 1998 Abstract We present a sublinear-time algorithm for testing whether a bounded degree graph is bipartite

### MOP 2007 Black Group Integer Polynomials Yufei Zhao. Integer Polynomials. June 29, 2007 Yufei Zhao yufeiz@mit.edu

Integer Polynomials June 9, 007 Yufei Zhao yufeiz@mit.edu We will use Z[x] to denote the ring of polynomials with integer coefficients. We begin by summarizing some of the common approaches used in dealing

### Fairness in Routing and Load Balancing

Fairness in Routing and Load Balancing Jon Kleinberg Yuval Rabani Éva Tardos Abstract We consider the issue of network routing subject to explicit fairness conditions. The optimization of fairness criteria

### Invited Applications Paper

Invited Applications Paper - - Thore Graepel Joaquin Quiñonero Candela Thomas Borchert Ralf Herbrich Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, UK THOREG@MICROSOFT.COM JOAQUINC@MICROSOFT.COM

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### Principle of Data Reduction

Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

### Privacy-Preserving Aggregation of Time-Series Data

Privacy-Preserving Aggregation of Time-Series Data Elaine Shi PARC/UC Berkeley elaines@eecs.berkeley.edu Richard Chow PARC rchow@parc.com T-H. Hubert Chan The University of Hong Kong hubert@cs.hku.hk Dawn

### Chapter 21: The Discounted Utility Model

Chapter 21: The Discounted Utility Model 21.1: Introduction This is an important chapter in that it introduces, and explores the implications of, an empirically relevant utility function representing intertemporal

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice,

### Cost of Conciseness in Sponsored Search Auctions

Cost of Conciseness in Sponsored Search Auctions Zoë Abrams Yahoo!, Inc. Arpita Ghosh Yahoo! Research 2821 Mission College Blvd. Santa Clara, CA, USA {za,arpita,erikvee}@yahoo-inc.com Erik Vee Yahoo! Research

### How to build a probability-free casino

How to build a probability-free casino Adam Chalcraft CCR La Jolla dachalc@ccrwest.org Chris Freiling Cal State San Bernardino cfreilin@csusb.edu Randall Dougherty CCR La Jolla rdough@ccrwest.org Jason

### Probability Using Dice

Using Dice One Page Overview By Robert B. Brown, The Ohio State University Topics: Levels:, Statistics Grades 5 8 Problem: What are the probabilities of rolling various sums with two dice? How can you

### Victor Shoup Avi Rubin. fshoup,rubing@bellcore.com. Abstract

Session Key Distribution Using Smart Cards Victor Shoup Avi Rubin Bellcore, 445 South St., Morristown, NJ 07960 fshoup,rubing@bellcore.com Abstract In this paper, we investigate a method by which smart

### Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

### The Goldberg Rao Algorithm for the Maximum Flow Problem

The Goldberg Rao Algorithm for the Maximum Flow Problem COS 528 class notes October 18, 2006 Scribe: Dávid Papp Main idea: use of the blocking flow paradigm to achieve essentially O(min{m 2/3, n 1/2 }

Mechanisms for Fair Attribution Eric Balkanski Yaron Singer Abstract We propose a new framework for optimization under fairness constraints. The problems we consider model procurement where the goal is

### International Journal of Information Technology, Modeling and Computing (IJITMC) Vol.1, No.3,August 2013

FACTORING CRYPTOSYSTEM MODULI WHEN THE CO-FACTORS DIFFERENCE IS BOUNDED Omar Akchiche 1 and Omar Khadir 2 1,2 Laboratory of Mathematics, Cryptography and Mechanics, Fstm, University of Hassan II Mohammedia-Casablanca,

### Evaluation of a New Method for Measuring the Internet Degree Distribution: Simulation Results

Evaluation of a New Method for Measuring the Internet Distribution: Simulation Results Christophe Crespelle and Fabien Tarissan LIP6 CNRS and Université Pierre et Marie Curie Paris 6 4 avenue du président

### A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA

A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University Agency Internal User Unmasked Result Subjects

### Factoring & Primality

Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

### Efficient Algorithms for Masking and Finding Quasi-Identifiers

Efficient Algorithms for Masking and Finding Quasi-Identifiers Rajeev Motwani Stanford University rajeev@cs.stanford.edu Ying Xu Stanford University xuying@cs.stanford.edu ABSTRACT A quasi-identifier refers

### Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

### Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22

CS 70 Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice, roulette

### 1 Approximating Set Cover

CS 05: Algorithms (Grad) Feb 2-24, 2005 Approximating Set Cover. Definition An Instance (X, F ) of the set-covering problem consists of a finite set X and a family F of subset of X, such that every elemennt

### Strategic Online Advertising: Modeling Internet User Behavior with

2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew

### RE-IDENTIFICATION RISK IN SWAPPED MICRODATA RELEASE

RE-IDENTIFICATION RISK IN SWAPPED MICRODATA RELEASE Krish Muralidhar, Department of Marketing and Supply Chain Management, University of Oklahoma, Norman, OK 73019, (405)-325-2677, krishm@ou.edu ABSTRACT

effective FACEBOOK MARKETPLACE advertising April 2012 Lead Contributors Justin Fontana Search Marketing Supervisor Havas Media justin.fontana@havasmedia.com Rob Griffin Global Director Product Development

### Keywords the Most Important Item in SEO

This is one of several guides and instructionals that we ll be sending you through the course of our Management Service. Please read these instructionals so that you can better understand what you can

### Stationary random graphs on Z with prescribed iid degrees and finite mean connections

Stationary random graphs on Z with prescribed iid degrees and finite mean connections Maria Deijfen Johan Jonasson February 2006 Abstract Let F be a probability distribution with support on the non-negative

### Challenges of Data Privacy in the Era of Big Data. Rebecca C. Steorts, Vishesh Karwa Carnegie Mellon University November 18, 2014

Challenges of Data Privacy in the Era of Big Data Rebecca C. Steorts, Vishesh Karwa Carnegie Mellon University November 18, 2014 1 Outline Why should we care? What is privacy? How do achieve privacy? Big

### 6 Scalar, Stochastic, Discrete Dynamic Systems

47 6 Scalar, Stochastic, Discrete Dynamic Systems Consider modeling a population of sand-hill cranes in year n by the first-order, deterministic recurrence equation y(n + 1) = Ry(n) where R = 1 + r = 1

### Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

### 4: SINGLE-PERIOD MARKET MODELS

4: SINGLE-PERIOD MARKET MODELS Ben Goldys and Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2015 B. Goldys and M. Rutkowski (USydney) Slides 4: Single-Period Market

Single-Period Balancing of Pay Per-Click and Pay-Per-View Online Display Advertisements Changhyun Kwon Department of Industrial and Systems Engineering University at Buffalo, the State University of New

### Triangle deletion. Ernie Croot. February 3, 2010

Triangle deletion Ernie Croot February 3, 2010 1 Introduction The purpose of this note is to give an intuitive outline of the triangle deletion theorem of Ruzsa and Szemerédi, which says that if G = (V,

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### Security Games in Online Advertising: Can Ads Help Secure the Web?

Security Games in Online Advertising: Can Ads Help Secure the Web? Nevena Vratonjic, Jean-Pierre Hubaux, Maxim Raya School of Computer and Communication Sciences EPFL, Switzerland firstname.lastname@epfl.ch

### Oscillations of the Sending Window in Compound TCP

Oscillations of the Sending Window in Compound TCP Alberto Blanc 1, Denis Collange 1, and Konstantin Avrachenkov 2 1 Orange Labs, 905 rue Albert Einstein, 06921 Sophia Antipolis, France 2 I.N.R.I.A. 2004

### Credit Card Market Study Interim Report: Annex 4 Switching Analysis

MS14/6.2: Annex 4 Market Study Interim Report: Annex 4 November 2015 This annex describes data analysis we carried out to improve our understanding of switching and shopping around behaviour in the UK

### Here are our Pay per Click Advertising Packages:

Did you know that PPC is the fastest way to drive instant traffic to your website? What can your website do without traffic? Without traffic, you are losing time, opportunities and money to your competition.

### Simulation-Based Security with Inexhaustible Interactive Turing Machines

Simulation-Based Security with Inexhaustible Interactive Turing Machines Ralf Küsters Institut für Informatik Christian-Albrechts-Universität zu Kiel 24098 Kiel, Germany kuesters@ti.informatik.uni-kiel.de

### Hacking-proofness and Stability in a Model of Information Security Networks

Hacking-proofness and Stability in a Model of Information Security Networks Sunghoon Hong Preliminary draft, not for citation. March 1, 2008 Abstract We introduce a model of information security networks.

Optimal energy trade-off schedules Neal Barcelo, Daniel G. Cole, Dimitrios Letsios, Michael Nugent, Kirk R. Pruhs To cite this version: Neal Barcelo, Daniel G. Cole, Dimitrios Letsios, Michael Nugent,

### Section 3 Sequences and Limits, Continued.

Section 3 Sequences and Limits, Continued. Lemma 3.6 Let {a n } n N be a convergent sequence for which a n 0 for all n N and it α 0. Then there exists N N such that for all n N. α a n 3 α In particular

### Competition and Fraud in Online Advertising Markets

Competition and Fraud in Online Advertising Markets Bob Mungamuru 1 and Stephen Weis 2 1 Stanford University, Stanford, CA, USA 94305 2 Google Inc., Mountain View, CA, USA 94043 Abstract. An economic model

### 2.1 Complexity Classes

15-859(M): Randomized Algorithms Lecturer: Shuchi Chawla Topic: Complexity classes, Identity checking Date: September 15, 2004 Scribe: Andrew Gilpin 2.1 Complexity Classes In this lecture we will look

### MTH6120 Further Topics in Mathematical Finance Lesson 2

MTH6120 Further Topics in Mathematical Finance Lesson 2 Contents 1.2.3 Non-constant interest rates....................... 15 1.3 Arbitrage and Black-Scholes Theory....................... 16 1.3.1 Informal

### Chapter 11. Asymmetric Encryption. 11.1 Asymmetric encryption schemes

Chapter 11 Asymmetric Encryption The setting of public-key cryptography is also called the asymmetric setting due to the asymmetry in key information held by the parties. Namely one party has a secret

### Week 5 Integral Polyhedra

Week 5 Integral Polyhedra We have seen some examples 1 of linear programming formulation that are integral, meaning that every basic feasible solution is an integral vector. This week we develop a theory

### [Refer Slide Time: 05:10]

Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture

### Automated SEO. A Market Brew White Paper

Automated SEO A Market Brew White Paper Abstract In this paper, we use the term Reach to suggest the forecasted traffic to a particular webpage or website. Reach is a traffic metric that describes an expected

### Rafael Witten Yuze Huang Haithem Turki. Playing Strong Poker. 1. Why Poker?

Rafael Witten Yuze Huang Haithem Turki Playing Strong Poker 1. Why Poker? Chess, checkers and Othello have been conquered by machine learning - chess computers are vastly superior to humans and checkers

### Applying Machine Learning to Stock Market Trading Bryce Taylor

Applying Machine Learning to Stock Market Trading Bryce Taylor Abstract: In an effort to emulate human investors who read publicly available materials in order to make decisions about their investments,

### 24. The Branch and Bound Method

24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no