A Practical Framework for Privacy-Preserving Data Analytics

Size: px
Start display at page:

Download "A Practical Framework for Privacy-Preserving Data Analytics"

Transcription

1 A Practica Framework for Privacy-Preserving Data Anaytics ABSTRACT Liyue Fan Integrated Media Systems Center University of Southern Caifornia Los Angees, CA, USA The avaiabiity of an increasing amount of user generated data is transformative to our society. We enjoy the benefits of anayzing big data for pubic interest, such as disease outbreak detection and traffic contro, as we as for commercia interests, such as smart grid and product recommendation. However, the arge coection of user generated data contains unique patterns and can be used to re-identify individuas, which has been exempified by the AOL search og reease incident. In this paper, we propose a practica framework for data anaytics, whie providing differentia privacy guarantees to individua data contributors. Our framework generates differentiay private aggregates which can be used to perform data mining and recommendation tasks. To aeviate the high perturbation errors introduced by the differentia privacy mechanism, we present two methods with different samping techniques to draw a subset of individua data for anaysis. Empirica studies with reaword data sets show that our soutions enabe accurate data anaytics on a sma fraction of the input data, reducing user privacy risk and data storage requirement without compromising the anaysis resuts. Categories and Subject Descriptors H.2.7 [Database Management]: Database Administration Security, integrity, and protection; H.2.8 [Database Management]: Database Appications Data mining eywords Data Anaytics, Differentia Privacy, Samping. INTRODUCTION We ive in the age of big data. With an increasing number of peope, devices, and sensors connected with digita networks, individua data now can be argey coected and anayzed to understand important phenomena. One exampe is Googe Fu Trends, a service that estimates fu activity by aggregating individua search work done whie interning with Samsung. Copyright is hed by the Internationa Word Wide Web Conference Committee (IW3C2). IW3C2 reserves the right to provide a hyperink to the author s site if the Materia is used in eectronic media. WWW 25, May 8 22, 25, Forence, Itay. ACM /5/5. Hongxia Jin Samsung R&D Research Center San Jose, CA, USA Records Data Loss Privacy Surpus Users cutoff Figure : Record Distribution of Netfix Users queries. In the retai market, individua purchase histories are used by recommendation toos to earn trends and patterns. Performing anaytics on private data is ceary beneficia, such as eary detection of disease and recommendation services. However, user concerns rise from a privacy perspective, with sharing an increasing amount of information regarding their heath, ocation, service usage, and onine activities. As a matter of fact, the uniqueness of each user is increased by the big coection of individua data. The AOL data reease in 26 is an unfortunate exampe of privacy catastrophe [], in which the search ogs of an innocent citizen were quicky identified by a newspaper journaist. A recent study by de Montjoye et a. [9] concudes that human mobiity patterns are highy unique and four spatio-tempora points are enough to uniquey identify 95% of the individuas. In order to protect users from re-identification attacks, their private data must be transformed prior to reease for anaysis. The current state-of-the-art paradigm for privacy-preserving data anaysis is differentia privacy [], which aows un-trusted parties to access private data through aggregate queries. The aggregate statistics are perturbed by a randomized agorithm, such that the output remains roughy the same even if any user is added or removed in the input data. Differentia privacy provides a strong guarantee: given the output statistics, an adversary wi not be abe to infer whether any user is present in the input database. However, this indistinguishabiity can be ony achieved at high perturbation cost. Intuitivey, the more data a user contributes to the anaysis process, the more perturbation noise is needed to hide his/her presence. In some cases, a user coud generate an unbounded amount of data, such as purchase or check-in history, the addition or remova of which may resut in unimited impact on the output. The chaenge of enforcing differentia privacy is that it incurs a surpus of privacy cost, i.e. high perturbation error, being designed to protect each user according to the highest possibe data contribution. In reaity, ony a very sma number of users generate arge amount of persona data, whie the rest contribute itte data each. As shown in Figure, out of 5 users from Netfix prize com- 3

2 petition [2], ony user generated around 7 data records, whie the majority of users generated much ess persona data, ess than 2 data records each. If a upper bound is imposed on individua user data contribution, the surpus of privacy, e.g. high perturbation noise, can be reduced at the cost of data oss, i.e. part of data from those users who contributed more than the threshod. To imit individua data contribution, some strategies have been adopted by severa works [6][25]. The authors of [6] used the first d search queries submitted by each user, and the work in [25] reduced the number of items contained in each transaction to with smart truncation. However, there has been no discussion on the choice of the bounds, i.e., d and. Furthermore, the choice of actua user records (or items in a singe transaction) remains non-trivia, for generic appications. With a rigorous privacy notion, we consider how to anayze individuay contributed data to gain a deep understanding of service usage and behavior patterns, for various appication domains. We woud ike to understand the impacts of privacy and data oss on the resuting data anaytics, and design agorithms to draw private data accordingy. Exampe data anaytica questions are: Which paces do peope visit on Thursdays? and What are the most popuar movies with femae watchers under age 25? We formay define the tasks as database queries and detais are provided in Section 3. Contributions. In this paper, we address the probem of differentiay private data anaytics, where each user coud contribute a arge number of records. We propose a generic framework to generate anaysis resuts on a samped database, and study two samping methods as we as the samping factor in order to achieve a baance between data oss and privacy surpus. We summarize the contributions of this paper as foows: () We propose a generic, samping-based framework for an important cass of data anaytica tasks: top- mining and contextaware recommendation. We consider the probem of reeasing a set of count queries regarding the domain-specific items of interest as we as customizabe predicates to answer deep, anaytica questions. The count queries are perturbed prior to reease such that they satisfy differentia privacy. (2) We design two agorithms that draw a sampe of user records from the raw database and generate anaysis resuts on the samped data. The agorithm randomy sampes up to records per user. The HPA agorithm seects up to records from each user that are most usefu for the specific anaytica tasks. The utiity of each record can be customized based on the actua appication domain. We outine each samping method and provide pseudo code for easy impementation. (3) We provide anaysis on the accuracy of random samping, i.e. Mean Squared Error of reeased counts, with respect to the samping factor. We concude that the optima vaue is positivey correated the privacy constraint. We show that performing record samping on individua user s data does not infict extra privacy eakage. We formay prove that both samping agorithms satisfy differentia privacy. (4) We conduct extensive empirica studies with various reaword data sets. We compare our approaches with existing differentiay private mechanisms and evauate the accuracy of reeased count data with three utiity metrics. The experimenta resuts show that athough performed on a sma samped database, our methods provide comparabe performance to the best existing approaches in MSE and L-divergence, and superior performance in top- discovery and context-aware recommendation tasks. The HPA agorithm yieds higher precision, whie the agorithm preserves we the distributiona properties in reeased data. We beieve that our privacy-preserving framework wi enabe data anaytics for a variety of services, reducing user privacy cost and data storage requirement without compromising output utiity. The rest of the paper is organized as foows: Section 2 briefy surveys the reated works on privacy-preserving data pubishing and anaytics. Section 3 defines the probem and privacy notion. Section 4 presents the technica detais of the proposed framework and two samping agorithms. Theoretica resuts of privacy guarantees are provided in Section 5. Section 6 describes the data set and presents a set of empirica studies. Finay, Section 7 concudes the paper and states possibe directions for future work. 2. RELATED WORS A pethora of differentiay private techniques have been deveoped since the introduction of ɛ-differentia privacy in [2]. Here we briefy review the most recent, reevant works to our probem. Differentia Privacy. Dwork et a. [2] first proposed ɛ-differentia privacy and estabished the Lapace mechanism to perturb aggregate queries to guarantee differentia privacy. Since then, two variants have been proposed and adopted by many works as reaxations of ɛ-differentia privacy. The (ɛ, δ)-probabiistic differentia privacy [9] achieves ɛ-differentia privacy with high probabiity, i.e. ( δ). The (ɛ, δ)-indistinguishabiity [, 2] reaxes the bound of ɛ-differentia privacy by introducing an additive term δ. Our work adopts the strict definition of ɛ-differentia privacy and the Lapace mechanism to reease numeric data for anaysis. Data Pubication Techniques. A pethora of works have been proposed to pubish sanitized data with differentia privacy. To ist a few representatives among them, there is histogram pubication for range queries [7], for a given workoad [24], and for sparse data [8]. The majority of data pubication methods consider settings where each user contributes ony one record, or affects ony one reeased count. In contrast, we focus on those services where each individua may contribute a arge number of records and coud even have unbounded infuence on the reeased count queries. Bounding Individua Contribution. Here we review works estabished in a simiar probem setting, i.e. where individua data contribution is high, i.e. high goba sensitivity. The work of Nissim et a. [2] proposed smooth sensitivity, which measures individua impact on the output statistics in the neighborhood of the database instance. They showed that smooth sensitivity aows a smaer amount of perturbation noise injected to reeased statistics. However, it does not guarantee ɛ-differentia privacy. Proserpio et a. [22] recenty proposed to generaize ɛ-dp definition to weighted datasets, and scae down the weights of data records to reduce sensitivity. Rastogi and Nath [23], Fan and Xiong [3] and Chan et. a [5] studied the probem of sharing time series of counts with differentia privacy, where the maximum individua contribution is T, the number of time points. The authors of [23] proposed to preserve ony k discrete Fourier coefficients of the origina count series. The FAST framework in [3] reduced the sensitivity by samping M points in a count series and predicts at other points. The work [5] proposed the notion of p-sum to ensure each item in the stream ony affects a sma number of p-sum s. Two works by oroova et a. [6] and Hong et a. [4] addressed the differentiay private pubication of search ogs, where each user coud contribute a arge search history. The work of [6] keeps the first d queries of each user, whie the work of [4] expicity removes those users whose data change the optima output by more than a certain threshod. Zeng et a. [25] studied frequent itemset mining with differentia privacy and truncated each individua transaction to contain up to items. Recenty, earis and Papadopouos [5] 32

3 more specificay at Recommend good paces to visit on a Tuesday!. Moreover, we consider the probem of performing the above tasks on reeased count data. As in Figure 2, each V i s represents an item of interest, e.g. a restaurant, and each A j represents a vaue of the context, e.g. Monday. For each V i, the number of records containing V i is reeased. For each edge connecting V i and A j, the number of records containing V i and A j is reeased. As a resut, top- discovery can be performed on the item counts and contextaware recommendation on the edge counts connected to any context A j. We formay state the probem to investigate beow. Figure 2: Reeasing Counts for Data Anaytics proposed to reease non-overapping count data by grouping simiar coumns, i.e. items in our definition. In their work, each user is aowed to contribute no more than one records to each coumn, thus the maximum individua contribution is bounded by the number of coumns. However, the binary representation of user data may not truy convey information about each coumn, i.e. pace of interest or product. For exampe, when the bit for a user and a ocation is set, we cannot distinguish whether it was an accidenta check-in or the user went there many times due to persona preference. Samping Differentia Privacy. There have been a few works which studied the reationship between samping and differentia privacy. Chaudhuri and Mishra [6] first showed that the combination of k-anonymity and random samping can achieve differentia privacy with high probabiity. Li et a. [7] proposed to sampe each record with a caibrated probabiity β and then perform k- anonymity on the samped data, to achieve (ɛ, δ)-indistinguishabiity. Both works adopt the random samping technique which sampes a data record with certain probabiity. However, when appied in our setting, no guarantee is provided on bounding the individua data in the samped database. Our Competitors. After reviewing existing differentiay private technique, we identify three works that aow high individua contribution, reease aggregate statistics, and satisfy ɛ-differentia privacy. The first is a straight-forward appication of Lapace perturbation mechanism [2] to each reeased count, denoted as LPA. The second is the Fourier transform based agorithm from [23], which can be adapted to share count vectors, denoted as DFT. The third is GS, which is the best method proposed in [5]. 3. PRELIMINARIES 3. Probem Formuation Suppose a database D contains data records contributed by a set of n users about m items of interest. Each item coud represent in reaity a pace or a product. Each record in dataset D is a tupe (rid, uid, vid, attra), where rid is the record ID, uid corresponds to the user who contributed this record, vid is the item which the record is about, and attra represents contextua/additiona information regarding this record. In reaity, various information is often avaiabe in the actua database, such as transaction time, user ratings and reviews, and user demographic information. In our probem setting, attra can be an attribute, e.g. dayofweek, or a set of attributes, e.g. (Gender, Age), which can be customized to offer deep insight in a specific appication domain. Let h denote the number of possibe attra vaues. To be more concrete, we seect two anaytic tasks, i.e. top- discovery and context-aware recommendation, to iustrate the usabiity of our soutions. The first task answers questions as What are the most popuar paces in city X?, whie the second task aims DEFINITION D, reease (ITEM COUNTS). For each item V i in database c i(d) seect from D where vid = V i () DEFINITION 2 (EDGE COUNTS). For each edge connecting item V i and attribute A j, reease c i,j(d) seect from D where vid = V i and attra = A j (2) PROBLEM (PRIVATE DATA ANALYTICS). Given database D and privacy parameter ɛ, reease a sanitized version of item counts and edge counts, such that the reeased data satisfies ɛ- differentia privacy. Note that the probem definition, i.e. the counting queries to reease, can be customized according to the anaytica task to perform. For instance, to understand the correation between items, the bipartite graph in Figure 2 can be adapted as foows: A nodes wi be repaced by items, i.e. V nodes; and each edge (V i, V j) represents the number of times that V j is purchased/watched/visited by users who aso purchase/watch/visit V i. Simiary, those counts can be reeased privatey with sight adaption of our proposed soutions beow. 3.2 Privacy Definition The privacy guarantee provided by our work is differentia privacy [4]. Simpy put, a mechanism is differentiay private if its outcome is not significanty affected by the remova or addition of any user. An adversary thus earns approximatey the same information about any individua, irrespective of his/her presence or absence in the origina database. DEFINITION 3 (ɛ-differential PRIVACY). A non-interactive privacy mechanism A : D T satisfies ɛ-differentia privacy if for any neighboring databases D and D 2, and for any set D T, P r[a(d ) = D] e ɛ P r[a(d 2) = D] (3) where the probabiity is taken over the randomness of A. The privacy parameter ɛ, aso caed the privacy budget [2], specifies the degree of privacy offered. Intuitivey, a ower vaue of ɛ impies stronger privacy guarantee and a arger perturbation noise, and a higher vaue of ɛ impies a weaker guarantee whie possiby achieving higher accuracy. The neighboring databases D and D 2 differ on at most one user. Lapace Mechanism. Dwork et a. [2] show that ɛ-differentia privacy can be achieved by adding i.i.d. noise to query resut q(d): q(d) = q(d) + (Ñ,..., Ñz) (4) Ñ i Lap(, GS(q) ) for i =,..., z (5) ɛ 33

4 Symbo Description D/D Input database / Domain of a databases T k Set of records contributed by user u k in D D R/D R samped database / Domain of D R D G/D G HPA samped database / Domain of D G D E/D E HPA samped database / Domain of D E q / q Query of a item counts / Noisy output of q q 2/ q 2 Query of a edge counts / Noisy output of q 2 p/ p Popuarity vector for a items / Estimation of p M Max records per user aowed in D Max records per user aowed in D R and D G d Max records per user aowed in D E Tabe : Summary of notations where z represents the dimension of q(d). The magnitude of Ñ conforms to a Lapace distribution with mean and GS(q)/ɛ scae, where GS(q) represents the goba sensitivity [2] of the query q. The goba sensitivity is the maximum L distance between the resuts of q from any two neighboring databases. Formay, it is defined as foows: GS(q) = max D,D 2 q(d ) q(d 2). (6) Sensitivity Anaysis. Let M denote the maximum number of records any user coud contribute and D denote the domain of database D. Let q = {c,..., c m} output the item counts for every V i. Let q 2 = {c,, c,2,..., c m,h } output the edge counts for every V i and A j. The foowing emmas estabish the goba sensitivity of q and q 2, in order to protect the privacy of each individua user. The proof is quite straightforward thus omitted here for brevity. LEMMA (ITEM COUNTS SENSITIVITY). The goba sensitivity of q : D R m is M, i.e. GS(q ) = M. (7) LEMMA 2 (EDGE COUNTS SENSITIVITY). The goba sensitivity of q 2 : D R mh is M, i.e. GS(q 2) = M. (8) Composition.. The composition properties of differentia privacy provide privacy guarantees for a sequence of computations, which can be appied to mechanisms that require mutipe steps. THEOREM (SEQUENTIAL COMPOSITION [2]). Let A i each provide ɛ i-differentia privacy. A sequence of A i(d) over the dataset D provides ( i ɛi)-differentia privacy. 4. PROPOSED SOLUTIONS Beow we describe two samping-based soutions to privacy-preserving data anaytics. The notations used in the probem definition and our proposed soutions are summarized in Tabe. 4. Simpe Random Agorithm () Our first soution has been inspired by the fact that the maximum number of records contributed by each user, i.e. M, coud be rather arge in rea appications. For exampe, the Netfix user who contributed the most data submitted 7, reviews, as shown in Tabe 4. In fact, a user coud contribute as many records as the domain size, i.e. m, as in the tota number of movies on Netfix. As a resut of the arge magnitude of M, a very high perturbation noise is required to provide differentia privacy, according to the Lapace mechanism. Furthermore, the number of records contributed by each user can be unbounded for many appications, as a Figure 3: Outine of Agorithm Agorithm Simpe Random Agorithm () Input: raw dataset D, samping factor, privacy budget ɛ Output: sanitized answer q and q 2 /* Simpe Random Samping */ : D R 2: for k =,..., n do 3: T k σ uid=uk (D) /* T k : records of user u k */ 4: 5: if T k do D R D R Tk 6: ese do 7: 8: T k random sampe records from T k D R D R T k /* Generate Private Item Counts */ 9: q (D R) compute count c i(d R) for every i : Output q (D R) = q (D R) + Lap( ɛ ) m /* Generate Private Edge Counts */ : q 2(D R) compute count c i,j(d R) for every i, j 2: Output q 2(D R) = q 2(D R) + Lap( ɛ 2 ) mh user coud repeatedy check in at the same ocation or purchase the same product. In that case, M may not be known without breaching individua user privacy. In order to mitigate the effect of very arge or unbounded individua data contribution, we propose to sampe the raw input dataset D and aow up to records per user in the samped database. Therefore, the individua contribution to the samped database is bounded by the fixed constant. The aggregate statistics wi be generated from the samped data and then perturbed correspondingy in order to guarantee differentia privacy. The samping technique used in our soution is simpe random samping without repacement, after which our soution is named. An outine of the agorithm is provided in Figure 3. Given the input database D and a pre-defined samping factor, the method generates a samped database D R by random samping without repacement at most records for each user in input database D. The samped database D R coud be different every time the agorithm is run, due to the randomness of samping. However, it is guaranteed that for every possibe sampe D R, any user coud have no more than records. The foowing emma estabishes the sensitivity of q and q 2 under such constraint. LEMMA 3 (SAMPLE SENSITIVITY). In the domain of D R, it hods that GS(q ) = and GS(q 2) =. Subsequenty, the method computes the query answers to q and q 2 from the samped database D R, where a individua count queries c i and c i,j are evauated based on the data records in D R. According to the Lapace mechanism, it is sufficient to add perturbation noise from Lap( ɛ ) to each item count c i(d R) to guarantee ɛ -differentia privacy. Simiary, adding perturbation noise from Lap( ɛ 2 ) to each edge count c i,j(d R) guarantees ɛ 2- differentia privacy. The pseudocode of method is provided in Agorithm. 34

5 rid uid vid Day-Of-Week r Aice Gym Monday r 2 Aice Mary s house Tuesday r 3 Aice de Young Museum Friday r 4 Aice Goden Gate Bridge Saturday Tabe 2: Exampe Check-in Records To sum up, injects ow Lapace noise into reeased query resuts, due to reduced sensitivity in the samped database. However, the accuracy of reeased query resuts is affected by ony using D R, a subset of the input data D. Intuitivey, the more we sampe from each user, the coser q (D R) and q 2(D R) are to the true resuts q (D) and q 2(D), respectivey, at the cost of a higher Lapace perturbation error to achieve differentia privacy. Beow we formay anayze the trade-off between accuracy and privacy for query q to study the optima choice of. Simiar anaysis can be conducted for query q 2 and is thus omitted here for brevity. DEFINITION 4 (MEAN SQUARED ERROR). Let c i denote the noisy count reeased by q (D R) and c i denote the rea count computed by q (D), for each item V i. The Mean Squared Error of the noisy count c i is defined as foows: MSE( c i) = V ar( c i) + (Bias( c i, c i)) 2. (9) THEOREM 2. Given D R is a simpe random sampe of D and q (D R) = q (D R) + Lap( ɛ ) m, the vaue of that minimizes MSE is a monotonicay increasing function of ɛ 2. PROOF. See Appendix A. The above theorem provides a guideine to choose the vaue given the privacy budget ɛ : when the privacy budget is higher, we can afford to use more private data to overcome the error due to data oss; When privacy budget is imited, a sma number of data records shoud be taken from each user to reduce the perturbation error by the differentia privacy mechanism. 4.2 Hand-Picked Agorithm (HPA) Observing that a majority of data anaytica tasks depend on popuar paces or products, such as in traffic anaysis and recommendation services, data reated to popuar items shoud preferaby be preserved in the sampe database. In other words, some records generated by one user might be more usefu for data anaytics than the rest. The foowing exampe iustrates the concept of record usefuness. EXAMPLE. Tabe 2 iustrates Aice s check-in records in the raw database. Among the 4 paces Aice has been, de Young Museum and the Goden Gate Bridge are paces of interest and attract a arge number of visitors. On the other hand, gym and Mary s house are oca and persona to Aice and may not interest other peope. Therefore we consider r 3 and r 4 more usefu than r and r 2 for data anaytics. However, r and r 2 may be chosen by over r 3 and r 4, due to the simpe random samping procedure. From Exampe, it can be seen that r 3 and r 4 shoud be picked by the samping procedure over r and r 2, in order to generate meaningfu recommendation resuts. Therefore, we define the foowing popuarity-base utiity score for each private data record and propose to preserve records with highest scores for each user. DEFINITION 5 (UTILITY SCORE). Given record r and r.vid = V i, the utiity score of r is defined as foows: score(r) = p i () where p i represents the underying popuarity of item V i. Figure 4: Outine of HPA Agorithm rid uid vid Day-Of-Week Utiity r 4 Aice Goden Gate Bridge Saturday.2 r 3 Aice de Young Museum Friday. r Aice Gym Monday. r 2 Aice Mary s house Tuesday. Tabe 3: Exampe Check-in Records Sorted by Utiity Score Note that the record utiity can be defined in other ways according to the target anaytica questions. Our choice of the popuaritybased measure is motivated by the tasks of discovering popuar paces or products, as we as the fact that popuar items are ess persona/sensitive to individua users. In order to maximize the utiity of the samped database, we propose to greediy pick up to records with highest utiity scores for each user. Note that a user s records with the same score wi have equa chance to be picked. The outine of HPA is provided in Figure 4. Beow we describe () private estimation of record utiity and (2) greedy samping procedure. Popuarity Estimation. For each item V i, the popuarity p i represents the probabiity of any record r having r.vid = V i, which is often estimated by the reative frequency of such records. However, the estimation of p i s from the private user data must not vioate the privacy guarantee. We present our privacy-preserving utiity estimation in Agorithm 2 from Line to Line 7, which is outined in the upper haf of Figure 4. The utiity estimation is aso conducted on a samped database D E with samping factor d. D E is obtained by randomy choosing up to d records per user from the raw database D. We adopt randomy samping here because we do not have prior knowedge about the database at this point. The query q is computed based on D E and each count is perturbed with Lapace noise from Lap( d ɛ ). The perturbed counts { c i(d E)} are used to estimate the popuarity for each item V i by the foowing normaization: p i = max( c i(d E), ) m. () i= max( ci(de), ) Since the Lapace perturbation noise is a random variabe and therefore coud be negative, we repace the negative counts with s in computing item popuarity. The resuting p i is used to estimate the utiity score of each record r with r.vid = V i. The foowing emma estabishes the sensitivity of q and q 2 where each user can contribute up to d records. The proof is straightforward and is thus omitted. LEMMA 4. In the domain of D E, it hods that GS(q ) = d. Greedy Samping. The greedy samping procedure hand-picks up to records with highest utiity scores among each user s data in 35

6 Agorithm 2 Hand-Picked Agorithm (HPA) Input: raw dataset D, samping factor Output: sanitized answer q and q 2 /* Popuarity Estimation */ /* Random Sampe */ : D E 2: for k =,..., n do 3: T k σ uid=uk (D) /* T k : records of user u k */ 4: Random sampe d record from T k, add to D E /* Generate Private Item Counts */ 5: q (D E) compute count c i(d E) for every i 6: q (D E) = q (D E) + Lap( d ɛ ) m /* Estimate Popuarity */ 7: p normaize histogram q (D E) /* Greedy Samping */ 8: D G 9: for k =,..., n do : T k σ uid=uk (D) : if T k do 2: D G D G Tk 3: ese do 4: for record r in T k do 5: assign score(r) = p i iff r.vid = V i 6: T k pick records with highest scores from T k 7: D G D G T k /* Generate Private Item Counts */ 8: q (D G) compute count c i(d G) for every i 9: Output q (D G) = q (D G) + Lap( ɛ ) m /* Generate Private Edge Counts */ 2: q 2(D G) compute count c i,j(d G) for every i, j 2: Output q 2(D G) = q 2(D G) + Lap( ɛ 2 ) mh D. The pseudo code is provided in Agorithm 2 from Line 8 to Line 7. Tabe 3 iustrates Aice s records sorted by utiity score. Since Gym and Mary s House do not interest greater pubic, their scores are ikey to be much ower than Goden Gate Bridge" and de Young Museum. Then the top records on the sorted ist wi be put in the samped database D G. This step is performed on every user s data in the raw database D. LEMMA 5. In the domain of D G, it hods that GS(q ) = and GS(q 2) =. After the greedy samping step, the resuts to q and q 2 wi be computed on the samped database D G. Each individua item count and edge count wi be perturbed by Lapace noise from Lap( ɛ ) and Lap( ɛ 2 ), respectivey. We wi provide proof of privacy guarantee in the next section. The advantage of HPA is that it greediy picks the most vauabe data records from each user, without increasing the sampe data size, i.e. records per user. The utiity of each data record is estimated privatey from the overa data distribution. Records with high utiity have higher chance to be picked by greedy samping. Since the samped data greaty depends on the reative usefuness among each user s records, it is difficut to anayze the accuracy of reeased counts. We wi empiricay evauate the effectiveness of this approach in Section PRIVACY GUARANTEE In this section, we prove that both and HPA agorithms are differentiay private. We begin with the foowing emma, which states that record samping on each user does not infict differentia privacy breach. Users 2,579 45,289 48,89 6,4 Items 5,2 7,967 7,7 3,76 D 739,6,276,988,48,57,,29 max T k 4,38,33 7, 2,34 avg T k min T k 2 Tabe 4: Data Sets Statistics LEMMA 6. Let A be an ɛ-differentiay private agorithm and S be a record samping procedure which is performed on each user individuay. A S is aso ɛ-differentiay private. PROOF. See Appendix B. THEOREM 3. satisfies (ɛ + ɛ 2)-differentia privacy. PROOF. Let S rand, denote the random samping procedure in. S rand, is therefore a function that takes an raw database and outputs a samped database, i.e. S rand, : D D R. According to the Lapace mechanism and Lemma 3, q : D R R m is ɛ -differentiay private. By the above Lemma 6, the item counts by, i.e. q S rand, : D R m is ɛ -differentiay private. Simiary, q 2 : D R R mh is ɛ 2-differentiay private. The edge counts by, i.e. q 2 S rand, : D R mh is aso ɛ 2-differentiay private. Therefore, the overa computation satisfies (ɛ + ɛ 2)-differentia privacy by Theorem. THEOREM 4. HPA satisfies (ɛ + ɛ + ɛ 2)-differentia privacy. PROOF. Let S rand,d denote the random samping procedure in HPA for popuarity estimation, i.e. S rand,d : D D E. Let S grd, denote the greedy samping procedure, i.e. S grd, : D D G. According to the Lapace mechanism and Lemma 4, q : D E R m is ɛ -differentiay private. By Lemma 6, the HPA popuarity estimation step, i.e. q S rand,d : D R m is ɛ -differentiay private. Simiary, we can prove that the HPA item counts q S grd, : D R m is ɛ -differentiay private, and the HPA edge counts q 2 S grd, : D R mh is ɛ 2-differentiay private. Therefore, by Theorem, the overa HPA satisfies (ɛ + ɛ + ɛ 2)-differentia privacy. 6. EXPERIMENTS Here we present a set of empirica studies. We compare our soutions and HPA with three existing approaches: ) LPA, the baseine method that injects Lapace perturbation noise to each count, 2) DFT, the Discrete Fourier Transform based agorithm proposed in [23], appied to a vector of counts, and 3) GS, the best method with grouping and smoothing proposed in [5], appied to count histograms. Given the overa privacy budget ɛ, we set ɛ,2 = ɛ ɛ for method, and ɛ = and ɛ,2 =.45ɛ for HPA 2 method. Without specuating about the optima privacy aocation, we set ɛ to a sma fraction of ɛ, because it is used to protect a sma sampe of private data for utiity score estimation. To achieve the same privacy guarantee, we appy LPA, DFT, and GS to item counts and edge counts separatey, with privacy budget ɛ for each 2 appication. Data sets. We conducted our empirica studies with four rea-word data sets referred to as Gowaa, Foursquare, Netfix, and Movie- Lens, each named after its data source. The first two data sets consist of ocation check-in records. Gowaa is coected among users based in Austin from Gowaa ocation-based socia network by 36

7 Berjani and Strufe [3] between June and October 2. Simiary, Foursquare is coected from Foursquare by Long et a. [8] between February and Juy 22. In these two data sets, each record contains a user, a ocation, and a check-in time-stamp. Since a user can check-in at one ocation many times, the check-in data sets can represent a cass of services which vaue the returning behavior, such as buying or browsing. The other two data sets consist of movie ratings, where a movie may not be rated more than once by a user. Netfix is the training data set for the Netfix Prize competition. MovieLens is coected from users of MovieLens website 2. Each rating corresponds to a user, a movie, a rating score, and a time-stamp. Moreover, MovieLens aso provides user demographic information, such as gender, age, occupation, and zipcode. The properties of the data sets are summarized in Tabe 4. Note that the minimum individua contribution in MovieLens is 2, as opposed to for other data sets. This is because MovieLens was initiay coected for personaized recommendation, thus users with fewer than 2 records were excuded from the pubished data set. Setup. We impemented our and HPA methods, as we as the baseine LPA and DFT in Java.We obtained Jave code of GS from the authors of [5]. A experiments were run on a 2.9GHz Inte Core i7 PC with 8GB RAM. Each setting was run 2 times and the average resut was reported. The defaut settings of parameters are summarized beow: the overa privacy ɛ =, the samping parameter d for HPA popuarity estimation d = min T k, the samping parameter = for Gowaa, Foursquare, and Netfix and = 3 for MovieLens. Our choice of parameter settings is guided by anaytica resuts and minima knowedge about the data sets and thus might not be optima. For LPA and DFT, we set M to be equa to max T k. However, this vaue may not be known a priori. Stricty speaking, M is unbounded for check-in appications. In this sense, we overestimate the performance of LPA and DFT. 6. HPA-Private Popuarity Estimation We first examine the private popuarity estimation step of HPA method regarding the abiity to discover top- popuar items from the noisy counts q (D E). Reca that D E is generated by randomy samping d records per user and the output of q (D E) is then perturbed with noise from Lap( d ɛ ) to guarantee privacy. Given a sma privacy budget ɛ, it is ony meaningfu to choose a sma d vaue for accuracy, according to Theorem 2. Therefore, we set d equa to the minimum individua contribution, i.e. min T k, in every data set. In this experiment, we sort a items according to q (D E) output and items with highest noisy counts are evauated against the ground truth discovered from the raw data set. Figure 5 reports the precision resuts with various vaues on Foursquare and Netfix data. As can be seen, from the output of q (D E), we are abe to discover more than 6% of top-2 popuar ocations in Foursquare and 7% top-2 popuar movies on Netfix. When ooking at =, the output of q (D E) captures 4% of the rea popuar ocations and amost 8% popuar movies. We concude that HPA popuarity estimation provides a soid step stone for subsequent greedy samping, at very sma cost of individua data as we as privacy. 6.2 Impact of Samping Factor Here we ook at the upper bound of individua data contribution required by our soutions and study its impact on the accuracy of q and q 2 output. Mean Squared Error(MSE) is adopted as the 2 MSE % 8% 6% 4% 2% q(de ) (a) Foursquare % 8% 6% 4% 2% q(de ) (b) Netfix Figure 5: Estimation of Item Popuarity by HPA 5 x HPA min error = (a) MSE of q MSE HPA min error = (b) MSE of q 2 Figure 6: Impact of with Foursquare Data Set metric for accuracy and is cacuated between the noisy output by our methods and the true resuts of q and q 2 from the raw input data D. We ran our and HPA methods varying the vaue of, in order to generate samped database D R and D G with different sizes. Figure 6(a) summarizes the resuts from Foursquare data for item counts, i.e. q, and Figure 6(b) for edge counts, i.e. q 2. In both figures of Figure 6, when vaue increases, the MSE of the noisy output by our methods first drops as samped database gets arger. For exampe, we observe a decreasing trend of MSE as is raised to 3 in Figure 6(a) and as is raised to 5 in Figure 6(b). Beyond these two points, when further increasing, the MSE grows due to the perturbation noise from Lap( ɛ ). Ceary, there is a trade-off between sampe data size and the perturbation error. The optima vaue of depends on actua data distribution and the privacy parameter ɛ, according to Theorem 2. This set of resuts show that both and HPA achieve minimum MSE with reativey sma vaues, i.e. = 3 for q and = 5 for q 2. Our findings in Theorem 2 are confirmed and we concude that choosing a sma upper bound on individua data contribution is beneficia especiay when privacy budget is imited. 6.3 Comparison of Methods Here we compare our and HPA methods with existing approaches, i.e. LPA, DFT, and GS on a data sets. The utiity of item counts and edge counts reeased by a private mechanisms are evauated with three metrics. Note that for Gowaa, Foursquare, and Netfix data, each edge connects an item with a day-of-week, from Monday to Sunday. For MovieLens data set, each edge connects a movie with a (Gender, Age) pair. The domain of Gender is { M, F } and the domain of Age is { Under 25, 25-34, Above 34 }. Beow we review the resuts regarding the reeased item counts and edge counts, for each utiity metric. Mean Squared Error (MSE). This metric provides a generic utiity comparison of different methods on the reeased counts. Figure 7(a) and Figure 8(a) summarize the MSE resuts for item counts and edge counts, respectivey. As can be seen, the baseine LPA yieds the highest error in both item counts and edge counts. The 37

8 MSE.E+9.E+8.E+7.E+6.E+5.E+4.E+3.E+2.E+.E+ (a) Mean Squared Error L-divergence (b) L-Divergence 2% % 8% 6% 4% 2% % (c) Top- Figure 7: Utiity of Reeased Item Counts MSE.E+9.E+8.E+7.E+6.E+5.E+4.E+3.E+2.E+.E+ (a) Mean Squared Error L-divergence % 8% 6% 4% 2% % (b) L-Divergence (c) Average Top- Figure 8: Utiity of Reeased Edge Counts GS method, as studied in the origina work [5], is no worse than DFT in every case except for MovieLens item counts. Our methods and HPA provide the owest MSE error except in three cases, i.e. Netfix item counts and MovieLens item/edge counts. This can be interpreted by the high average user contribution in these two data sets, where our methods infict more data oss by imiting individua data in the samped database. L-divergence. The L-divergence is a common metric widey used to measure the distance between two probabiity distributions. In this set of experiments, we consider the item/edge counts as data record distributions over the domain of items/edges. Both the reeased counts and origina counts are normaized to simuate probabiity distributions. Note that prior to that, zero or negative counts are repaced with. for continuity without generating many fase positives. We compute the L-divergence of the reeased distribution with respect to the origina distribution for each query and present the resuts in Figure 7(b) and Figure 8(b). The reeased distributions by LPA are further from origina data distributions than those of other methods for every data set. As expected, DFT and GS preserve the count distributions we in genera, because: ) the DFT method is designed to capture major trends in data series, and 2) the GS method generates smooth distributions by grouping simiar coumns. However, in severa cases, those two methods fai to provide simiar distributions, e.g. on Gowaa and Netfix data. We beieve that their performance depends on the actua data distribution, i.e. whether significant trend or near-uniform grouping exists and can be we extracted. On the other hand, our soutions and HPA provide comparabe performance to the best existing methods, athough not optimized to preserve distributiona simiarities. Furthermore, constanty outperforms HPA in approximating the true distributions, thanks to the nature of simpe random samping technique. Top- Discovery. In this set of experiments, we examine the quaity of top- discovery retrieved by a privacy-preserving mechanisms. For item counts, the top- popuar items are evauated. For edge counts, the top- popuar items associated with each attribute vaue are evauated and the average precision is reported, to simuate discoveries for each day-of-week and each user demographic group. In Figure 7(c), we observe that existing methods fai to preserve the most popuar items in any dataset. The reason is the baseine LPA suffers from high perturbation error, and DFT and GS yied over-smoothed reeased counts and thus cannot distinguish the most popuar items from those ranked next to them. When is arge enough, we wi see that their performance in top- discovery sowy recovers in a subsequent experiment. On the other hand, our methods and HPA greaty outperform existing approaches and HPA even achieves % precision for Netfix data. Simiary, our methods show superior performance in Figure 8(c), with the absoute precision sighty dropped due to sparser data distributions. Overa, HPA outperforms by preserving user records with high popuarity scores. The ony exception where is better than HPA is in finding the top- most popuar movies on MovieLens. The reason is that those users who contribute ess than 2 records were excuded from the data set and no movies were preferred by the majority of the rest users. As for finding top- movies for each demographic group, HPA greaty improves over, since users within a demographic group show simiar interests. We further ook at top- precision of the reeased item counts by a methods, with ranging from to. The resuts are provided in Figure 9. We can see that the performance of our greedy approach HPA is % when = and drops as increases, since the samping step ony picks a sma number of records, i.e. records, from each user with highest utiity score, i.e. item popuarity. Our random approach aso shows decreasing precision as increases, due to the data oss caused by samping. However, the decreasing rate is much sower compared to that of HPA, because records of a user have equa chance to be seected by random samping. On the contrary, LPA, DFT, and GS show % precision when = and higher precision as increases. We concude that and HPA can discover the most popuar items, superior to existing approaches up to =, but do not distinguish ess popuar items due to ack of information in the samped database. 38

9 HPA LPA DFT GS (a) Gowaa HPA LPA DFT GS 2 3 (c) Netfix HPA LPA DFT GS (b) Foursquare.4 HPA LPA.2 DFT GS 2 3 (d) MovieLens Figure 9: Comparison of Methods: Top- Mining The existing approaches fai to distinguish the most popuar items, e.g. top-, because of perturbation or the smoothing effect of their methods, but might provide good precision for arge, e.g Additiona Benefits Data Reduction. One beneficia side effect of imiting individua data contribution is the reduction of data storage space by generating anaytics from a samped database. Figure shows the number of records in the samped databases used by and HPA compared to that of the raw input. As can be seen, the samped data is much smaer than the raw input for every data set. For Netfix data set, our methods perform privacy-preserving anaytics and generate usefu resuts on sampe databases with ess than 5% of the origina data, reducing the data storage requirement without compromising the utiity of output anaytics. Weeky Distribution. We aso examine the samped database by and HPA by the weeky distribution of data records. The percentage of Foursquare check-in records on each day of week is potted in Figure. As is shown, the percentage of Friday, Saturday, and Sunday check-ins is higher in the samped databases generated by our methods than in the origina data set, whie the percentage of Monday-Thursday check-ins is ower than the origina. Since the majority of the users are occasiona users and contribute ess than records, our methods preserve their data competey in the samped databases. We may infer that the occasiona users are more ikey to use the check-in service on Friday-Sunday. Moreover, the samped data is constanty coser to the origina data distribution, compared to HPA. We can further infer that users are more ikey to check-in popuar paces on Friday-Sunday. Movie Recommendation. A exampe of context-aware, fine-grained recommendation is to suggest items based on the common interest demonstrated among the user group with simiar demographics, such as age and gender. We iustrate the top- movie recommendation to mae users under the age of 25 with reeased edge counts by our soutions on MovieLens data set. The first coumn in Tabe 5 shows the top- recommended movies using origina data, whie the second and third coumns ist movies recommended by our privacy-preserving soutions. We observe that some movies NumbermofmRecords Percentage.E+9.E+8.E+7.E+6.E+5.E+4 25% 2% 5% % 5% % SampedmData RawmData Figure : Data Reduction orgina HPA SUN MON TUE WED THU FRI SAT Weekday Figure : Weeky Distribution with Foursquare Data Top Movies Output HPA Output American Beauty Phantasm II American Beauty Star Wars VI Marvin s Room Star Wars VI Star Wars V A Dogs Go to Heaven Terminator 2 The Matrix In the Line of Duty 2 Star Wars V Star Wars IV Star Wars V Jurassic Park Terminator 2 The Sumber Party Massacre III The Matrix Saving Private Ryan The Story of Xinghua Men in Back Jurassic Park American Beauty The Fugitive Star Wars I Shaft Braveheart Braveheart Star Wars I Saving Private Ryan Tabe 5: Movie Recommendations to Mae, Under 25. recommended by may not interest the target audience, such as Marvin s Room and The story of Xinghua. Furthermore, the top movie on ist, i.e. Phantasm II, is a horror movie and not suitabe for underage audience. On the other hand, the movies recommended by HPA are quite consistent with the origina top- except for two movies, i.e. Men in Back" and The Fugitive", which may interest the target audience as we. We beieve that HPA captures more information by greedy samping and thus can make better recommendations than, especiay when users have very diverse interests. 7. CONCLUSION AND DISCUSSION We have proposed a practica framework for privacy-preserving data anaytics by samping a fixed number of records from each user. We have presented two soutions, i.e. and HPA, which impement the framework with different samping techniques. Our soutions do not require the input data be preprocessed, such as removing users with arge or itte data. The output anaysis resuts are highy accurate for performing top- discovery and contextaware recommendations, cosing the utiity gap between no privacy and existing differentiay private techniques. Our soutions benefit from samping techniques that reduce the individua data contribution to a sma constant factor,, and thus reducing the perturbation error inficted by differentia privacy. We provided anaysis resuts about the optima samping factor with respect to the privacy requirement. We formay proved that both mechanisms satisfy ɛ-differentia privacy. Empirica studies with rea-word data sets confirm that our soutions enabe accurate data anaytics on a 39

10 sma fraction of the input data, reducing user privacy cost and data storage requirement without compromising utiity. Potentia future work may incude the design of a hybrid approach between and HPA which coud have the benefits of both. For rea-time appications, we woud ike to consider how to dynamicay sampe user generated data, in order to further reduce the data storage requirement. Another direction is to appy the proposed samping framework to soving more compex data anaytica tasks, which might invove mutipe, over-apping count queries or other statistica queries. 8. ACNOWLEDGMENTS We thank the anonymous reviewers for the detaied and hepfu comments to the manuscript. 9. REFERENCES [] M. Barbaro and T. Zeer. A face is exposed for ao searcher no The New York Times, Aug. 26. [2] J. Bennett and S. Lanning. The netfix prize. In Proceedings of DD cup and workshop, voume 27, page 35, 27. [3] B. Berjani and T. Strufe. A recommendation system for spots in ocation-based onine socia networks. In Proceedings of the 4th Workshop on Socia Network Systems, SNS, pages 4: 4:6, New York, NY, USA, 2. ACM. [4] A. Bum,. Ligett, and A. Roth. A earning theory approach to non-interactive database privacy. In Proceedings of the 4th annua ACM symposium on Theory of computing, pages 69 68, New York, 28. ACM. [5] T.-H. H. Chan, E. Shi, and D. Song. Private and continua reease of statistics. ACM Trans. Inf. Syst. Secur., 4(3):26: 26:24, Nov. 2. [6]. Chaudhuri and N. Mishra. When random samping preserves privacy. In Proceedings of the 26th annua internationa conference on Advances in Cryptoogy, CRYPTO 6, pages 98 23, Berin, Heideberg, 26. Springer-Verag. [7] R. Chen, G. Acs, and C. Casteuccia. Differentiay private sequentia data pubication via variabe-ength n-grams. In Proceedings of the 22 ACM conference on Computer and communications security, CCS 2, pages , 22. [8] G. Cormode, C. Procopiuc, D. Srivastava, and T. T. L. Tran. Differentiay private summaries for sparse data. In Proceedings of the 5th Internationa Conference on Database Theory, ICDT 2, pages 299 3, New York, NY, USA, 22. ACM. [9] Y.-A. de Montjoye, C. A. Hidago, M. Vereysen, and V. D. Bonde. Unique in the Crowd: The privacy bounds of human mobiity. Scientific Reports, Mar. [] C. Dwork. Differentia privacy. In M. Bugiesi, B. Prenee, V. Sassone, and I. Wegener, editors, Automata, Languages and Programming, voume 452 of Lecture Notes in Computer Science, pages 2. Springer Berin Heideberg, 26. [] C. Dwork,. enthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourseves: privacy via distributed noise generation. In Proceedings of the 24th annua internationa conference on The Theory and Appications of Cryptographic Techniques, EUROCRYPT 6, pages , Berin, Heideberg, 26. Springer-Verag. [2] C. Dwork, F. Mcsherry,. Nissim, and A. Smith. Caibrating noise to sensitivity in private data anaysis. In In Proceedings of the 3rd Theory of Cryptography Conference, pages , Heideberg, 26. Springer-Verag. [3] L. Fan and L. Xiong. An adaptive approach to rea-time aggregate monitoring with differentia privacy. nowedge and Data Engineering, IEEE Transactions on, 26(9):294 26, Sept 24. [4] Y. Hong, J. Vaidya, H. Lu, and M. Wu. Differentiay private search og sanitization with optima output utiity. In Proceedings of the 5th Internationa Conference on Extending Database Technoogy, EDBT 2, pages 5 6, New York, NY, USA, 22. ACM. [5] G. earis and S. Papadopouos. Practica differentia privacy via grouping and smoothing. In Proceedings of the 39th internationa conference on Very Large Data Bases, PVLDB 3, pages 3 32, 23. [6] A. oroova,. enthapadi, N. Mishra, and A. Ntouas. Reeasing search queries and cicks privatey. In Proceedings of the 8th internationa conference on Word wide web, WWW 9, pages 7 8, 29. [7] N. Li, W. Qardaji, and D. Su. On samping, anonymization, and differentia privacy or, k-anonymization meets differentia privacy. In Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security, ASIACCS 2, pages 32 33, 22. [8] X. Long, L. Jin, and J. Joshi. Towards understanding traveer behavior in ocation-based socia networks. In Goba Communications Conference (GLOBECOM), 23 IEEE, 23. [9] A. Machanavajjhaa, D. ifer, J. Abowd, J. Gehrke, and L. Vihuber. Privacy: Theory meets practice on the map. In Data Engineering, 28. ICDE 28. IEEE 24th Internationa Conference on, pages , 28. [2] F. McSherry. Privacy integrated queries: an extensibe patform for privacy-preserving data anaysis. voume 53, pages 89 97, 2. [2]. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and samping in private data anaysis. In Proceedings of the thirty-ninth annua ACM symposium on Theory of computing, STOC 7, pages 75 84, New York, NY, USA, 27. ACM. [22] D. Proserpio, S. Godberg, and F. McSherry. Caibrating data to sensitivity in private data anaysis: A patform for differentiay-private anaysis of weighted datasets. Proc. VLDB Endow., 7(8): , Apr. 24. [23] V. Rastogi and S. Nath. Differentiay private aggregation of distributed time-series with transformation and encryption. In Proceedings of the 2 ACM SIGMOD Internationa Conference on Management of data, pages , 2. [24] G. Yuan, Z. Zhang, M. Winsett, X. Xiao, Y. Yang, and Z. Hao. Low-rank mechanism: optimizing batch queries under differentia privacy. Proc. VLDB Endow., 5(): , Juy 22. [25] C. Zeng, J. F. Naughton, and J.-Y. Cai. On differentiay private frequent itemset mining. Proc. VLDB Endow., 6():25 36, Nov

11 APPENDIX A. PROOF OF THEOREM 2 PROOF. For item V i, et c i denote the true count computed by q from the sampe D R. Therefore, the noisy count c i is derived by adding a Lapace noise to c i as foows: The MSE of c i can be re-written as: c i = c i + ν i, (2) ν i Lapace(, /ɛ ). (3) MSE( c i) = V ar(c i + ν i) + (E(c i + ν i c i)) 2 = V ar(c i) + V ar(ν i) + (E(c i) E(c i)) 2. (4) Note that c i and ν i are mutuay independent. Let p i denote the popuarity of item V i, i.e. the probabiity of any record having vid = V i. For simpicity, we assume that users are mutuay independent, records are mutuay independent, and every user has M records in the raw data set D. To obtain D R, records out of M are randomy chosen for each user in D. Thus for any item V i, c i can be represented as the sum of independent random variabes: n c i = δ r,i (5) k= r T k { if r.vid = Vi & r D δ r,i = R, (6) otherwise. The event of δ r,i = is equivaent to the event of record r is about V i and r is samped in D R by chance: P r[δ r,i = ] = P r[r.vid = V i & r D R] = p i M. (7) Therefore, we can obtain the foowing expectation and variance for c i: n E(c i) = E(δ r,i) (8) r T k k= = n p i k= r T k M = np i We concude that the optima vaue is a monotonicay increasing function of ɛ 2. B. PROOF OF LEMMA 6 PROOF. By definition of differentia privacy, we are to prove that for any neighboring raw databases D and D 2, A S satisfies the foowing inequaity for D Range(A S): P r[a S(D ) = D] e ɛ P r[a S(D 2) = D]. (23) Without oss of generaity, we assume D 2 contains one more user than D. Let u denote the user that is contained in D 2 but not D and T be user u s set of records in D 2. By definition of neighboring databases, we can rewrite D 2 = D T 3. Let ˆD denote any possibe samping output of S(D ). We have: P r[a S(D ) = D] = ˆD = ˆD P r[a S(D ) = D S(D ) = ˆD ]P r[s(d ) = ˆD ] P r[a( ˆD ) = D]P r[s(d ) = ˆD ] (24) Let ˆT denote any possibe samping output of S(T ). We note that ˆT can take vaues from the entire domain, in genera: P r[s(t ) = ˆT ] =. (25) ˆT Since S is performed independenty on each user, we can derive: P r[s(d ) = ˆD ] = ˆT = ˆT P r[s(d ) = ˆD ]P r[s(t ) = ˆT ] P r[s(d T ) = ˆD ˆT ]. (26) Note that since D and T are disjoint, the samping output on D and T are aso independent and disjoint. Therefore, P r[a S(D ) = D] = ˆD P r[a( ˆD ) = D] ˆT P r[s(d T ) = ˆD ˆT ] V ar(c i) = = n n k= p i k= r T k r T k V ar(δ r,i) (9) ( pi M M ) = np i( p i M ) Simiary, we can obtain the expectation of c i: E(c i) = nmp i. (2) From the above resuts, we can re-write Equation 4 as foows: MSE( c i) = np i( p i M ) (np i nmp i) 2 (2) and we can perform the standard east square method to minimize the MSE. The optima vaue is thus: ɛ 2 2n 2 p 2 i M np i = 4/ɛ 2 2np2 i /M + 2n2 p 2 i (22) = ˆD, ˆT P r[a( ˆD ) = D]P r[s(d T ) = ˆD ˆT ] ˆD, ˆT = e ɛ ˆD2 e ɛ P r[a( ˆD ˆT ) = D]P r[s(d T ) = ˆD ˆT ] (27) P r[a( ˆD 2) = D]P r[s(d 2) = ˆD 2] (28) = e ɛ P r[a S(D 2) = D]. (29) Line 27 is due to the fact that A is ɛ-differentiay private and ˆD and ˆD ˆT are neighboring databases. In ine 28 we change notation and et ˆD 2 represent ˆD ˆT. The proof is hence compete. 3 is used to denote a co-product, or disjoint union of two databases. 32

Secure Network Coding with a Cost Criterion

Secure Network Coding with a Cost Criterion Secure Network Coding with a Cost Criterion Jianong Tan, Murie Médard Laboratory for Information and Decision Systems Massachusetts Institute of Technoogy Cambridge, MA 0239, USA E-mai: {jianong, medard}@mit.edu

More information

Face Hallucination and Recognition

Face Hallucination and Recognition Face Haucination and Recognition Xiaogang Wang and Xiaoou Tang Department of Information Engineering, The Chinese University of Hong Kong {xgwang1, xtang}@ie.cuhk.edu.hk http://mmab.ie.cuhk.edu.hk Abstract.

More information

Teamwork. Abstract. 2.1 Overview

Teamwork. Abstract. 2.1 Overview 2 Teamwork Abstract This chapter presents one of the basic eements of software projects teamwork. It addresses how to buid teams in a way that promotes team members accountabiity and responsibiity, and

More information

Finance 360 Problem Set #6 Solutions

Finance 360 Problem Set #6 Solutions Finance 360 Probem Set #6 Soutions 1) Suppose that you are the manager of an opera house. You have a constant margina cost of production equa to $50 (i.e. each additiona person in the theatre raises your

More information

Australian Bureau of Statistics Management of Business Providers

Australian Bureau of Statistics Management of Business Providers Purpose Austraian Bureau of Statistics Management of Business Providers 1 The principa objective of the Austraian Bureau of Statistics (ABS) in respect of business providers is to impose the owest oad

More information

Betting Strategies, Market Selection, and the Wisdom of Crowds

Betting Strategies, Market Selection, and the Wisdom of Crowds Betting Strategies, Market Seection, and the Wisdom of Crowds Wiemien Kets Northwestern University w-kets@keogg.northwestern.edu David M. Pennock Microsoft Research New York City dpennock@microsoft.com

More information

Multi-Robot Task Scheduling

Multi-Robot Task Scheduling Proc of IEEE Internationa Conference on Robotics and Automation, Karsruhe, Germany, 013 Muti-Robot Tas Scheduing Yu Zhang and Lynne E Parer Abstract The scheduing probem has been studied extensivey in

More information

A Similarity Search Scheme over Encrypted Cloud Images based on Secure Transformation

A Similarity Search Scheme over Encrypted Cloud Images based on Secure Transformation A Simiarity Search Scheme over Encrypted Coud Images based on Secure Transormation Zhihua Xia, Yi Zhu, Xingming Sun, and Jin Wang Jiangsu Engineering Center o Network Monitoring, Nanjing University o Inormation

More information

Advanced ColdFusion 4.0 Application Development - 3 - Server Clustering Using Bright Tiger

Advanced ColdFusion 4.0 Application Development - 3 - Server Clustering Using Bright Tiger Advanced CodFusion 4.0 Appication Deveopment - CH 3 - Server Custering Using Bri.. Page 1 of 7 [Figures are not incuded in this sampe chapter] Advanced CodFusion 4.0 Appication Deveopment - 3 - Server

More information

Bite-Size Steps to ITIL Success

Bite-Size Steps to ITIL Success 7 Bite-Size Steps to ITIL Success Pus making a Business Case for ITIL! Do you want to impement ITIL but don t know where to start? 7 Bite-Size Steps to ITIL Success can hep you to decide whether ITIL can

More information

ELEVATING YOUR GAME FROM TRADE SPEND TO TRADE INVESTMENT

ELEVATING YOUR GAME FROM TRADE SPEND TO TRADE INVESTMENT Initiatives Strategic Mapping Success in The Food System: Discover. Anayze. Strategize. Impement. Measure. ELEVATING YOUR GAME FROM TRADE SPEND TO TRADE INVESTMENT Foodservice manufacturers aocate, in

More information

The Radix-4 and the Class of Radix-2 s FFTs

The Radix-4 and the Class of Radix-2 s FFTs Chapter 11 The Radix- and the Cass of Radix- s FFTs The divide-and-conuer paradigm introduced in Chapter 3 is not restricted to dividing a probem into two subprobems. In fact, as expained in Section. and

More information

Practicing Reference... Learning from Library Science *

Practicing Reference... Learning from Library Science * Practicing Reference... Learning from Library Science * Mary Whisner ** Ms. Whisner describes the method and some of the resuts reported in a recenty pubished book about the reference interview written

More information

Network/Communicational Vulnerability

Network/Communicational Vulnerability Automated teer machines (ATMs) are a part of most of our ives. The major appea of these machines is convenience The ATM environment is changing and that change has serious ramifications for the security

More information

TERM INSURANCE CALCULATION ILLUSTRATED. This is the U.S. Social Security Life Table, based on year 2007.

TERM INSURANCE CALCULATION ILLUSTRATED. This is the U.S. Social Security Life Table, based on year 2007. This is the U.S. Socia Security Life Tabe, based on year 2007. This is avaiabe at http://www.ssa.gov/oact/stats/tabe4c6.htm. The ife eperiences of maes and femaes are different, and we usuay do separate

More information

3.3 SOFTWARE RISK MANAGEMENT (SRM)

3.3 SOFTWARE RISK MANAGEMENT (SRM) 93 3.3 SOFTWARE RISK MANAGEMENT (SRM) Fig. 3.2 SRM is a process buit in five steps. The steps are: Identify Anayse Pan Track Resove The process is continuous in nature and handed dynamicay throughout ifecyce

More information

Pay-on-delivery investing

Pay-on-delivery investing Pay-on-deivery investing EVOLVE INVESTment range 1 EVOLVE INVESTMENT RANGE EVOLVE INVESTMENT RANGE 2 Picture a word where you ony pay a company once they have deivered Imagine striking oi first, before

More information

Pricing and Revenue Sharing Strategies for Internet Service Providers

Pricing and Revenue Sharing Strategies for Internet Service Providers Pricing and Revenue Sharing Strategies for Internet Service Providers Linhai He and Jean Warand Department of Eectrica Engineering and Computer Sciences University of Caifornia at Berkeey {inhai,wr}@eecs.berkeey.edu

More information

Fixed income managers: evolution or revolution

Fixed income managers: evolution or revolution Fixed income managers: evoution or revoution Traditiona approaches to managing fixed interest funds rey on benchmarks that may not represent optima risk and return outcomes. New techniques based on separate

More information

Leakage detection in water pipe networks using a Bayesian probabilistic framework

Leakage detection in water pipe networks using a Bayesian probabilistic framework Probabiistic Engineering Mechanics 18 (2003) 315 327 www.esevier.com/ocate/probengmech Leakage detection in water pipe networks using a Bayesian probabiistic framework Z. Pouakis, D. Vaougeorgis, C. Papadimitriou*

More information

effect on major accidents

effect on major accidents An Investigation into a weekend (or bank hoiday) effect on major accidents Nicoa C. Heaey 1 and Andrew G. Rushton 2 1 Heath and Safety Laboratory, Harpur Hi, Buxton, Derbyshire, SK17 9JN 2 Hazardous Instaations

More information

Fast Robust Hashing. ) [7] will be re-mapped (and therefore discarded), due to the load-balancing property of hashing.

Fast Robust Hashing. ) [7] will be re-mapped (and therefore discarded), due to the load-balancing property of hashing. Fast Robust Hashing Manue Urueña, David Larrabeiti and Pabo Serrano Universidad Caros III de Madrid E-89 Leganés (Madrid), Spain Emai: {muruenya,darra,pabo}@it.uc3m.es Abstract As statefu fow-aware services

More information

Chapter 3: e-business Integration Patterns

Chapter 3: e-business Integration Patterns Chapter 3: e-business Integration Patterns Page 1 of 9 Chapter 3: e-business Integration Patterns "Consistency is the ast refuge of the unimaginative." Oscar Wide In This Chapter What Are Integration Patterns?

More information

eg Enterprise vs. a Big 4 Monitoring Soution: Comparing Tota Cost of Ownership Restricted Rights Legend The information contained in this document is confidentia and subject to change without notice. No

More information

Oligopoly in Insurance Markets

Oligopoly in Insurance Markets Oigopoy in Insurance Markets June 3, 2008 Abstract We consider an oigopoistic insurance market with individuas who differ in their degrees of accident probabiities. Insurers compete in coverage and premium.

More information

Load Balancing in Distributed Web Server Systems with Partial Document Replication *

Load Balancing in Distributed Web Server Systems with Partial Document Replication * Load Baancing in Distributed Web Server Systems with Partia Document Repication * Ling Zhuo Cho-Li Wang Francis C. M. Lau Department of Computer Science and Information Systems The University of Hong Kong

More information

Distribution of Income Sources of Recent Retirees: Findings From the New Beneficiary Survey

Distribution of Income Sources of Recent Retirees: Findings From the New Beneficiary Survey Distribution of Income Sources of Recent Retirees: Findings From the New Beneficiary Survey by Linda Drazga Maxfied and Virginia P. Rena* Using data from the New Beneficiary Survey, this artice examines

More information

Order-to-Cash Processes

Order-to-Cash Processes TMI170 ING info pat 2:Info pat.qxt 01/12/2008 09:25 Page 1 Section Two: Order-to-Cash Processes Gregory Cronie, Head Saes, Payments and Cash Management, ING O rder-to-cash and purchase-topay processes

More information

Simultaneous Routing and Power Allocation in CDMA Wireless Data Networks

Simultaneous Routing and Power Allocation in CDMA Wireless Data Networks Simutaneous Routing and Power Aocation in CDMA Wireess Data Networks Mikae Johansson *,LinXiao and Stephen Boyd * Department of Signas, Sensors and Systems Roya Institute of Technoogy, SE 00 Stockhom,

More information

CONTRIBUTION OF INTERNAL AUDITING IN THE VALUE OF A NURSING UNIT WITHIN THREE YEARS

CONTRIBUTION OF INTERNAL AUDITING IN THE VALUE OF A NURSING UNIT WITHIN THREE YEARS Dehi Business Review X Vo. 4, No. 2, Juy - December 2003 CONTRIBUTION OF INTERNAL AUDITING IN THE VALUE OF A NURSING UNIT WITHIN THREE YEARS John N.. Var arvatsouakis atsouakis DURING the present time,

More information

Pricing Internet Services With Multiple Providers

Pricing Internet Services With Multiple Providers Pricing Internet Services With Mutipe Providers Linhai He and Jean Warand Dept. of Eectrica Engineering and Computer Science University of Caifornia at Berkeey Berkeey, CA 94709 inhai, wr@eecs.berkeey.edu

More information

Art of Java Web Development By Neal Ford 624 pages US$44.95 Manning Publications, 2004 ISBN: 1-932394-06-0

Art of Java Web Development By Neal Ford 624 pages US$44.95 Manning Publications, 2004 ISBN: 1-932394-06-0 IEEE DISTRIBUTED SYSTEMS ONLINE 1541-4922 2005 Pubished by the IEEE Computer Society Vo. 6, No. 5; May 2005 Editor: Marcin Paprzycki, http://www.cs.okstate.edu/%7emarcin/ Book Reviews: Java Toos and Frameworks

More information

(David H T Lan) Secretary for Home Affairs

(David H T Lan) Secretary for Home Affairs Message We sha make every effort to strengthen the community buiding programme which serves to foster among the peope of Hong Kong a sense of beonging and mutua care. We wi continue to impement the District

More information

Ricoh Healthcare. Process Optimized. Healthcare Simplified.

Ricoh Healthcare. Process Optimized. Healthcare Simplified. Ricoh Heathcare Process Optimized. Heathcare Simpified. Rather than a destination that concudes with the eimination of a paper, the Paperess Maturity Roadmap is a continuous journey to strategicay remove

More information

Let s get usable! Usability studies for indexes. Susan C. Olason. Study plan

Let s get usable! Usability studies for indexes. Susan C. Olason. Study plan Let s get usabe! Usabiity studies for indexes Susan C. Oason The artice discusses a series of usabiity studies on indexes from a systems engineering and human factors perspective. The purpose of these

More information

A Description of the California Partnership for Long-Term Care Prepared by the California Department of Health Care Services

A Description of the California Partnership for Long-Term Care Prepared by the California Department of Health Care Services 2012 Before You Buy A Description of the Caifornia Partnership for Long-Term Care Prepared by the Caifornia Department of Heath Care Services Page 1 of 13 Ony ong-term care insurance poicies bearing any

More information

Vendor Performance Measurement Using Fuzzy Logic Controller

Vendor Performance Measurement Using Fuzzy Logic Controller The Journa of Mathematics and Computer Science Avaiabe onine at http://www.tjmcs.com The Journa of Mathematics and Computer Science Vo.2 No.2 (2011) 311-318 Performance Measurement Using Fuzzy Logic Controer

More information

Dynamic Pricing Trade Market for Shared Resources in IIU Federated Cloud

Dynamic Pricing Trade Market for Shared Resources in IIU Federated Cloud Dynamic Pricing Trade Market or Shared Resources in IIU Federated Coud Tongrang Fan 1, Jian Liu 1, Feng Gao 1 1Schoo o Inormation Science and Technoogy, Shiiazhuang Tiedao University, Shiiazhuang, 543,

More information

WHITE PAPER UndERsTAndIng THE VAlUE of VIsUAl data discovery A guide To VIsUAlIzATIons

WHITE PAPER UndERsTAndIng THE VAlUE of VIsUAl data discovery A guide To VIsUAlIzATIons Understanding the Vaue of Visua Data Discovery A Guide to Visuaizations WHITE Tabe of Contents Executive Summary... 3 Chapter 1 - Datawatch Visuaizations... 4 Chapter 2 - Snapshot Visuaizations... 5 Bar

More information

Maintenance activities planning and grouping for complex structure systems

Maintenance activities planning and grouping for complex structure systems Maintenance activities panning and grouping for compex structure systems Hai Canh u, Phuc Do an, Anne Barros, Christophe Berenguer To cite this version: Hai Canh u, Phuc Do an, Anne Barros, Christophe

More information

Artificial neural networks and deep learning

Artificial neural networks and deep learning February 20, 2015 1 Introduction Artificia Neura Networks (ANNs) are a set of statistica modeing toos originay inspired by studies of bioogica neura networks in animas, for exampe the brain and the centra

More information

Life Contingencies Study Note for CAS Exam S. Tom Struppeck

Life Contingencies Study Note for CAS Exam S. Tom Struppeck Life Contingencies Study Note for CAS Eam S Tom Struppeck (Revised 9/19/2015) Introduction Life contingencies is a term used to describe surviva modes for human ives and resuting cash fows that start or

More information

WHITE PAPER BEsT PRAcTIcEs: PusHIng ExcEl BEyond ITs limits WITH InfoRmATIon optimization

WHITE PAPER BEsT PRAcTIcEs: PusHIng ExcEl BEyond ITs limits WITH InfoRmATIon optimization Best Practices: Pushing Exce Beyond Its Limits with Information Optimization WHITE Best Practices: Pushing Exce Beyond Its Limits with Information Optimization Executive Overview Microsoft Exce is the

More information

A Latent Variable Pairwise Classification Model of a Clustering Ensemble

A Latent Variable Pairwise Classification Model of a Clustering Ensemble A atent Variabe Pairwise Cassification Mode of a Custering Ensembe Vadimir Berikov Soboev Institute of mathematics, Novosibirsk State University, Russia berikov@math.nsc.ru http://www.math.nsc.ru Abstract.

More information

Early access to FAS payments for members in poor health

Early access to FAS payments for members in poor health Financia Assistance Scheme Eary access to FAS payments for members in poor heath Pension Protection Fund Protecting Peope s Futures The Financia Assistance Scheme is administered by the Pension Protection

More information

A Supplier Evaluation System for Automotive Industry According To Iso/Ts 16949 Requirements

A Supplier Evaluation System for Automotive Industry According To Iso/Ts 16949 Requirements A Suppier Evauation System for Automotive Industry According To Iso/Ts 16949 Requirements DILEK PINAR ÖZTOP 1, ASLI AKSOY 2,*, NURSEL ÖZTÜRK 2 1 HONDA TR Purchasing Department, 41480, Çayırova - Gebze,

More information

READING A CREDIT REPORT

READING A CREDIT REPORT Name Date CHAPTER 6 STUDENT ACTIVITY SHEET READING A CREDIT REPORT Review the sampe credit report. Then search for a sampe credit report onine, print it off, and answer the questions beow. This activity

More information

Business schools are the academic setting where. The current crisis has highlighted the need to redefine the role of senior managers in organizations.

Business schools are the academic setting where. The current crisis has highlighted the need to redefine the role of senior managers in organizations. c r o s os r oi a d s REDISCOVERING THE ROLE OF BUSINESS SCHOOLS The current crisis has highighted the need to redefine the roe of senior managers in organizations. JORDI CANALS Professor and Dean, IESE

More information

Market Design & Analysis for a P2P Backup System

Market Design & Analysis for a P2P Backup System Market Design & Anaysis for a P2P Backup System Sven Seuken Schoo of Engineering & Appied Sciences Harvard University, Cambridge, MA seuken@eecs.harvard.edu Denis Chares, Max Chickering, Sidd Puri Microsoft

More information

Betting on the Real Line

Betting on the Real Line Betting on the Rea Line Xi Gao 1, Yiing Chen 1,, and David M. Pennock 2 1 Harvard University, {xagao,yiing}@eecs.harvard.edu 2 Yahoo! Research, pennockd@yahoo-inc.com Abstract. We study the probem of designing

More information

Business Banking. A guide for franchises

Business Banking. A guide for franchises Business Banking A guide for franchises Hep with your franchise business, right on your doorstep A true understanding of the needs of your business: that s what makes RBS the right choice for financia

More information

eye talk DIGITAL Contents

eye talk DIGITAL Contents eye tak DIGITAL Eye Tak Digita subscribers are abe to downoad a computer fie containing the atest product and price changes, as we as other vauabe resources for the management and deveopment of their practices.

More information

GREEN: An Active Queue Management Algorithm for a Self Managed Internet

GREEN: An Active Queue Management Algorithm for a Self Managed Internet : An Active Queue Management Agorithm for a Sef Managed Internet Bartek Wydrowski and Moshe Zukerman ARC Specia Research Centre for Utra-Broadband Information Networks, EEE Department, The University of

More information

Chapter 3: JavaScript in Action Page 1 of 10. How to practice reading and writing JavaScript on a Web page

Chapter 3: JavaScript in Action Page 1 of 10. How to practice reading and writing JavaScript on a Web page Chapter 3: JavaScript in Action Page 1 of 10 Chapter 3: JavaScript in Action In this chapter, you get your first opportunity to write JavaScript! This chapter introduces you to JavaScript propery. In addition,

More information

Technology and Consulting - Newsletter 1. IBM. July 2013

Technology and Consulting - Newsletter 1. IBM. July 2013 Technoogy and Consuting - Newsetter Juy 2013 Wecome to Latitude Executive Consuting s atest newsetter, reviewing recent marketpace activity. The newsetter focuses on the Technoogy and Consuting sectors,

More information

Introduction the pressure for efficiency the Estates opportunity

Introduction the pressure for efficiency the Estates opportunity Heathy Savings? A study of the proportion of NHS Trusts with an in-house Buidings Repair and Maintenance workforce, and a discussion of eary experiences of Suppies efficiency initiatives Management Summary

More information

The guaranteed selection. For certainty in uncertain times

The guaranteed selection. For certainty in uncertain times The guaranteed seection For certainty in uncertain times Making the right investment choice If you can t afford to take a ot of risk with your money it can be hard to find the right investment, especiay

More information

SPOTLIGHT. A year of transformation

SPOTLIGHT. A year of transformation WINTER ISSUE 2014 2015 SPOTLIGHT Wecome to the winter issue of Oasis Spotight. These newsetters are designed to keep you upto-date with news about the Oasis community. This quartery issue features an artice

More information

A New Statistical Approach to Network Anomaly Detection

A New Statistical Approach to Network Anomaly Detection A New Statistica Approach to Network Anomay Detection Christian Caegari, Sandrine Vaton 2, and Michee Pagano Dept of Information Engineering, University of Pisa, ITALY E-mai: {christiancaegari,mpagano}@ietunipiit

More information

MARKETING INFORMATION SYSTEM (MIS)

MARKETING INFORMATION SYSTEM (MIS) LESSON 4 MARKETING INFORMATION SYSTEM (MIS) CONTENTS 4.0 Aims and Objectives 4.1 Introduction 4.2 MIS 4.2.1 Database 4.2.2 Interna Records 4.2.3 Externa Sources 4.3 Computer Networks and Internet 4.4 Data

More information

Risk Margin for a Non-Life Insurance Run-Off

Risk Margin for a Non-Life Insurance Run-Off Risk Margin for a Non-Life Insurance Run-Off Mario V. Wüthrich, Pau Embrechts, Andreas Tsanakas February 2, 2011 Abstract For sovency purposes insurance companies need to cacuate so-caed best-estimate

More information

NCH Software FlexiServer

NCH Software FlexiServer NCH Software FexiServer This user guide has been created for use with FexiServer Version 1.xx NCH Software Technica Support If you have difficuties using FexiServer pease read the appicabe topic before

More information

Avaya Remote Feature Activation (RFA) User Guide

Avaya Remote Feature Activation (RFA) User Guide Avaya Remote Feature Activation (RFA) User Guide 03-300149 Issue 5.0 September 2007 2007 Avaya Inc. A Rights Reserved. Notice Whie reasonabe efforts were made to ensure that the information in this document

More information

Considering Dynamic, Non-Textual Content when Migrating Digital Asset Management Systems

Considering Dynamic, Non-Textual Content when Migrating Digital Asset Management Systems Considering Dynamic, Non-Textua Content when Migrating Digita Asset Management Systems Aya Stein; University of Iinois at Urbana-Champaign; Urbana, Iinois USA Santi Thompson; University of Houston; Houston,

More information

MICROSOFT DYNAMICS CRM

MICROSOFT DYNAMICS CRM biztech TM MICROSOFT DYNAMICS CRM Experienced professionas, proven toos and methodoogies, tempates, acceerators and vertica specific soutions maximizing the vaue of your Customer Reationships Competency

More information

Risk Margin for a Non-Life Insurance Run-Off

Risk Margin for a Non-Life Insurance Run-Off Risk Margin for a Non-Life Insurance Run-Off Mario V. Wüthrich, Pau Embrechts, Andreas Tsanakas August 15, 2011 Abstract For sovency purposes insurance companies need to cacuate so-caed best-estimate reserves

More information

Assessing Network Vulnerability Under Probabilistic Region Failure Model

Assessing Network Vulnerability Under Probabilistic Region Failure Model 2011 IEEE 12th Internationa Conference on High Performance Switching and Routing Assessing Networ Vunerabiity Under Probabiistic Region Faiure Mode Xiaoiang Wang, Xiaohong Jiang and Achie Pattavina State

More information

CERTIFICATE COURSE ON CLIMATE CHANGE AND SUSTAINABILITY. Course Offered By: Indian Environmental Society

CERTIFICATE COURSE ON CLIMATE CHANGE AND SUSTAINABILITY. Course Offered By: Indian Environmental Society CERTIFICATE COURSE ON CLIMATE CHANGE AND SUSTAINABILITY Course Offered By: Indian Environmenta Society INTRODUCTION The Indian Environmenta Society (IES) a dynamic and fexibe organization with a goba vision

More information

Driving Accountability Through Disciplined Planning with Hyperion Planning and Essbase

Driving Accountability Through Disciplined Planning with Hyperion Planning and Essbase THE OFFICIAL PUBLICATION OF THE Orace Appications USERS GROUP summer 2012 Driving Accountabiity Through Discipined Panning with Hyperion Panning and Essbase Introduction to Master Data and Master Data

More information

The eg Suite Enabing Rea-Time Monitoring and Proactive Infrastructure Triage White Paper Restricted Rights Legend The information contained in this document is confidentia and subject to change without

More information

Online Supplement for The Robust Network Loading Problem under Hose Demand Uncertainty: Formulation, Polyhedral Analysis, and Computations

Online Supplement for The Robust Network Loading Problem under Hose Demand Uncertainty: Formulation, Polyhedral Analysis, and Computations Onine Suppement for The Robust Network Loading Probem under Hose Demand Uncertaint: Formuation, Pohedra Anasis, and Computations Aşegü Atın Department of Industria Engineering, TOBB Universit of Economics

More information

Journal of Economic Behavior & Organization

Journal of Economic Behavior & Organization Journa of Economic Behavior & Organization 85 (23 79 96 Contents ists avaiabe at SciVerse ScienceDirect Journa of Economic Behavior & Organization j ourna ho me pag e: www.esevier.com/ocate/j ebo Heath

More information

Education sector: Working conditions and job quality

Education sector: Working conditions and job quality European Foundation for the Improvement of Living and Working Conditions sector: Working conditions and job quaity Work pays a significant roe in peope s ives, in the functioning of companies and in society

More information

NatWest Global Employee Banking Eastwood House Glebe Road Chelmsford Essex England CM1 1RS Depot Code 028

NatWest Global Employee Banking Eastwood House Glebe Road Chelmsford Essex England CM1 1RS Depot Code 028 To appy for this account, the printed appication must be competed and returned together with any necessary supporting documentation to the foowing address: NatWest Goba Empoyee Banking Eastwood House Gebe

More information

Design Considerations

Design Considerations Chapter 2: Basic Virtua Private Network Depoyment Page 1 of 12 Chapter 2: Basic Virtua Private Network Depoyment Before discussing the features of Windows 2000 tunneing technoogy, it is important to estabish

More information

GWPD 4 Measuring water levels by use of an electric tape

GWPD 4 Measuring water levels by use of an electric tape GWPD 4 Measuring water eves by use of an eectric tape VERSION: 2010.1 PURPOSE: To measure the depth to the water surface beow and-surface datum using the eectric tape method. Materias and Instruments 1.

More information

Application and Desktop Virtualization

Application and Desktop Virtualization Appication and Desktop Virtuaization Content 1) Why Appication and Desktop Virtuaization 2) Some terms reated to vapp and vdesktop 3) Appication and Desktop Deivery 4) Appication Virtuaization 5)- Type

More information

Managing Business Risks from Major Chemical

Managing Business Risks from Major Chemical Managing Business Risks from Major Chemica Process Accidents Mariana Bardy 1, Dr Luiz Fernando Oiveira 2, and Dr Nic Cavanagh 3 1 Head of Section, Risk Management Soutions Savador, DNV Energy Soutions

More information

Measuring operational risk in financial institutions

Measuring operational risk in financial institutions Measuring operationa risk in financia institutions Operationa risk is now seen as a major risk for financia institutions. This paper considers the various methods avaiabe to measure operationa risk, and

More information

DigitalKitbag. Email marketing

DigitalKitbag. Email marketing Emai marketing Who are Digita Kitbag? We re your business marketing team Digita Kitbag is owned and operated by Johnston Press, one of the argest regiona media pubishers in the UK and Ireand. We have a

More information

Normalization of Database Tables. Functional Dependency. Examples of Functional Dependencies: So Now what is Normalization? Transitive Dependencies

Normalization of Database Tables. Functional Dependency. Examples of Functional Dependencies: So Now what is Normalization? Transitive Dependencies ISM 602 Dr. Hamid Nemati Objectives The idea Dependencies Attributes and Design Understand concepts normaization (Higher-Leve Norma Forms) Learn how to normaize tabes Understand normaization and database

More information

With the arrival of Java 2 Micro Edition (J2ME) and its industry

With the arrival of Java 2 Micro Edition (J2ME) and its industry Knowedge-based Autonomous Agents for Pervasive Computing Using AgentLight Fernando L. Koch and John-Jues C. Meyer Utrecht University Project AgentLight is a mutiagent system-buiding framework targeting

More information

Application-Aware Data Collection in Wireless Sensor Networks

Application-Aware Data Collection in Wireless Sensor Networks Appication-Aware Data Coection in Wireess Sensor Networks Xiaoin Fang *, Hong Gao *, Jianzhong Li *, and Yingshu Li +* * Schoo of Computer Science and Technoogy, Harbin Institute of Technoogy, Harbin,

More information

ST. MARKS CONFERENCE FACILITY MARKET ANALYSIS

ST. MARKS CONFERENCE FACILITY MARKET ANALYSIS ST. MARKS CONFERENCE FACILITY MARKET ANALYSIS Prepared by: Lambert Advisory, LLC Submitted to: St. Marks Waterfronts Forida Partnership St. Marks Conference Center Contents Executive Summary... 1 Section

More information

arxiv:1506.05851v1 [cs.ai] 18 Jun 2015

arxiv:1506.05851v1 [cs.ai] 18 Jun 2015 Smart Pacing for Effective Onine Ad Campaign Optimization Jian Xu, Kuang-chih Lee, Wentong Li, Hang Qi, and Quan Lu Yahoo Inc. 7 First Avenue, Sunnyvae, Caifornia 9489 {xuian,kcee,wentong,hangqi,qu}@yahoo-inc.com

More information

Overview of Health and Safety in China

Overview of Health and Safety in China Overview of Heath and Safety in China Hongyuan Wei 1, Leping Dang 1, and Mark Hoye 2 1 Schoo of Chemica Engineering, Tianjin University, Tianjin 300072, P R China, E-mai: david.wei@tju.edu.cn 2 AstraZeneca

More information

COMPARISON OF DIFFUSION MODELS IN ASTRONOMICAL OBJECT LOCALIZATION

COMPARISON OF DIFFUSION MODELS IN ASTRONOMICAL OBJECT LOCALIZATION COMPARISON OF DIFFUSION MODELS IN ASTRONOMICAL OBJECT LOCALIZATION Františe Mojžíš Department of Computing and Contro Engineering, ICT Prague, Technicá, 8 Prague frantise.mojzis@vscht.cz Abstract This

More information

Example of Credit Card Agreement for Bank of America Visa Signature and World MasterCard accounts

Example of Credit Card Agreement for Bank of America Visa Signature and World MasterCard accounts Exampe of Credit Card Agreement for Bank of America Visa Signature and Word MasterCard accounts PRICING INFORMATION Actua pricing wi vary from one cardhoder to another Annua Percentage Rates for Purchases

More information

ONE of the most challenging problems addressed by the

ONE of the most challenging problems addressed by the IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 44, NO. 9, SEPTEMBER 2006 2587 A Mutieve Context-Based System for Cassification of Very High Spatia Resoution Images Lorenzo Bruzzone, Senior Member,

More information

l l ll l l Exploding the Myths about DETC Accreditation A Primer for Students

l l ll l l Exploding the Myths about DETC Accreditation A Primer for Students Expoding the Myths about DETC Accreditation A Primer for Students Distance Education and Training Counci Expoding the Myths about DETC Accreditation: A Primer for Students Prospective distance education

More information

Design of Follow-Up Experiments for Improving Model Discrimination and Parameter Estimation

Design of Follow-Up Experiments for Improving Model Discrimination and Parameter Estimation Design of Foow-Up Experiments for Improving Mode Discrimination and Parameter Estimation Szu Hui Ng 1 Stephen E. Chick 2 Nationa University of Singapore, 10 Kent Ridge Crescent, Singapore 119260. Technoogy

More information

Best Practices for Push & Pull Using Oracle Inventory Stock Locators. Introduction to Master Data and Master Data Management (MDM): Part 1

Best Practices for Push & Pull Using Oracle Inventory Stock Locators. Introduction to Master Data and Master Data Management (MDM): Part 1 SPECIAL CONFERENCE ISSUE THE OFFICIAL PUBLICATION OF THE Orace Appications USERS GROUP spring 2012 Introduction to Master Data and Master Data Management (MDM): Part 1 Utiizing Orace Upgrade Advisor for

More information

SELECTING THE SUITABLE ERP SYSTEM: A FUZZY AHP APPROACH. Ufuk Cebeci

SELECTING THE SUITABLE ERP SYSTEM: A FUZZY AHP APPROACH. Ufuk Cebeci SELECTING THE SUITABLE ERP SYSTEM: A FUZZY AHP APPROACH Ufuk Cebeci Department of Industria Engineering, Istanbu Technica University, Macka, Istanbu, Turkey - ufuk_cebeci@yahoo.com Abstract An Enterprise

More information

ELECTRONIC FUND TRANSFERS YOUR RIGHTS AND RESPONSIBILITIES

ELECTRONIC FUND TRANSFERS YOUR RIGHTS AND RESPONSIBILITIES About ELECTRONIC FUND TRANSFERS YOUR RIGHTS AND RESPONSIBILITIES The Eectronic Fund Transfers we are capabe of handing for consumers are indicated beow, some of which may not appy your account. Some of

More information

Load Balance vs Energy Efficiency in Traffic Engineering: A Game Theoretical Perspective

Load Balance vs Energy Efficiency in Traffic Engineering: A Game Theoretical Perspective Load Baance vs Energy Efficiency in Traffic Engineering: A Game Theoretica Perspective Yangming Zhao, Sheng Wang, Shizhong Xu and Xiong Wang Schoo of Communication and Information Engineering University

More information

Minimizing the Total Weighted Completion Time of Coflows in Datacenter Networks

Minimizing the Total Weighted Completion Time of Coflows in Datacenter Networks Minimizing the Tota Weighted Competion Time of Cofows in Datacenter Networks Zhen Qiu Ciff Stein and Yuan Zhong ABSTRACT Communications in datacenter jobs (such as the shuffe operations in MapReduce appications

More information

Pricing and hedging of variable annuities

Pricing and hedging of variable annuities Cutting Edge Pricing and hedging of variabe annuities Variabe annuity products are unit-inked investments with some form of guarantee, traditionay sod by insurers or banks into the retirement and investment

More information

Comparison of Traditional and Open-Access Appointment Scheduling for Exponentially Distributed Service Time

Comparison of Traditional and Open-Access Appointment Scheduling for Exponentially Distributed Service Time Journa of Heathcare Engineering Vo. 6 No. 3 Page 34 376 34 Comparison of Traditiona and Open-Access Appointment Scheduing for Exponentiay Distributed Service Chongjun Yan, PhD; Jiafu Tang *, PhD; Bowen

More information

Conference Paper Service Organizations: Customer Contact and Incentives of Knowledge Managers

Conference Paper Service Organizations: Customer Contact and Incentives of Knowledge Managers econstor www.econstor.eu Der Open-Access-Pubikationsserver der ZBW Leibniz-Informationszentrum Wirtschaft The Open Access Pubication Server of the ZBW Leibniz Information Centre for Economics Kirchmaier,

More information

INTERNATIONAL PAYMENT INSTRUMENTS

INTERNATIONAL PAYMENT INSTRUMENTS INTERNATIONAL PAYMENT INSTRUMENTS Dr Nguyen Minh Duc 2009 1 THE INTERNATIONAL CHAMBER OF COMMERCE THE ICC AT A GLANCE represent the word business community at nationa and internationa eves promotes word

More information