14 Protecting Private Information in Online Social Networks



Similar documents
Bullwhip Effect Measure When Supply Chain Demand is Forecasting

A panel data approach for fashion sales forecasting

FORECASTING MODEL FOR AUTOMOBILE SALES IN THAILAND

UNDERWRITING AND EXTRA RISKS IN LIFE INSURANCE Katarína Sakálová

Ranking of mutually exclusive investment projects how cash flow differences can solve the ranking problem

CHAPTER 22 ASSET BASED FINANCING: LEASE, HIRE PURCHASE AND PROJECT FINANCING

A Queuing Model of the N-design Multi-skill Call Center with Impatient Customers

REVISTA INVESTIGACION OPERACIONAL VOL. 31, No.2, , 2010

Studies in sport sciences have addressed a wide

The Term Structure of Interest Rates

3. Cost of equity. Cost of Debt. WACC.

PERFORMANCE COMPARISON OF TIME SERIES DATA USING PREDICTIVE DATA MINING TECHNIQUES

Modeling the Nigerian Inflation Rates Using Periodogram and Fourier Series Analysis

Managing Learning and Turnover in Employee Staffing*

A formulation for measuring the bullwhip effect with spreadsheets Una formulación para medir el efecto bullwhip con hojas de cálculo

Ranking Optimization with Constraints

1/22/2007 EECS 723 intro 2/3

Reaction Rates. Example. Chemical Kinetics. Chemical Kinetics Chapter 12. Example Concentration Data. Page 1

Combining Adaptive Filtering and IF Flows to Detect DDoS Attacks within a Router

12. Spur Gear Design and selection. Standard proportions. Forces on spur gear teeth. Forces on spur gear teeth. Specifications for standard gear teeth

A Strategy for Trading the S&P 500 Futures Market

APPLICATIONS OF GEOMETRIC

Derivative Securities: Lecture 7 Further applications of Black-Scholes and Arbitrage Pricing Theory. Sources: J. Hull Avellaneda and Laurence

A New Hybrid Network Traffic Prediction Method

Kyoung-jae Kim * and Ingoo Han. Abstract

Research Article Dynamic Pricing of a Web Service in an Advance Selling Environment

A Heavy Traffic Approach to Modeling Large Life Insurance Portfolios

4. Levered and Unlevered Cost of Capital. Tax Shield. Capital Structure

Introduction to Statistical Analysis of Time Series Richard A. Davis Department of Statistics

Teaching Bond Valuation: A Differential Approach Demonstrating Duration and Convexity

Circularity and the Undervaluation of Privatised Companies

COLLECTIVE RISK MODEL IN NON-LIFE INSURANCE

Why we use compounding and discounting approaches

Improving Survivability through Traffic Engineering in MPLS Networks

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

Monitoring of Network Traffic based on Queuing Theory

Mechanical Vibrations Chapter 4

Capital Budgeting: a Tax Shields Mirage?

Predicting Indian Stock Market Using Artificial Neural Network Model. Abstract

THE FOREIGN EXCHANGE EXPOSURE OF CHINESE BANKS

Modelling Time Series of Counts

Data Protection and Privacy- Technologies in Focus. Rashmi Chandrashekar, Accenture

Introduction to Hypothesis Testing

The Norwegian Shareholder Tax Reconsidered

Hilbert Transform Relations

Financial Data Mining Using Genetic Algorithms Technique: Application to KOSPI 200

ON THE RISK-NEUTRAL VALUATION OF LIFE INSURANCE CONTRACTS WITH NUMERICAL METHODS IN VIEW ABSTRACT KEYWORDS 1. INTRODUCTION

Granger Causality Analysis in Irregular Time Series

Chapter 4 Return and Risk

DBIQ USD Investment Grade Corporate Bond Interest Rate Hedged Index

An Approach for Measurement of the Fair Value of Insurance Contracts by Sam Gutterman, David Rogers, Larry Rubin, David Scheinerman

Determinants of Public and Private Investment An Empirical Study of Pakistan

UNIT ROOTS Herman J. Bierens 1 Pennsylvania State University (October 30, 2007)

Chapter 8: Regression with Lagged Explanatory Variables

Numerical Solution of Differential and Integral Equations

TACTICAL PLANNING OF THE OIL SUPPLY CHAIN: OPTIMIZATION UNDER UNCERTAINTY

A simple SSD-efficiency test

A GLOSSARY OF MAIN TERMS

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

ACCOUNTING TURNOVER RATIOS AND CASH CONVERSION CYCLE

Abstract. 1. Introduction. 1.1 Notation. 1.2 Parameters

Estimating Non-Maturity Deposits

FEBRUARY 2015 STOXX CALCULATION GUIDE

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

IDENTIFICATION OF MARKET POWER IN BILATERAL OLIGOPOLY: THE BRAZILIAN WHOLESALE MARKET OF UHT MILK 1. Abstract

Department of Economics Working Paper 2011:6

Determining the sample size

On Motion of Robot End-effector Using The Curvature Theory of Timelike Ruled Surfaces With Timelike Ruling

Handbook on Residential Property Prices Indices (RPPIs)

5 Boolean Decision Trees (February 11)

Exchange Rates, Risk Premia, and Inflation Indexed Bond Yields. Richard Clarida Columbia University, NBER, and PIMCO. and

Distributed Containment Control with Multiple Dynamic Leaders for Double-Integrator Dynamics Using Only Position Measurements

General Bounds for Arithmetic Asian Option Prices

APPLIED STATISTICS. Economic statistics

Lesson 17 Pearson s Correlation Coefficient

Caring for trees and your service

Chapter XIV: Fundamentals of Probability and Statistics *

Fuzzy Task Assignment Model of Web Services Supplier

The Application of Multi Shifts and Break Windows in Employees Scheduling

17 Laplace transform. Solving linear ODE with piecewise continuous right hand sides

Department of Computer Science, University of Otago

Output Analysis (2, Chapters 10 &11 Law)

INTRODUCTION TO MARKETING PERSONALIZATION. How to increase your sales with personalized triggered s

Review: Classification Outline

Professional Networking

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

ANALYSIS AND COMPARISONS OF SOME SOLUTION CONCEPTS FOR STOCHASTIC PROGRAMMING PROBLEMS

Asymptotic Growth of Functions

The Transport Equation

Testing the Weak Form of Efficient Market Hypothesis: Empirical Evidence from Jordan

Chapter 2 Problems. 3600s = 25m / s d = s t = 25m / s 0.5s = 12.5m. Δx = x(4) x(0) =12m 0m =12m

TRANSPORT ECONOMICS, POLICY AND POVERTY THEMATIC GROUP

Hanna Putkuri. Housing loan rate margins in Finland

I. Chi-squared Distributions

1 Computing the Standard Deviation of Sample Means

Transcription:

4 roecig rivae Iormaio i Olie Social eworks Jiamig He ad Wesley W. Chu Compuer Sciece Deparme Uiversiy o Calioria USA {jmhekwwc}@cs.ucla.edu Absrac. Because persoal iormaio ca be ierred rom associaios wih rieds privacy becomes icreasigly impora as olie social ework services gai more populariy. Our rece sudy showed ha he causal relaios amog rieds i social eworks ca be modeled by a Bayesia ework ad persoal aribue values ca be ierred wih high accuracy rom close rieds i he social ework. Based o hese isighs we propose schemes o proec privae iormaio by selecively hidig or alsiyig iormaio based o he characerisics o he social ework. Boh simulaio resuls ad aalyical sudies reveal ha selecive aleraios o he social ework relaios ad/or aribue values accordig o our proposed proecio rule are much more eecive ha radom aleraios. 4. Iroducio Wih he icreasig populariy o Web 2. more ad more olie social eworks OSs such as Myspace.com Facebook.com ad Friedser.com have emerged. eople i OSs have heir ow persoalized space where hey o oly publish heir biographies hobbies ieress blogs ec. bu also lis heir rieds. Frieds or visiors ca visi hese persoal spaces ad leave commes. OSs provide plaorms where people ca place hemselves o exhibi ad maiai coecios wih rieds ad ha is why hey are so popular wih he youger geeraio. However as more people use OSs privacy becomes a impora issue. Whe cosiderig he muliude o user proiles ad riedships loodig he OSs e.g. Myspace.com claims o have abou millio membership accous we realize how easily iormaio ca be divulged i people mishadle i [8]. Oe example is a school policy violaio ideiied o Facebook.com. I ovember 25 our sudes a orher Keucky Uiversiy were ied whe picures o a drikig pary were posed o Facebook.com. The picures ake i oe o KU s dormiories were visual proo ha he sudes were i violaio o he uiversiy s dry campus policy. I his example people s privae aciviies were disclosed by hemselves. There is aoher ype o privacy disclosure ha is more diicul o ideiy ad preve. I his case privae daa ca be idirecly ierred by adversaries. Iuiively rieds ed o share commo rais. For example high school classmaes have similar ages ad he same homeow ad members o a dace club like dacig. Thereore o ier someoe s homeow or ieres i dacig we ca check he values o hese H. Che ad C.C. Yag Eds.: Ielligece ad Securiy Iormaics SCI 35 pp. 249 273 28. sprigerlik.com Spriger-Verlag Berli Heidelberg 28

25 J. He ad W.W. Chu aribues o his classmaes or club maes. I aoher example assume Joe does o wish o disclose his salary. However a hird pary such as a isurace compay uses OSs o obai a repor o Joe s social ework which icludes Joe s rieds ad oice colleagues ad heir persoal iormaio. Aer lookig careully io his repor he isurace compay realizes ha Joe has quie a ew rieds who are juior web developers o a sarup compay i Sa Jose. Thus he isurace compay ca deduce ha mos likely Joe is also a programmer i his iormaio is o provided by Joe himsel. By usig he kowledge cocerig a juior programmer s salary rage he isurace compay ca he igure ou Joe s approximae salary ad adverise isurace packages accordigly. Thereore i his example Joe s privae salary iormaio is idirecly disclosed rom Joe s social relaios. Iormaio privacy is oe o he mos urge research issues i buildig exgeeraio iormaio sysems ad a grea deal o research eor has bee devoed o proecig people s privacy. I addiio o rece developmes i crypography ad securiy proocols [ 2] ha provide secure daa raser capabiliies here has bee work o eorcig idusry sadards e.g. 3 [2] ad goverme policies e.g. he HIAA rivacy Rule [9] o gra idividuals corol over heir ow privacy. These exisig echiques ad policies aim o eecively block direc disclosure o sesiive persoal iormaio. However as we meioed i he previous examples privae iormaio ca also be idirecly deduced by ielligely combiig pieces o seemigly iocuous or urelaed iormaio. To he bes o our kowledge oe o he exisig echiques are able o hadle such idirec disclosures. I his chaper we shall discuss how o proec he disclosure o privae iormaio ha ca be ierred rom social relaios. To preserve he ierece properies rom he social ework characerisics we ecode he causabiliy o a social ework io a Bayesia ework ad he use simulaio ad aalysis o ivesigae he eeciveess o ierece o privae iormaio i a social ework. We have coduced a experime o he Epiios.com ha operaes i a real evirome o veriy he perormace improvemes gaied by usig he Bayesia ework or ierrig privae iormaio. Based o he isighs obaied rom he experime a privacy proecio rule has bee developed. rivacy proecio mehods derived rom he proecio rule are proposed ad heir perormace is evaluaed. The chaper is orgaized as ollows. Aer iroducig he backgroud i Sec. 4.2 we propose a Bayesia ework approach i Sec. 4.3 o ier privae iormaio. Sec. 4.4 discusses simulaio experimes or sudyig he perormace o Bayesia ierece. rivacy proecio rules as well as proecio schemes are proposed ad evaluaed i Sec. 4.5. I Sec. 4.6 we use aalysis o show ha based o our proecio rules selecive aleraios o he social ework social relaios ad/or aribue values yield much more eecive privacy proecio ha he radom aleraios. We prese some relaed work o social eworks i Sec. 4.7. Fially uure work ad coclusios are summarized i Sec. 4.8. Sec. 4.8 is ollowed by several quesios relaed o he discussio i his chaper. 4.2 Backgroud A Bayesia ework [9 7 22] is a graphic represeaio o he joi probabiliy disribuio over a se o variables. I cosiss o a ework srucure ad a collecio

4 roecig rivae Iormaio i Olie Social eworks 25 o codiioal probabiliy ables CT. The ework srucure is represeed as a direced acyclic graph DAG i which each ode correspods o a radom variable ad each edge idicaes a depede relaioship bewee coeced variables. I addiio each variable ode i a Bayesia ework is associaed wih a CT which eumeraes he codiioal probabiliies or his variable give all he combiaios o is pares value. Thus or a Bayesia ework he DAG capures causal relaios amog radom variables ad CTs quaiy hese relaios. Bayesia eworks have bee exesively applied o ields such as medicie image processig ad decisio suppor sysems. Sice Bayesia eworks iclude he cosideraio o ework srucures we decided o model social eworks wih Bayesia eworks. Basically we represe a idividual i a social ework as a ode i a Bayesia ework ad a relaio bewee idividuals i a social ework as a edge i a Bayesia ework. 4.3 Bayesia Ierece Via Social Relaios I his secio we propose a approach o map social eworks io Bayesia eworks ad he illusrae how we use his or aribue ierece. The aribue ierece is used o predic he privae aribue value o a paricular idividual reerred o as he arge ode rom his social ework which cosiss o he values o he same aribue o his rieds. oe ha we do o uilize he values o oher aribues o ad s rieds i his sudy hough cosiderig such iormaio migh improve he predicio accuracy. Isead we oly cosider a sigle aribue so ha we ca ocus o he role o social relaios i he aribue ierece. The sigle aribue ha we sudy ca be ay aribue i geeral such as geder ehiciy ad hobbies ad we reer o his aribue as he arge aribue. For simpliciy we cosider he value o he arge aribue as a biary variable i.e. eiher rue or or shor or alse. For example i likes readig books he we cosider s book aribue value is rue. eople are acquaied wih each oher via diere ypes o relaios ad i is o ecessary or a idividual o have he same aribue values as his rieds. Which aribues are commo bewee rieds depeds o he ype o relaioship. For example diabees could be a iheried rai i amily members bu his would o apply o oicemaes. Thereore o perorm aribue ierece we eed o iler ou he orelaed social relaios. For isace we eed o remove s oicemaes rom his social ework i we wa o predic his healh codiio. I he ypes o social relaios ha cause rieds o coec wih oe aoher are speciied i he social eworks he he ilerig is sraighorward. However i case such iormaio is o give oe possible soluio is o classiy social relaios io diere caegories ad he iler ou o-relaed social relaios based o he ype o he caegories. I Sec. 4.4 we show such a example while ierrig persoal ieress rom daa i Epiios.com. For simpliciy i his secio we assume ha we have already ilered ou he o-relaed social relaios ad he social relaios we discussed here are he oes ha are closely relaed o he arge aribue. The aribue ierece ivolves wo seps. Beore we predic he arge aribue value o we irs cosruc a Bayesia ework rom s social ework ad he apply a Bayesia ierece ad obai he probabiliy ha has a cerai aribue

252 J. He ad W.W. Chu value. I his secio we shall irs sar wih a simple case i which he arge aribue values o all he direc rieds are kow. The we exed he sudy by cosiderig he case where some rieds hide heir arge aribue values. 4.3. Sigle-Hop Ierece Le us irs cosider he case i which we kow he arge aribue values o all he direc rieds o. We deie ij as he j h ried o a i hops away. I a ried ca be reached via more ha oe roue rom we use he shores pah as he value o i. Thereore ca also be represeed as. Le i be he se o ij j < i where i is he umber o s rieds a i hops away. For isace {... - } is he se o s direc rieds who are oe hop away. Furhermore we use he correspodig lowercase variable o represe he arge aribue value o a paricular perso e.g. z sads or he arge aribue value o. A example o a social ework wih six rieds is show i Fig. 4.a. I his igure ad 2 are direc rieds o. 2 ad 3 are he direc rieds o ad 2 respecively. I his sceario he aribue values o 2 ad 3 are kow represeed as shaded odes. a b c Fig. 4.. Reducio o a social ework a io a Bayesia ework o ier rom his rieds via localizaio assumpio b ad via aïve Bayesia assumpio c. The shaded odes represe rieds whose aribue values are kow. Bayesia ework Cosrucio To cosruc he Bayesia ework we make he ollowig wo assumpios. Iuiively our direc rieds have more iluece o us ha rieds who are wo or more hops away. Thereore o ier he arge aribue value o i is suicie o cosider oly he direc rieds o. Kowig he aribue values o rieds a muliple hops away provides o addiioal iormaio or predicig he arge aribue value. Formally we sae his assumpio as ollows. Localizaio Assumpio Give he aribue values o he direc rieds rieds a more ha oe hop away i.e. i or i > are codiioally idepede o. Based o his assumpio 2 ad 3 i Fig. 4.a ca be prued ad he ierece o oly ivolves ad 2 Fig. 4.b. The he ex quesio is how

4 roecig rivae Iormaio i Olie Social eworks 253 o decide a DAG likig he remaiig odes. I he resulig social ework does o coai cycles a Bayesia ework is ormed. Oherwise we mus employ more sophisicaed echiques o remove cycles such as he use o auxiliary variables o capure o-causal cosrais exac coversio ad he deleio o edges wih he weakes relaios approximaio coversio. We adop he laer approach ad make a aive Bayesia assumpio. Tha is he aribue value o ilueces ha o j j < ad here is a direc lik poiig rom o each j. By makig his assumpio we cosider he ierece pahs rom o j as he primary correlaios ad disregard he correlaios amog he odes i. Formally we have: aïve Bayesia Assumpio Give he aribue value o he arge ode he aribue values o direc rieds are codiioally idepede o each oher. This aïve Bayesia model has bee used i may classiicaio/predicio applicaios icludig exual-docume classiicaio. Though i simpliies he correlaio amog variables his model has bee show o be quie eecive [4]. Thus we also adoped his assumpio i our sudy. For example a ial DAG is ormed as show i Fig. 4.c by removig he coecio bewee ad i Fig. 4.b. Bayesia Ierece Aer modelig he speciic perso s social ework io a Bayesia ework we use he Bayes decisio rule o predic he aribue value o. For a geeral Bayesia ework wih maximum deph i le be he maximum codiioal poserior probabiliy or he aribue value o give he aribue values o oher odes i he ework as i Eq. 4.: arg max 2... i { }. 4. Sice sigle-hop ierece ivolves oly direc rieds which are idepede o each oher he poserior probabiliy ca be urher reduced usig he codiioal idepedece propery ecoded i he Bayesia ework: z z z z [ z z] z [ z - j j zj z - z z] j j j 4.2 where z ad z j are he aribue values o ad j respecively j < z z j { } ad he value o each z j is kow. To compue Eq. 4.2 we eed o urher lear he codiioal probabiliy able CT or each perso i he social ework. I our sudy we apply he parameer

254 J. He ad W.W. Chu esimaio [7] echique o he eire ework. For every pair o pare X ad child Y we obai Eq. 4.3: y X x Y # o riedship liks coecig people wih X x ad Y y # o riedship likscoecig a perso wih X x 4.3 where x y { }. Y y X x is he CT or every pair o rieds j ad i he ework. Sice j is he same or j < j becomes equivale o oe aoher ad he poserior probabiliy ow depeds o he umber o direc rieds wih aribue value. We ca rewrie he poserior probabiliy z as z. Give we obai: z z z z z [ z z z. ] 4.4 where z { }. Aer obaiig ad rom Eq. 4.4 we predic has aribue value i he ormer value is greaer ha he laer value ad vice versa. 4.3.2 Muli-hop Ierece I sigle-hop ierece we assume ha we kow he aribue values o all he direc rieds o. However i realiy o all o hose aribue values may be observed sice people may hide heir sesiive iormaio ad he localizaio assumpio i he previous secio is o loger valid. To icorporae more aribue iormaio io our Bayesia ework we propose he ollowig geeralized localizaio assumpio. Geeralized Localizaio Assumpio Give he aribue value o he j h ried o a i hops away ij j < he aribue o is codiioally idepede o he descedas o ij. a b Fig. 4.2. Reducio o a social ework a io a Bayesia ework o ier rom his rieds via geeralized localizaio assumpio b. The shaded odes represe rieds whose aribue values are kow.

4 roecig rivae Iormaio i Olie Social eworks 255 This assumpio saes ha i he aribue value o s direc ried j is ukow he he aribue value o is codiioally depede o hose o he direc rieds o j. This process coiues uil we reach a descede o j wih kow aribue value. For example he ework srucure i Fig. 4.2a is he same as i Fig. 4.a bu he aribue value o is ukow. Based o he geeralized localizaio assumpio we exed he ework by brachig o s direc child 2. Sice 2 s aribue value is ukow we urher brach o 2 s direc ried 3. The brach ermiaes here because he aribue value o 3 is kow. Thus he ierece ework or icludes all he odes i he graph. Aer applyig he aive Bayesia assumpio we obai he DAG show i Fig. 4.2b. Similar o siglehop ierece he resulig DAG i muli-hop ierece is a ree rooed a he arge ode. Oe ierpreaio o his model is ha whe we predic he aribue value o we always rea as a egoceric perso who has srog ilueces o his/her rieds. Thus he aribue value o ca be releced by he aribues o rieds. For muli-hop ierece we sill apply he Bayes decisio rule. Due o addiioal ukow aribue values such as he calculaio o he poserior probabiliy becomes more complicaed. Oe commo echique or solvig his equaio is variable elimiaio [9]. I his chaper we use his echique o derive he value o i Eq. 4.. 4.4 Experimeal Sudy o Bayesia Ierece I he previous secio we discussed he mehod or perormig he aribue ierece i social eworks. I his secio we sudy several characerisics o social eworks o ivesigae uder wha codiio ad o wha exe he value o a arge aribue ca be ierred by Bayesia ierece. Speciically we sudy he iluece sregh bewee riedship prior probabiliy o arge aribues ad sociey opeess. We use simulaios ad experimes o evaluae heir impac o ierece accuracy which is deied as he perceage o odes prediced correcly by he ierece. 4.4. Characerisics o Social eworks Iluece Sregh Aalogous o he ieracio bewee iheriace ad muaio i biology we deie wo ypes o iluece i social relaios. More speciically or he relaioship bewee every pair o pare X ad child Y we deie Y X or or simpliicaio as iheriace sregh. This value measures he degree o which a child iheris a aribue value rom his/her pare. A higher value o implies ha boh X ad Y will possess he aribue value wih a higher probabiliy. O he oher had we deie Y X or as muaio sregh. measures he poeial ha Y develops a aribue value by muaio raher ha iheriace. A idividual s aribue value is he resul o boh ypes o sregh. There are wo oher codiioal probabiliies bewee X ad Y; i.e. Y X or ad Y X or. These wo values ca be derived rom ad

256 J. He ad W.W. Chu respecively ad -. Thereore i is suicie o oly cosider iheriace ad muaio sregh. rior robabiliy rior probabiliy or or shor is he perceage o people i he social ework who have he arge aribue value as. Whe o addiioal iormaio is provided we ca use prior probabiliy o predic aribue values or he arge odes: i.5 we predic ha every arge ode has value ; oherwise we predic ha i has value. We call his mehod aive ierece. The average aive ierece accuracy ha ca be obaied is max -. I our sudy we use i as a base lie or compariso wih he Bayesia ierece approach. I is worh poiig ou ha whe people i a sociey are i ac idepede o each oher hus. Hece havig addiioal iormaio abou a ried provides o coribuio o he predicio or he arge ode. Sociey Opeess We deie sociey opeess O A as he perceage o people i a social ework who release heir arge aribue value A. The more people who release heir values he higher he sociey opeess ad he more iormaio observed abou aribue A. Usig sociey opeess we sudy he amou o iormaio eeded o kow abou oher people i he social ework i order o make a correc predicio. 4.4.2 Daa Se For he simulaio we collec 66766 persoal proiles rom a olie weblog service provider Livejoural [2] which has 2.6 millio acive members all over he world. For each member Livejoural geeraes a persoal proile ha speciies he member s biography as well as a lis o his rieds. Amog he colleced proiles here are 43348 ried relaioships. The degree o he umber o rieds ollows he power law disribu io Fig. 4.3. Abou hal o he populaio has less ha e direc rieds. Livejoural Lik Disribuio umber o Members umber o Direc Frieds Fig. 4.3. umber o direc rieds vs. umber o members i Livejoural o a log-log scale

4 roecig rivae Iormaio i Olie Social eworks 257 I order o evaluae he ierece behaviors or a wide rage o parameers we use a hypoheical aribue ad syhesize he aribue values. For each member we assig a CT ad deermie he acual aribue value based o he pare s value ad he assiged CT. The aribue assigme sars rom he se o odes whose idegree is zero ad explores he res o he ework ollowig riedship liks. We use he same CT or each member. For all he experimes we evaluae he ierece perormace by varyig CTs. Aer he aribue assigme we obai a social ework. To ier each idividual we build a correspodig Bayesia ework ad he coduc Bayesia ierece as described i Sec. 4.3. 4.4.3 Simulaio Resuls Compariso o Bayesia ad aive Ierece I his se o experimes we compare he perormace o Bayesia ierece o aïve ierece. We shall sudy wheher privacy ca be ierred rom social relaios. We ix he prior probabiliy o.3 ad vary iheriace sregh rom. o.9. We perorm ierece usig boh approaches o every member i he ework. The ierece accuracy is obaied by comparig he prediced values wih he Ierece Accuracy.95.9.85.8.75 aive Ierece Bayesia Ierece.7.65.2.4.6.8 Iheriace Sregh Fig. 4.4. Ierece accuracy o Bayesia vs. aive ierece whe.3 correspodig acual values. Fig. 4.4 shows he ierece accuracy o he wo mehods as he iheriace sregh icreases. I is clear ha Bayesia ierece ouperorms aïve ierece. The curve or aïve ierece lucuaes aroud 7% because wih.3 he average accuracy we ca achieve is 7%. The perormace o Bayesia ierece varies wih. We achieve a very high accuracy especially a high iheriace sregh. The accuracy reaches 95% whe.9 which is much higher ha he 7% accuracy o he aïve ierece. Similar reds are observed bewee hese wo mehods or oher prior probabiliies as well. I a equilibrium sae he value o ca be derived rom ad. Whe is ixed icreasig resuls i a decrease i.

258 J. He ad W.W. Chu Eec o Iluece Sregh ad rior robabiliy Fig. 4.5 shows he accuracy o Bayesia ierece whe he prior probabiliy is.5..3 ad.5 ad he iheriace sregh varies rom. o.9. As varies he ierece accuracy yields diere reds wih. The lowes ierece accuracy always occurs whe is equal o. For example he lowes ierece accuracy approximaely 7% a.3 occurs whe is.3. A his poi people i he ework are idepede o each aoher. The ierece accuracy icreases as he dierece bewee ad icreases..2 Ierece Accuracy.8.6.4.5.2..3.5.2.4.6.8 Iheriace Sregh Fig. 4.5. Ierece accuracy o Bayesia ierece or diere prior probabiliies Sociey Opeess I he previous experimes we assumed ha sociey opeess is %. Tha is he aribue values o all he rieds o he arge ode are kow. I his se o experimes we sudy he ierece behavior a diere levels o sociey opeess. We radomly hide he aribue values o a cerai perceage o members ragig rom % o 9% ad he perorm Bayesia ierece o hose members. Fig. 4.6 shows he experimeal resuls or he prior probabiliy.3 ad he sociey opeess O A % 5% ad 9%. The ierece accuracy decreases as he opeess decreases i.e. he umber o members hidig heir aribue values icreases. For isace a iheriace sregh.7 whe he opeess is decreased rom 9% o % he accuracy reduces rom 84.6% o 8.5%. However he reducio i ierece accuracy is relaively small o average less ha 5%. We also observe similar reds or oher prior probabiliies. This pheomeo reveals ha radomly hidig rieds aribue values oly resuls i a small eec o he ierece accuracy. Thereore we should cosider selecively alerig social eworks o improve privacy proecio. Robusess o Bayesia Ierece o False Iormaio To evaluae he robusess o Bayesia ierece whe people provide alse iormaio we corol he perceage o members rom % o% who ca radomly se

4 roecig rivae Iormaio i Olie Social eworks 259 Ierece Accuracy.95.9.85.8.75 O A % O A 5% O A 9%.7.65.2.4.6.8 Iheriace Sregh Fig. 4.6. Ierece accuracy o Bayesia ierece or diere sociey opeess whe.3 heir aribue values reerred o as radomess. Fig. 4.7 shows he impac o radomess o he ierece accuracy a prior probabiliy.3 ad iheriace sregh.7. A low radomess we oe ha he Bayesia ierece clearly has a higher accuracy ha he aïve ierece. For example whe he radomess is. he ierece accuracy o Bayesia ad aïve iereces is 79.7% ad 72.9% respecively. However he advaage o Bayesia ierece decreases as he radomess icreases. This is especially so whe he radomess reaches.. A ha poi here is almos o dierece i he ierece accuracy bewee Bayesia ad aïve iereces. This is because heir aribue values o loger ollow he causal relaios bewee rieds whe hey radomly egae heir aribue values. As a resul Bayesia ierece behaves similar o aive ierece. Thus rom a privacy proecio poi o view alsiyig persoal aribue values ca be a eecive echique. Based o hese characerisics we will propose several schemes or privacy proecio ad evaluae heir eeciveess i Sec. 4.5..8 Ierece Accuracy.6.4.2 aive Ierece Bayesia Ierece.2.4.6.8 Radomess Fig. 4.7. Ierece accuracy o Bayesia ierece or diere radomess whe.3 ad.7

26 J. He ad W.W. Chu 4.4.4 Experimes o Epiios.com To evaluae he perormace o Bayesia ierece i a real social ework we coduc some experimes o Epiios.com [6]. Epiios is a review websie or producs icludig digial cameras video games hoels resauras ec. Epiios divides hese producs io 23 caegories ad hudreds o subcaegories. We cosider ha people have ieress i a paricular caegory i hey wrie reviews o producs i his caegory. I addiio regisered members ca also speciy members i Epiios ha hey rus. Thus a social ework is ormed where people are coeced by rus relaios. I his rus ework i perso A russ perso B i is very likely ha A also likes he producs ha B is ieresed i. I his experime we use Bayesia ierece o predic people s ieress i some caegories rom he rieds ha hey rus ad he compared he predicio wih he acual caegories o heir reviews published o Epiios. The higher he perceage o he maches he beer he predicio. We collec 6639 persoal proiles rom Epiios. Each proile represes a idividual wih his produc reviews ad he people he russ. We remove people who have o review ad have o ried a all which reduces he collecio o 44992 persoal proiles. O average each perso wries 7 reviews ad has reviews i our caegories. Amog all caegories he mos popular oes are movies elecroics ad books. I erms o rus relaios each idividual russ 7 persos o average ad he disribuio o he rus relaios per user alls io a power law disribuio agai as show i Fig. 4.8. Beore we perorm Bayesia ierece o Epiios we eed o urher prue he social ework by ilerig ou social relaios ha are o relaed o he arge aribue. Alhough people i Epiios are coeced by rus relaios he persos ha a idividual russ may be diere rom caegory o caegory. Sice his iormaio is o give i Epiios we apply a heurisic assumpio ha rieds wih similar ypes o commo ieress have similar ypes o relaios. We perorm a K-meas cluserig [5] over 23 caegories i Epiios. Each cluser represes a group o similar ieress. Several examples o clusers are show i Table 4.. For example elecroic ad compuer hardware are clusered ogeher olie sore & services is clusered wih music ad books ec. Oce we have he clusers we iler ou he social relaios Epiio Lik Disribuio umber o Members umber o Direc Frieds Fig. 4.8. umber o direc rieds vs. umber o members i Epiios o a log-log scale

4 roecig rivae Iormaio i Olie Social eworks 26 Table 4.. Examples o he clusered ieress i Epiios Cluser Healh ersoal Fiace Educaio Olie Sore & Services Music Books Resauras & Gourme Movies Elecroics Compuer Hardware Table 4.2. Ierece accuracy compariso bewee Bayesia ad aïve iereces Targe Aribue Accuracy aïve Ierece Bayesia Ierece Healh.46.734 53.9% 63.8% Olie Sore & Services.522.735 52.2% 6.6% Resauras & Gourme.432.667 56.8% 64.2% Elecroics.766.833 76.6% 76.5% i coeced people have o commo ieress wih ohers i he cluser. I oher words whe predicig he arge aribue values o healh we oly cosider he social relaios where coeced persos have a commo ieres i a leas oe caegory i he healh cluser i.e. persoal iace educaio or healh caegories. This ilerig process reduces he origial social ework io a more ocused social ework. Oce he social ework is prued we perorm Bayesia ierece. Table 4.2 compares he ierece accuracy o Bayesia ad aïve iereces. oe ha he opeess used i his experime is %. As we ca see rom his able Bayesia ierece achieves higher predicios ha he aïve ierece. For he healh caegory he ierece accuracy o aïve ierece is 53.9% ad he correspodig accuracy o Bayesia ierece is 63.8%. The resuls o oher aribues show a similar red excep or he elecroics caegories. This is because elecroics is a very popular ieres wih prior probabiliy.766. Thus mos people will have his ieres hemselves ad he iluece rom rieds is o comparaively srog eough. 4.5 rivacy roecio We have show ha privae aribue values ca be ierred rom social relaios. Oe way o preve such ierece is o aler a idividual s social ework which meas chagig his social relaios or he aribue values o his rieds. For social relaios we ca eiher hide exisig relaios or add raudule oes. For rieds aribues we ca eiher hide or alsiy heir values. Our sudy o sociey opeess suggess ha radom chages o a social ework have oly a small eec o he resul o Bayesia ierece. Thereore a eecive proecio mehod requires choosig appropriae cadidaes or aleraio. I his secio we shall sudy privacy proecio schemes. We irs prese a heorem ha capures he causal eec bewee rieds aribue values i a chai opology. We he apply his heorem o develop our proecio schemes. We coduc

262 J. He ad W.W. Chu experimes o he Livejoural ework srucure ad evaluae he perormace o he proposed proecio schemes. 4.5. Causal Eec bewee Frieds Aribue Values As meioed earlier childre s aribue values are he resul o he ieracio bewee he iheriace sregh ad he muaio sregh o heir pares. For example i a amily where he iheriace sregh is sroger ha he muaio sregh childre ed o iheri he same aribue value rom heir pares; hus he evidece o a child havig he aribue value icreases our belie ha his/her pare has he same aribue value. O he corary whe he iheriace sregh is weaker ha he muaio sregh pares ad childre are more likely o have opposie aribue values ad he evidece o a child havig a aribue value reduces our belie ha his/her pare has he same aribue value. Ispired by his observaio we derive a heorem o quaiy he causal eecs bewee rieds aribue values. Theorem: Give a social ework wih a chai opology le be he arge ode be s desceda a hops away. Assumig he aribue value o is he oly evidece observed i his chai ad he prior probabiliy saisies < < we have > i - > ad > i - < where ad are he iheriace sregh ad muaio sregh o he ework respecively. roo: see Appedix. This heorem saes ha whe > he poserior probabiliy is greaer ha he prior probabiliy. Thus he evidece o icreases our predicio or. O he oher had whe < wheher is greaer ha or o also depeds o he value o i.e. he deph o. Whe is eve he evidece ha will icrease our predicio or. However whe is odd he evidece ha will decrease our predicio or. 4.5.2 A rivacy roecio Rule Based o he above heorem we propose a privacy proecio rule as ollows. Assume he proecio goal is o reduce ohers belie ha he arge ode has he aribue value. We aler he odes i he social ework wih aribue value whe >. The aleraio could be: hide or alsiy he aribue values o rieds who saisy he above codiios or 2 hide relaioships o rieds who saisy he above codiios or add raudule relaioships o rieds who do o. O he oher had whe < we aler odes wih aribue value whe ha ode is eve hops away rom he arge ode; oherwise we aler odes wih aribue value. To mislead people io believig he arge ode possesses a aribue value we ca apply hese echiques i he opposie way. Based o he proecio rule we propose he ollowig our proecio schemes: Selecively hidig aribue value SHA. SHA hides he aribue values o appropriae rieds.

4 roecig rivae Iormaio i Olie Social eworks 263 Selecively alsiyig aribue value SFA. SFA alsiies he aribue values o appropriae rieds. Selecively hidig relaioships SHR. SHR hides he relaioship bewee he arge ode ad seleced direc rieds. Whe all he ried relaioships o his idividual are hidde he idividual becomes a sigleo ad he predicio will be made based o he prior probabiliy. Selecively addig relaioships SAR. Corary o hidig relaioships i SHR based o he proecio rule SAR selecively adds raudule relaioships wih people whose aribue values could cause iccorrec ierece o he arge ode 4.5.3 erormace o rivacy roecio I his secio we coduc a se o corolled experimes o evaluae diere schemes or privacy proecio. To provide privacy proecio o a idividual s aribue value arge ode we icremeally aler his idividual s social ework uil he aribue value rom ierece predicaio chages is value ad becomes corary o is origial value. The proecio is cosidered a ailure i i ails o chage he aribue value predicio ad o urher aleraio ca be made. We use a radomly hidig aribue value RHA as a baselie o evaluae he perormace o he proposed proecio schemes. RHA radomly selecs a ried i he idividual s social ework ad hides his/her aribue value wihou ollowig he proecio rule. We repeaedly perorm such operaios wih he idividual s direc rieds. I proecio ails aer we hide all he direc rieds aribue values we proceed o hide aribue values o idirec rieds e.g. a wo hops away rom his idividual ad so o. We have perormed simulaio experimes o 3 idividuals odes i he Livejoural daa se. For each ode we apply he above proecio schemes ad compare heir perormace. The wo merics used are: he perceage o idividuals whose aribue values are successully proeced ad he average umber o aleraios eeded o reach such proecio. Fig. 4.9 displays he perceage o successul proecio or diere iheriace sreghs a.3. We oe ha he eeciveess o he seleced schemes ollows he order: SAR > SFA > SHR > SHA > RHA. We shall ow discuss he behavior o heses schemes o explai ad suppor our experimeal idigs. For RHA SHA SHR ad SFA he maximum umber o aleraios is he umber o descedas e.g. or RHA ad SHA ad he umber o direc rieds e.g. or SFA ad SHR o he arge ode. Sice SAR ca add ew ried relaioships ad suppor he highes umber o aleraios o he social ework SAR provides more privacy proecioo he arge ode. The perormace o SFA ad SHR ollows SAR. We ca view SFA as a combiaio o SHR ad SAR i.e. hidig a ried relaioship ollowed by addig a raudule relaioship. Thereore he perormace o SFA is beer ha ha o SHR. SHA does o perorm as well as SFA ad SHR because rieds a muliple hops away sill leave clues or privacy ierece. Fially RHA does o ollow he proecio rule o ake advaage o he properies o he social ework so i yields he wors perormace. Fig. 4. preses he perormace based o he average umber o aleraios required o successully proec he aribue value o a arge ode. We oed ha RHA

264 J. He ad W.W. Chu erceage o Successul roecio.6.5.4.3.2. RHA SHA SFA SHR SAR.2.4.6.8 Iheriace Sregh Fig. 4.9. erormace compariso o seleced schemes based o he perceage o successul proecio or.3 umber o Aleraios eeded or roecio 2 5 5 RHA SHA SFA SHR SAR.2.4.6.8 Iheriace Sregh Fig. 4.. umber o aleraios required o successully proec he aribue value o a arge ode a.3 has he wors ad SFA has he bes perormace amog he proposed schemes. The average umber o aleraios o SHR ad SAR are comparable or mos cases. oe ha he average umber o required chages or SAR is higher ha ha o SHR a.2. This is because a he low iheriace sregh regio SAR ca proec may cases ha oher schemes cao proec by addig a large umber o raudule ried relaioships. Fially SHA perorms beer ha RHA bu o as good as he oher schemes. Figs. 4.9 ad 4. reveal he eeciveess o usig he proecio rule or derivig privacy proecio schemes. Furhermore SFA ca provide successul proecio or mos cases ye does o require a excessive umber o aleraios o he origial social ework. 4.6 Aalysis o RHR ad SHR I he previous secio we demosraed ha selecive social ework aleraios based o he proecio rule are more eecive ha he mehod ha does o ollow he proecio rule. We shall ow use aalysis o compare he dierece bewee

4 roecig rivae Iormaio i Olie Social eworks 265 radomly hidig ried relaioships RHR ad selecively hidig ried relaioships SHR. Speciically we use he requecy o poserior probabiliy variaio aer hidig ried relaioships as a meric. A hidig scheme ha has a high requecy o poserior probabiliy variaio is cosidered more eecive i privacy proecio ha ha o he low requecy oes. 4.6. Radomly Hidig Fried Relaioships RHR Hidig ried relaioships meas removig direc rieds o he arge ode. The social ework ca be represeed as a wo-level ree wih he arge ode as he roo ad direc rieds... - as leaves. We wa o derive he probabiliy disribuio o he poserior probabiliy variaio due o radomly hidig ried relaioships i.e. he dierece bewee he poserior probabiliy aer hidig heir aribue values ad he correspodig probabiliy o his occurrece. Le radom variables ad be he umber o rieds havig aribue value beore ad aer hidig h ried relaioships where h ad max - h mi - h. I ad we ca compue he poserior probabiliies ad rom Eq. 4.4 respecively. Thereore he poserior probabiliy variaio caused by hidig h ried relaioships is Eq. 4.5:. ' ' ' ' Δ 4.5 ow we wa o derive he probabiliy ha each possible value o Δ occurs. I oher words we wa o compue he probabiliy o he joi eve ad beore ad aer hidig ried relaioships which is equal o Eq. 4.6:. ' ' ' ' 4.6 Iiially i we kow s aribue value is z { } he probabiliy ha ollows he Biomial disribuio Eq. 4.7:. 4.7 By u-codiioig o we obai Eq. 4.8:. 4.8

266 J. He ad W.W. Chu We deie h ad h as he umbers o removed odes i.e. hidde ried relaioships wih aribue value ad respecively h ad h h - h. The we ca compue he codiioal probabiliy ha give as Eq. 4.9: ' ' h h h. 4.9 I his equaio he umeraor represes he umber o ways o remove h odes wih value ad h odes wih value ad he deomiaor represes he umber o combiaios whe choosig ay h odes rom a oal o odes. Subsiuig Eq. 4.8 ad Eq. 4.9 io Eq. 4.6 we obai. 4.6.2 Selecively Hidig Fried Relaioships SHR We perorm a similar aalysis or selecively hidig ried relaioships i a wo-level ree. Ulike radom selecio which radomly selecs odes wih aribue values or his mehod ollows he proecio rules ad selecs all he odes wih he same aribue values o hide. Thus we ca compue Δ as i he previous secio. However he disribuio o poserior probabiliy variaio eeds o be compued dierely. Give h he umber o odes o remove is deermiisic. For example i we remove odes wih aribue he m - h; oherwise m. Cosequely i he ormer case Eq. 4. i m - h ' ' 4. oherwise whereas i he laer case Eq. i m ' ' 4. oherwise where ca be obaied rom Eq. 4.8. 4.6.3 Radomly vs. Selecively Hidig Fried Relaioships We irs compue he average variaio i he poserior probabiliy o boh RHR ad SHR as show i Fig. 4.. We ix o be e ad vary h rom oe o ie. The x- axis is he umber o rieds ha we hide ad he y-axis is he poserior probabiliy variaio based o Eq. 4.5. Clearly SHR has higher poserior probabiliy variaio ha RHR especially or he case o a large umber o hidde rieds. We also plo he hisogram o he poserior probabiliy variaio Δ. We divide he rage o poserior probabiliy variaio io e equal widh iervals. The we compue he probabiliy ha he poserior probabiliy variaio alls i each ierval.

4 roecig rivae Iormaio i Olie Social eworks 267 Fig. 4.2 shows he hisogram o he poserior probabiliy variaio or RHR ad SHR whe he prior probabiliy is.3 ad he iluece sregh is.7. The x axis represes he iervals ad he y axis represes he requecy o he poserior probabiliy variaio wihi he ierval. The requecy is derived rom Eq. 4.6 or RHR ad rom Eqs. 4. ad 4.2 or SHR. For SHR we remove rieds wih aribue value. The maximum umber o removed rieds k cao exceed. As a resul we do o cosider he cases whe < k ad we ormalize he requecy resuls or selecively hidig rieds based o he overall probabiliy ha k..6 oserior robabiliy Variaio.5.4.3.2. RHR SHR 2 4 6 8 umber o Hidde odes Fig. 4.. Average poserior probabiliy variaio or selecively ad radomly hidig ried relaioships For RHR we observe ha he variaio is less ha. or 7% o 9% o he cases i Fig. 4.2a. Thus he poserior probabiliy is ulikely o be varied grealy. I coras he poserior probabiliy variaio i Fig. 4.2b is widely disribued which meas here are oiceable chages i he poserior probabiliy aer hidig odes selecively. This red is more proouced whe icreasig he umber o removed rieds. For example whe h 8 he requecy o he variaio lyig bewee.9 ad. is abou 28.8% as compared o.9% i Fig. 4.2a. These resuls show he eeciveess o usig he proecio rule or privacy proecio..8 h2 h4 h6 h8.8 h2 h4 h6 h8 Frequecy.6.4 Frequecy.6.4.2.2..2.3.4.5.6.7.8.9 oserior robabiliy Variaio a..2.3.4.5.6.7.8.9 oserior robabiliy Variaio b Fig. 4.2. Frequecy o poserior probabiliy variaio or a radomly hidig ried relaioships ad b selecively hidig ried relaioships

268 J. He ad W.W. Chu 4.7 Relaed Work Social ework aalysis has bee widely used i may areas such as sociology geography psychology ad iormaio sciece. I primarily ocuses o he sudy o social srucures ad social ework modelig. For isace Milgram s classic paper [6] i 967 esimaes ha every perso i he world is oly six hops away rom oe aoher. The rece success o he Google search egie [3] applies social ework ideas o he Iere. I [7] ewma reviews he relaioship bewee graph srucures ad he dyamic behavior o large eworks. The Reerral Web projec mied social eworks rom a wide variey o publicly available iormaio []. I sociology social eworks are oe modeled as a auocorrelaio model [5]. I his model idividuals opiios or behaviors are ilueced o oly by hose o ohers bu also by various oher cosrais i social eworks. I uses a weigh marix o represe people s ieracios; however i is sill o very clear how o choose he weigh marix. Leeders suggesed buildig he weigh marix based o ework srucure iormaio like ode degrees [3]. Our work o he oher had models ierpersoal relaios usig codiioal probabiliies; his depeds o boh srucure iormaio ad persoal aribues. Furhermore Domigos ad Richardso hik ha a idividual s decisio o buy a produc is ilueced by his rieds ad hey propose o model social eworks as Markov radom ields [4]. Because he social eworks ha hey sudied are buil rom a collaboraive ilerig daabase each perso is always coeced o a ixed umber o people who are mos similar o him which i ur orms a srucure o sars wih a regular degree. I coras we collec social eworks direcly rom real olie social ework service providers. The umber o rieds o each idividual varies. For he reasos o scalabiliy ad compuaioal cos we model social eworks wih Bayesia eworks. I erms o privacy proecio a grea deal o eor has bee devoed o developig crypography ad securiy proocols o provide securiy daa raser [ 2]. Addiioally here are also models ha have bee developed or preservig idividual aoymiy i daa publishig. Sweeey proposes a K-aoymiy model which imposes cosrais wherei he released iormaio or each perso cao be reideiied rom a group smaller ha k [6]. I our sudy we assume ha all he persoal iormaio released by he owers ca be obaied by ayoe i he social ework. Uder his assumpio we propose echiques o preve malicious users rom ierrig privae iormaio rom social eworks. 4.8 Coclusio We have ocused his sudy o he impac o social relaios o privacy disclosure ad proecio. The causal relaios amog rieds i social eworks ca be eecively modeled by a Bayesia ework ad persoal aribue values ca be ierred via heir social relaios. The ierece accuracy icreases as he iluece sregh icreases bewee rieds. Experimeal resuls wih real daa rom Epiios.com validae our idigs ha Bayesia ierece provides higher ierece accuracies ha aïve ierece.

4 roecig rivae Iormaio i Olie Social eworks 269 Based o he ieracio bewee iheriace sregh ad muaio sregh ad he ework srucure a proecio rule is developed o provide guidace via selecive ework aleraios social relaios ad/or aribue values o provide privacy proecio. Experimeal resuls show ha aleraios based o he proecio rule are ar more eecive ha radom aleraios. Because large variaios o aleraios ca be provided by alsiyig aribue values his yields he mos eecive privacy proecio amog all he proposed mehods. For uure sudy we pla o ivesigae he use o muliple aribues o improve ierece ad proecio. For example die ad lie syle ca reduce he risk o hear disease. Such muli-aribue semaic relaioships ca be obaied via domai expers or daa miig. We ca exploi his iormaio o cluser arge ieress or ierece. Reereces. Abadi M. eedham R.: rude Egieerig racice or Crypographic roocols. Trasacios o Soware Egieerig 22 6 5 995 2. Bellovi S.M. Merri M.: Ecryped Key Exchage: assword-based roocols Secure Agais Dicioary Aacks. I: IEEE Symposium o Securiy ad rivacy Oaklad Calioria May 992 pp. 72 84 992 3. Bri S. age L.: The Aaomy o a Large-Scale Hyperexual Web Search Egie. I: roceedigs o he Seveh Ieraioal World Wide Web Coerece 998 4. Domigos. Richardso M.: Miig he ework Value o Cusomers. I: roceedigs o he Seveh Ieraioal Coerece o Kowledge Discovery ad Daa Miig 2 5. Doreia.: Models o ework Eecs o Social Acors. I: Freema L.C. Whie D.R. Romey K. eds. Research Mehods i Social Aalysis pp. 295 37. George Maso Uiversiy ress Fairax 989 6. Epiios 999 hp://www.epiios.com 7. Friedma. Geoor L. Koller D. eer A.: Learig robabilisic Relaioal Models. I: roceedigs o he 6h Ieraioal Joi Coerece o Ariicial Ielligece IJCAI Sockholm Swede 999 8. He J. Chu W.W. Liu.: Ierrig rivacy Iormaio rom Social eworks. I: Mehrora S. eg D.D. Che H. Thuraisigham B. Wag F.-Y. eds. ISI 26. LCS vol. 3975 Spriger Heidelberg 26 9. Heckerma D.: A Tuorial o Learig Bayesia eworks. Techical Repor. MSR-TR- 95-6 995. Heckerma D. Geiger D. Chickerig D.M.: Learig Bayesia eworks: The Combiaio o Kowledge ad Saisical Daa. I: KDD Workshop pp. 85 96 994. Kauz H. Selma B. Shah M.: Reerral Web: Combiig Social eworks ad Collaboraive Filerig. Commuicaios o he ACM 43 63 65 997 2. Livejoural 997 hp://www.livejoural.com 3. Leeders R.T.: Modelig Social Iluece Through ework Auocorrelaio: Cosrucig he Weigh Marix. Social eworks 24 2 47 22 4. Lowd D. Domigos.: aive Bayes Models or robabiliy Esimaio. I: roceedigs o he Twey-Secod Ieraioal Coerece o Machie Learig ICML. ACM ress Bo 25

27 J. He ad W.W. Chu 5. MacQuee J.B.: Some Mehods or Classiicaio ad Aalysis o Mulivariae Observaios. I: roceedigs o 5h Berkeley Symposium o Mahemaical Saisics ad robabiliy vol. pp. 28 297. Uiversiy o Calioria ress Berkeley 967 6. Milgram S.: The Small World roblem. sychology Today 967 7. ewma M.E.: The Srucure ad Fucio o Complex eworks. SIAM Review 452 67 256 23 8. Sweeey L.: K-Aoymiy: A Model or roecig rivacy. Ieraioal Joural o Uceraiy Fuzziess ad Kowledge-Based Sysems 5 22 9. U.D. o Healh ad O. or Civil Righs Huma Services Sadards or rivacy o Idividually Ideiiable Healh Iormaio 23 hp://www.hhs.gov/ocr/combiedregex.pd 2. Was D.J. Srogaz S.H.: Collecive Dyamics o Small-World eworks. aure 998 2. W.W.W.C. W3C The laorm or rivacy reereces. 24 hp://www.w3.org/tr/3/ 22. hag.l. oole D.: Exploiig Causal Idepedece i Bayesia ework Ierece. Joural o Ariicial Ielligece Research 5 3 328 996 Quesios or Discussios. Wha are he reasos ha he Bayesia ework is suiable or modellig social eworks or daa ierece? 2. Wha are he challeges i usig Bayesia eworks o model social eworks? 3. Why ca social eworks improve he accuracy o iormaio ierece? 4. How does he privacy proecio rule proec privae aribues i social eworks? 5. How ca Bayesia ierece accuracy be improved usig muliple persoal aribues? Appedix Theorem: Casual Eec Bewee Frieds Aribue Values i a Chai ework Give a chai opology le be he arge ode be s desceda a hops away. Assumig ha he aribue value o is he oly evidece observed i his chai ad he prior probabiliy saisies < < we have > i - > ad > i - < where ad are he iheriace sregh ad muaio sregh o he ework respecively. roo: Le us cosider a chai opology show i Fig. 4.3. The arge ode is he roo ode ad each desceda excep he las oe has exacly oe child. Cosider he simples example whe i.e. he arge ode has oly oe direc child as show i Fig. 4.3a. I his example he aribue value o is kow. a Fig. 4.3. The chai ework srucure: a he arge ode wih oe desceda; b he arge ode wih descedas b

4 roecig rivae Iormaio i Olie Social eworks 27 Assumig rom Eq. 4. he poserior probabiliy is:. 4.2 Thus > > > 4.3 Similarly whe we ca prove > i - < or. ow we exed his example o show how he aribue value o a ode a deph aecs he predicio or. I Fig. 4.3b we show a ework o odes. I his igure oly s descede a deph has a kow value. Fig. 4.4 shows he correspodig codiioal probabiliy able or hese odes. Fig. 4.4. Codiioal probabiliy able or odes i Fig. 4.3b Le ad be he joi disribuios o ad :. 4.4 or

272 J. He ad W.W. Chu For example ad so o. We kow. 4.5 Furher rom Fig. 4.4 we have he ollowig relaios:. 4.6 Whe he poserior probabiliy is:. 4.7 Thereore. > > > 4.8 Based o Eq. 4.6 [ ] [ ]. 4.9 Subsiuig Eq. 4.5 io Eq. 4.9 we have [ ]. 4.2 Recursively we have [ ]. 4.2

4 roecig rivae Iormaio i Olie Social eworks 273 Sice ad - we obai [ ]. 4.22 Combiig Eq. 4.8 ad Eq. 4.22 > is equivale o - > whe < <. Similarly we ca show ha > is equivale o - <.