omputer Science S 45 Introduction to omputer Security Topic 6.: Database Inference ontrol Outline Inference attacks Direct attacks (no inference needed) Indirect attacks via aggregations Tracker attacks Inference via linear systems Inference via database constraints Inference control Limited Response Suppression ombining results Random sample Random data perturbation Query analysis omputer Science
Direct ttacks Name Sex Race id ines Drugs Dorm dams 5 45 ailey hin Dewitt 5 Earhart 95 ein 5 Groff 4 Hill 5 Koch Liu ajors Query List NE Where SEX= ^ DRUGS= Results: omputer Science Direct ttacks (ont d) Name dams ailey hin Dewitt Earhart ein Groff Hill Koch Liu ajors Sex Query List NE where (SEX= ^ DRUGS=) (SEX!= ^ SEX!=) (DOR=YRES) Result= omputer Science Race id 5 4 5 ines 45 5 95 5 Drugs Dorm 4
Direct ttacks (ont d) Protect against direct attacks n items over k percent rule Data should be withheld if n items represent over k% of the result reported. dopted by U.S. ensus ureau Intuition: do not reveal results where a small number of records make up a large proportion of the category. Release only statistics Examples: sum, average, count, etc. omputer Science 5 Indirect ttacks via ggregations Sums of inancial id by Dorm and Sex Total 5 4 7 4 Total 8 emale Students Living in Name Liu Try to infer a sensitive value from a reported sum. What can we infer for the female students living in? s financial aid = omputer Science 6
Indirect ttacks via ggregations (ont d) ount of inancial id by Dorm and Sex Total 5 6 Total 4 4 ale Students Living in or Name dams Groff Dorm With additional counts, what can we further infer? s financial aid = s financial aid = omputer Science 7 Tracker ttacks DS protection llow aggregation of sensitive attributes only when the number of data items that constitute the aggregate is more than a threshold t. Trackers defeats this protection by using additional queries. omputer Science 8 4
Tracker ttacks (ont d) Name Sex Race id ines Drugs Dorm dams 5 45 ailey hin Dewitt 5 Earhart 95 ein 5 Groff 4 Hill 5 Koch Liu ajors Query Sum ((Sex=) ^ (Race=) ^ (Dorm=)) Is this allowed? omputer Science 9 Tracker ttacks (ont d) sum (a^b^c) = sum(a) sum (a^ (b^c)) sum ((Sex=) ^ (Race=) ^ (Dorm=)) is equivalent to sum (Sex=) sum ((Sex=)^(Race!= Dorm!= ) omputer Science 5
Tracker ttacks (ont d) Name Sex Race id ines Drugs Dorm dams 5 45 ailey hin Dewitt 5 Earhart 95 ein 5 Groff 4 Hill 5 Koch Liu ajors ount ((Sex=) ^ (Race=) ^ (Dorm=)) = = omputer Science Tracker ttacks (ont d) q() is disallowed = ^ T= ^ ~ Tracker q()=q() q(t) omputer Science 6
Inference via Linear Systems Generalization of the Tracker attacks We can get a sequence of linear equations through a sequence of queries Variables: sensitive values Q = c + c + c + c4 + c5 Q = c + c + c4 Q = c + c4 Q4 = c4 + c5 Q5 = c + c5 5 = ((Q Q) (Q Q4))/. omputer Science Inference via Database onstraints Integrity constraints Database dependencies Key integrity omputer Science 4 7
Integrity onstraints =+ =public, =public, and =secret can be calculated from and, i.e., secret information can be calculated from public data omputer Science 5 Database Dependencies Knowledge about the database could be used to make inference unctional dependencies ulti-valued dependencies Join dependencies etc. omputer Science 6 8
unctional Dependency D:, that is for any two tuples in the relation, if they have the same value for, they must have the same value for. Example: D: Rank Salary Secret information: Name and Salary together Query: Name and Rank Query: Rank and Salary ombine answers for query and to reveal Name and Salary together omputer Science 7 Inference ontrols Two ways Suppression Sensitive data values are not provided Query is rejected without response oncealing The answer provided is close to but not exactly the actual value. oth can be applied to either queries or individual items within the database. omputer Science 8 9
Limited Response Suppression Suppression technique Eliminate low-frequency elements Not always work. Student by Dorm and Sex Total -- -- 5 -- 6 Total 4 4 What are the suppressed values? omputer Science 9 ombining Results Suppression techniques ombine rows or columns to protect sensitive values. Present results in ranges Rounding Students by Sex and Drug Use Sex Drug Use Sex or Drug Use or omputer Science
Random Sample oncealing technique Use random sample of the database to answer queries. The same sample set should be chosen for equivalent queries. Prevent averaging attacks omputer Science Random Data Perturbation oncealing technique Perturb the values of the database by a small error. Statistical measures such as sum and mean will be close. Easier than random sample. omputer Science
Query nalysis Suppression technique Decide whether a result should be provided through analyzing queries and their implications. Need to maintain a query history Difficult to know what a user knows from out-ofband ways. omputer Science ethodologies of Inference ontrol Suppress obviously sensitive information Easy to do, but tend to be over restrictive Track what user knows Very expensive Query history annot deal with conspiracy Disguise data Sacrifice the quality of data omputer Science 4
onclusions No general technique is available to solve the problem Need assurance of protection Hard to incorporate outside knowledge omputer Science 5