DATA MINING - 1DL360
|
|
- Gervase Weaver
- 8 years ago
- Views:
Transcription
1 DATA MINING - 1DL360 Fall 2013" An introductory class in data mining Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala University, Uppsala, Sweden 10/12/13 1
2 Introduction to Data Mining Privacy in Data mining (slides and selected papers)" Kjell Orsborn Department of Information Technology Uppsala University, Uppsala, Sweden 10/12/13 2
3 Privacy and security in data mining" Protecting private data is an important concern for society Several laws now require explicit consent prior to analysis of an individual s data However, its importance is not limited to individuals Corporations might also need to protect their information s privacy, even though sharing it for analysis could benefit the company. Clearly, the trade-off between sharing information for analysis and keeping it secret to preserve corporate trade secrets and customer privacy is a growing challenge 10/12/13 3
4 Techniques for privacy and security" Most data mining applications operate under the assumption that all the data is available at a single central repository, called a data warehouse. This poses a huge privacy problem because violating only a single repository security exposes all the data. A naive solution to the problem is de-identification remove all identifying information from the data and release it pinpointing exactly what constitutes identification information is difficult worse, even if de-identification is possible and (legally) acceptable, its extremely hard to do effectively without losing the datas utility. studies have used externally available public information to re-identify anonymous data and proved that effectively anonymizing the data required removal of substantial detail. Another solution is to avoid centralized warehouses Requires specialized distributed data mining algorithms, e.g. secure multi-party computation Accurate methods shown for classification and association analysis A third approach is data transformation and perturbation i.e. modifying data so that it no longer represents real individuals. 10/12/13 4
5 Privacy-preserving techniques in data mining" Most methods use some form of transformation of data to perform privacy preservation Typically, these methods reduce the granularity of representation to reduce the privacy Randomization techniques Introduce noise Group-based anonymization, e.g. K-anonymity Prohibits too detailed queries Distributed privacy preservation Prohibits distribution of individual data while supporting aggregate results Downgrading application effectiveness Results such as association rules, classification might violate privacy and can be restricted by a association rule hiding, classifier downgrading and query auditing 10/12/13 5
6 Privacy-preserving techniques in data mining" Randomization techniques Additative perturbation techniques - introduce noise, e.g. in the form of statistical distributions Can be attacked by analyzing correlation structure of randomized data Can also be attacked by matching the distribution of randomized data with the distribution of known public information Multiplicative perturbation techniques E.g. applying multidimensional projections to reduce dimensions of data Data swapping Values for different records are swapped while still being able to compute correct aggregate values Randomization approach is well suited for privacy-preservation in data stream mining since noise added is independent of the rest of the data 10/12/13 6
7 Privacy-preserving techniques in data mining" Group-based anonymization techniques K-anonymity Generalization and/or suppression of attributes to avoid identification of individual data Each release of the data must be such that every combination of values of quasiidentifiers (indirect identifiers) can be indistinguishably matched to at least k respondents. l-diversity In addition to k-anonymity focus on maintaining the diversity of sensitive attributes t-closeness model further enhancement to deal with e.g. skewed data sets Potential problems with sequential releases Several releases of data might reveal more details Linking successive releases must be prevented 10/12/13 7
8 Privacy-preserving techniques in data mining" Distributed privacy-preservation Horizontal partitioning See example next page Vertical partitioning See example next page Distributed algorithms for aggregate operations See example next page Distributed algorithms for k-anonymity Semi-honest adversaries Malicious adversaries 10/12/13 8
9 Distributed data mining" The way the data is distributed also plays an important role in defining the problem because data can be partitioned into many parts either vertically or horizontally. Vertical partitioning of data implies that although different sites gather information about the same set of entities, they collect different feature sets. Banks, for example, collect financial transaction information, whereas the IRS collects tax information. Figure 2 illustrates vertical partitioning and the kind of useful knowledge we can extract from it. The figure describes two databases, one containing individual medical records and another containing cell-phone information for the same set of people. Mining the joint global database might reveal such information as cell phones with Li/Ion batteries can lead to brain tumors in diabetics. 10/12/13 9
10 Distributed data mining" In horizontal partitioning, different sites collect the same set of information but about different entities. Different supermarkets, for example, collect the same type of grocery shopping data. Figure 3 illustrates horizontal partitioning and shows the credit-card databases of two different (local) credit unions. Taken together, we might see that fraudulent customers often have similar transaction histories. However, no credit union has sufficient data by itself to discover the patterns of fraudulent behavior. 10/12/13 10
11 Secure distributed computation " The secure sum protocol is a simple example of a (information theoretically) secure multi-party computation. Site k generates a random number R uniformly chosen from [0.. n], adds this to its local value x k, and then sends the sum R + x k (mod n) to site k+ 1 (mod l). Drawback of SMC is inefficiency and complexity of model 10/12/13 11
12 Privacy-preserving techniques in data mining" Privacy-preservation of application results Related to disclosure control in statistical databases Association rule-hiding Distortion Blocking Downgrading classifier effectiveness Modifying data so classification accuracy is reduced while retaining the utility of data for other applications Query auditing and inference control Query auditing denies one or more queries from a sequence of queries Query inference control underlying data (or query result) is perturbed so privacy is preserved See slides for statistical data security 10/12/13 12
13 Statistical database security " Databases often include sensitive information about single individuals that must be protected from unallowed use. However, statistical information should be extractable from the database. Statistical database security must prohibit access of individual data elements. Three main security mechanisms: conceptual, restriction-based, and perturbation-based. Examples: prohibit queries on attribute level only queries for statistical aggregation (statistical queries) statistical queries are prohibited when the selection from the population is to small. prohibit repeated statistical queries on the same tuples. introduce distortion into data. 10/12/13 13
14 Security in statistical databases" Statistical database security, (also called inference control), should prevent and avoid possibilities to infer protected information from the set of allowed and fully legitimate statistical queries (statistical aggregation). A security problem occur when providing statistical information without requiring to release sensitive information concerning individuals. The main problem with SDB security is to accomplish a good compromise between integrity for individuals and the need for knowledge and information management and analysis of organizations. 10/12/13 14
15 Inference protection techniques " One can divide inference protection techniques into three main categories: conceptual, restriction-based, and perturbation-based techniques. Conceptual techniques: Treats the security problem on a conceptual level lattice model conceptual partitioning 10/12/13 15
16 Inference protection techniques " Restriction-based techniques Prevent queries for certain types of statistical queries query-set size control expanded query-set size control query-set overlap control audit-based control 10/12/13 16
17 Inference protection techniques " Perturbation-based techniques Modifies information that is stored or presented data swapping random-sample queries fixed perturbation query-based perturbation rounding (systematic, random, controlled) 10/12/13 17
18 Privacy-preserving techniques in data mining" Limitation of privacy The curse of dimensionality Problems with many privacy-preserving algorithms in high-dimensional space due to sparseness Applications of privacy-preserving data mining Medical databases Sensitive info patients, family members, addresses etc Bioterrorism E.g. Need to compare possible antrax attack with data from outbreak of common respiratory diceases Homeland security Credential validation problem, identity theft, web camera and video surveillance, whatch list problem Genomic privacy Keeping privacy of DNA data while making it available for analysis 10/12/13 18
DATA MINING - 1DL105, 1DL025
DATA MINING - 1DL105, 1DL025 Fall 2009 An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/ht09 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationA GENERAL SURVEY OF PRIVACY-PRESERVING DATA MINING MODELS AND ALGORITHMS
Chapter 2 A GENERAL SURVEY OF PRIVACY-PRESERVING DATA MINING MODELS AND ALGORITHMS Charu C. Aggarwal IBM T. J. Watson Research Center Hawthorne, NY 10532 charu@us.ibm.com Philip S. Yu IBM T. J. Watson
More informationA generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment
www.ijcsi.org 434 A generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment V.THAVAVEL and S.SIVAKUMAR* Department of Computer Applications, Karunya University,
More informationComputer Security (EDA263 / DIT 641)
Computer Security (EDA263 / DIT 641) Lecture 12: Database Security Erland Jonsson Department of Computer Science and Engineering Chalmers University of Technology Sweden Outline Introduction to databases
More informationCS346: Advanced Databases
CS346: Advanced Databases Alexandra I. Cristea A.I.Cristea@warwick.ac.uk Data Security and Privacy Outline Chapter: Database Security in Elmasri and Navathe (chapter 24, 6 th Edition) Brief overview of
More informationInformation Security in Big Data: Privacy and Data Mining (IEEE, 2014) Dilara USTAÖMER 2065787
Information Security in Big Data: Privacy and Data Mining (IEEE, 2014) Dilara USTAÖMER 2065787 2015/5/13 OUTLINE Introduction User Role Based Methodology Data Provider Data Collector Data Miner Decision
More informationDATABASDESIGN FÖR INGENJÖRER - 1DL124
1 DATABASDESIGN FÖR INGENJÖRER - 1DL124 Sommar 2005 En introduktionskurs i databassystem http://user.it.uu.se/~udbl/dbt-sommar05/ alt. http://www.it.uu.se/edu/course/homepage/dbdesign/st05/ Kjell Orsborn
More informationData Privacy and Biomedicine Syllabus - Page 1 of 6
Data Privacy and Biomedicine Syllabus - Page 1 of 6 Course: Data Privacy in Biomedicine (BMIF-380 / CS-396) Instructor: Bradley Malin, Ph.D. (b.malin@vanderbilt.edu) Semester: Spring 2015 Time: Mondays
More informationDatabase and Data Mining Security
Database and Data Mining Security 1 Threats/Protections to the System 1. External procedures security clearance of personnel password protection controlling application programs Audit 2. Physical environment
More informationA Survey of Quantification of Privacy Preserving Data Mining Algorithms
A Survey of Quantification of Privacy Preserving Data Mining Algorithms Elisa Bertino, Dan Lin, and Wei Jiang Abstract The aim of privacy preserving data mining (PPDM) algorithms is to extract relevant
More informationPrivacy-preserving Data Mining: current research and trends
Privacy-preserving Data Mining: current research and trends Stan Matwin School of Information Technology and Engineering University of Ottawa, Canada stan@site.uottawa.ca Few words about our research Universit[é
More informationNSF Workshop on Big Data Security and Privacy
NSF Workshop on Big Data Security and Privacy Report Summary Bhavani Thuraisingham The University of Texas at Dallas (UTD) February 19, 2015 Acknowledgement NSF SaTC Program for support Chris Clifton and
More informationInformation Security in Big Data using Encryption and Decryption
International Research Journal of Computer Science (IRJCS) ISSN: 2393-9842 Information Security in Big Data using Encryption and Decryption SHASHANK -PG Student II year MCA S.K.Saravanan, Assistant Professor
More informationPrivacy Preserved Association Rule Mining For Attack Detection and Prevention
Privacy Preserved Association Rule Mining For Attack Detection and Prevention V.Ragunath 1, C.R.Dhivya 2 P.G Scholar, Department of Computer Science and Engineering, Nandha College of Technology, Erode,
More informationUsing multiple models: Bagging, Boosting, Ensembles, Forests
Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 ISSN 2229-5518 1582
1582 AN EFFICIENT CRYPTOGRAPHIC APPROACH FOR PRESERVING PRIVACY IN DATA MINING T.Sujitha 1, V.Saravanakumar 2, C.Saravanabhavan 3 1. M.E. Student, Sujiraj.me@gmail.com 2. Assistant Professor, visaranams@yahoo.co.in
More informationDatabase security. André Zúquete Security 1. Advantages of using databases. Shared access Many users use one common, centralized data set
Database security André Zúquete Security 1 Advantages of using databases Shared access Many users use one common, centralized data set Minimal redundancy Individual users do not have to collect and maintain
More informationMULTILATERAL SECURITY. Based on chapter 9 of Security Engineering by Ross Anderson
MULTILATERAL SECURITY Based on chapter 9 of Security Engineering by Ross Anderson עומר פפרו Paparo Presenter: Omer Outline Introduction Motivation Data flow models Compartmentation and the lattice model
More informationSurvey on Data Privacy in Big Data with K- Anonymity
Survey on Data Privacy in Big Data with K- Anonymity Salini. S, Sreetha. V. Kumar, Neevan.R M.Tech Student, Dept of CSE, Marian Engineering College, Trivandrum, Kerala, India Asst. Professor, Dept of CSE,
More informationA Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining
433 467 A Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining Hongwei Tian, Weining Zhang, Shouhuai Xu and Patrick Sharkey Department of Computer Science, University of Texas at San
More informationA Brief Survey on Anonymization Techniques for Privacy Preserving Publishing of Social Network Data
A Brief Survey on Anonymization Techniques for Privacy Preserving Publishing of Social Network Data Bin Zhou School of Computing Science Simon Fraser University, Canada bzhou@cs.sfu.ca Jian Pei School
More informationData mining successfully extracts knowledge to
C O V E R F E A T U R E Privacy-Preserving Data Mining Systems Nan Zhang University of Texas at Arlington Wei Zhao Rensselaer Polytechnic Institute Although successful in many applications, data mining
More informationSearch and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social
More informationComputer Security (EDA263 / DIT 641)
Computer Security (EDA263 / DIT 641) Lecture in EDA263: Database Security Erland Jonsson Department of Computer Science and Engineering Chalmers University of Technology Sweden Outline Introduction to
More informationOLAP Online Privacy Control
OLAP Online Privacy Control M. Ragul Vignesh and C. Senthil Kumar Abstract--- The major issue related to the protection of private information in online analytical processing system (OLAP), is the privacy
More information(Big) Data Anonymization Claude Castelluccia Inria, Privatics
(Big) Data Anonymization Claude Castelluccia Inria, Privatics BIG DATA: The Risks Singling-out/ Re-Identification: ADV is able to identify the target s record in the published dataset from some know information
More informationAnonymization: Enhancing Privacy and Security of Sensitive Data of Online Social Networks
Anonymization: Enhancing Privacy and Security of Sensitive Data of Online Social Networks Mr.Gaurav.P.R. PG Student, Dept.Of CS&E S.J.M.I.T Chitradurga, India Mr.Gururaj.T M.Tech Associate Professor, Dept.Of
More informationMario Guarracino. Data warehousing
Data warehousing Introduction Since the mid-nineties, it became clear that the databases for analysis and business intelligence need to be separate from operational. In this lecture we will review the
More informationFoundation Working Group
Foundation Working Group Proposed Recommendations on De-identifying Information for Disclosure to Third Parties The Foundation Working Group (FWG) engaged in discussions around protecting privacy while
More informationA THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA
A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University Agency Internal User Unmasked Result Subjects
More informationPRIVACY IN STATISTICAL DATABASES: AN APPROACH USING CELL SUPPRESSION NEELABH BAIJAL. Department of Computer Science
PRIVACY IN STATISTICAL DATABASES: AN APPROACH USING CELL SUPPRESSION NEELABH BAIJAL Department of Computer Science APPROVED: Luc Longpré, Ph.D. Vladik Kreinovich, Ph.D. Martine Ceberio, Ph.D. Scott Starks,
More informationCentralized and Distributed Anonymization for High-Dimensional Healthcare Data
Centralized and Distributed Anonymization for High-Dimensional Healthcare Data NOMAN MOHAMMED and BENJAMIN C. M. FUNG Concordia University PATRICK C. K. HUNG University of Ontario Institute of Technology
More informationData attribute security and privacy in distributed database system
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. V (Mar-Apr. 2014), PP 27-33 Data attribute security and privacy in distributed database system
More informationLi Xiong, Emory University
Healthcare Industry Skills Innovation Award Proposal Hippocratic Database Technology Li Xiong, Emory University I propose to design and develop a course focused on the values and principles of the Hippocratic
More informationChapter 23. Database Security. Security Issues. Database Security
Chapter 23 Database Security Security Issues Legal and ethical issues Policy issues System-related issues The need to identify multiple security levels 2 Database Security A DBMS typically includes a database
More informationPrivacy Preserving Data Mining
Privacy Preserving Data Mining Technion - Computer Science Department - Ph.D. Thesis PHD-2011-01 - 2011 Arie Friedman Privacy Preserving Data Mining Technion - Computer Science Department - Ph.D. Thesis
More informationPrivacy Preserving Outsourcing for Frequent Itemset Mining
Privacy Preserving Outsourcing for Frequent Itemset Mining M. Arunadevi 1, R. Anuradha 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College, Coimbatore, India 1 Assistant
More informationInternational Journal of Advanced Computer Technology (IJACT) ISSN:2319-7900 PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS
PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS First A. Dr. D. Aruna Kumari, Ph.d, ; Second B. Ch.Mounika, Student, Department Of ECM, K L University, chittiprolumounika@gmail.com; Third C.
More informationKeywords: Security; data warehouse; data mining; statistical database security; privacy
Security in Data Warehouses By Edgar R. Weippl, Secure Business Austria, Vienna, Austria Favoritenstrasse 16 1040 Wien Tel: +43-1-503 12 80 Fax: +43-1-505 88 88 E-mail: eweippl@securityresearch.at Keywords:
More informationProposing a Novel Synergized K-Degree L-Diversity T- Closeness Model for Graph Based Data Anonymization
Proposing a Novel Synergized K-Degree L-Diversity T- Closeness Model for Graph Based Data Anonymization S.Charanyaa 1, K.Sangeetha 2 M.Tech. Student, Dept of Information Technology, S.N.S. College of Technology,
More informationRespected Chairman and the Members of the Board, thank you for the opportunity to testify today on emerging technologies that are impacting privacy.
Statement of Latanya Sweeney, PhD Associate Professor of Computer Science, Technology and Policy Director, Data Privacy Laboratory Carnegie Mellon University before the Privacy and Integrity Advisory Committee
More informationPrivacy-Preserving Outsourcing Support Vector Machines with Random Transformation
Privacy-Preserving Outsourcing Support Vector Machines with Random Transformation Keng-Pei Lin Ming-Syan Chen Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan Research Center
More informationRandom Projection-based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining
Random Projection-based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining Kun Liu Hillol Kargupta and Jessica Ryan Abstract This paper explores the possibility of using multiplicative
More informationA Brief Survey on Anonymization Techniques for Privacy Preserving Publishing of Social Network Data
A Brief Survey on Anonymization Techniques for Privacy Preserving Publishing of Social Network Data Bin Zhou School of Computing Science Simon Fraser University, Canada bzhou@cs.sfu.ca Jian Pei School
More informationBig Data - Security and Privacy
Big Data - Security and Privacy Elisa Bertino CS Department, Cyber Center, and CERIAS Purdue University Cyber Center! Big Data EveryWhere! Lots of data is being collected, warehoused, and mined Web data,
More informationModule outline. CS 458 / 658 Computer Security and Privacy. (Relational) Databases. Module outline. Module 6 Database Security and Privacy.
Module outline CS 458 / 658 Computer Security and Privacy Module 6 Database Security and Privacy Fall 2008 1 Introduction to databases 2 Security requirements 3 Data disclosure and inference 4 Multilevel
More informationThe Christian Doppler Laboratory for Client-Centric Cloud Computing
The Christian Doppler Laboratory for Client-Centric Cloud Computing Application-Oriented Fundamental Research Klaus-Dieter Schewe 1,2, Károly Bósa 2, Harald Lampesberger 2 Ji Ma 2, Boris Vleju 2 1 Software
More informationPrivacy-Preserving Big Data Publishing
Privacy-Preserving Big Data Publishing Hessam Zakerzadeh 1, Charu C. Aggarwal 2, Ken Barker 1 SSDBM 15 1 University of Calgary, Canada 2 IBM TJ Watson, USA Data Publishing OECD * declaration on access
More informationCS 458 / 658 Computer Security and Privacy. Module outline. Module outline. Module 6 Database Security and Privacy. Winter 2010
CS 458 / 658 Computer Security and Privacy Module 6 Database Security and Privacy Winter 2010 Module outline 1 Introduction to databases 2 Security requirements 3 Data disclosure and inference 4 Multilevel
More informationSocietal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse 2015 21-22 April Oslo
Privacy Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse 2015 21-22 April Oslo Kassaye Yitbarek Yigzaw UiT The Arctic University of Norway Outline
More informationPracticing Differential Privacy in Health Care: A Review
TRANSACTIONS ON DATA PRIVACY 5 (2013) 35 67 Practicing Differential Privacy in Health Care: A Review Fida K. Dankar*, and Khaled El Emam* * CHEO Research Institute, 401 Smyth Road, Ottawa, Ontario E mail
More informationData Warehousing and Data Mining
Data Warehousing and Data Mining Winter Semester 2010/2011 Free University of Bozen, Bolzano DW Lecturer: Johann Gamper gamper@inf.unibz.it DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html
More informationA Study of Data Perturbation Techniques For Privacy Preserving Data Mining
A Study of Data Perturbation Techniques For Privacy Preserving Data Mining Aniket Patel 1, HirvaDivecha 2 Assistant Professor Department of Computer Engineering U V Patel College of Engineering Kherva-Mehsana,
More informationPrivacy-preserving Data-aggregation for Internet-of-things in Smart Grid
Privacy-preserving Data-aggregation for Internet-of-things in Smart Grid Aakanksha Chowdhery Postdoctoral Researcher, Microsoft Research ac@microsoftcom Collaborators: Victor Bahl, Ratul Mahajan, Frank
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationPRACTICAL DATA MINING IN A LARGE UTILITY COMPANY
QÜESTIIÓ, vol. 25, 3, p. 509-520, 2001 PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY GEORGES HÉBRAIL We present in this paper the main applications of data mining techniques at Electricité de France,
More informationCustomer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
More informationDe-Identification of Clinical Data
De-Identification of Clinical Data Sepideh Khosravifar, CISSP Info Security Analyst IV Tyrone Grandison, PhD Manager, Privacy Research, IBM TEPR Conference 2008 Ft. Lauderdale, Florida May 17-21, 2008
More informationCourse 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
More informationPerforming Data Mining in (SRMS) through Vertical Approach with Association Rules
Performing Data Mining in (SRMS) through Vertical Approach with Association Rules Mr. Ambarish S. Durani 1 and Miss. Rashmi B. Sune 2 MTech (III rd Sem), Vidharbha Institute of Technology, Nagpur, Nagpur
More informationARX A Comprehensive Tool for Anonymizing Biomedical Data
ARX A Comprehensive Tool for Anonymizing Biomedical Data Fabian Prasser, Florian Kohlmayer, Klaus A. Kuhn Chair of Biomedical Informatics Institute of Medical Statistics and Epidemiology Rechts der Isar
More informationFoundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
More informationS Z E C S K A Y Ü g y v é d i
EMPLOYEE MONITORING FROM THE PERSPECTIVE OF HUNGARIAN DATA PROTECTION LAWS While employers oftentimes wish to monitor the behavior of their employees, which generally is a rightful intention, it is also
More informationOverview of Information Security. Murat Kantarcioglu
UT DALLAS Erik Jonsson School of Engineering & Computer Science Overview of Information Security Murat Kantarcioglu Pag. 1 Purdue University Outline Information Security: basic concepts Privacy: basic
More informationObfuscation of sensitive data in network flows 1
Obfuscation of sensitive data in network flows 1 D. Riboni 2, A. Villani 1, D. Vitali 1 C. Bettini 2, L.V. Mancini 1 1 Dipartimento di Informatica,Universitá di Roma, Sapienza. E-mail: {villani, vitali,
More informationBuilding Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu
Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the
More informationPrivacy-by-design in big data analytics and social mining
Monreale et al. EPJ Data Science 2014, 2014:10 REGULAR ARTICLE OpenAccess Privacy-by-design in big data analytics and social mining Anna Monreale 1,2*, Salvatore Rinzivillo 2, Francesca Pratesi 1,2, Fosca
More informationFormal Methods for Preserving Privacy for Big Data Extraction Software
Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability
More informationChapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
More informationData Mining Introduction
Data Mining Introduction Organization Lectures Mondays and Thursdays from 10:30 to 12:30 Lecturer: Mouna Kacimi Office hours: appointment by email Labs Thursdays from 14:00 to 16:00 Teaching Assistant:
More informationOn Density Based Transforms for Uncertain Data Mining
On Density Based Transforms for Uncertain Data Mining Charu C. Aggarwal IBM T. J. Watson Research Center 19 Skyline Drive, Hawthorne, NY 10532 charu@us.ibm.com Abstract In spite of the great progress in
More informationOn the Performance Measurements for Privacy Preserving Data Mining
On the Performance Measurements for Privacy Preserving Data Mining Nan Zhang, Wei Zhao, and Jianer Chen Department of Computer Science, Texas A&M University College Station, TX 77843, USA {nzhang, zhao,
More informationChapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
More informationTop Ten Security and Privacy Challenges for Big Data and Smartgrids. Arnab Roy Fujitsu Laboratories of America
1 Top Ten Security and Privacy Challenges for Big Data and Smartgrids Arnab Roy Fujitsu Laboratories of America 2 User Roles and Security Concerns [SKCP11] Users and Security Concerns [SKCP10] Utilities:
More informationProtecting Patient Privacy. Khaled El Emam, CHEO RI & uottawa
Protecting Patient Privacy Khaled El Emam, CHEO RI & uottawa Context In Ontario data custodians are permitted to disclose PHI without consent for public health purposes What is the problem then? This disclosure
More informationPolicy-based Pre-Processing in Hadoop
Policy-based Pre-Processing in Hadoop Yi Cheng, Christian Schaefer Ericsson Research Stockholm, Sweden yi.cheng@ericsson.com, christian.schaefer@ericsson.com Abstract While big data analytics provides
More informationDe-Identification of Health Data under HIPAA: Regulations and Recent Guidance" " "
De-Identification of Health Data under HIPAA: Regulations and Recent Guidance" " " D even McGraw " Director, Health Privacy Project January 15, 201311 HIPAA Scope Does not cover all health data Applies
More informationAttestation and Authentication Protocols Using the TPM
Attestation and Authentication Protocols Using the TPM Ariel Segall June 21, 2011 Approved for Public Release: 11-2876. Distribution Unlimited. c 2011. All Rights Reserved. (1/28) Motivation Almost all
More informationSecure Computation Martin Beck
Institute of Systems Architecture, Chair of Privacy and Data Security Secure Computation Martin Beck Dresden, 05.02.2015 Index Homomorphic Encryption The Cloud problem (overview & example) System properties
More informationPrivacy Aspects in Big Data Integration: Challenges and Opportunities
Privacy Aspects in Big Data Integration: Challenges and Opportunities Peter Christen Research School of Computer Science, The Australian National University, Canberra, Australia Contact: peter.christen@anu.edu.au
More informationChapter 23. Database Security. Security Issues. Database Security
Chapter 23 Database Security Security Issues Legal and ethical issues Policy issues System-related issues The need to identify multiple security levels 2 Database Security A DBMS typically includes a database
More informationEnabling the 21st Century HEALTH CARE INFORMATION TECHNOLOGY REVOLUTION
Enabling the 21st Century HEALTH CARE INFORMATION TECHNOLOGY REVOLUTION The U.S. government s vision of the health care information infrastructure is possible using technologies that support the sharing
More informationPrivacy & data protection in big data: Fact or Fiction?
Privacy & data protection in big data: Fact or Fiction? Athena Bourka ENISA ISACA Athens Conference 24.11.2015 European Union Agency for Network and Information Security Agenda 1 Privacy challenges in
More informationPrivacy Challenges of Telco Big Data
Dr. Günter Karjoth June 17, 2014 ITU telco big data workshop Privacy Challenges of Telco Big Data Mobile phones are great sources of data but we must be careful about privacy 1 / 15 Sources of Big Data
More informationComputer Security: Principles and Practice
Computer Security: Principles and Practice Chapter 5 Database Security First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Database Security 1 Relational Databases constructed
More informationFACIAL IMAGE DE-IDENTIFICATION USING IDENTIY SUBSPACE DECOMPOSITION. Hehua Chi 1,2, Yu Hen Hu 2
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) FACIAL IMAGE DE-IDENTIFICATION USING IDENTIY SUBSPACE DECOMPOSITION Hehua Chi 1,2, Yu Hen Hu 2 1 State Key Laboratory
More informationKnowledge Discovery and Data Mining. Structured vs. Non-Structured Data
Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.
More information1.2: DATA SHARING POLICY. PART OF THE OBI GOVERNANCE POLICY Available at: http://www.braininstitute.ca/brain-code-governance. 1.2.
1.2: DATA SHARING POLICY PART OF THE OBI GOVERNANCE POLICY Available at: http://www.braininstitute.ca/brain-code-governance 1.2.1 Introduction Consistent with its international counterparts, OBI recognizes
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationHomomorphic Encryption Schema for Privacy Preserving Mining of Association Rules
Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules M.Sangeetha 1, P. Anishprabu 2, S. Shanmathi 3 Department of Computer Science and Engineering SriGuru Institute of Technology
More informationArnab Roy Fujitsu Laboratories of America and CSA Big Data WG
Arnab Roy Fujitsu Laboratories of America and CSA Big Data WG 1 Security Analytics Crypto and Privacy Technologies Infrastructure Security 60+ members Framework and Taxonomy Chair - Sree Rajan, Fujitsu
More informationMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 asistithod@gmail.com
More informationData Warehouse: Introduction
Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,
More informationIEEE JAVA Project 2012
IEEE JAVA Project 2012 Powered by Cloud Computing Cloud Computing Security from Single to Multi-Clouds. Reliable Re-encryption in Unreliable Clouds. Cloud Data Production for Masses. Costing of Cloud Computing
More informationCYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION
CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION MATIJA STEVANOVIC PhD Student JENS MYRUP PEDERSEN Associate Professor Department of Electronic Systems Aalborg University,
More informationData Formulation Analysis of a Network Marketing Agency
Analysis of Integrated Data without Data Integration Alan F. Karr National Institute of Statistical Sciences karr@niss.org Your Privacy is Threatened!? Problem Formulation Multiple, distributed databases
More informationPrivacy-Preserving In Big Data with Efficient Metrics
Privacy-Preserving In Big Data with Efficient Metrics 1 N.Arunkumar and 2 S.Charumathy 1 PG scholar, 2 PG scholar 1 Department of Computer Science and Engineering,, 2 Department of Computer Science and
More informationIMPROVED MASK ALGORITHM FOR MINING PRIVACY PRESERVING ASSOCIATION RULES IN BIG DATA
International Conference on Computer Science, Electronics & Electrical Engineering-0 IMPROVED MASK ALGORITHM FOR MINING PRIVACY PRESERVING ASSOCIATION RULES IN BIG DATA Pavan M N, Manjula G Dept Of ISE,
More informationData Warehousing and Data Mining
Data Warehousing and Data Mining Winter Semester 2012/2013 Free University of Bozen, Bolzano DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html Organization
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More information