Globe Tech, Inc. 76 Northeastern Blvd., Suite #30B Nashua, NH Fax PrivGuard an eprivacy Solution

Size: px
Start display at page:

Download "Globe Tech, Inc. 76 Northeastern Blvd., Suite #30B Nashua, NH Fax PrivGuard an eprivacy Solution"

From this document you will learn the answers to the following questions:

  • What is stored in a microaggregation?

  • Where is the location of Globe Tech , Inc .?

  • What is the purpose of PrivGuard?

Transcription

1 Globe Tech, Inc. 76 Northeastern Blvd., Suite #30B Nashua, NH Fax Protecting Private Healthcare Information (PHI) PrivGuard an eprivacy Solution As a result of widespread use of electronic health records (EHR), in recent years, there has been an explosion of digital patient data being generated and collected by health-care organizations. In tandem with this unprecedented growth of digital data, techniques for data mining have gained popularity in a wide variety of domains. While the health-care industry has benefited from information sharing and data mining, patients are increasingly concerned about invasion of their privacy by these practices. Similarly, the public and civil libertarian groups have also been concerned about privacy protection with EHR becoming main stream and increasingly popular with the medical professionals and patients. These growing concerns on privacy led to the passage of Health Insurance Portability and Accountability Act (HIPAA) in HIPAA HIPAA is designed to give patients more control over their personal medical information. It explicitly outlines how medical records can be given to third parties and carries stiff penalties for violations. The impact of HIPAA on medical research is beginning to surface in the research community with some researchers fearing that it could jeopardize studies of drug safety, medical device validation, and disease prediction and prevention. While HIPAA was intended to protect patient privacy, it has a significant impact on medical studies involving collection of data from a variety of health-care organizations. Because HIPAA guidelines are so cumbersome and the penalties for violations so steep, many organizations, particularly those small community hospitals and clinics, may decide it is safer and easier not to provide data for the medical research. Due to this concern, the Association of American Medical Colleges plans to compile a database so it can document the effect of HIPAA on research activities. PrivGuard is developed to address these concerns and assist healthcare providers in HIPAA compliance. PrivGuard has automated the process of quickly de-identifying and masking sensitive data, yet preserving the overall data integrity to permit high quality data mining and research analysis. PrivGuard is developed from years of innovative research work and integrates various techniques, such as decision trees, linear programming, Bayes estimation, kdtrees, and data masking, and attempts to apply them to help protect patient privacy. PrivGuard s broader implications are that it allows safe sharing of patient data across health-care organizational boundaries, while satisfying compliance requirements and providing the quality data to analysts for data-mining research that benefits both the medical research and society at large. The PrivGuard Solution The PrivGuard system provides several data masking algorithms. Data masking is very different from encryption; it does not change the data via ciphering nor does it require any keys or digital certificates to change the data. Instead data masking changes the data values using noise perturbation, data aggregation, or data swapping. The properties of the data are generally maintained after masking for statistical analysis and data-mining research. Data masking is not as resource intensive as encryption and it is used for preserving privacy of data before sharing with external organizations whereas encryption is more useful for protecting data during the process of data transmission. State of the art engineering solutions

2 Why PrivGuard? While there are many privacy protection and data masking solutions available in the market, PrivGuard is the only application which was designed to give the control of protecting their data from privacy attacks to the data owners. The solution uses complex masking technology yet, its simple to use and cost effective to implement. It does not require expensive and complex encryption technology but can protect patient data and allows data analysts and researchers do high quality research and analysis. Here are the key features of PrivGuard: Empowerment - allows data owners to take control over their data privacy Powerful Technology - provides a choice of 10 data masking techniques developed from years of research & analysis Scalable Solution permits increasing or decreasing the level of masking depending on security desired Open Connectivity ODBC or JDBC support for all databases & file formats Seamless Integration with applications, databases and file formats Open Standards - multi-platform support for O/S (Windows, Unix/Linux &Apple) Intuitive GUI data masking tools require minimal user training Data Masking Techniques used in PrivGuard As mentioned earlier, PrivGuard uses powerful and flexible masking techniques for a wide variety of data formats. There are two categories of data masking algorithms in PrivGuard. The first set of algorithms focuses on masking with numerical data, while the second set focuses on categorical (text or Boolean) data. This document provides a detail description of the various algorithms available from PrivGuard, where they are useful, and various options or parameters available within each algorithm for increasing the level of data masking or preserving the data originality. A. Numeric Data Masking i) Simple Noise Perturbation: This is a univariate perturbation technique to add random noise to the original data. It does not preserve the relationships between attributes when perturbing data. Noise Type: Additive: Add random noise to the data. The noise follows a normal distribution with mean = 0 and a specified variance. The mean of the noise is zero so that the mean of the data will remain approximately the same after adding the noise. Multiplicative: Multiply the data values by random noise. The noise follows a normal distribution with mean = 1 and a specified variance. The mean of the noise is one so that the mean of the data will remain approximately the same after multiplying the noise. Column List: This option allows you to select the various attributes (or fields) for perturbation. Noise Multiple: This parameter is related to the variance of the noise. The larger the value, the higher degree of noise in the masked data, which implies a lower disclosure risk but deteriorated data quality in the masked data. ii) General Additive Data Perturbation (GADP): This multivariate perturbation technique adds random noise to the original data. It attempts to preserve the multivariate distribution of the data. There is no parameter (option) for this technique to control the degree of perturbation. The technique is ideal when the data follow exactly a multivariate normal distribution. Type: GADP: Adds random noise to the data, based on the multivariate normal distribution theory. Shuffle: This technique is a variant of GADP. With this technique, numeric values are swapped, instead of perturbed by random noise.

3 Column List: To select attributes (or fields) for perturbation. iii) MicroAggregation: This technique first divides data into groups using sorting and clustering techniques and then masks the data by replacing original values with group averages. It is a non-parametric approach that does not require any knowledge about the statistical distribution of the original data. Type: Univariate Microaggregation: Group the data for each attribute based on the sorted values of the attribute. The technique does not consider (or preserve) the relationships between attributes. It runs fast for large data sets. Multivariate Microaggregation: Group the data for each attribute based on clustering techniques. It attempts to preserve the relationships among all attributes. However, it is slow for large data sets. Subset Size: The maximum number of records allowed in a group (subset). The larger the value, the higher degree of masking in the masked data, which implies a lower disclosure risk but deteriorated data quality in the masked data. Column List: To select attributes for masking. iv). KD-Tree-Based Masking: This approach first divides data into groups using kd-tree-based techniques. It then masks the data by replacing original values with group averages or by swapping data within the groups. It is a non-parametric approach that does not require any knowledge about the statistical distribution of the original data. It attempts to preserve the relationships among all attributes. It runs fast for large data (significantly faster than multivariate microaggregation). Subset Size: The maximum number of records allowed in a group (subset). The larger the value, the higher degree of masking in the masked data, which implies a lower disclosure risk but deteriorated data quality in the masked data. Column List: To select attributes for masking. B. Categorical Data Masking i) Simple Data Swapping: This is a univariate swapping technique that randomly swaps the categorical (text) values of an attribute. It attempts to preserve the frequency distribution of the attribute, but does not consider the dependencies across different attributes. Swapping Proportion: The proportion of the values in each attribute to be swapped. The larger the proportion, the more records are swapped, which implies a lower disclosure risk but deteriorated data quality in the masked data. Because the swapped values for different attributes may appear in different records, the total proportion of the records that have at least one attribute value swapped will normally large than this ii) Multivariate Data Swapping: A multivariate swapping technique that attempts to preserve the multivariate frequency distributions up to a certain order (see descriptions for the term order below). Proportion: The proportion of the values in each attributes to be swapped. The larger the proportion, the Order: The number of dimensions (attributes) whose joint distributions are to be preserved. Order = 1: To preserve univariate frequency distributions. So this is equivalent to Simple Data Swapping.

4 Order = 2: To preserve bivariate frequency distributions. Take the life insurance data as an example. There are four categorical attributes: Age (A), Gender (G), Location (L) and Income (I). This technique will swap the data such that the joint counts for each value combination involving the following pairs of attributes will be approximately preserved: A&G, A&L, A&I, G&L, G&I, and L&I. For example, the count for {A = & G = Female} will likely remain the same after swapping. Order = 3: To preserve trivariate frequency distributions. In the above example, the joint distributions will involve the following triples of attributes: A&G&L, A&G&I, A&L&I, and G&L&I. When the Order is greater than 3, the algorithm becomes very time consuming. Therefore, we only implement the algorithm up to order 3. iii) Bayesian-Based Data Swapping: This is a multivariate swapping technique that preserves the multivariate frequency distributions up to any order (see descriptions for the term order in Multivariate Data Swapping). These attributes are assumed to be conditionally independent (the Naïve Bayes assumption). The algorithm runs faster than the Multivariate Data Swapping for higher order requirements. In addition, this technique is optimal in preserving univariate distributions (via a Linear Programming method). Proportion: The proportions of the values in each attribute to be swapped. The larger the proportion, the iv) Decision-Tree-Based Data Swapping: This approach first divides data into groups using decision-tree-based techniques. It then masks the data by swapping the values within the groups. The attribute subject to masking must be categorical. However, it allows the other attribute to be categorical or numeric and attempts to preserve the relationships among all attributes (categorical and numeric). This is a key difference between this technique and the other categorical data swapping techniques (which require all attributes to be categorical) and the KD-Tree-Based Masking (which works for numeric attributes only). The algorithm runs fast for large data sets. Random Seed: Used in swapping. Proportion: The proportion of the values in each attribute to be swapped. The larger the proportion, the Note: Currently, this algorithm can only mask one attribute at a time. Further work needs to be done to extend this algorithm to masking multiple attributes simultaneously. PrivGuard Technical References J.F. Traub, Y. Yemini, and H. Wozniakowski, The statistical security of a statistical database, ACM Transactions on Database Systems, vol. 9, no. 4, pp , C. K. Liew, U.J. Choi, and C.J. Liew, A data distortion by probability distribution, ACM Transactions on Database Systems vol. 10, no. 3, pp , K. Muralidhar, R. Parsa, and R. Sarathy, A general additive data perturbation method for database security, Management Science, vol. 45, no. 10, pp , K.Muralidhar and R. Sarathy, Data shuffling A new masking approach for numerical data, Management Science vol. 52, no. 5, pp , 2006.

5 D. Defays and P. Nanopoulos, Panels of enterprises and confidentiality: The small aggregates method, Proceedings of Statistics Canada Symposium 92 on Design and Analysis of Longitudinal Surveys, pp , Ottawa, Canada, November J. Domingo-Ferrer and J.M. Mateo-Sanz, Practical data-oriented microaggregation for statistical disclosure control, IEEE Transactions on Knowledge and Data Engineering, vol. 14, no. 1, pp , X.-B. Li and S. Sarkar, A tree-based data perturbation approach for privacy-preserving data mining, IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 9, pp , X.-B. Li and S. Sarkar, Protecting privacy against re-identification by record linkage, Proceedings of the 16th Annual Workshop on Information Technologies and Systems (WITS 2006), Milwaukee, WI, 2006.S.P. Reiss, Practical data-swapping: The first steps, ACM Transactions on Database Systems, vol. 9, no. 1, pp , X.-B. Li and S. Sarkar, Privacy protection in data mining: A perturbation approach for categorical data, Information Systems Research, vol. 17, no. 3, pp , X.-B. Li and S. Sarkar, Protecting Privacy against Classification Attacks in Data Mining, Proceedings of the 15th Annual Workshop on Information Technologies and Systems (WITS 2005), Las Vegas, NV, Globe Tech, Inc. All Rights Reserved. PrivGuard is a trademark of Globe Tech, Inc. All other trademarks or service marks are the property of their respective owners.

A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA

A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University Agency Internal User Unmasked Result Subjects

More information

XIAOBAI (BOB) LI ACADEMIC EXPERIENCE RESEARCH HIGHLIGHTS TEACHING HIGHLIGHTS

XIAOBAI (BOB) LI ACADEMIC EXPERIENCE RESEARCH HIGHLIGHTS TEACHING HIGHLIGHTS XIAOBAI (BOB) LI Department of Operations & Information Systems Manning School of Business One University Ave., Lowell, MA 01854 Phone: 978-934-2707 Email: xiaobai_li@uml.edu ACADEMIC EXPERIENCE 2011-present

More information

A Study of Data Perturbation Techniques For Privacy Preserving Data Mining

A Study of Data Perturbation Techniques For Privacy Preserving Data Mining A Study of Data Perturbation Techniques For Privacy Preserving Data Mining Aniket Patel 1, HirvaDivecha 2 Assistant Professor Department of Computer Engineering U V Patel College of Engineering Kherva-Mehsana,

More information

Privacy-preserving Data Mining: current research and trends

Privacy-preserving Data Mining: current research and trends Privacy-preserving Data Mining: current research and trends Stan Matwin School of Information Technology and Engineering University of Ottawa, Canada stan@site.uottawa.ca Few words about our research Universit[é

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Information Security in Big Data: Privacy and Data Mining (IEEE, 2014) Dilara USTAÖMER 2065787

Information Security in Big Data: Privacy and Data Mining (IEEE, 2014) Dilara USTAÖMER 2065787 Information Security in Big Data: Privacy and Data Mining (IEEE, 2014) Dilara USTAÖMER 2065787 2015/5/13 OUTLINE Introduction User Role Based Methodology Data Provider Data Collector Data Miner Decision

More information

International Journal of Advanced Computer Technology (IJACT) ISSN:2319-7900 PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS

International Journal of Advanced Computer Technology (IJACT) ISSN:2319-7900 PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS First A. Dr. D. Aruna Kumari, Ph.d, ; Second B. Ch.Mounika, Student, Department Of ECM, K L University, chittiprolumounika@gmail.com; Third C.

More information

De-identification Koans. ICTR Data Managers Darren Lacey January 15, 2013

De-identification Koans. ICTR Data Managers Darren Lacey January 15, 2013 De-identification Koans ICTR Data Managers Darren Lacey January 15, 2013 Disclaimer There are several efforts addressing this issue in whole or part Over the next year or so, I believe that the conversation

More information

Li Xiong, Emory University

Li Xiong, Emory University Healthcare Industry Skills Innovation Award Proposal Hippocratic Database Technology Li Xiong, Emory University I propose to design and develop a course focused on the values and principles of the Hippocratic

More information

CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved

CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved CHAPTER SIX DATA Business Intelligence 2011 The McGraw-Hill Companies, All Rights Reserved 2 CHAPTER OVERVIEW SECTION 6.1 Data, Information, Databases The Business Benefits of High-Quality Information

More information

For ONC S&I DS4P. Dennis Giokas Chief Technology Officer Canada Health Infoway Inc. January 25, 2012

For ONC S&I DS4P. Dennis Giokas Chief Technology Officer Canada Health Infoway Inc. January 25, 2012 For ONC S&I DS4P Dennis Giokas Chief Technology Officer Canada Health Infoway Inc. January 25, 2012 1 Outline EHR Business Architecture EHR Solution Blueprint EHR Privacy and Security Summary & Conclusion

More information

Tutorial for sdcmicrogui

Tutorial for sdcmicrogui Tutorial for sdcmicrogui Matthias Templ, Bernhard Meindl and Alexander Kowarik August 2014 International Household Survey Network (IHSN) 1 1 Acknowledgements: The authors benefited from the support and

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

DATA MINING - 1DL360

DATA MINING - 1DL360 DATA MINING - 1DL360 Fall 2013" An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/per1ht13 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Special Topics in Security and Privacy of Medical Information. Privacy HIPAA. Sujata Garera. HIPAA Anonymity Hippocratic databases.

Special Topics in Security and Privacy of Medical Information. Privacy HIPAA. Sujata Garera. HIPAA Anonymity Hippocratic databases. Special Topics in Security and Privacy of Medical Information Sujata Garera Privacy HIPAA Anonymity Hippocratic databases HIPAA Health Insurance Portability and Accountability Act of 1996 1 HIPAA What

More information

Taxonomy for Privacy Policies of Social Networks Sites

Taxonomy for Privacy Policies of Social Networks Sites Social Networking, 2013, 2, 157-164 http://dx.doi.org/10.4236/sn.2013.24015 Published Online October 2013 (http://www.scirp.org/journal/sn) Taxonomy for Privacy Policies of Social Networks Sites Sergio

More information

Data Driven Approaches to Prescription Medication Outcomes Analysis Using EMR

Data Driven Approaches to Prescription Medication Outcomes Analysis Using EMR Data Driven Approaches to Prescription Medication Outcomes Analysis Using EMR Nathan Manwaring University of Utah Masters Project Presentation April 2012 Equation Consulting Who we are Equation Consulting

More information

Privacy Preserving Outsourcing for Frequent Itemset Mining

Privacy Preserving Outsourcing for Frequent Itemset Mining Privacy Preserving Outsourcing for Frequent Itemset Mining M. Arunadevi 1, R. Anuradha 2 PG Scholar, Department of Software Engineering, Sri Ramakrishna Engineering College, Coimbatore, India 1 Assistant

More information

The PerspecSys PRS Solution and Cloud Computing

The PerspecSys PRS Solution and Cloud Computing THE PERSPECSYS KNOWLEDGE SERIES Solving Privacy, Residency and Security in the Cloud Data Compliance and the Enterprise Cloud Computing is generating an incredible amount of excitement and interest from

More information

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-19-B &

More information

Welcome to the Privacy and Security PowerPoint presentation in the Data Analytics Toolkit. This presentation will provide introductory information

Welcome to the Privacy and Security PowerPoint presentation in the Data Analytics Toolkit. This presentation will provide introductory information Welcome to the Privacy and Security PowerPoint presentation in the Data Analytics Toolkit. This presentation will provide introductory information about HIPAA, the HITECH-HIPAA Omnibus Privacy Act, how

More information

Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules

Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules Homomorphic Encryption Schema for Privacy Preserving Mining of Association Rules M.Sangeetha 1, P. Anishprabu 2, S. Shanmathi 3 Department of Computer Science and Engineering SriGuru Institute of Technology

More information

Data Analytics in Health Care

Data Analytics in Health Care Data Analytics in Health Care ONUP 2016 April 4, 2016 Presented by: Dennis Giokas, CTO, Innovation Ecosystem Group A lot of data, but limited information 2 Data collection might be the single greatest

More information

OUTLIER ANALYSIS. Data Mining 1

OUTLIER ANALYSIS. Data Mining 1 OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,

More information

AN ENHANCED ATTRIBUTE BASED ENCRYPTION WITH MULTI PARTIES ACCESS IN CLOUD AREA

AN ENHANCED ATTRIBUTE BASED ENCRYPTION WITH MULTI PARTIES ACCESS IN CLOUD AREA Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 1, January 2014,

More information

A 2 d -Tree-Based Blocking Method for Microaggregating Very Large Data Sets

A 2 d -Tree-Based Blocking Method for Microaggregating Very Large Data Sets A 2 d -Tree-Based Blocking Method for Microaggregating Very Large Data Sets Agusti Solanas, Antoni Martínez-Ballesté, Josep Domingo-Ferrer and Josep M. Mateo-Sanz Universitat Rovira i Virgili, Dept. of

More information

Fast Sequential Summation Algorithms Using Augmented Data Structures

Fast Sequential Summation Algorithms Using Augmented Data Structures Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik vadim.stadnik@gmail.com Abstract This paper provides an introduction to the design of augmented data structures that offer

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 ISSN 2229-5518 1582

International Journal of Scientific & Engineering Research, Volume 4, Issue 10, October-2013 ISSN 2229-5518 1582 1582 AN EFFICIENT CRYPTOGRAPHIC APPROACH FOR PRESERVING PRIVACY IN DATA MINING T.Sujitha 1, V.Saravanakumar 2, C.Saravanabhavan 3 1. M.E. Student, Sujiraj.me@gmail.com 2. Assistant Professor, visaranams@yahoo.co.in

More information

Information Security Law: Control of Digital Assets.

Information Security Law: Control of Digital Assets. Brochure More information from http://www.researchandmarkets.com/reports/2128523/ Information Security Law: Control of Digital Assets. Description: For most organizations, an effective information security

More information

An Efficiency Keyword Search Scheme to improve user experience for Encrypted Data in Cloud

An Efficiency Keyword Search Scheme to improve user experience for Encrypted Data in Cloud , pp.246-252 http://dx.doi.org/10.14257/astl.2014.49.45 An Efficiency Keyword Search Scheme to improve user experience for Encrypted Data in Cloud Jiangang Shu ab Xingming Sun ab Lu Zhou ab Jin Wang ab

More information

Technical Approaches for Protecting Privacy in the PCORnet Distributed Research Network V1.0

Technical Approaches for Protecting Privacy in the PCORnet Distributed Research Network V1.0 Technical Approaches for Protecting Privacy in the PCORnet Distributed Research Network V1.0 Guidance Document Prepared by: PCORnet Data Privacy Task Force Submitted to the PMO Approved by the PMO Submitted

More information

ESSNET-SDC Deliverable Report on Synthetic Data Files

ESSNET-SDC Deliverable Report on Synthetic Data Files ESSNET-SDC Deliverable Report on Synthetic Data Files Josep Domingo-Ferrer 1, Jörg Drechsler 2 and Silvia Polettini 3 1 Universitat Rovira i Virgili, Dept. of Computer Engineering and Maths, Av. Països

More information

Industry insight into FCRA-compliance and its benefits to healthcare organizations.

Industry insight into FCRA-compliance and its benefits to healthcare organizations. White Paper Industry insight into FCRA-compliance and its benefits to healthcare organizations. April 2012 LexisNexis Health Care Credentialing Excellent Certification in License to Practice, Malpractice

More information

Insight for Informed Decisions

Insight for Informed Decisions Insight for Informed Decisions NORC at the University of Chicago is an independent research institution that delivers reliable data and rigorous analysis to guide critical programmatic, business, and policy

More information

EFFECTIVE DATA RECOVERY FOR CONSTRUCTIVE CLOUD PLATFORM

EFFECTIVE DATA RECOVERY FOR CONSTRUCTIVE CLOUD PLATFORM INTERNATIONAL JOURNAL OF REVIEWS ON RECENT ELECTRONICS AND COMPUTER SCIENCE EFFECTIVE DATA RECOVERY FOR CONSTRUCTIVE CLOUD PLATFORM Macha Arun 1, B.Ravi Kumar 2 1 M.Tech Student, Dept of CSE, Holy Mary

More information

HIPAA In The Workplace. What Every Employee Should Know and Remember

HIPAA In The Workplace. What Every Employee Should Know and Remember HIPAA In The Workplace What Every Employee Should Know and Remember What is HIPAA? The Health Insurance Portability and Accountability Act of 1996 Portable Accountable Rules for Privacy Rules for Security

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

A generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment

A generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment www.ijcsi.org 434 A generalized Framework of Privacy Preservation in Distributed Data mining for Unstructured Data Environment V.THAVAVEL and S.SIVAKUMAR* Department of Computer Applications, Karunya University,

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

A Three-Dimensional Conceptual Framework for Database Privacy

A Three-Dimensional Conceptual Framework for Database Privacy A Three-Dimensional Conceptual Framework for Database Privacy Josep Domingo-Ferrer Rovira i Virgili University UNESCO Chair in Data Privacy Department of Computer Engineering and Mathematics Av. Països

More information

Degrees of De-identification of Clinical Research Data

Degrees of De-identification of Clinical Research Data Vol. 7, No. 11, November 2011 Can You Handle the Truth? Degrees of De-identification of Clinical Research Data By Jeanne M. Mattern Two sets of U.S. government regulations govern the protection of personal

More information

ACM SIGKDD Workshop on Intelligence and Security Informatics Held in conjunction with KDD-2010

ACM SIGKDD Workshop on Intelligence and Security Informatics Held in conjunction with KDD-2010 Fuzzy Association Rule Mining for Community Crime Pattern Discovery Anna L. Buczak, Christopher M. Gifford ACM SIGKDD Workshop on Intelligence and Security Informatics Held in conjunction with KDD-2010

More information

Data Mining Project Report. Document Clustering. Meryem Uzun-Per

Data Mining Project Report. Document Clustering. Meryem Uzun-Per Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...

More information

HIPAA Security Rule Changes and Impacts

HIPAA Security Rule Changes and Impacts HIPAA Security Rule Changes and Impacts Susan A. Miller, JD Tony Brooks, CISA, CRISC HIPAA in a HITECH WORLD American Health Lawyers Association March 22, 2013 Baltimore, MD Agenda I. Introduction II.

More information

Texas Medical Records Privacy Act

Texas Medical Records Privacy Act A COALFIRE PERSPECTIVE Texas Medical Records Privacy Act Texas House Bill 300 (HB 300) Rick Dakin, CEO & Co-Founder Rick Link, Director Andrew Hicks, Director Overview The State of Texas has pushed ahead

More information

Cardinality-based Inference Control in Sum-only Data Cubes (Extended Version)

Cardinality-based Inference Control in Sum-only Data Cubes (Extended Version) Cardinality-based Inference Control in Sum-only Data Cubes (Extended Version) Lingyu Wang, Duminda Wijesekera, and Sushil Jajodia Center for Secure Information Systems George Mason University, Fairfax,

More information

A Proposed Data Mining Model to Enhance Counter- Criminal Systems with Application on National Security Crimes

A Proposed Data Mining Model to Enhance Counter- Criminal Systems with Application on National Security Crimes A Proposed Data Mining Model to Enhance Counter- Criminal Systems with Application on National Security Crimes Dr. Nevine Makram Labib Department of Computer and Information Systems Faculty of Management

More information

Practicing Differential Privacy in Health Care: A Review

Practicing Differential Privacy in Health Care: A Review TRANSACTIONS ON DATA PRIVACY 5 (2013) 35 67 Practicing Differential Privacy in Health Care: A Review Fida K. Dankar*, and Khaled El Emam* * CHEO Research Institute, 401 Smyth Road, Ottawa, Ontario E mail

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

Enhancing Data Security in Cloud Storage Auditing With Key Abstraction

Enhancing Data Security in Cloud Storage Auditing With Key Abstraction Enhancing Data Security in Cloud Storage Auditing With Key Abstraction 1 Priyadharshni.A, 2 Geo Jenefer.G 1 Master of engineering in computer science, Ponjesly College of Engineering 2 Assistant Professor,

More information

Why Add Data Masking to Your IBM DB2 Application Environment

Why Add Data Masking to Your IBM DB2 Application Environment Why Add Data Masking to Your IBM DB2 Application Environment dataguise inc. 2010. All rights reserved. Dataguise, Inc. 2201 Walnut Ave., #260 Fremont, CA 94538 (510) 824-1036 www.dataguise.com dataguise

More information

Privacy Preserving Distributed Cloud Storage

Privacy Preserving Distributed Cloud Storage Privacy Preserving Distributed Cloud Storage Praveenkumar Khethavath 1 *, Doyel Pal 2 1 Department of Mathematics, Engineering and Computer Science, LaGuardia Community College, Long Island City, NY 11101.

More information

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics Predictive Analytics Powered by SAP HANA Cary Bourgeois Principal Solution Advisor Platform and Analytics Agenda Introduction to Predictive Analytics Key capabilities of SAP HANA for in-memory predictive

More information

Encrypting Network Traffic

Encrypting Network Traffic Encrypting Network Traffic Mark Lomas Computer Security Group University of Cambridge Computer Laboratory Encryption may be used to maintain the secrecy of information, to help detect when messages have

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

A Review of Anomaly Detection Techniques in Network Intrusion Detection System A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In

More information

The De-identification Maturity Model Authors: Khaled El Emam, PhD Waël Hassan, PhD

The De-identification Maturity Model Authors: Khaled El Emam, PhD Waël Hassan, PhD A PRIVACY ANALYTICS WHITEPAPER The De-identification Maturity Model Authors: Khaled El Emam, PhD Waël Hassan, PhD De-identification Maturity Assessment Privacy Analytics has developed the De-identification

More information

A Q&A with the Commissioner: Big Data and Privacy Health Research: Big Data, Health Research Yes! Personal Data No!

A Q&A with the Commissioner: Big Data and Privacy Health Research: Big Data, Health Research Yes! Personal Data No! A Q&A with the Commissioner: Big Data and Privacy Health Research: Big Data, Health Research Yes! Personal Data No! Ann Cavoukian, Ph.D. Information and Privacy Commissioner Ontario, Canada THE AGE OF

More information

MUTI-KEYWORD SEARCH WITH PRESERVING PRIVACY OVER ENCRYPTED DATA IN THE CLOUD

MUTI-KEYWORD SEARCH WITH PRESERVING PRIVACY OVER ENCRYPTED DATA IN THE CLOUD MUTI-KEYWORD SEARCH WITH PRESERVING PRIVACY OVER ENCRYPTED DATA IN THE CLOUD A.Shanthi 1, M. Purushotham Reddy 2, G.Rama Subba Reddy 3 1 M.tech Scholar (CSE), 2 Asst.professor, Dept. of CSE, Vignana Bharathi

More information

REMOTE ACCESS TO A HEALTHCARE FACILITY AND THE IT PROFESSIONAL S OBLIGATIONS UNDER HIPAA AND THE HITECH ACT

REMOTE ACCESS TO A HEALTHCARE FACILITY AND THE IT PROFESSIONAL S OBLIGATIONS UNDER HIPAA AND THE HITECH ACT REMOTE ACCESS TO A HEALTHCARE FACILITY AND THE IT PROFESSIONAL S OBLIGATIONS UNDER HIPAA AND THE HITECH ACT ARE YOUR AUTHENTICATION, ACCESS, AND AUDIT PARADIGMS UP TO DATE? BY KERRY ARMSTRONG, PRIVACY,

More information

Decision Trees from large Databases: SLIQ

Decision Trees from large Databases: SLIQ Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values

More information

A SECURE DECISION SUPPORT ESTIMATION USING GAUSSIAN BAYES CLASSIFICATION IN HEALTH CARE SERVICES

A SECURE DECISION SUPPORT ESTIMATION USING GAUSSIAN BAYES CLASSIFICATION IN HEALTH CARE SERVICES A SECURE DECISION SUPPORT ESTIMATION USING GAUSSIAN BAYES CLASSIFICATION IN HEALTH CARE SERVICES K.M.Ruba Malini #1 and R.Lakshmi *2 # P.G.Scholar, Computer Science and Engineering, K. L. N College Of

More information

DATA MINING - 1DL105, 1DL025

DATA MINING - 1DL105, 1DL025 DATA MINING - 1DL105, 1DL025 Fall 2009 An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/ht09 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Keywords-- Cloud computing, Encryption, Data integrity, Third Party Auditor (TPA), RC5 Algorithm, privacypreserving,

Keywords-- Cloud computing, Encryption, Data integrity, Third Party Auditor (TPA), RC5 Algorithm, privacypreserving, Volume 3, Issue 11, November 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Third Party

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

Composite performance measures in the public sector Rowena Jacobs, Maria Goddard and Peter C. Smith

Composite performance measures in the public sector Rowena Jacobs, Maria Goddard and Peter C. Smith Policy Discussion Briefing January 27 Composite performance measures in the public sector Rowena Jacobs, Maria Goddard and Peter C. Smith Introduction It is rare to open a newspaper or read a government

More information

Making confident decisions with the full spectrum of analysis capabilities

Making confident decisions with the full spectrum of analysis capabilities IBM Software Business Analytics Analysis Making confident decisions with the full spectrum of analysis capabilities Making confident decisions with the full spectrum of analysis capabilities Contents 2

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm

Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm Markus Goldstein and Andreas Dengel German Research Center for Artificial Intelligence (DFKI), Trippstadter Str. 122,

More information

Legal Insight. Big Data Analytics Under HIPAA. Kevin Coy and Neil W. Hoffman, Ph.D. Applicability of HIPAA

Legal Insight. Big Data Analytics Under HIPAA. Kevin Coy and Neil W. Hoffman, Ph.D. Applicability of HIPAA Big Data Analytics Under HIPAA Kevin Coy and Neil W. Hoffman, Ph.D. Privacy laws and regulations such as the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule can have a significant

More information

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014 Maximierung des Geschäftserfolgs durch SAP Predictive Analytics Andreas Forster, May 2014 Legal Disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed

More information

DATA VERIFICATION IN ETL PROCESSES

DATA VERIFICATION IN ETL PROCESSES KNOWLEDGE ENGINEERING: PRINCIPLES AND TECHNIQUES Proceedings of the International Conference on Knowledge Engineering, Principles and Techniques, KEPT2007 Cluj-Napoca (Romania), June 6 8, 2007, pp. 282

More information

Operation Count; Numerical Linear Algebra

Operation Count; Numerical Linear Algebra 10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point

More information

HIPAA and Network Security Curriculum

HIPAA and Network Security Curriculum HIPAA and Network Security Curriculum This curriculum consists of an overview/syllabus and 11 lesson plans Week 1 Developed by NORTH SEATTLE COMMUNITY COLLEGE for the IT for Healthcare Short Certificate

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

PRESENTS... How to Access Remote SourceSafe Fast & Securely?

PRESENTS... How to Access Remote SourceSafe Fast & Securely? PRESENTS... How to Access Remote SourceSafe Fast & Securely? This article focuses on the growing problem for development teams who try to use Microsoft Visual SourceSafe (VSS) remotely. The paper will

More information

REFERENCE 5. White Paper Health Insurance Portability and Accountability Act: Security Standards; Implications for the Healthcare Industry

REFERENCE 5. White Paper Health Insurance Portability and Accountability Act: Security Standards; Implications for the Healthcare Industry REFERENCE 5 White Paper Health Insurance Portability and Accountability Act: Security Standards; Implications for the Healthcare Industry Shannah Koss, Program Manager, IBM Government and Healthcare This

More information

1. Secure 128-Bit SSL Communication 2. Backups Are Securely Encrypted 3. We Don t Keep Your Encryption Key VERY IMPORTANT:

1. Secure 128-Bit SSL Communication 2. Backups Are Securely Encrypted 3. We Don t Keep Your Encryption Key VERY IMPORTANT: HOW IT WORKS 1. Secure 128-Bit SSL Communication All communications between Offsite Backup Server and your computer are transported in a 128-bit SSL (Secure Socket Layer) channel. Although all your backup

More information

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence

More information

Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology

More information

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

3D Interactive Information Visualization: Guidelines from experience and analysis of applications 3D Interactive Information Visualization: Guidelines from experience and analysis of applications Richard Brath Visible Decisions Inc., 200 Front St. W. #2203, Toronto, Canada, rbrath@vdi.com 1. EXPERT

More information

IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS

IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS Volume 2, No. 3, March 2011 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at www.jgrcs.info IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE

More information

FACT SHEET: Ransomware and HIPAA

FACT SHEET: Ransomware and HIPAA FACT SHEET: Ransomware and HIPAA A recent U.S. Government interagency report indicates that, on average, there have been 4,000 daily ransomware attacks since early 2016 (a 300% increase over the 1,000

More information

De-Identification Framework

De-Identification Framework A Consistent, Managed Methodology for the De-Identification of Personal Data and the Sharing of Compliance and Risk Information March 205 Contents Preface...3 Introduction...4 Defining Categories of Health

More information

Current Developments of k-anonymous Data Releasing

Current Developments of k-anonymous Data Releasing Current Developments of k-anonymous Data Releasing Jiuyong Li 1 Hua Wang 1 Huidong Jin 2 Jianming Yong 3 Abstract Disclosure-control is a traditional statistical methodology for protecting privacy when

More information

Enterprise Organization and Communication Network

Enterprise Organization and Communication Network Enterprise Organization and Communication Network Hideyuki Mizuta IBM Tokyo Research Laboratory 1623-14, Shimotsuruma, Yamato-shi Kanagawa-ken 242-8502, Japan E-mail: e28193@jp.ibm.com Fusashi Nakamura

More information

RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS

RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for

More information

ARGUS: SOFTWARE FOR STATISTICAL DISCLOSURE CONTROL OF MICRODATA 1

ARGUS: SOFTWARE FOR STATISTICAL DISCLOSURE CONTROL OF MICRODATA 1 ARGUS: SOFTWARE FOR STATISTICAL DISCLOSURE CONTROL OF MICRODATA 1 A.G. De Waal, A. J. Hundepool and L.C.R.J. Willenborg 2 ABSTRACT In recent years Statistics Netherlands has developed a prototype version

More information

HIPAA Training 2010. For Research Investigators and Study Staff

HIPAA Training 2010. For Research Investigators and Study Staff HIPAA Training 2010 For Research Investigators and Study Staff HIPAA IS... THE HEALTH INSURANCE PORTABILITY & ACCOUNTABILITY ACT OF 1996 Portability Created to ensure access to health coverage Allows for

More information

Closing the data privacy gap: Protecting sensitive data in non-production environments

Closing the data privacy gap: Protecting sensitive data in non-production environments Enterprise Data Management Solutions February 2008 IBM Information Management software Closing the data privacy gap: Protecting sensitive data in non-production environments Page 2 Contents 2 Executive

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

Performing Data Mining in (SRMS) through Vertical Approach with Association Rules

Performing Data Mining in (SRMS) through Vertical Approach with Association Rules Performing Data Mining in (SRMS) through Vertical Approach with Association Rules Mr. Ambarish S. Durani 1 and Miss. Rashmi B. Sune 2 MTech (III rd Sem), Vidharbha Institute of Technology, Nagpur, Nagpur

More information

A Survey of Quantification of Privacy Preserving Data Mining Algorithms

A Survey of Quantification of Privacy Preserving Data Mining Algorithms A Survey of Quantification of Privacy Preserving Data Mining Algorithms Elisa Bertino, Dan Lin, and Wei Jiang Abstract The aim of privacy preserving data mining (PPDM) algorithms is to extract relevant

More information

Remote Access to a Healthcare Facility and the IT professional s obligations under HIPAA and the HITECH Act

Remote Access to a Healthcare Facility and the IT professional s obligations under HIPAA and the HITECH Act Remote Access to a Healthcare Facility and the IT professional s obligations under HIPAA and the HITECH Act Are your authentication, access, and audit paradigms up to date? Table of Contents Synopsis...1

More information

Is Privacy Still an Issue for Data Mining? (Extended Abstract)

Is Privacy Still an Issue for Data Mining? (Extended Abstract) Is Privacy Still an Issue for Data Mining? (Extended Abstract) Chris Clifton Wei Jiang Mummoorthy Muruguesan M. Ercan Nergiz Department of Computer Science Purdue University 305 North University Street

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

MSCA 31000 Introduction to Statistical Concepts

MSCA 31000 Introduction to Statistical Concepts MSCA 31000 Introduction to Statistical Concepts This course provides general exposure to basic statistical concepts that are necessary for students to understand the content presented in more advanced

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

HIPAA: AN OVERVIEW September 2013

HIPAA: AN OVERVIEW September 2013 HIPAA: AN OVERVIEW September 2013 Introduction The Health Insurance Portability and Accountability Act of 1996, known as HIPAA, was enacted on August 21, 1996. The overall goal was to simplify and streamline

More information