1 Privacy: Legal Aspects of Big Data and Information Security Presentation at the 2 nd National Open Access Workshop October, 2013 Izmir, Turkey John N. Gathegi University of South Florida, Tampa, FL Visiting Professor, Hacettepe University, Ankara, TURKEY
2 Characteristics of big data and data mining: --Big data refers to massive amounts of seemingly unrelated data collected from a variety of sources that are agregated in massive data depository systems.
3 --usually data sets too big for common database software to manage or process.
4 --3 defining features (Rubinstein, 2013): 1. availability of massive data continuously collected in multiple ways including: -online -mobile devices -location tracking -data sharing apps
5 -smart environment interactions and monitoring (e.g. Internet of Things) --big data increasingly will be derived from The Internet of Things. -web 2.0 user generated data, including personal information sharing
6 2. use of high-speed, high-transfer rate computers with massive storage capability utilizing the cloud computing model 3. use of new computational frameworks... for storing and analyzing this huge volume of data. --summary: more data, faster computers, new analytic techniques
7 Data mining: extraction of information from massive amounts of data that lead to unexpected new knowledge associations, patterns, and meanings that were previously buried in the data. Have to use massively complex data mining algorithms and statistical methods to analyze the data.
8 --Think of Google: -- data (gmail); search data; personal information, web navigation data, geographic location data, voice communication data, video communication data, image management and processing data, translation data
9 --major benefits to industry and society in the area of innovations and service delivery (e.g. medical research, traffic management), but also some downsides, especially in the area of privacy.
10 --Think of Facebook: nearly a billion users uploading personal information Rubinstein (2013) notes several intertwined trends that are presenting great challenges to privacy: the popularity of social networking sites that permit individuals to voluntarily share personal data the growth of cloud computing the ubiquity of mobile devices and of physical sensors that transmit geo-location information and the growing use of data mining technologies enabling the aggregation and analysis of data from multiple sources --Add to this Open Access and you have a problem!
11 According to Nicholas Terry (2012): Data aggregation and customer profiling are hardly news. The developments that mark out big data are the scale of the data collection and the increasing sophistication of predictive analytics.
12 --problems: data mining; profiling (cookies are not the primary concern anymore) -finding hidden correlations, enabling interesting predictions --right to be forgotten (addressed somewhat in Europe but almost ignored in the US) --subverted by the ability to re-identify data subjects using non-personal data. Blurring the line between personal and non-personal data Data aggregation to provide anonymity loses its meaning
13 Consider this --purchase by Walmart in 2012 of Social Calendar (a Facebook application). Already had ShopyCat, a facebook app of its own that is a giftrecommendation service. Why purchase and not build its own?
14 Points to --weakest link: over-reliance on informed consent (most people do not read, or understand disclosures, and have no idea bout the subsequent use, or even custody, of their personal information)
15 Other BD problems --Also allows automated decision-making about individuals, e.g., creditworthiness, insurance eligibility, etc. --process opaque and affords little chance for individual feedback or correction of the underlying data --BD users unable to provide adequate notice of purpose and use of data to individuals, since they cannot tell in advance what they will find --Users cannot effectively consent to the use of their information because they cannot monitor the correlations made possible by the data mining
16 --dangers of predictive analysis -Target analysis producing a pregnancy prediction score based on women customers purchase patterns. (identification of pregnancy and prediction of due date) e.g., daughter sent baby ads, upsetting father - Pre-crime police departments (as in the movie Minority Report) apprehending criminals based on prediction of their future deeds (thought police?) -redlining certain neighborhoods (for insurance purposes,, social services, etc).
17 --Tene and Polonetsky (2013) make the very salient points that: In a big data World, what calls for scrutiny is often not the accuracy of the raw data but rather the accuracy of the inferences drawn from the data. Inaccurate, manipulative or discriminatory conclusions may be drawn from perfectly innocuous, accurate data.
18 --de-identification is often reversible --privacy v. Societal benefit e.g., Tene and Polonetsky (2013) pose the following question: what if the analysis of de-identified online search engine logs enables: identification of a life-threatening epidemic in x% of cases saving y lives assuming a z% chance of re-identification for a certain subset of search engine users should such an analysis be permitted?
19 No surprise that it is in the health area that privacy has received the most sympathy and attention. But even here, the US, for example, has depended on HIPAA, which is supposed to protect against disclosure of patient data However, as Terry (2012) points out, HIPAA protects against disclosure, not against collection! He notes that a lot of traditional health information circulates in a mainly HIPAA-free zone
20 --Harvard Researchers who collected data on Facebook users to study changes in their interests and friendships over time. Released data for research to the World because supposed to be anonymous. Other researchers quickly found that they could deanonymize parts of the dataset
21 On the other hand Stanford researchers who discovered the effect of taking an antidepressant drug together with a cholesterol-reducing drug on the increase of patients blood glucose to diabetic levels (through analyzing data in adverse effect reporting data sets and creating a symptomatic footprint for diabetes-inducing drugs. Then searched this footprint in interactions between pairs of drugs. Four pairs with this effect were found. Among them Paxil and Pravachol. Next they examined Bing search engine logs to see if there was more likelihood of people who searched for both drugs to also report the symptoms, as opposed to those who searched only for the one drug. Found support in the data and potentially saved the lives of 1 million Americans.
22 Industry not the only BD driver --In 2012 President Obama deployed a Big Data R&D initiative to advance the science and technology of managing, analyzing, visualizing and extracting information from large, diverse, distributed, and heterogeneous data sets. Terry (2012) also notes that in the future BD will come from less structured sources including "[w]eb-browsing data trails, social network communications, sensor data and surveillance data. Much of it is "exhaust data," or data created unintentionally as a byproduct of social networks, web searches, smartphones, and other online behaviors.
23 This means that with industry, social behavior, and government behind it, BD is only going to grow larger and the privacy problems associated with it are going to grow not in tandem, but exponentially
24 Ethics Look beyond the law; ethics of BD research availability makes it ethical? research ethics boards have insufficient understanding of the process of anonymizing and mining data, or the errors that can lead to data becoming personally identifiable effects may not be realized until many years into the future data contributors (e.g. social networkers) usually do not have researchers as their audience many have no idea of the processes currently gathering and using their data difference between being in public and being public
25 --even in the area of litigation, electronic discovery can uncover both criminal acts and non-criminal embarrassing acts
26 Conclusions BD is here to stay Increasingly happening in the cloud, and with open access Erasing the notion of public/private space distinction
27 Hierarchy in the BD World 3 classes of people in Big Data World (Manovich, 2011): (1) those that create data (consciously or by leaving digital footprints) (2) those who have the means to collect it (3) those who have the expertise to analyze it (smallest group, and most privileged) -A pyramid?
28 Tene and Polonetsky (2013) note that presently the benefits of big data do not accrue to individuals whose data is harvested, only to big businesses that use such data: -- those who aggregate and mine this data neither view their informational assets as public goods held on trust nor seem particularly interested in protecting the privacy of their data subjects. The truth lies in the opposite because the big data business model is selling information about their data subjects. To make it less of a pyramid, they advocate the empowerment of individuals in controlling their information by giving them meaningful rights to Access their data in usable, machine-readable format. advantages: unleash innovation for user-side applications and services, give an incentive to users to participate in the data economy ( by aligning their own self-interest with broader societal goals )
29 To make it less of a pyramid, they advocate the empowerment of individuals in controlling their information by giving them meaningful rights to Access their data in usable, machine-readable format. advantages: unleash innovation for user-side applications and services, give an incentive to users to participate in the data economy ( by aligning their own self-interest with broader societal goals )
30 What you think about this proposal will have to be a debate we are willing to undertake, today or another day! Thank you!
DEMOCRATIZING BIG DATA: THE ETHICAL CHALLENGES OF SOCIAL MINING Dino PEDRESCHI (KDDLab, Dipartimento di Informatica, Università di Pisa) Siamo tutti pollicini digitali Plenty of digital breadcrumbs behind
The Children s Online Privacy Protection Act COPPA How COPPA & Parental Intelligence Systems Help Parents Protect Their Kids Online A uknow White Paper by Tim Woda, co founder of uknow.com, Inc Overview
Connected car, big data, big brother? Using geolocation in a trustworthy and compliant way Simon.Hania@tomtom.com Trends that threaten trust 2 Connected cars with downloadable apps Location services, cloud,
Microsoft Corporation Tel 425 882 8080 One Microsoft Way Fax 425 936 7329 Redmond, WA 98052-6399 http://www.microsoft.com/ March 31, 2014 Ms. Nicole Wong Big Data Study Office of Science and Technology
Vol. 7, No. 11, November 2011 Can You Handle the Truth? Degrees of De-identification of Clinical Research Data By Jeanne M. Mattern Two sets of U.S. government regulations govern the protection of personal
3108 Fifth Avenue Suite B San Diego, CA 92103 Comments of the World Privacy Forum To: Office of Science and Technology Policy Re: Big Data Request for Information Via email to firstname.lastname@example.org Big Data
IBM SPSS Modeler Three proven methods to achieve a higher ROI from data mining Take your business results to the next level Highlights: Incorporate additional types of data in your predictive models By
Exploring Big Data in Social Networks email@example.com (firstname.lastname@example.org) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
Big Data & Personal Data Markets Arnold Roosendaal (TNO) email@example.com @ajunior1 Outline Two topics in one lecture Big Data Personal Data Markets How do they relate? 2 The Challenge Big Data
Big Data / Privacy: Pick One? A. Michael Froomkin University of Miami School of Law firstname.lastname@example.org 1 Privacy, Quickly Has multiple elements including control of access to body, to thoughts, protection
Submission to the National Telecommunications and Information Administration (NTIA), U.S. Department of Commerce Docket No. 140514424 4424 01 RIN 0660 XC010 Comments of the Information Technology Industry
Big Data / FDAAWARE Rafi Maslaton President, cresults the maker of Smart-QC/QA/QD & FDAAWARE 30-SEP-2015 1 Agenda BIG DATA What is Big Data? Characteristics of Big Data Where it is being used? FDAAWARE
Whitepaper BPM for Structural Integrity Management in Oil and Gas Industry - Saurangshu Chakrabarty Abstract Structural Integrity Management (SIM) is an ongoing lifecycle process for ensuring the continued
Google Places Optimization (FAQ) 1. What is local Search? Local search is any search aimed at finding something within a specific geographic area like hotel in Los Angles. Most of the time Google delivers
SOCIAL MEDIA LISTENING AND ANALYSIS Spring 2014 EXECUTIVE SUMMARY In this digital age, social media has quickly become one of the most important communication channels. The shift to online conversation
Unlocking The Value of the Deep Web Harvesting Big Data that Google Doesn t Reach Introduction Every day, untold millions search the web with Google, Bing and other search engines. The volumes truly are
Statement of Latanya Sweeney, PhD Associate Professor of Computer Science, Technology and Policy Director, Data Privacy Laboratory Carnegie Mellon University before the Privacy and Integrity Advisory Committee
Big Data, Not Big Brother: Best Practices for Data Analytics Peter Leonard Gilbert + Tobin Lawyers March 2013 How Target Knew a High School Girl Was Pregnant Before Her Parents Did just because you can,
Big Data Big Privacy Privacy Awareness Week SPEAKING NOTES Stephen Wilson Lockstep Group Setting the scene Practical experience shows a gap in the understanding that technologists as a class have regarding
Privacy Challenges in the Internet of Things (IoT) a European Perspective Alicja Gniewek, PhD Student Interdisciplinary Centre for Security, Reliability and Trust Weicker Building, Université du Luxembourg
Before the Department of Commerce National Telecommunications and Information Administration Docket No. 120214135-2135- 01 Multistakeholder Process to Develop Consumer Data Privacy Codes of Conduct Request
Hadoop for Enterprises: Overcoming the Major Challenges Introduction to Big Data Big Data are information assets that are high volume, velocity, and variety. Big Data demands cost-effective, innovative
White Paper Analyzing Big Data: The Path to Competitive Advantage by Marcia Kaplan Contents Introduction....2 How Big is Big Data?................................................................................
Utilizing big data to bring about innovative offerings and new revenue streams DATA-DERIVED GROWTH ACTIONABLE INTELLIGENCE Ericsson is driving the development of actionable intelligence within all aspects
Big Data Mining: Challenges and Opportunities to Forecast Future Scenario Poonam G. Sawant, Dr. B.L.Desai Assist. Professor, Dept. of MCA, SIMCA, Savitribai Phule Pune University, Pune, Maharashtra, India
Internal Investigations, Data Analytics and Employee Privacy in Online/Computer Activity Prepared for: The Essential Privacy, Access to Information, CASL Forum November 24, 2014 Dr. Andrea Slane University
Capturing Meaningful Competitive Intelligence from the Social Media Movement Social media has evolved from a creative marketing medium and networking resource to a goldmine for robust competitive intelligence
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
OVERVIEW Yale University Open Data Access (YODA) Project These procedures support the YODA Project Data Release Policy and more fully describe the process by which clinical trial data held by a third party,
SHORT FORM NOTICE CODE OF CONDUCT TO PROMOTE TRANSPARENCY IN MOBILE APP PRACTICES I. Preamble: Principles Underlying the Code of Conduct Below is a voluntary Code of Conduct for mobile application ( app
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
Big Data, Key Challenges: Privacy Protection & Cooperation Observations on international efforts to develop frameworks to enhance privacy while realising big data s benefits Seminar arranged by the Office
Managing Special Authorities for PCI Compliance on the System i Introduction What is a Powerful User? On IBM s System i platform, it is someone who can change objects, files and/or data, they can access
Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop
White Paper Understanding The Role of Data Governance To Support A Self-Service Environment Sponsored by Sponsored by MicroStrategy Incorporated Founded in 1989, MicroStrategy (Nasdaq: MSTR) is a leading
United States of America Federal Trade Commission The Social Impact of Open Data Remarks of Maureen K. Ohlhausen 1 Commissioner, Federal Trade Commission Center for Data Innovation The Social Impact of
HIPAA-Compliant Research Access to PHI HIPAA permits the access, disclosure and use of PHI from a HIPAA Covered Entity s or HIPAA Covered Unit s treatment, payment or health care operations records for
The Essentials Series: Email-Centric Data Loss Prevention Benefits of Using Data Loss Prevention Technology sponsored by by Dan Sullivan Benefits of Using Data Loss Prevention Technology...1 Need for Automated
Big Data better business benefits Paul Edwards, HouseMark 2 December 2014 What I ll cover.. Explain what big data is Uses for Big Data and the potential for social housing What Big Data means for HouseMark
Online and Mobile Privacy Notice ( Privacy Notice ) Introduction This Privacy Notice applies to the operations of Cigna Global Health Benefits and its affiliated companies listed at the end of this Privacy
Session No. 744 The Internet of Things (IoT) Opportunities and Risks David Loomis, CSP Risk Specialist Chubb Group of Insurance Companies Brian Wohnsiedler, CSP Risk Specialist Chubb Group of Insurance
Assessing Your Business Analytics Initiatives Eight Metrics That Matter WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 The Metrics... 1 Business Analytics Benchmark Study.... 3 Overall
Android Developer Applications January 31, 2013 Contact Departmental Privacy Office U.S. Department of the Interior 1849 C Street NW Mail Stop MIB-7456 Washington, DC 20240 202-208-1605 DOI_Privacy@ios.doi.gov
Big Data Analytics- Innovations at the Edge Brian Reed Chief Technologist Healthcare Four Dimensions of Big Data 2 The changing Big Data landscape Annual Growth ~100% Machine Data 90% of Information Human
WHITEPAPER Voice of the Customer: How to Move Beyond Listening to Action Merging Text Analytics with Data Mining and Predictive Analytics Successful companies today both listen and understand what customers
email@example.com @BlueHillBoston 617.624.3600 Anatomy of a Decision BI Platform vs. Tool: Choosing Birst Over Tableau for Enterprise Business Intelligence Needs What You Need To Know The demand
FIVE INDUSTRIES Where Big Data Is Making a Difference To understand how Big Data can transform businesses, we have to understand its nature. Although there are numerous definitions of Big Data, many will
THE ROLE OF BIG DATA IN HEALTH AND BIOMEDICAL RESEARCH John Quackenbush Dana-Farber Cancer Institute Harvard School of Public Health CONFIDENTIAL Background and Disclosures Professor of Biostatistics and
Data Privacy and Biomedicine Syllabus - Page 1 of 6 Course: Data Privacy in Biomedicine (BMIF-380 / CS-396) Instructor: Bradley Malin, Ph.D. (firstname.lastname@example.org) Semester: Spring 2015 Time: Mondays
Healthcare Utilizing Trusted NextgenID - Headquarters 10226 San Pedro Ave, Suite 100 San Antonio, TX 78216 (210) 530-9991 NextgenID - Washington DC 13454 Sunrise Valley Drive, Suite 430 Herndon, VA 20171
The Case for a New CRM Solution Customer Relationship Management software has gone well beyond being a good to have capability. Senior management is now generally quite clear that this genre of software
Addressing Big Data Security Challenges: The Right Tools for Smart Protection Trend Micro, Incorporated A Trend Micro White Paper September 2012 EXECUTIVE SUMMARY Managing big data and navigating today
Big Data in Transportation Engineering Nii Attoh-Okine Professor Department of Civil and Environmental Engineering University of Delaware, Newark, DE, USA Email: email@example.com IEEE Workshop on Large Data
Customer Segmentation in the Age of Big Data By: Michael Million Moving Beyond Traditional Segmentation Traditional customer segmentation is at the heart of every marketing organization, giving companies
M2SYS Healthcare Solutions Free Online Learning Podcasts The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry with Beth Just, President and CEO of Just Associates Podcast length
VMware vcenter Log Insight Delivers Immediate Value to IT Operations VMware vcenter Log Insight VMware vcenter Log Insight delivers a powerful real-time log management for VMware environments, with machine
AdvancedMD Online Privacy Statement Effective date: September 1, 2015 This Privacy Statement explains how AdvancedMD uses and discloses personal information that we collect from people who visit our websites
Government Technology Trends to Watch in 2014: Big Data OVERVIEW The federal government manages a wide variety of civilian, defense and intelligence programs and services, which both produce and require
Bruhati Technologies ISO 9001:2008 certified Technology fit for Business About us 1 Strong, agile and adaptive Leadership Geared up technologies for and fast moving long lasting With sound understanding
Online Reputation in a Connected World Abstract This research examines the expanding role of online reputation in both professional and personal lives. It studies how recruiters and HR professionals use
Big Data and its Real Impact on Your Security & Privacy Framework: A Pragmatic Overview Erik Luysterborg Partner, Deloitte EMEA Data Protection & Privacy leader Prague, SCCE, March 22 nd 2016 1. 2016 Deloitte
Privacy Practices and Feedback Privacy Statement Welcome to this website, a service of Questzones.net Inc (collectively, "Questzones." "we," or "us"). This statement discloses the privacy practices for
HMG Corporate Development Team firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com NOTICE: Proprietary and Confidential All the content of this document
WHITE PAPER: PREDICTIVE CODING DEFENSIBILITY........................................ Predictive Coding Defensibility and the Transparent Predictive Coding Workflow Who should read this paper Predictive
Big Data overview SICS Software week, Sept 23-25 Cloud and Big Data Day Livio Ventura Big Data European Industry Leader for Telco, Energy and Utilities and Digital Media Agenda some data on Data Big Data
A How to Guide to Predictive Analytics A How to Guide to Predictive Analytics Chapter 1: The Promise of Predictive Analytics Chapter 2: Data End Points Chapter 3: Storing and Managing Data Chapter 4: Policy
Why Modern B2B Marketers Need Predictive Marketing Sponsored by www.raabassociatesinc.com firstname.lastname@example.org www.mintigo.com email@example.com Introduction Marketers have used predictive modeling
THE STATE OF Customer Analytics Taking A Proactive Approach To Loyalty & Retention By Kerry Doyle An Exclusive Research Report UBM TechWeb research conducted an online study of 339 marketing professionals
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI firstname.lastname@example.org What