Morality, Ethics, and the Social Compact in a Big Data World Presentation to Big Data Unconference June 19, 2015 Jerrard B. Gaertner CPA, CA IT, CGEIT, CIPP/IT, CISA, CISSP, CIA, CFI, I.S.P., ITCP 1
Your Presenter 2
My CV CPA, CA, CA IT, CISA, CISSP, CGEIT, CFI, CIPP/IT, CIA, I.S.P. IT strategist (big data and predictive analytics); systems assurance & certification professional; security & privacy practitioner Co-founder: Managed Analytic Services (Vice President Risk, Compliance and Security) Digital Legacy Institute (Executive Director) data Privacy Partners (Partner) Adjunct Professor of Computer Science (Ryerson University) Special Advisor: Risk Management, Analytics (U. of Toronto) Past President, Canadian Information Processing Society Author of 3 legal texts (Thompson Carswell) Graduate of MIT 3
Big Data and analytics strategy and consulting with a business value orientation Data science as a service Analytics training and certification Domain expertise in fraud detection, continuous risk monitoring, continuous auditing, computer security, insurance, AML, expense control and financial/regulatory compliance 4
Canadian Information Processing Society (CIPS) National and provincial societies Established by statute 5,000 members in Canada Canada s IT voice internationally (United Nations, IFIP, FEAPO ) Professionalism in IT Protection of the public Advocacy on technology issues Certifications (national and international) College and university accreditation Public education 5
Digital Legacy Institute www.digitallegacyinstitute.com Research and standard setting Works with ISDLP to help accredit DL professionals worldwide Open to lawyers, accountants, trustees, executors, forensic practitioners, security and privacy experts and other interested parties Protection of the public Advocacy on DL technology issues College and university accreditation Public education 6
Digital Legacy Institute www.digitallegacyinstitute.com Starting up needs volunteers! 7
The 3 Y s 1. The Big Data world may exacerbate not only economic, but political, social and even educational inequality (or not) 2. The oft remarked death of privacy may actually occur as Big Data technology and the IoT evolve 3. The transformative power of Big Data for social good may well be overwhelmed by the search for profit unless ethical and moral considerations of Big Data become an integral part of CSR, similar to green and evolving carbon standards 8
9
10
Objective Get Big Data professionals (and others) thinking about the Big Data world being created (good? bad?) and how their individual actions can help in raising awareness and ensuring that this transformative technology ultimately benefits us all. https://www.youtube.com/watch?v=lb13ynu3iac# 11
Really High Level Fly-By of the Topic 12
What We Mean by The Big Data World 13
1. Ubiquitous data recording, transmission and storage all formats, all modalities, all activities (including biometrics and IoT) 2. Application of readily available, capable and relatively easy to use/automate analytic tools for prediction, pattern recognition, modelling, segmentation, semantic/contextual analysis, visualization of the data 3. Seamless integration of neural networks, artificial intelligence, machine learning with analytic tools 4. Enabled by virtually free storage and data transmission and robust technological infrastructure (open source and commercial) 14
Morality The differentiation of intentions, decisions and actions between those that are good or right and those that are bad or wrong. Morality can be a body of standards or principles derived from a code of conduct from a particular philosophy, religion, or culture, or it can derive from a standard that a person believes should be universal. Morality may also be specifically synonymous with "goodness" or rightness." 15
Ethics I Ethics is the branch of philosophy that involves systematizing, defending, and recommending concepts of right and wrong conduct. As a branch of philosophy, ethics investigates the questions What is the best way for people to live? and What actions are right or wrong in particular circumstances? 16
Ethics II In practice, ethics seeks to resolve questions of human morality, by defining concepts such as good and evil, right and wrong, virtue and vice, justice and crime. As a field of intellectual enquiry, moral philosophy also is related to the fields of moral psychology, descriptive ethics, and value theory. 17
Social Compact A Somewhat Archaic Term A usually implicit agreement among the members of an organized society or between the governed and the government defining and limiting the rights and duties of each. The idea that power derives from the people who must consent to be governed. 18
19
Definitions Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. Challenges include data capture, curation, storage, search, sharing, transfer, analysis and visualization Big Data is about identifying contributing processes and factors (correlation) without necessarily completely understanding what these factors are and if they are causative. 20
Definitions Big data is a short hand label that typically means applying sophisticated statistical tools, or the techniques of artificial intelligence, such as machine learning, to vast new troves of data beyond that captured in standard data bases. The new data sources include web browsing data trails, social network communications, sensor data and surveillance data. - Steve Lohr, New York Times Reporter How Big Data Became So Big, August 12, 2012 21
Big data is characterized by: the volume, variety and velocity of its data. Some people also include validity or veracity as a fourth V. Big data solutions optimally consist of innovative, cost effective forms of information processing, providing enhanced insight and decision making. - Adapted from Gartner Inc., S. Sicular 22
Big Data World Ethics and Morality Social Compact, Legal and Regulatory, Inalienable Rights 23
10 Questions 1. Does the use of big data to micro target voters and to refine candidates platforms so as to be optimally palatable to the greatest number of voters enhance, detract or have little impact on the exercise of democracy? Logical extension Asimov s Foundation. 24
Is this the Future of Democracy? 25
10 Questions 2. As Big Data permits us to know more and more about the end consumer, the following progression may occur: first we satisfy needs (shelves are stocked), then we anticipate needs (get pumpkins before Halloween), then we extrapolate needs more broadly (make sure to get costumes too) then finally we create needs, based on our intimate understanding of the target consumer (create the new holiday Thanksturkeyween which falls between Halloween and Thanksgiving and provides another opportunity to buy pumpkins, turkeys and costumes). When does marketing become manipulation? 26
10 Questions 3. Detailed tracking of students progress provides an opportunity to intervene earlier if difficulties arise. It also provides an opportunity to out-stream troublesome students who cost the system more (time, effort, resources) and may drag time standardized test scores. How do we balance the good of the many (the system) with the good of the few, or the one? 27
10 Questions 4. Predictive policing provides an opportunity to crime-proof a neighbourhood at risk of being invaded by gangs (or homeless people, or prostitutes, or undesirables). However, the gangs (or homeless people, or prostitutes, or undesirables) simply move to another neighbourhood one which was not identified in the prediction. Who decides which neighbourhood is protected and which neighbourhood gets the crime? Who decides if we draw the line at gangs, sex workers or the homeless? Does predictive policing provide a politically expedient way of burying a social problem or at least making sure it doesn t upset the powerful? 28
10 Questions 5. How much value do proprietary stock market models add? Do they (including HST) simply tilt the odds unfairly to the well connected and large portfolio clients? 29
10 Questions 6 The hypocrisy of differential enforcement of laws (at the policing level) and differential access to justice based on social status and wealth is well recognized. The Minister s mistress is rarely arrested for soliciting, whereas the desperate single mother begging in front of an upscale restaurant surely will be. Big Data provides an opportunity to monitor individual behaviour at work, on line, financially, recreationally and to identify infractions which previously would have gone unnoticed. Does this provide an opportunity for more effective law enforcement or rather for more effective differential enforcement of laws as a tool of power and social manipulation? 30
10 Questions 7. Will people with bad genes, bad driving habits, bad finances who live in bad neighbourhoods continue to benefit from the insurance and social philosophy of shared risks, or will Big Data be used to benefit the unneedy and penalize those who most require risk-sharing? Will pay according to your risk profile replace all for one and one for all? 31
10 Questions 8. Big Data can power extremely rapid and granular dynamic pricing, from road tolls to volume discounts. As this type of pricing becomes the norm, who is most likely to benefit and who to pay more? 32
Farecast 33
10 Questions 9. Continuous, ubiquitous collection of data, even if not directly related to an individual, can easily be used to characterize families and neighbourhoods. In many cases, indirect data (such as VISA transactions) can nevertheless be used to identify, track and characterize individuals with great specificity. Have we lost the battle for personal privacy and the right to control our own information, or does some hope remain? (Some scholars believe that without privacy, democracy itself cannot effectively exist.) 34
35
10 Questions 10. Some scholars believe that Open Government empowers enterprise users commercially, while not providing a similar advantage to the citizenry. Furthermore, very few embodiments of Open Government are truly open, with censorship, purposeful mal-formatting of data, omissions and inconsistent data making analysis of government effectiveness and honesty (the main reason for Open Government!) very difficult. What types of mechanisms, standards and the like might be instituted to address these concerns? 36
37
Privacy Issues of data ownership (and data value) have not been agreed. In fact there are vast differences even among US states. Cost of monitoring an individual has dropped from $275/hr. to about $0.06/hr. over the last 5 years. Big Data puts all data assets in one basket, nominally increasing risk of a data breach. In addition, security and privacy controls are often overlooked for off line data lakes. 38
Big Data has Special Risks 1. Concentration creates high value targets 2. Where did each element come from, is it accurate, unique, current? Data quality issues are significant 3. Lower established reliability and less familiarity, greater inherent complexity, increase risk of error 4. Logical analysis, process re-performance not always possible. Untestable processing leaves residual risk 5. ETL process can be complex & time consuming 6. On line and off line processes pose different risks 7. Big Data sometimes falls between the cracks in the application of security and privacy policies 39
Big Data Risks Concentration, Conversion (ETL) and Data Quality Risks Staff Lack Familiarity and Training Lack of Proven Reliability and 3 rd Party Certification Difficult to Test in Conventional Ways Few Security and Privacy Tools Architectural Complexity Unrealistic Expectations and Pressure to Produce 40
Accuracy and Reliability Don t forget garbage-in, garbage-out applies to Big Data as well. Issues of accuracy, timeliness, provenance, completeness can affect conclusions drawn and the accuracy of models. Applying statistical tools and techniques does not eliminate the risk of results tainted by bias. However, often times the bias is more difficult to detect. (For example sampling methodology). More data and more sophisticated algorithms do not guarantee better results (as many seem to believe). 41
Miscellaneous Big Data may be changing experimental science (away from hypothesis based science). [I doubt this and believe changes are not causing a drastic rethink of the scientific method.] Many believe Big Data is increasing the digital divide between techno savvy and others. Recognition of new asset category (data) and the value of nascent patters and insights, as well. IoT will result in huge increases in the amount of data available for analysis. 42
Miscellaneous Big Data remains a process and not a technology or technique. Failing to recognize this, many implementations fail. 43
Implementation Requires People Policies and Procedures Knowledge and Skill Time and Money Receptive Culture Leadership and Commitment Monitoring and Feedback Incentives and Disincentives Success Stories 44
45
Big Data offers competitive advantage Analytics and Big Data are providing more and more examples of improved decision making, more focused business strategy, better marketing, more efficient processes, faster detection and correction of anomalies, superior R & D, greater profitability, more satisfied stakeholders and better outcomes in general - versus conventional decision making, intuition, and less formal analysis. 46
Commercial Successes - Goldcorp 47
Amazing Growth and Competitive Advantage Monsanto Buys Weather Big Data Company Climate Corporation For Around $1.1B Climate Corp less than 2 years old! http://techcrunch.com/2013/10/02/monsanto-acquires-weatherbig-data-company-climate-corporation-for-930m/ 48
Common Applications Fraud and money laundering detection Continuous auditing Marketing and sentiment analysis Credit scoring Quality control Investment management Customer service improvement Increasing productivity of existing assets Fleet and logistics optimization Recommendations Call centre management Diagnosis 49
That s all for now 50
Questions? 51
Contact Information Jerry Gaertner 416 505-0307 658 Danforth Avenue Suite 209 Toronto ON M4J 5B9 jerrard.gaertner@managedanalytics.com jgaertner@cips.ca 52