1 Data Governance in Large-scale Biomedical Research: Tools, Methods and Policy Goals Pilar N. Ossorio, Ph.D., JD University of Wisconsin Morgridge Institute for Research
2 Road Map What is governance? From the specific to the general Specific: sharing sensitive human data through controlled access repositories General: some observations on the future of governing human data in biomedical research
3 Governance A Very Brief Introduction
4 What is governance? Processes and interactions through which a collection of actors seeking to solve a problem or operate an institution generate, reinforce, revise, and enforce rules, policies, and norms of behavior.
5 Governance Values To achieve legitimacy (for all public gov systems) Institutions and rules should be: fair, efficient, effective, well-reasoned, responsive, justified, locally-informed, adaptive (problem solving) Processes should be: deliberative, inclusive and participatory, transparent, and reasonable Decision makers should have accountability Identify other values specific to biomedical research and health care and Goal: Simulatanousely advance as many values as possible; prioritize, balance, or trade-off as necessary.
6 Biomed Research & Data Governance Privacy and Respect Advance scientific knowledge Promote Human Flourishing Improve healthcare systems, improve health Achieve a politically and socially functional set of power relations in society
7 Governance Tools Statutes & Regulations Contracts (i.e., data user agreements) Professional and institutional policies and norms Reputational incentives Citizen monitoring Market mechanisms Certifications, insurance, other??? Technical infrastructure
8 The Particular Controlled Access Data Sharing
9 Sharing Human Data Fundamental tension: Promote science through sharing while protecting the interests of people whose data are in the repositories. Minimize informational harm and ensure that risks are justified. Respect & promise keeping Advancing science
10 Resolving the Tension Open Access Controlled Access
13 Repository Governance at a Glance Data deposition agreements and rules; IRB review Data use agreements and rules Data Producers Repository/NIH Data Users Data curation, quality control, policies for data access and use, DACs, compliance assessments
14 Reporting on data accessed up to Dec. 1, studies had deposited datasets 2221 investigators had accessed at least one data set 924 publications cited a dataset from dbgap
15 GWAS/GDS Data Access & Use Different datasets have different data use restrictions because: The came from studies with different consent processes and promises to participants; Different ICs have slightly different policies for accessing data for which they are responsible; Large, multisite studies and common fund projects can have their own specific data sharing policies, particularly for prepublication data sharing.
16 The Project Interviewed 30 dbgap users between 2009 and Additional background interviews w DAC members. Total Data Users 30 Gender Female 11 Male 19 Prof. Affiliation Biomedical 17 Computational 11 Other 2 Institution University 25 Firm 5
17 Research Questions Do data users think controlled access strikes a reasonable balance between promoting science and protecting/respecting data sources? Do data users know the use restrictions on dbgap data generally, and on their data in particular? What factors influence users willingness or ability to follow the rules? What steps do researchers take to assure compliance?
18 One Key Finding V. Biomedical scientists and data scientists had very different experiences with and views of the controlled access process!
19 I think the burdensomeness [of controlled access] is not necessarily a bad thing. It s, it s a filter... And so, so people have to show a commitment to being serious about the data, which probably is correlative with being serious about its protection. And you also, usually, have to have time and money to be able to do this. So, grant and teaching release, and so other people have said, Yes, you are a credible researcher in order to be able to go through this process.
20 Even if the mechanism for allowing controlled access seems great on paper, the implementation is complicated... This was the NIH people reviewing our requests and for reasons that are beyond me turning us down. I don t have time for this! And so it was a real turn down. And, it sounds like why... from my point of view it s why, why, why should I be denied access to this data!
21 Bioscientists vs Data Scientists Bioscientists are accustomed to other oversight regimes that impose delays on research while data scientists are much less familiar with such oversight. Bioscientists are more likely to plan research with delays in mind. Bioscientists are more likely to have expertise at navigating oversight processes, or they have somebody with such knowledge in their local professional networks.
22 Bioscientists vs Data Scientists Bioscientists are more likely to have a commitment to studying one particular disease... They may need fewer datasets, and even if they access many, those datasets are often overseen by the same data access committee. A bioscientist s longstanding commitment to studying a particular disease might make her/him more willing to persevere in the face of obstacles to data access.
23 Different Professional Networks! M.E.J. Newman (2004) PNAS, 101 (Supp.1):
24 Bioscientists vs Data Scientists Different professional networks! Bioscientists have someone they can call when things go wrong, NIH is not a bunch of faceless bureaucrats to them. Bioscientists are more likely to know someone who played a role in developing the policy. A bioscientist who spends months getting access to the data in dbgap has a competitive advantage over her peers, whereas a data scientist may be a professional disadvantage when compared to peers who work on completely open data.
26 Compliance and Professionalism Why do people comply with data use restrictions? Serious bioscientists do not violate participant trust or poison the well for their peers, so they follow the rules for protecting participants. Serious data scientists do not permit breaches of data security or data use restrictions. Biomedical Compliance Behavior Serious Scientist Computational
27 Two Lesson s Learned Inclusiveness, participation, and transparency matter in research data governance! Inattention by the designers of the GWAS/GDS policy to the networked/nodal nature of data governance has led to some of the most significant delays in data access.
28 Broadening the Vision: Data Governance Going Forward
29 dbgap is so 2014!
30 Going Forward There will be increasing complexity and pluralism with respect to: The types of data used in biomedical research; The routes by which researchers will access data; The types of actors who will be involved in data governance; The computational approaches to sharing and analyzing data; Size of studies and complexity of study design.
31 Complexity and Governance Hierarchical, integrated, comprehensive governance regimes with extensive formal law networked, departure from hierarchy, more public-private partnerships, fewer formal and enforceable rules and more soft law experimentalist governance
32 Experimentalist Governance Openness to participation by a very broad range of actors or potential actors; Extensive deliberation; Broadly agreed common problem and articulation of a framework setting forth open ended goals; Implementation by actors with local or contextual knowledge; Continuous feedback, data collection or reporting, and monitoring of progress/effectiveness; Evolution of rules.
33 Tailor Data Governance to Fit
34 Conceptualizing the Enterprise Frame the problem carefully: Privacy v. utility OR proper channeling of the right kind and amt of data to the right people? Who owns the data? OR who controls its collection, use, and redistribution, and how do we make that decision?
35 Legal Complexities of Research Data Governance U.S. Federal Law HIPAA, FCRA, GINA COPPA, Privacy Act, Federal Trade Comm. Act, IP Law State medical info privacy law, other state law, and state regulations U.S. Regulations Common Rule, HIPAA Privacy Rule, HIPAA Security Rule, FDA medical device regulations, FDA Health IT regs
36 Law How can compliance with a patchwork of sector-specific laws be simplified? Do we need some harmonization? Are there conflicts among the applicable laws? Do we need special data use authority for health care researchers or the health care industry? If so, who would receive such authorities? Office of the President, Big Data: Seizing Opportunities, Preserving Values, at ig_data_privacy_report_may_1_2014.pdf
37 Law Or, do we need uniform (not sector-specific), but very flexible, data use and privacy laws? To what extent do or should fair information practices principles undergird all data use? Notice/Awareness Choice/Consent Access (my access to info about me) Integrity/security Enforcement/redress
38 Aknowledgements Michael Kloman transcriptions Yao Zhou and Audrey Tluczek coding Sara Heyn research The Data Discussion Group Center for Predictive Computational Phenotyping Interviews supported by NIH R03HG from NHGRI.