Data platforms to support research, evaluation & practice. David V Ford Professor of Health Informatics School of Medicine, Swansea University



Similar documents
Data Appliance Sailing to Data Islands

Health Information Research Unit. Secure Anonymised Information Linkage (SAIL) system

Big Data and the social sciences a perspective from the ESRC. Peter Elias

Dundee e-health Centre of Excellence

Project Assured Data Access. Henry Hughes

Individual Referencing Systems. Anthea Springbett Programme Principal SHIS-R NHS Information Services Division

Informatics: Opportunities & Applications. Professor Colin McCowan Robertson Centre for Biostatistics and Glasgow Clinical Trials Unit

SAIL Person-level Export File Structure & Data Transfer

SAIL Address-level Export File Structure & Data Transfer

Farr Institute of Health Informatics Research Harnessing Data for Health Science and e-health Innovation

The Research Capability Programme. Peter Knight, Group Programme Director

SAIL Person-level Export File Structure & Data Transfer

Big Data for health. Farr Institute, Administrative Data Research Centres, Medical Bioinformatics. 9 July Jacky Pallas, UCL

Health and Social Care Information Centre

SAIL Address-level Export File Structure & Data Transfer

ESRC Big Data Network Phase 2: Business and Local Government Data Research Centres Welcome, Context, and Call Objectives

Driving ehealth Innovation: The Economic Development Model by David V Ford. Med-e-Tel Luxembourg, 6 8 April 2011

UK Biobank. Integrating electronic health records into the UK Biobank Resource. Version

Developments to the NHAIS System

Digital Health: Catapulting Personalised Medicine Forward STRATIFIED MEDICINE

ehealth Research HEADLINES

Assessing the medical costs of injury using linked anonymised datasets. The impossible is often the untried

Secure networking and AAAI

Advanced research computing

Information Governance. A Clinician s Guide to Record Standards Part 1: Why standardise the structure and content of medical records?

CONSUMER DATA RESEARCH CENTRE DATA SERVICE USER GUIDE. Version: August 2015

Documenting the research life cycle: one data model, many products

CORPORATE OVERVIEW. Big Data. Shared. Simply. Securely.

Stakeholders meeting. Ethical protocols and standards for research in Social Sciences today

Enhancing Discoverability of Public Health and Epidemiology Research Data: Summary

THE UNIVERSITY OF MANCHESTER PARTICULARS OF APPOINTMENT FACULTY OF MEDICAL & HUMAN SCIENCES INSTITUTE OF POPULATION HEALTH

BioGrid s use of Business Analytics for Collaborative Medical Research. Maureen Turner, CEO, BioGrid Australia

Electronic Health Records Research in a Health Sector with Multiple Provider Types

Keystones for supporting collaborative research using multiple data sets in the medical and bio-sciences

Effective Data Integration - where to begin. Bryte Systems

Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram

Data Management Planning

The Internet of ANYthing

Internet of Services. What is the Future Internet for us? Pnina Vortman IBM Haifa Research Laboratory

SURE: Helping to get the most out of routinely collected data

Civica Health & Social Care

Metadata driven framework for the Canada Research Data Centre Network

The UK Administrative Data Research Network: Improving Access for Research and Policy. Report from the Administrative Data Taskforce December 2012

Business Intelligence Solutions Architect Band 8a

A discussion of information integration solutions November Deploying a Center of Excellence for data integration.

Big Data and Analytics: Challenges and Opportunities

Achieving a Single Patient View. Eric Williams Software Practice Sun Microsystems UK Ltd.

A Commercial Approach to De-Identification Dan Wasserstrom, Founder and Chairman De-ID Data Corp, LLC

Embedding Customized Data Visualization and Analysis

Autonomy Consolidated Archive

Should we be making better use of public data in health research? Paul Boyle. The value of routine administrative data

HP Adaptive Backup and Recovery

Big Data Visualization and Dashboards

Migrating to TM1. The future of IBM Cognos Planning, Forecasting and Reporting

WebFOCUS Cloud Express. The WebFOCUS Cloud Express service is delivered as a managed G-Cloud service by Amtex Solutions Ltd.

Integrating Netezza into your existing IT landscape

UK Data Service: Data-driven research - Access, analyse, evidence

Report to UCLH Board Bryan Williams MD FRCP FESC FAHA

COMP9321 Web Application Engineering

G-Cloud Healthcare Analytics Service. October G-Cloud. service definitions

Linking Administrative and Survey Data for Statistical Purposes

Microsoft Business Intelligence solution. What makes Microsoft BI difference

Distributed Networking

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Deputy Director, Mental Health and Protection of Rights Division, Scottish Government

Monitoring the health of our Nation s children

ONS Big Data Project Progress report: Qtr 1 Jan to Mar 2014

SharePoint Seminar. Explore and Learn. Integrated Business IT Services.

Intelligent document management for the legal industry

Ad Hoc Analysis of Big Data Visualization

, Head of IT Strategy and Architecture. Application and Integration Strategy

Geo Analysis, Visualization and Performance with JReport 13

Making Data Citable The German DOI Registration Agency for Social and Economic Data: da ra

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.

Making communication in healthcare effective and compliant Integration via the HL7 Interface Engine by Somnath Mukherjee

Luncheon Webinar Series May 13, 2013

From Fishing to Attracting Chicks

Searching biomedical data sets. Hua Xu, PhD The University of Texas Health Science Center at Houston

INTERNET DATA SAFE SOLUTIONS TURNKEY AND CUSTOM MADE

1 Executive Summary Document Structure Business Context... 5

Establishing a Data and Information Governance Program at the Cancer Institute NSW. Data Governance 2012 Narelle Grayson

Data management plan

How To Create A Health Research Data Repository

NHS public health functions agreement Service specification no.20 NHS Newborn Hearing Screening Programme

Data Quality: Review of Arrangements at the Velindre Cancer Centre Velindre NHS Trust

Top 5 reasons to choose HP Information Archiving

Big Data Analytics Service Definition G-Cloud 7

OpenAIRE Research Data Management Briefing paper

North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics

Using Version Control and Configuration Management in a SAS Data Warehouse Environment

Data Quality Arrangements in the Screening Division Public Health Wales NHS Trust. Issued: August 2012 Document reference: 348A2012

Perceptive Software: Corporate Profile

SHARING RESEARCH DATA POLICY, INFRASTRUCTURE, PEOPLE

Civica Health & Social Care Paris EPR and Case Management. Transform the way you work

Health Information and Quality Authority. To drive continuous improvements in the quality and safety of health and social care in Ireland

SAS CLOUD ANALYTICS MAY 2015

Big Data 101: Harvest Real Value & Avoid Hollow Hype

Big data coming soon... to an NSI near you. John Dunne. Central Statistics Office (CSO), Ireland

1 About This Proposal

Transcription:

Data platforms to support research, evaluation & practice David V Ford Professor of Health Informatics School of Medicine, Swansea University

Outline 1. Swift overview of SAIL Databank as used in Wales 2. Everything we know in a box Research Data Appliances

The SAIL Databank A national e-research platform for Wales David Ford Professor of Health Informatics College of Medicine Swansea University

SAIL: 5 Minute Overview SAIL= Secure Anonymised Information Linkage Databank of >9 billion recordings on the people of Wales Data on >5 million people 500+ feeder systems from Wales, inc >350 GP (75%) Much data goes back 10-20 years All pre-linked 5m+ investment in high performance IT Strong privacy protection & IG Remote access from anywhere, given approvals Industrial strength, reusable infrastructure.

SAIL: Data resources Core holdings: Annual District Birth Extract (ADBE) Annual District Death Extract (ADDE) Bowel Screening Wales (BSW) Breast Test Wales (BTW) Cervical Screening Wales (CSW) Congenital Anomaly Register and Information Service (CARIS) Emergency department Data Set (EDDS) National Community Child Health Database (NCCHD) Outpatient Dataset (OPD) Patient Episode Database for Wales (PEDW) Primary Care GP dataset (360 practices, 2m people, 75% of practices) Welsh Cancer Intelligence and Surveillance Unit (WCISU) Welsh Demographic Service (WDS) Project-specific holdings: Clinical trials participants Conventional cohort participants Cross sectional survey participants Many others Reference data: Data quality reports Extract histories Coding and mapping information Metadata Organisation codes

SAIL features 1. Built to consume (new data) swiftly 2. Secure sharing for projects, based on views and a UKSERP instance (more later). 3. Total population coverage 4. Used for observational research, trial feasibility, outcome data for trials, extending cohorts, natural experiments.. 5. Lots of trial and cohort study participants embedded in SAIL

SAIL: 5 Minute Overview 1. Currently supporting externally funded projects with value > 90m 2. >100 Peer-reviewed publications to date 3. 140+ approved and active SAIL projects. 4. Over 300 registered users, with 150 active today from across the world 5. 100 staff in Swansea working on Health Informaticsrelated projects 6. Average 35 day turnaround from application to data 7. Applications open to all

SAIL: 5 Minute Overview 1. Now new UK Centre of Excellence in HI CIPHER Centre for Improvement in Population Health through E-records Research ( 4.4m). Partners: Cardiff Bristol, Sussex, Australia, Canada 2. CIPHER now one of 4 UK-wide Farr Institute Centres (+ 5m) 3. New award: CADRE Centre for Administrative Research & Evaluation ( 8.0m) ESRC, as part of the ESRC ADRN and its Big data initiative. Partners: Cardiff

SAIL: 5 Minute Overview New Data Science building @ Swansea Security and privacy protection from the ground up Funded by MRC (Farr); ESRC (ADRN) and Welsh Government High security home to Farr@Swansea and CADRE Office space for NHS and other public sector staff and industry to work alongside university staff

Research Data Appliances & UKSERP Everything we know, in a box

5 Minute Overview Now in beta test: 1. UK Secure E-Research Platform (UKSeRP) based on the SAIL Gateway massively extended to provide a secure platform for data sharing across the UK not just SAIL data (for Farr and ADRN). 270001 certified and IG Toolkit compliant 2. National Research Data Appliances: New concentrator technology to gift to NHS and other organisations, including automated matching, anonymisation, data management, metadata capture, data quality assessment, etc. 3. New focus on the capture and analysis of electronic free text data (on-board NLP in the Appliances) 4. Funded by MRC Farr institute and ESRC ADRN grants

Research Data Appliance Brings SAIL's capabilities onto combined hardware and software Shrink wrapped, ready to go. Any size from data stick to massive! Easy to use, low expertise barrier Multiple Appliances work together as a larger whole Purposes: concentrate data, make it research ready Provide utility to data owners and partners Provided free (limited numbers) to our collaborators

RDA Usecase Bring data together within one IG defined organisation: Upload Link Anonymise (optional) Measure Catalogue Share Pool Manage & Report Analyse All under strong IG control

A Simple RDA deployment model Bring data together within one IG defined organisation: Upload Link Anonymise (optional) Catalogue Share Report Analyse RDA3 This could be a UKSe-RP instance Bring data together within one IG defined organisation: Upload Link Anonymise (optional) Catalogue Share Report Analyse RDA1 Bring data together within one IG defined organisation: Upload Link Anonymise (optional) Catalogue Share Report Analyse RDA2

Research Data Appliances Concepts (Presented at SHIP 2013) Benefits to health partners: Data concentrator with file store and database capabilities On board ETL Very easy to use (low technical barriers) Built on best privacy practice principles On-board high quality probabilistic identity matching system Integrated automated Metadata Catalogue and data quality reporting Full IG control locally - local data for local reuse. Robust standardised anonymisation (as required) NLP facilities option for unstructured (free text) data stores

Self management

Data schema automatically computed based on data contained in uploaded file

Publish Dataset Depend on Configuration/Capabilities. Data will now be available

Data Catalogue Key Component Additional points following previous sessions: All DA carry a DC, DS can inherit from other DS DC entries, DC related to Programme/Security domain. DC s replicate to Regional/Global DC. Road map: DC used to define and create DS

A Dataset Contact Specific version & Date Request VIMO All section attach files Theme / Type / Level Tags

A Dataset (cont.) DDI, SPSS, SAS, STATA

Data Catalogue a specific table

Key features of relevance Appliances federate with each other to create a sharing network Network can be hierarchical or peer-to-peer Full IG controls available at every point Metadata builds automatically and publishes to a global catalogue Data quality measurement automated NLP address rich, free text datasets, converting them to SNOMEDCT High quality identity reconciliation automatic, de-identification optional UKSERP provides scalable, performant analytics platform, with full IG controls

Questions?