Karl Lum Partner, LabKey Software klum@labkey.com. Evolution of Connectivity in LabKey Server



Similar documents
LabKey Server: An open source platform for scientific data integration, analysis, and collaboration

Adam Rauch Partner, LabKey Software Extending LabKey Server Part 1: Retrieving and Presenting Data

Sisense. Product Highlights.

Implementing and Maintaining Microsoft SQL Server 2008 Integration Services

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Data Management for Large Studies Robert R. Kelley, PhD. Thursday, September 27, 2012

CPAS Overview. Josh Eckels LabKey Software

Putting the pieces together: Integrated Research Data Management Using the LabKey Server

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

Creating a Participants Mailing and/or Contact List:

DFW Backup Software. Whitepaper DFW Backup Agent

Implementing a Data Warehouse with Microsoft SQL Server 2012

Zoner Online Backup. Whitepaper Zoner Backup Agent

SQL Server Replication Guide

Ahsay Backup Software. Whitepaper Ahsay Backup Agent

MS-55052: SharePoint 2013 End User Level II

Accessing Your Database with JMP 10 JMP Discovery Conference 2012 Brian Corcoran SAS Institute

Whitepaper FailSafeSolutions Backup Agent

Blaze Vault Online Backup. Whitepaper Blaze Vault Online Backup Agent

news from Tom Bacon about Monday's lecture

Build Your Knowledge!

Implementing a Data Warehouse with Microsoft SQL Server

Course Outline. Module 1: Introduction to Data Warehousing

DataTrust Backup Software. Whitepaper DataTrust Backup Agent. Version 6.3

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

Implementing a Data Warehouse with Microsoft SQL Server

Oracle Warehouse Builder 10g

Evaluation Checklist Data Warehouse Automation

Implementing a Data Warehouse with Microsoft SQL Server

LDAPCON Sébastien Bahloul

High-Volume Data Warehousing in Centerprise. Product Datasheet

Implementing a Data Warehouse with Microsoft SQL Server MOC 20463

COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

BarTender Integration Methods. Integrating BarTender s Printing and Design Functionality with Your Custom Application WHITE PAPER

Course 10777A: Implementing a Data Warehouse with Microsoft SQL Server 2012

Basics Of Replication: SQL Server 2000

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Move Data from Oracle to Hadoop and Gain New Business Insights

Unit & Live Testing for SSIS

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

Data Integration with Talend Open Studio Robert A. Nisbet, Ph.D.

Implementing a Data Warehouse with Microsoft SQL Server 2012

How to Use PIPS Access to/from SQL Database Utility Program. By PIPSUS Support Team Dr. Chouikha

East Asia Network Sdn Bhd

Implementing a Data Warehouse with Microsoft SQL Server 2012

For Sales Kathy Hall

Beta: Implementing a Data Warehouse with Microsoft SQL Server 2012

Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment

Implementing a Data Warehouse with Microsoft SQL Server 2012 (70-463)

SynapseBackup Secure backups and disaster recovery services for both physical and virtual environments. Top reasons on why SynapseBackup is the best

INTRODUCING ORACLE APPLICATION EXPRESS. Keywords: database, Oracle, web application, forms, reports

NAIP Consortium Strengthening Statistical Computing for NARS SAS Enterprise Business Intelligence

SQL Server Administrator Introduction - 3 Days Objectives

Synchronization Agent Configuration Guide

Connecting to your Database!... 3

Product Brief. it s Backed Up

ManageEngine Exchange Reporter Plus :: Help Documentation WELCOME TO EXCHANGE REPORTER PLUS... 4 GETTING STARTED... 7 DASHBOARD VIEW...

Chapter 24: Creating Reports and Extracting Data

SAS BI Course Content; Introduction to DWH / BI Concepts

Technical Data Sheet: imc SEARCH 3.1. Topology

Moving the Web Security Log Database

LearnFromGuru Polish your knowledge

Mobile device management

Securing and Accelerating Databases In Minutes using GreenSQL

IBM Campaign Version-independent Integration with IBM Engage Version 1 Release 3 April 8, Integration Guide IBM

AVALANCHE MC 5.3 AND DATABASE MANAGEMENT SYSTEMS

REDCap General Security Overview

Resources You can find more resources for Sync & Save at our support site:

Content Management System (CMS)

Events Forensic Tools for Microsoft Windows

IBM Tivoli Storage Manager for Microsoft SharePoint

Vembu NetworkBackup v3.1.1 GA

Paper PO03. A Case of Online Data Processing and Statistical Analysis via SAS/IntrNet. Sijian Zhang University of Alabama at Birmingham

New Features... 1 Installation... 3 Upgrade Changes... 3 Fixed Limitations... 4 Known Limitations... 5 Informatica Global Customer Support...

ICE Trade Vault. Public User & Technology Guide June 6, 2014

SQL Server An Overview

IBM Campaign and IBM Silverpop Engage Version 1 Release 2 August 31, Integration Guide IBM

SQL Server 2012 Business Intelligence Boot Camp

Swiss Safe Storage Online Backup Whitepaper Swiss Safe Storage Backup Agent

SSIS Training: Introduction to SQL Server Integration Services Duration: 3 days

Incremental Data Migration in Multi-database Systems Using ETL Algorithm

Microsoft Exam MB2-702 Microsoft Dynamics CRM 2013 Deployment Version: 6.1 [ Total Questions: 90 ]

Veeam Backup Enterprise Manager. Version 7.0

Oracle Database 10g Express

Course 20463:Implementing a Data Warehouse with Microsoft SQL Server

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Implementing a Data Warehouse with Microsoft SQL Server

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Geodatabase Programming with SQL

A Scalable Data Transformation Framework using the Hadoop Ecosystem

Pastel Evolution BIC. Getting Started Guide

MySQL for Beginners Ed 3

Typo3_tridion. SDL Tridion R5 3/21/2008

Migrating MS Access Database to MySQL database: A Process Documentation

SOFTWARE TESTING TRAINING COURSES CONTENTS

Managing Third Party Databases and Building Your Data Warehouse

Teleran PCI Customer Case Study

David Dye. Extract, Transform, Load

Data Visualization. Scientific Principles, Design Choices and Implementation in LabKey. Cory Nathe Software Engineer, LabKey

Transcription:

Karl Lum Partner, LabKey Software klum@labkey.com Evolution of Connectivity in LabKey Server

Connecting Data to LabKey Server Lowering the barrier to connect scientific data to LabKey Server Increased flexibility in routing data Cooperating with other systems Giving users more options with their data Historical look at LabKey Server connectivity Focus on some recent changes (REDCap, FreezerPro, ETL) Future directions 2

2003-2005 Data Pipelines LabKey Software focused on Proteomics CPAS server processing MS2 runs through the data pipeline Data uploaded through the browser and results saved to the database Pipeline tasks could parse specific data formats Analysis of Flow data FCS files processed via the pipeline Data entered into LabKey Server tables through web forms LabKey Server was the database of record 3

2005 Connectivity Summary Data Pipeline Form Entry Java Module LabKey Server 4

2006 Study and Assay Data Collaboration with SCHARP on the Atlas Portal Many data types associated with HIV/AIDS research Lots of study and assay data CRF and specimen data imported through the pipeline Assay data consisted of machine generated data files Assay framework and GPAT Imports data from spreadsheets or tab-separated text files No built-in specialized analysis or visualizations Appropriate for both raw and analyzed results Tool to infer fields from first file 5

2007 2008 APIs and Simple Modules Needed many custom applications for Atlas Java modules were complex to build and maintain Build custom applications without the module overhead LabKey APIs & Simple Modules Lowered the extensibility barrier Insert, update, delete programmatically Module based assays allowed easy entry into the assay framework Lists Create tables in LabKey Server and integrate with existing data Easily import file based data through the browser Tools to infer fields from files 6

2008 Connectivity Summary Client API Data Pipeline Form Entry File upload Java Module LabKey Server Simple Module 7

2008-2009 External Schemas Support for connecting to data sources not in the LabKey Server schema Relocating the data is no longer required LabKey Server security could be applied Editing of external table through the LabKey Server UI can be enabled Supported data sources: SAS PostgreSQL Microsoft SQL Server Oracle MySQL 8

2010-2012 APIs and Remote Connections LabKey Software continues to refine APIs Additional language bindings for Perl and Python Polish module based tools Remote connections LabKey Server as an external data source Connectivity through the LabKey Server API Folder level granularity 9

2012 Connectivity Summary Client API External SQL Data Sources Data Pipeline Form Entry File upload Java Module LabKey Server Simple Module Remote Server 10

2013-2014 External Application Integration REDCap Web application for building and managing online surveys and databases Developed and distributed by Vanderbilt University Popular in the academic and research community for designing clinical and translational research databases 11

2013-2014 External Application Integration International Center of Excellence for Malaria Research (ICEMR) at the University of Washington Demographic and clinical data in REDCap Wanted their REDCap data integrated into their LabKey Server Visualizations Queries Integration with experimental data 12

2013-2014 External Application Integration Data needed to be synchronized from REDCap to the LabKey Server REDCap API allowed programmatic and secure access to the projects of interest Data is extracted and saved in a format that can be imported into a LabKey Server study Scheduled automatic import 13

2013-2014 External Application Integration FreezerPro Commercial web application for frozen specimen inventory management Supports various sample types Tracks location and availability of specimens Allows user defined fields Users can create custom reports and export data 14

2013-2014 External Application Integration Novo Nordisk Type 1 Diabetes Research Center Uses FreezerPro to manage their research specimens Needed their specimen inventory integrated into LabKey Server Combine with experimental data Queries Visualization 15

2013-2014 External Application Integration API access to the remote FreezerPro server LabKey Server uses a secure storage to encrypt the FreezerPro credentials Inventory information is imported directly into LabKey Server Uses the data pipeline Study specimen repository Users control, field mapping, filtering, synchronization schedule 16

2013-2014 ETL Framework Stands for extract, transform and load Developed as part of HIDRA (Hutch Integrated Data Repository & Archive) Goals of building a LabKey Server ETL Framework Provenance Understanding the origin of the data, knowing when and how it got there Auditing Security Integration into the LabKey Server security model Flexible data integration strategy ETL 17

2013-2014 ETL Framework Built on top of Pipelines Functionality Query based ETLs Stored procedures Remote Sources Checkers (identify whether work is to be done) Scheduling Logging output ETL 18

2013-2014 ETL Framework ETLs are module based An ETL consists of a set of Transform Steps Key components of a transform Source table or query Destination table Filter strategy Identifies rows to transform & if there is work to do Schedule ETL 19

2013-2014 ETL Framework Filter Strategies Choose which rows to move to target table Select all Just get all the data, every time Last modified Rows with a date/time column newer than last run Records most recent value Run filter Checks a specified column, especially an incrementing integer column Any rows with higher value than last time are transformed Useful for rows written by previous ETLs ETL 20

2013-2014 ETL Framework Target Options How to add data to target table truncate - delete all rows and add the selected ones append Add new rows to the target table Will fail if duplicate primary keys merge Update or Insert Matches Primary Keys ETL 21

2013-2014 ETL Framework Schedule Options When to run the transform Poll option Check at a defined interval Cron option Can be used to check at a particular time of day ETL 22

Connectivity Summary Client API External SQL Data Sources Data Pipeline Form Entry File upload Java Module LabKey Server Simple Module ETL Remote Server External Systems 23

Future Directions Other connection strategies LabKey is investigating DatStat I2b2 Caisis Online data and study management software informatics framework that will enable clinical researchers to use existing clinical data for discovery research Open source, cancer data management system 24

Karl Lum klum@labkey.com Any questions? 25