Big Data Visualization for Genomics. Luca Vezzadini Kairos3D

Similar documents
Visual Mining for Big Data

An Introduction to Genomics and SAS Scientific Discovery Solutions

Tutorial for proteome data analysis using the Perseus software platform

Cancer Genomics: What Does It Mean for You?

GeoManitoba Spatial Data Infrastructure Update. Presented by: Jim Aberdeen Shawn Cruise

Medical Data Review and Exploratory Data Analysis using Data Visualization

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

Version 5.0 Release Notes

6.0, 6.5 and Beyond. The Future of Spotfire. Tobias Lehtipalo Sr. Director of Product Management

Explorable Visual Analytics (EVA) Interactive Exploration of LEHD. Saman Amraii - Amir Yahyavi Carnegie Mellon University

Mobile Device Management

Delivering the power of the world s most successful genomics platform

Teaching Computational Thinking using Cloud Computing: By A/P Tan Tin Wee

The Data Mining Process

PBI365: Data Analytics and Reporting with Power BI

AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE

Big Data: Rethinking Text Visualization

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

TOP New Features of Oracle Business Intelligence 11g

Striking New Visualization for Sequence Execution in Batch & Continuous Processes

Data Harmonization and Management System for the Institute for Prospective Technological Studies

Rapid Visualization with Big Data Analytics. Ravi Chalaka VP, Solution and Social Innovation Marketing

RT 2 Profiler PCR Array: Web-Based Data Analysis Tutorial

MultiExperiment Viewer Quickstart Guide

Balancing Big Data for Security, Collaboration and Performance

SAFARI. Future Work Ideas. Alberto Garcia-Robledo, Abel Sanchez, Rongsha Li, Juan-Carlos Murillo-Torres, John Williams and Sascha Boheme

A Beginners Guide To Responsive, Mobile & Native Websites 2013 Enhance.ie.All Rights Reserved.

Computers, Smartphones & Tablets Sales:

Fast. Integrated Genome Browser & DAS. Easy. Flexible. Free. bioviz.org/igb

Hierarchical Clustering Analysis

Answer Key. Vocabulary Practice

SAP Lumira Cloud: True Self-Service BI Without The Server

Section A: CURRICULUM, INSTRUCTIONAL DESIGN AND STUDENT ASSESSMENT

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

Tableau's data visualization software is provided through the Tableau for Teaching program.

Visualizing the Top 400 Universities

How To Handle Big Data With A Data Scientist

How can you unlock the value in real-world data? A novel approach to predictive analytics could make the difference.

Cluster software and Java TreeView

Creating Power BI solutions using Power BI Desktop

14.3 Studying the Human Genome

Question Bank June 2015 R001 Mock

Introduction to Junos Space Network Director

Expanding Uniformance. Driving Digital Intelligence through Unified Data, Analytics, and Visualization

Avaya Speech Analytics Desktop Client 2.0

Data deluge (and it s applications) Gianluigi Zanetti. Data deluge. (and its applications) Gianluigi Zanetti

Cambridge International AS and A Level Computer Science

Fast Analytics on Big Data with H20

MOBILE SALES ENABLEMENT HOW TABLETS UNLOCK SALES OPPORTUNITIES

Using visualization to understand big data

September 9 11, 2013 Anaheim, California Spatial Analytics: 3D Models in SBOP Dashboards

Big Data Challenges. technology basics for data scientists. Spring Jordi Torres, UPC - BSC

Bursting to a Hybrid Cloud for Services OFC 2015

Supported Client Devices: - SIP/H.323 hardware and software end-points

BIOS 6660: Analysis of Biomedical Big Data Using R and Bioconductor, Fall 2015 Computer Lab: Education 2 North Room 2201DE (TTh 10:30 to 11:50 am)

TIBCO Spotfire Helps Organon Bridge the Data Gap Between Basic Research and Clinical Trials

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

visualization pitfalls (and how to avoid them)

Generating leads with Meraki's Systems Manager. Partner Training"

Discovering Computers

Increasing Flash Throughput for Big Data Applications (Data Management Track)

GeneProf and the new GeneProf Web Services

Mobile Application Testing

The Scientific Data Mining Process

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Cisco Data Preparation

Creating Electronic Portfolios using Microsoft Word and Excel

SAS Visual Analytics 7.2 for SAS Cloud: Quick-Start Guide

Adobe Marketing Cloud Bloodhound for Mac 3.0

Machine Learning/Data Mining for Cancer Genomics

MediSapiens Ltd. Bio-IT solutions for improving cancer patient care. Because data is not knowledge. 19th of March 2015

Discovering Business Intelligence Using Treemap Visualizations

Testimony of. Paul Misener Vice President for Global Public Policy, Amazon.com. Before the

Wave 4.5. Wave ViewPoint Mobile 2.0. User Guide

vcenter Operations Manager Administration 5.0 Online Help VPAT

WHITE PAPER. Peter Drucker. intentsoft.com 2014, Intentional Software Corporation

Spotfire and Tableau Positioning. Summary

Astrophysics with Terabyte Datasets. Alex Szalay, JHU and Jim Gray, Microsoft Research

The Mobile Data Management Platform. Reach Relevant Audiences Across Devices and Channels with High Impact Ad Targeting

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros

City Surveillance and the Cloud

Transcription:

Big Data Visualization for Genomics Luca Vezzadini Kairos3D

Why GenomeCruzer? The amount of data for DNA sequencing is growing Modern hardware produces billions of values per sample Scientists need to handle hundreds of thousands variables Traditional tools are not suited to handle such complexity Partial views of the data set, not easy to correlate different datasets, long computational time, more tools involved in a session, GenomeCruzer builds on a 3D Big Data analytics platform Interactive 3D scene that enables capturing, visualizing and navigating hundreds of thousands of data items Kairos3D MIMOS 2012 Conference 1

Project information Developed by Kairos3D Based on proprietary software platform Derived from experience on Big Data applications Designed with Institute for Cancer Research and Treatment Part of Fondazione Piemontese per la Ricerca sul Cancro Strong knowledge in cancer research and bioinformatics Tested on well known international data sets Project web site: genomecruzer.com Kairos3D MIMOS 2012 Conference 2

Current state of the art The 2D heat map. Tabular dataset where each column is a sample (normally a patient) and each row is a gene. Colors represent measurements for each gene in each sample. Rows and columns can be grouped in clusters (e.g. male/female patients) Different measurements (e.g. gene expression and copy number) result in more heat maps. Kairos3D MIMOS 2012 Conference 3

GenomeCruzer view[1/2] A view from above, which looks like a regular heat map. Clusters are more visibly separated Kairos3D MIMOS 2012 Conference 4

GenomeCruzer view[2/2] A full 3D view. Two data sets are displayed together now (one for color, the other for height) The user can select which data set to map to both parameters. Kairos3D MIMOS 2012 Conference 5

Main features [1/2] Interactive 3D scene with data walls Each wall displays a different type of information Relations among wall elements are displayed User can select items on a wall, the system updates values on all related items, also on other walls Hierarchical data model to convey clustering information LOD, both Automatic & Manual 3D view is optimized for better readability Kairos3D MIMOS 2012 Conference 6

Main features [2/2] Statistical Analysis functions User can select items on any wall The system updates values on all related items, also on other walls User can select what operations to apply For example: select a group of patients, this will update the values on all genes and gene groups, by computing the average value of the genes for the selected patients. Available on desktop & laptop computers Kairos3D MIMOS 2012 Conference 7

LOD example Kairos3D MIMOS 2012 Conference 8

Data walls (beyond heat map) Kairos3D MIMOS 2012 Conference 9

A 3 walls view Selection on a wall updates values on other walls Kairos3D MIMOS 2012 Conference 10

Scientific applications IRCC has prepared 3 case studies using GenomeCruzer More details on genomecruzer.com Based on colon cancer data sets from The Cancer Genome Atlas GenomeCruzer greatly simplifies integrative analysis Simultanous visualization of gene sequencing, copy number, expression or metylation data. No other easy way to correlate two linked data sets Fast screening of working hypothesis (search recurrent patterns) Kairos3D MIMOS 2012 Conference 11

Future work A discovery release currently available for free Includes the 3 case studies and video tutorial Scientific dissemination process is on going Need feedback from the international community Next planned improvents include: Generalized data input/output system Extended UI interaction to create and edit clusters Port to tablet devices (ios and Android) Address wider user base (biologists, pharma industries, ) Kairos3D MIMOS 2012 Conference 12

GenomeCruzer video Kairos3D MIMOS 2012 Conference 13