GOBII. Genomic & Open-source Breeding Informatics Initiative

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "GOBII. Genomic & Open-source Breeding Informatics Initiative"

Transcription

1 GOBII Genomic & Open-source Breeding Informatics Initiative

2 My Background BS Animal Science, University of Tennessee MS Animal Breeding, University of Georgia Random regression models for longitudinal traits PhD Statistical Genetics, University of Georgia Feature selection and prediction algorithms Dow AgroSciences Quantitative Geneticist ( ) Quantitative Genetics Group Leader ( ) Development and implementation of global trial analysis system Development and implementation of genomic selection into NA corn breeding program

3 Genomic Data More Data, More Information? Genomic data is becoming increasingly more cost effective to generate. High Volume and High Dimensional data Need effective data management tools Analysis pipelines to turn data into information Genomic information does not replace phenotypic information Must have quality multi-year and multi-environment data to take full advantage of genomic information. Must be able to integrate genomic and phenotypic information Must have well designed training datasets to achieve needed prediction accuracies

4 Genomic Selection Selection Intensity Selection Accuracy Phenotype Environment Genotype R = irs g L Genetic Standard Deviation Generation Interval Train Potential Advantages of Genomic Selection Predict i,s g r L Early discarding, first stage screening based on genomic information Incorporate genomic information into early stage trials and multi-year evaluations Early recycling, reduce stages to variety release

5 r Accuracy Key Drivers Genetic Architecture and Heritability Model Training Population Data When properly implemented, is genomic selection accurate enough to drive increased genetic gain? Yes*

6 Z. Lin et al. Crop & Pasture Science 2014

7 Frequency Histogram of Accuracy Accuracy

8 Correlation = Discarding: Lose ~0.5% Picking Winners Advance ~33%

9 Correlation = Discarding: Lose ~9% 2 1 Picking Winners Advance ~20%

10 Correlation = Discarding: Lose ~21% Picking Winners Advance ~8%

11 Training i,s g, L Modifying the Funnel Widen the funnel Discard lines with low likelihood of success or absence of key traits based on genomic information. Can increase lines screened without increases in yield trial plot load (heavier nursery plot load) Increase selection intensity Prediction Early Stage Screening Characterization Release Shorten the funnel As accuracies of genomic predictions increase there is the possibility to replace the first stage of screening with GS and make recycling decisions earlier. Reduce the generation interval

12 Key Components Breeding Strategy Phenotypic Information Data Management (BMS) Analysis Pipelines Skilled Breeders Genomic Information Data Management (GOBII)

13 GOBII Mission To work closely with CGIAR centers to develop open-source capabilities and enable the implementation genomic and marker assisted selection for staple crops in the developing world. Vision Effective deployment of genomic information in breeding programs has the potential to significantly increase genetic gain in key crop performance traits. This can lead to staple crop varieties with improved yields and better adaption to growing conditions in South Asia and Sub-Saharan Africa, bringing us closer to providing a sustainable and reliable food supply

14 Key Components Breeding Strategy Phenotypic Information Data Management (BMS) Analysis Pipelines Skilled Breeders Genomic Information Data Management (GOBII)

15 Execution and Implementation Many Transformative Efforts Fail Many failed initiatives have great strategies They fall apart in the execution Need to have clearly define objectives Define the most critical elements and focus on those (must haves). Clearly defined deliverables aligned to those critical elements Action Avoid planning paralysis Engagement Commitment

16 Initial Phase Strategy Prioritize initial deliverables based on Urgency of the need across CG centers Technical feasibility look for low hanging fruit Dependencies on other deliverables Leverage existing components to the fullest extent possible Direct all user interaction through an API (focus on BRAPI) allowing the development team to switch out components on the back end with minimal user disruption Quickly piecing together a system to meet immediate needs of users should buy time to develop a truly nextgen solution for Phase 2 implementation

17 Sequence Data File Store Meta Data DB Pipeline: Genomic Variant Calls and imputation BRAPI LIMS Marker Variant DB Client Side Application and GUI Field Trial Management System

18 Work Packages WP1 Breeding Workflow Mapping/Project Prioritizations WP2 Data Warehouse/DataMart WP3 Server Application Data Analysis Pipelines WP4 Genomic API/ETL WP5 Client Application(s) Breeder Tools

19 Breeding Workflow Mapping/Project Prioritizations Breeding processes and strategy for each breeding program Line development process and timelines Key decision points Key traits GS and MAS strategies Understand marker workflows How marker data is pulled and filtered Common marker analyses Where markers are deployed in the breeding process Set initial prioritizations Understand critical marker needs that are not being met with current systems

20 Data Warehouse/DataMart Sequence Data Compressed FASTQ files Meta Data Relational database linking sample information to compressed FASTQ files. Sample and marker meta information Support basic BRAPI marker statistic calls Physical and Linkage map information Support BRAPI genomic maps calls Marker Calls Set up initial solution using currently implemented marker DBs Support BRAPI allele matrix Call Select and mock up and test large matrix store db solutions postgresql/citus, monet, Canssandra, Hbase, MongoDB

21 Server Application Variant Calling and Imputation Pipeline Leverage Existing Pipeline(s) File Selection Tool Based on SQL queries of meta data Analysis G matrix calculations (Possibly using TASSEL implementations) Calculations of LD PCoA decompositions

22 Genomic API/ETL API BRAPI implementation via web interface Custom GOBII API calls when needed ETL Mapping common queries to DB schemas Pull large blocks of data filtering on sample and marker characteristics Pull lines carrying haplotypes of interest. Client Application(s) Visualizations PCoA Connection to Flapjack LD matrices and LD decay SNP Calling pipeline File selection tool

23 Thank You

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele

Marker-Assisted Backcrossing. Marker-Assisted Selection. 1. Select donor alleles at markers flanking target gene. Losing the target allele Marker-Assisted Backcrossing Marker-Assisted Selection CS74 009 Jim Holland Target gene = Recurrent parent allele = Donor parent allele. Select donor allele at markers linked to target gene.. Select recurrent

More information

Delivering the power of the world s most successful genomics platform

Delivering the power of the world s most successful genomics platform Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE

More information

Investigating the genetic basis for intelligence

Investigating the genetic basis for intelligence Investigating the genetic basis for intelligence Steve Hsu University of Oregon and BGI www.cog-genomics.org Outline: a multidisciplinary subject 1. What is intelligence? Psychometrics 2. g and GWAS: a

More information

A Strategy for Plant Breeding Data Management in International Agricultural Research

A Strategy for Plant Breeding Data Management in International Agricultural Research A Strategy for Plant Breeding Data Management in International Agricultural Research Introduction Exchange of germplasm boosted crop improvement for subsistence agriculture during the 70s and 80s, and

More information

Basics of Marker Assisted Selection

Basics of Marker Assisted Selection asics of Marker ssisted Selection Chapter 15 asics of Marker ssisted Selection Julius van der Werf, Department of nimal Science rian Kinghorn, Twynam Chair of nimal reeding Technologies University of New

More information

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING

GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING GENOMIC SELECTION: THE FUTURE OF MARKER ASSISTED SELECTION AND ANIMAL BREEDING Theo Meuwissen Institute for Animal Science and Aquaculture, Box 5025, 1432 Ås, Norway, theo.meuwissen@ihf.nlh.no Summary

More information

SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis

SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis SeattleSNPs Interactive Tutorial: Web Tools for Site Selection, Linkage Disequilibrium and Haplotype Analysis Goal: This tutorial introduces several websites and tools useful for determining linkage disequilibrium

More information

Computational Requirements

Computational Requirements Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects Computational Requirements Steve Sherry, Lisa Brooks, Paul Flicek, Anton Nekrutenko, Kenna Shaw, Heidi Sofia High-density

More information

Development and Implementation

Development and Implementation International Presentation Crop Title Information Goes Here System : Development and Implementation presentation subtitle. Graham Mclaren GCP The ICIS Vision Connecting Islands of data o Connecting germplasm

More information

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik

Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated

More information

FUTURE TRENDS IN CORN GENETICS AND BIOTECHNOLOGY. Bill Curran 1 INTRODUCTION

FUTURE TRENDS IN CORN GENETICS AND BIOTECHNOLOGY. Bill Curran 1 INTRODUCTION FUTURE TRENDS IN CORN GENETICS AND BIOTECHNOLOGY Bill Curran 1 INTRODUCTION Commercial corn breeding and trait integration have changed modern corn production. In recent years, technology advances for

More information

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc. Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse

More information

Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects

Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects Report on the Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects Background and Goals of the Workshop June 5 6, 2012 The use of genome sequencing in human research is growing

More information

CHARACTERIZATION, CHALLENGES, AND USES OF SORGHUM DIVERSITY TO IMPROVE SORGHUM THROUGH PLANT BREEDING

CHARACTERIZATION, CHALLENGES, AND USES OF SORGHUM DIVERSITY TO IMPROVE SORGHUM THROUGH PLANT BREEDING 1 ST EUROPEAN SORGHUM CONGRESS WORKSHOP INNOVATIVE RESEARCH TOWARDS GENETIC PROGRESS CHARACTERIZATION, CHALLENGES, AND USES OF SORGHUM DIVERSITY TO IMPROVE SORGHUM THROUGH PLANT BREEDING MAXIMISING RESULTS

More information

Introductory to Advanced Training Course Five Day Course Information and Agenda October, 2015

Introductory to Advanced Training Course Five Day Course Information and Agenda October, 2015 Introductory to Advanced Training Course Five Day Course Information and Agenda October, 2015 Agronomix Software, Inc. Winnipeg, MB, Canada www.agronomix.com Who Should Attend? This course is designed

More information

Quantitative Genetics: II - Advanced Topics:

Quantitative Genetics: II - Advanced Topics: 1999, 000 Gregory Carey Chapter 19: Advanced Topics - 1 Quantitative Genetics: II - Advanced Topics: In this section, mathematical models are developed for the computation of different types of genetic

More information

i2b2 Clinical Research Chart

i2b2 Clinical Research Chart i2b2 Clinical Research Chart Shawn Murphy MD, Ph.D. Griffin Weber MD, Ph.D. Michael Mendis Vivian Gainer MS Lori Phillips MS Rajesh Kuttan Wensong Pan MS Henry Chueh MD Susanne Churchill Ph.D. John Glaser

More information

Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report

Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report G. Banos 1, P.A. Mitkas 2, Z. Abas 3, A.L. Symeonidis 2, G. Milis 2 and U. Emanuelson 4 1 Faculty

More information

SNP and destroy - a discussion of a weighted distance-based SNP selection algorithm

SNP and destroy - a discussion of a weighted distance-based SNP selection algorithm SNP and destroy - a discussion of a weighted distance-based SNP selection algorithm David A. Hall Rodney A. Lea November 14, 2005 Abstract Recent developments in bioinformatics have introduced a number

More information

URGI and ELIXIR France for plants and food

URGI and ELIXIR France for plants and food URGI and ELIXIR France for plants and food Elixir - SME & Innovation event, Data Driven Innovation. 19 th march 2015 A L I M E N T A T I O N A G R I C U L T U R E E N V I R O N N E M E N T URGI: Unité

More information

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS

Genomic Selection in. Applied Training Workshop, Sterling. Hans Daetwyler, The Roslin Institute and R(D)SVS Genomic Selection in Dairy Cattle AQUAGENOME Applied Training Workshop, Sterling Hans Daetwyler, The Roslin Institute and R(D)SVS Dairy introduction Overview Traditional breeding Genomic selection Advantages

More information

New Directions and Changing Faces for the USDA Sunflower Genetics Research Programs. Brent Hulke, Ph.D. Research Geneticist

New Directions and Changing Faces for the USDA Sunflower Genetics Research Programs. Brent Hulke, Ph.D. Research Geneticist New Directions and Changing Faces for the USDA Sunflower Genetics Research Programs Brent Hulke, Ph.D. Research Geneticist Brent s background Grew up on dairy farm in southern MN Agronomy BS from South

More information

The impact of genomic selection on North American dairy cattle breeding organizations

The impact of genomic selection on North American dairy cattle breeding organizations The impact of genomic selection on North American dairy cattle breeding organizations Jacques Chesnais, George Wiggans and Filippo Miglior The Semex Alliance, USDA and Canadian Dairy Network 2000 09 Genomic

More information

Accelerating variant calling

Accelerating variant calling Accelerating variant calling Mauricio Carneiro GSA Broad Institute Intel Genomic Sequencing Pipeline Workshop Mount Sinai 12/10/2013 This is the work of many Genome sequencing and analysis team Mark DePristo

More information

The key linkage of Strategy, Process and Requirements

The key linkage of Strategy, Process and Requirements Business Systems Business Functions The key linkage of Strategy, Process and Requirements Leveraging value from strategic business architecture By: Frank Kowalkowski, Knowledge Consultants, Inc.. Gil Laware,

More information

PROPOSAL To Develop an Enterprise Scale Disease Modeling Web Portal For Ascel Bio Updated March 2015

PROPOSAL To Develop an Enterprise Scale Disease Modeling Web Portal For Ascel Bio Updated March 2015 Enterprise Scale Disease Modeling Web Portal PROPOSAL To Develop an Enterprise Scale Disease Modeling Web Portal For Ascel Bio Updated March 2015 i Last Updated: 5/8/2015 4:13 PM3/5/2015 10:00 AM Enterprise

More information

Multifactorial Traits. Chapter Seven

Multifactorial Traits. Chapter Seven Multifactorial Traits Chapter Seven Multifactorial Not all diseases are Mendelian Multifactorial = many factors In Genetics: Multifactorial = both environment and genetics (usually more than one gene)

More information

Detecting the Sardinian Specific Variability Trough Next Generation Sequencing of 2120 Individuals

Detecting the Sardinian Specific Variability Trough Next Generation Sequencing of 2120 Individuals UNIVERSITÀ DEGLI STUDI DI SASSARI Scuola di Dottorato in Scienze Biomediche XXV CICLO DOTTORATO DI RICERCA IN SCIENZE BIOMEDICHE INDIRIZZO DI GENETICA MEDICA, MALATTIE METABOLICHE E NUTRIGENOMICA Direttore:

More information

Cheminformatics and Pharmacophore Modeling, Together at Last

Cheminformatics and Pharmacophore Modeling, Together at Last Application Guide Cheminformatics and Pharmacophore Modeling, Together at Last SciTegic Pipeline Pilot Bridging Accord Database Explorer and Discovery Studio Carl Colburn Shikha Varma-O Brien Introduction

More information

Work Package 13.5: Authors: Paul Flicek and Ilkka Lappalainen. 1. Introduction

Work Package 13.5: Authors: Paul Flicek and Ilkka Lappalainen. 1. Introduction Work Package 13.5: Report summarising the technical feasibility of the European Genotype Archive to collect, store, and use genotype data stored in European biobanks in a manner that complies with all

More information

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the

Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the Chapter 5 Analysis of Prostate Cancer Association Study Data 5.1 Risk factors for Prostate Cancer Globally, about 9.7% of cancers in men are prostate cancers, and the risk of developing the disease has

More information

Enhancing Functionality of EHRs for Genomic Research, Including E- Phenotying, Integrating Genomic Data, Transportable CDS, Privacy Threats

Enhancing Functionality of EHRs for Genomic Research, Including E- Phenotying, Integrating Genomic Data, Transportable CDS, Privacy Threats Enhancing Functionality of EHRs for Genomic Research, Including E- Phenotying, Integrating Genomic Data, Transportable CDS, Privacy Threats Genomic Medicine 8 meeting Alexa McCray Christopher G Chute Rex

More information

SAP HANA Enabling Genome Analysis

SAP HANA Enabling Genome Analysis SAP HANA Enabling Genome Analysis Joanna L. Kelley, PhD Postdoctoral Scholar, Stanford University Enakshi Singh, MSc HANA Product Management, SAP Labs LLC Outline Use cases Genomics review Challenges in

More information

Research Roadmap for the Future. National Grape and Wine Initiative March 2013

Research Roadmap for the Future. National Grape and Wine Initiative March 2013 Research Roadmap for the Future National Grape and Wine Initiative March 2013 Objective of Today s Meeting Our mission drives the roadmap Our Mission Drive research to maximize productivity, sustainability

More information

Data search and visualization tools at the Comparative Evolutionary Genomics of Cotton Web resource

Data search and visualization tools at the Comparative Evolutionary Genomics of Cotton Web resource Data search and visualization tools at the Comparative Evolutionary Genomics of Cotton Web resource Alan R. Gingle Andrew H. Paterson Joshua A. Udall Jonathan F. Wendel 1 CEGC project goals set the context

More information

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan

Combining Data from Different Genotyping Platforms. Gonçalo Abecasis Center for Statistical Genetics University of Michigan Combining Data from Different Genotyping Platforms Gonçalo Abecasis Center for Statistical Genetics University of Michigan The Challenge Detecting small effects requires very large sample sizes Combined

More information

IBM WebSphere DataStage Online training from Yes-M Systems

IBM WebSphere DataStage Online training from Yes-M Systems Yes-M Systems offers the unique opportunity to aspiring fresher s and experienced professionals to get real time experience in ETL Data warehouse tool IBM DataStage. Course Description With this training

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT

Building Bioinformatics Capacity in Africa. Nicky Mulder CBIO Group, UCT Building Bioinformatics Capacity in Africa Nicky Mulder CBIO Group, UCT Outline What is bioinformatics? Why do we need IT infrastructure? What e-infrastructure does it require? How we are developing this

More information

Structure of the presentation

Structure of the presentation Integration of Legacy Data (SLIMS) and Laboratory Information Management System (LIMS) through Development of a Data Warehouse Presenter N. Chikobi 2011.06.29 Structure of the presentation Background Preliminary

More information

Extraneous markers used for genetic similarity leads to loss of power in GWAS and heritability determination

Extraneous markers used for genetic similarity leads to loss of power in GWAS and heritability determination Extraneous markers used for genetic similarity leads to loss of power in GWAS and heritability determination Christoph Lippert 1*, Gerald Quon 1, Jennifer Listgarten 1*, and David Heckerman 1* 1 escience

More information

Big Data and the Data Lake. February 2015

Big Data and the Data Lake. February 2015 Big Data and the Data Lake February 2015 My Vision: Our Mission Data Intelligence is a broad term that describes the real, meaningful insights that can be extracted from your data truths that you can act

More information

Prerequisites. Course Outline

Prerequisites. Course Outline MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,

More information

(1-p) 2. p(1-p) From the table, frequency of DpyUnc = ¼ (p^2) = #DpyUnc = p^2 = 0.0004 ¼(1-p)^2 + ½(1-p)p + ¼(p^2) #Dpy + #DpyUnc

(1-p) 2. p(1-p) From the table, frequency of DpyUnc = ¼ (p^2) = #DpyUnc = p^2 = 0.0004 ¼(1-p)^2 + ½(1-p)p + ¼(p^2) #Dpy + #DpyUnc Advanced genetics Kornfeld problem set_key 1A (5 points) Brenner employed 2-factor and 3-factor crosses with the mutants isolated from his screen, and visually assayed for recombination events between

More information

Software Cost. Discounted STS Rate Units Total $0.00 $0.00 $0.00 $0.00 Total $0.00

Software Cost. Discounted STS Rate Units Total $0.00 $0.00 $0.00 $0.00 Total $0.00 Cost Form This cost form has been provided to assist respondents in submitting costs associated by deliverable. Remember that all costs are to be the firm, fixed price of the deliverable and project total.

More information

GRIN-Global Project. the global plant genebank information management system

GRIN-Global Project. the global plant genebank information management system GRIN-Global Project the global plant genebank information management system So what is GRIN-Global? GRIN-Global (GG) is a software suite that enables genebanks to store and manage information associated

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Applying Big Data approaches to Competitive Intelligence challenges

Applying Big Data approaches to Competitive Intelligence challenges Applying Big Data approaches to Competitive Intelligence challenges THOMSON REUTERS IP & SCIENCE PHARMA CI EUROPE CONFERENCE & EXHIBITION TIM MILLER 19 FEBRUARY 2014 BIG DATA, NOT JUST ABOUT VOLUMES Patient

More information

Federal Interagency Traumatic Brain Injury Research (FITBIR)

Federal Interagency Traumatic Brain Injury Research (FITBIR) Federal Interagency Traumatic Brain Injury Research (FITBIR) Matthew J. McAuliffe, PhD Co-director FITBIR Chief, Biomedical Imaging Research Services Section (BIRSS) email: Matthew.McAuliffe@nih.gov (301)

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

SNPbrowser Software v3.5

SNPbrowser Software v3.5 Product Bulletin SNP Genotyping SNPbrowser Software v3.5 A Free Software Tool for the Knowledge-Driven Selection of SNP Genotyping Assays Easily visualize SNPs integrated with a physical map, linkage disequilibrium

More information

Genomic Analysis Solutions from SAS

Genomic Analysis Solutions from SAS Genomic Analysis Solutions from SAS Russell Wolfinger Gerhard Held Development Director Solutions Manager Genomics Analytical Applications SAS Institute Inc. SAS International Pictures: Human Genome Program

More information

Nature of Genetic Material. Nature of Genetic Material

Nature of Genetic Material. Nature of Genetic Material Core Category Nature of Genetic Material Nature of Genetic Material Core Concepts in Genetics (in bold)/example Learning Objectives How is DNA organized? Describe the types of DNA regions that do not encode

More information

Global Alliance. Ewan Birney Associate Director EMBL-EBI

Global Alliance. Ewan Birney Associate Director EMBL-EBI Global Alliance Ewan Birney Associate Director EMBL-EBI Our world is changing Research to Medical Research English as language Lightweight legal Identical/similar systems Open data Publications Grant-funding

More information

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect A very short talk about Apache Kylin Business Intelligence meets Big Data Fabian Wilckens EMEA Solutions Architect 1 The challenge today 2 Very quickly: OLAP Online Analytical Processing How many beers

More information

SpreadSheet Inside. Xenomorph White Paper. Spreadsheet flexibility, database consistency

SpreadSheet Inside. Xenomorph White Paper. Spreadsheet flexibility, database consistency SpreadSheet Inside Spreadsheet flexibility, database consistency This paper illustrates how the TimeScape SpreadSheet Inside can bring unstructured spreadsheet data and complex calculations within a centralised

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

Presentation by: Ahmad Alsahaf. Research collaborator at the Hydroinformatics lab - Politecnico di Milano MSc in Automation and Control Engineering

Presentation by: Ahmad Alsahaf. Research collaborator at the Hydroinformatics lab - Politecnico di Milano MSc in Automation and Control Engineering Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen 9-October 2015 Presentation by: Ahmad Alsahaf Research collaborator at the Hydroinformatics lab - Politecnico di

More information

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - aniketb1@umbc.edu. CMSC 601 - Presentation

HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - aniketb1@umbc.edu. CMSC 601 - Presentation HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM Aniket Bochare - aniketb1@umbc.edu CMSC 601 - Presentation Date-04/25/2011 AGENDA Introduction and Background Framework Heterogeneous

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015 Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours

More information

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Charlie Berger, MS Eng, MBA Sr. Director Product Management, Data Mining and Advanced Analytics charlie.berger@oracle.com www.twitter.com/charliedatamine

More information

i2b2 Clinical Research Chart

i2b2 Clinical Research Chart i2b2 Clinical Research Chart Shawn Murphy MD, Ph.D. Griffin Weber MD, Ph.D. Michael Mendis Vivian Gainer MS Lori Phillips MS Rajesh Kuttan Wensong Pan MS Henry Chueh MD Susanne Churchill Ph.D. John Glaser

More information

Natural Selection, Chi-square & Hardy-Weinberg Calculations

Natural Selection, Chi-square & Hardy-Weinberg Calculations BIOL 0 LAB 5 Natural Selection, Chi-square & Hardy-Weinberg Calculations Variability exists in all natural populations. For a wide variety of reasons, some phenotypes (visible characters) or genotypes

More information

CT30A9301 Code Camp on Platform Based Application Development. LocalEAT

CT30A9301 Code Camp on Platform Based Application Development. LocalEAT CT30A9301 Code Camp on Platform Based Application Development Open Data and Green IT CodeCamp Spring 2015 LocalEAT Anar Bazarhanova 0 446968 Julien Dhallenne 0446926 Khan Mohammad Habibullah 0446890 Marie

More information

Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

Online Supplement to Polygenic Influence on Educational Attainment. Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using Online Supplement to Polygenic Influence on Educational Attainment Construction of Polygenic Score for Educational Attainment Genotyping was conducted with the Illumina HumanOmni1-Quad v1 platform using

More information

Introductory genetics for veterinary students

Introductory genetics for veterinary students Introductory genetics for veterinary students Michel Georges Introduction 1 References Genetics Analysis of Genes and Genomes 7 th edition. Hartl & Jones Molecular Biology of the Cell 5 th edition. Alberts

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

SMRT Analysis v2.2.0 Overview. 1. SMRT Analysis v2.2.0. 1.1 SMRT Analysis v2.2.0 Overview. Notes:

SMRT Analysis v2.2.0 Overview. 1. SMRT Analysis v2.2.0. 1.1 SMRT Analysis v2.2.0 Overview. Notes: SMRT Analysis v2.2.0 Overview 100 338 400 01 1. SMRT Analysis v2.2.0 1.1 SMRT Analysis v2.2.0 Overview Welcome to Pacific Biosciences' SMRT Analysis v2.2.0 Overview 1.2 Contents This module will introduce

More information

Modernizing Healthcare

Modernizing Healthcare Modernizing Healthcare Vision Mission: Transforming how healthcare information is created, consumed, and utilized to increase efficiency and improve outcomes. Physicians as programmers Built by physicians

More information

Microsoft Business Intelligence Platform

Microsoft Business Intelligence Platform Microsoft Business Intelligence Platform Agenda Welcome / Introductions Business Intelligence (BI) Overview Microsoft BI Stack Overview SharePoint BI Demo Q & A P 2 Firm Overview Founded in 1997. Offices:

More information

GenomeStudio Data Analysis Software

GenomeStudio Data Analysis Software GenomeStudio Analysis Software Illumina has created a comprehensive suite of data analysis tools to support a wide range of genetic analysis assays. This single software package provides data visualization

More information

Marketing Automation Request for Proposal

Marketing Automation Request for Proposal Marketing Automation Request for Proposal Choosing the right marketing automation system isn t easy. This is why we created this sample RFP, consisting entirely of actual questions from real RFPs submitted

More information

Building on the sequencing of the Potato Genome

Building on the sequencing of the Potato Genome Building on the sequencing of the Potato Genome Glenn Bryan Group leader, The James Hutton Institute PGSC Outline What do we mean by the Potato Genome? Why sequence potato? How the consortium sequenced

More information

BREEDING AND GERMPLASM TED CROSBIE SAM EATHINGTON CALVIN TREAT

BREEDING AND GERMPLASM TED CROSBIE SAM EATHINGTON CALVIN TREAT BREEDING AND GERMPLASM TED CROSBIE SAM EATHINGTON CALVIN TREAT Forward Looking Statements Certain statements contained in this presentation are "forward-looking statements," such as statements concerning

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

Information Technology Projects Evaluation Process

Information Technology Projects Evaluation Process I - Objective The purpose of this document is to describe a process for surfacing and evaluating ideas that might lead to informatics projects or information technology capital expenditures. Such projects

More information

Lecture 18. Genetics of complex traits (quantitative genetics)

Lecture 18. Genetics of complex traits (quantitative genetics) Lecture 18. Genetics of complex traits (quantitative genetics) PHENOTYPES ARE NOT ALWAYS A DIRECT REFLECTION OF GENOTYPES Some alleles are only expressed in some environments, or have variable expression

More information

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Cloud Integration and the Big Data Journey - Common Use-Case Patterns Cloud Integration and the Big Data Journey - Common Use-Case Patterns A White Paper August, 2014 Corporate Technologies Business Intelligence Group OVERVIEW The advent of cloud and hybrid architectures

More information

Genomic selection in dairy cattle: Integration of DNA testing into breeding programs

Genomic selection in dairy cattle: Integration of DNA testing into breeding programs Genomic selection in dairy cattle: Integration of DNA testing into breeding programs Jonathan M. Schefers* and Kent A. Weigel* *Department of Dairy Science, University of Wisconsin, Madison 53706; and

More information

Oracle RAC Services Appendix

Oracle RAC Services Appendix 1 Overview Oracle RAC Services Appendix As usage of the Blackboard Academic Suite grows and the system reaches a mission critical level, customers must evaluate the overall effectiveness, stability and

More information

FUJITSU Legacy Modernization Migration from Lotus Notes to Microsoft Exchange, Microsoft SharePoint and.net

FUJITSU Legacy Modernization Migration from Lotus Notes to Microsoft Exchange, Microsoft SharePoint and.net FUJITSU Legacy Modernization Migration from Lotus Notes to Microsoft Exchange, Microsoft SharePoint and.net FUJITSU Legacy Modernization Migration from Lotus Notes to Microsoft Exchange, Microsoft SharePoint

More information

BIOINFORMATICS Supporting competencies for the pharma industry

BIOINFORMATICS Supporting competencies for the pharma industry BIOINFORMATICS Supporting competencies for the pharma industry ABOUT QFAB QFAB is a bioinformatics service provider based in Brisbane, Australia operating nationwide and internationally. QFAB was established

More information

Keywords : Data Warehouse, Data Warehouse Testing, Lifecycle based Testing

Keywords : Data Warehouse, Data Warehouse Testing, Lifecycle based Testing Volume 4, Issue 12, December 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Lifecycle

More information

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013 Big Data Use Case How Rackspace is using Private Cloud for Big Data Bryan Thompson May 8th, 2013 Our Big Data Problem Consolidate all monitoring data for reporting and analytical purposes. Every device

More information

Quantitative and Population Genetics

Quantitative and Population Genetics Genome 371, 8 March 2010, Lecture 15 Quantitative and Population Genetics What are quantitative traits and why do we care? - genetic basis of quantitative traits - heritability Basic concepts of population

More information

Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia

Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia Unified Big Data Processing with Apache Spark Matei Zaharia @matei_zaharia What is Apache Spark? Fast & general engine for big data processing Generalizes MapReduce model to support more types of processing

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Sisense. Product Highlights. www.sisense.com

Sisense. Product Highlights. www.sisense.com Sisense Product Highlights Introduction Sisense is a business intelligence solution that simplifies analytics for complex data by offering an end-to-end platform that lets users easily prepare and analyze

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Predictive Analytics

Predictive Analytics Predictive Analytics How many of you used predictive today? 2015 SAP SE. All rights reserved. 2 2015 SAP SE. All rights reserved. 3 How can you apply predictive to your business? Predictive Analytics is

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Issues in Data Storage and Data Management in Large- Scale Next-Gen Sequencing

Issues in Data Storage and Data Management in Large- Scale Next-Gen Sequencing Issues in Data Storage and Data Management in Large- Scale Next-Gen Sequencing Matthew Trunnell Manager, Research Computing Broad Institute Overview The Broad Institute Major challenges Current data workflow

More information

Statistical Analysis for Genetic Epidemiology (S.A.G.E.) Version 6.2 Graphical User Interface (GUI) Manual

Statistical Analysis for Genetic Epidemiology (S.A.G.E.) Version 6.2 Graphical User Interface (GUI) Manual Statistical Analysis for Genetic Epidemiology (S.A.G.E.) Version 6.2 Graphical User Interface (GUI) Manual Department of Epidemiology and Biostatistics Wolstein Research Building 2103 Cornell Rd Case Western

More information

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER

Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER Step-by-Step Guide to Bi-Parental Linkage Mapping WHITE PAPER JMP Genomics Step-by-Step Guide to Bi-Parental Linkage Mapping Introduction JMP Genomics offers several tools for the creation of linkage maps

More information

Powerful Management of Financial Big Data

Powerful Management of Financial Big Data Powerful Management of Financial Big Data TickSmith s solutions are the first to apply the processing power, speed, and capacity of cutting-edge Big Data technology to financial data. We combine open source

More information

Request for Applications. Sharing Big Data for Health Care Innovation: Advancing the Objectives of the Global Alliance for Genomics and Health

Request for Applications. Sharing Big Data for Health Care Innovation: Advancing the Objectives of the Global Alliance for Genomics and Health 1. Overview Request for Applications Sharing Big Data for Health Care Innovation: Advancing the Objectives of the Global Alliance for Genomics and Health In order for Canada to take full advantage of the

More information

Milk protein genetic variation in Butana cattle

Milk protein genetic variation in Butana cattle Milk protein genetic variation in Butana cattle Ammar Said Ahmed Züchtungsbiologie und molekulare Genetik, Humboldt Universität zu Berlin, Invalidenstraβe 42, 10115 Berlin, Deutschland 1 Outline Background

More information

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University

More information