Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources
|
|
- Jeffry Floyd
- 8 years ago
- Views:
Transcription
1 Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources Ping Zhang IBM T. J. Watson Research Center Pankaj Agarwal GlaxoSmithKline Zoran Obradovic Temple University
2 Terms and ideas Drug, Chemical Compound Drug Targets, Target Proteins, Off Targets Indicated effects, Side Effects Drug Indication, Indicated Diseases
3 Timescale: drug discovery and development
4 Why drugs fail...
5 Drug repositioning Drug repositioning (also known as Drug repurposing, Drug re-profiling, Therapeutic Switching and Drug re-tasking) is the application of known drugs and compounds to new indications (i.e., new diseases)
6 Shorter timelines & less risk
7 Computational drug repositioning If two drugs d x and d y are found to be similar, and d y is used for treating disease s, then d x is a repositioning candidate for disease s treatment. Chemical Properties: [Keiser et al., Nature 2009], [Swamidas, Brief Bioinform 2011] Biological Properties: [Li et al., Plos CB 2009], [Kotelnikova et al., JBCB 2010] Phenotypic Properties: [Campillos et al., Science 2008], [Hu and Agarwal, Plos One 2009], [Yang and Agarwal, Plos One 2011], Integrate multiple drug data sources for better solutions.
8 Computing similarity of drug chemical structures Collected 1007 approved small-molecule drugs from DrugBank with their chemical structure information. Used CDK to encode each component into 881- dimensional substructure vector defined in PubChem. Tanimoto similarity: the proportion of substructures in common between two molecules.
9 Computing similarity of drug protein targets Mapped DrugBank Target information to Uniprot extracted 3152 relationships between 1007 drugs and 775 proteins. target P( d ) P( d ) x y 1 sim ( d, d ) g( Pd ( ), P( d)) x y i x j y Pd ( x) Pd ( y) i 1 j 1 where given a drug d, we present its target protein set as P(d); then P(d) is the size of the target protein set of drug d. The sequence similarity function of two proteins g is calculated as a Smith-Waterman sequence alignment score.
10 Computing similarity of drug side-effect profiles Side-effect keywords were obtained from the SIDER database (information from drug s package inserts). Each drug was represented by 1385-dimensional binary side-effect profile whose elements encode for the presence (1) or absence (0) of each of the side-effect key words. Then we can use Tanimoto to measure the side-effect similarity. Obtained relationships between 613 drugs and 1385 side effects. 394 drugs from DrugBank approved list could NOT be mapped to SIDER drug names. Imputing missing side-effect profiles from chemical structure information. Method similar to [Pauwels et al., BMC Bioinformatics 2011]
11 Computing prediction score from a single data source Obtained a drug s known use(s) National Drug File Reference Terminology. Constructed a gold set of 3250 treatment relationship between 799 drugs and 719 diseases. i f ( d, s) sim ( d, d ) C( s indications( d )) i x x y y d N ( d ) y k x C is a characteristic function that return 1 if d y has a disease indication s and 0 otherwise, and N k (d x ) are the k nearest neighbors of drug d x according to the metric sim i which is determined by the type of i-th data source. x query drug d x Neighborhood of d x
12 Combining multiple measures A new drug repositioning framework: Similarity-based LArgemargin learning of Multiple Sources (SLAMS)
13 Large margin method Given m scores for a drug-disease pair (d, s), we propose a large margin method to calculate final score f E as a weighted average of individual scores: A weight vector w, used for integration of m prediction, be found by solving the optimization problem.
14 Method comparison PREDICT (Gottlieb et al. Mol. Sys. Biol. 2011): Uses similarity measures as features, learns a logistic regression classifier to yield a classification score. Simple Average: Assumes that each data source is equally informative, thus simply averages all k-nn prediction scores. SLAMS: Algorithm proposed in this study that uses a large margin method to automatically weighs and integrates multiple data sources.
15 Data source comparison Distribution of SLAMS weights for chemical, biological and phenotypic data sources.
16 Analysis of novel predictions False-positive (FP) drug-disease associations were predicted by our method but they were not present in the training set. Some FP associations could be false, but a few associations could be true and can be considered as drug repositioning candidates in the real-world drug discovery. Of 4066 found drug-disease associations in ClinicalTrials.gov (not included in the training set), our FP associations cover 21%. Therefore, our predictions statistically overlap drug-disease associations tested in clinical trials, suggesting that the predicted drugs may be regarded as valuable repositioning candidates for further drug discovery research. All data sets and predicted drug-disease associations are available at
17 Examples of FP predictions for Rheumatoid Arthritis
18 Conclusion We proposed SLAMS, a new drug repositioning framework by integrating chemical, biological, and phenotypic properties. The method allows easy integration of additional drug information sources. The method ranked multiple drug information sources based on their contributions to the prediction, thus paving the way for prioritizing multiple data sources and building more reliable drug repositioning models.
19 Future work: integrate more information sources
20 Thank you! Questions? Ping Zhang: Pankaj Agarwal: Zoran Obradovic:
Towards Personalized Medicine: Leveraging Patient Similarity and Drug Similarity Analytics
Towards Personalized Medicine: Leveraging Patient Similarity and Drug Similarity Analytics Ping Zhang, PhD, Fei Wang, PhD, Jianying Hu, PhD, Robert Sorrentino, MD Healthcare Analytics Research Group, IBM
More informationFingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures
Fingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures Jérôme Hert, Peter Willett and David J. Wilton (University of Sheffield, Sheffield, UK) Pierre Acklin, Kamal Azzaoui, Edgar
More informationDiscover more, discover faster. High performance, flexible NLP-based text mining for life sciences
Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences It s not information overload, it s filter failure. Clay Shirky Life Sciences organizations face the challenge
More informationData Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
More informationBig Data Analytics for Healthcare
Big Data Analytics for Healthcare Jimeng Sun Chandan K. Reddy Healthcare Analytics Department IBM TJ Watson Research Center Department of Computer Science Wayne State University 1 Healthcare Analytics
More informationMachine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationPerCuro-A Semantic Approach to Drug Discovery. Final Project Report submitted by Meenakshi Nagarajan Karthik Gomadam Hongyu Yang
PerCuro-A Semantic Approach to Drug Discovery Final Project Report submitted by Meenakshi Nagarajan Karthik Gomadam Hongyu Yang Towards the fulfillment of the course Semantic Web CSCI 8350 Fall 2003 Under
More informationBig Data Healthcare. Fei Wang Associate Professor Department of Computer Science and Engineering School of Engineering University of Connecticut
Big Data Healthcare Fei Wang Associate Professor Department of Computer Science and Engineering School of Engineering University of Connecticut Healthcare Is in Crisis Hersh, W., Jacko, J. A., Greenes,
More informationComparative Drug Ranking for Clinical Decision Support in HIV Treatment
Comparative Drug Ranking for Clinical Decision Support in HIV Treatment Emiliano Mancini University of Amsterdam 1 Prevalence of HIV 2 Global Overview HIV Infection People living with HIV: 34 million (2010
More informationA leader in the development and application of information technology to prevent and treat disease.
A leader in the development and application of information technology to prevent and treat disease. About MOLECULAR HEALTH Molecular Health was founded in 2004 with the vision of changing healthcare. Today
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationIDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION
http:// IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION Harinder Kaur 1, Raveen Bajwa 2 1 PG Student., CSE., Baba Banda Singh Bahadur Engg. College, Fatehgarh Sahib, (India) 2 Asstt. Prof.,
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationDRUG repositioning, i.e. the prediction of novel therapeutic
IEEE ACM TRANS. ON COMP. BIOL. AND BIOINFORMATICS 1 Network-based Drug Ranking and Repositioning with respect to DrugBank Therapeutic Categories Matteo Re, and Giorgio Valentini Abstract Drug repositioning
More informationDATA PREPARATION FOR DATA MINING
Applied Artificial Intelligence, 17:375 381, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 $12.00 +.00 DOI: 10.1080/08839510390219264 u DATA PREPARATION FOR DATA MINING SHICHAO ZHANG and CHENGQI
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationPersonalized Predictive Modeling and Risk Factor Identification using Patient Similarity
Personalized Predictive Modeling and Risk Factor Identification using Patient Similarity Kenney Ng, PhD 1, Jimeng Sun, PhD 2, Jianying Hu, PhD 1, Fei Wang, PhD 1,3 1 IBM T. J. Watson Research Center, Yorktown
More informationToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database
ToxiCat: Hybrid Named Entity Recognition services to support curation of the Comparative Toxicogenomic Database Dina Vishnyakova 1,2, 4, *, Julien Gobeill 1,3,4, Emilie Pasche 1,2,3,4 and Patrick Ruch
More informationBig Data in Drug Discovery
Big Data in Drug Discovery David J. Wild Assistant Professor & Director, Cheminformatics Program Indiana University School of Informatics and Computing djwild@indiana.edu - http://djwild.info Epochs in
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationMaking semantics work in drug discovery
Indiana University School of Informatics and Computing Making semantics work in drug discovery Information is cheap. Understanding is expensive (Karl Fast) David Wild, Assistant Professor and Director,
More informationDe novo design in the cloud from mining big data to clinical candidate
De novo design in the cloud from mining big data to clinical candidate Jérémy Besnard Data Science For Pharma Summit 28 th January 2016 Overview the 3 bullet points Cloud based data platform that can efficiently
More informationApplied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.
Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.38457 Accuracy Rate of Predictive Models in Credit Screening Anirut Suebsing
More informationInformation Extraction from Patents: Combining Text- and Image-Mining. Martin Hofmann-Apitius
Information Extraction from Patents: Combining Text- and Image-Mining Martin Hofmann-Apitius Bonn-Aachen International Centre for Information Technology (B-IT) September 25, 2007 Status Report: Major Achievements
More informationTowards better accuracy for Spam predictions
Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial
More informationDr Alexander Henzing
Horizon 2020 Health, Demographic Change & Wellbeing EU funding, research and collaboration opportunities for 2016/17 Innovate UK funding opportunities in omics, bridging health and life sciences Dr Alexander
More informationRulex s Logic Learning Machines successfully meet biomedical challenges.
Rulex s Logic Learning Machines successfully meet biomedical challenges. Rulex is a predictive analytics platform able to manage and to analyze big amounts of heterogeneous data. With Rulex, it is possible,
More informationVad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives
Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives Dirk.Repsilber@oru.se 2015-05-21 Functional Bioinformatics, Örebro University Vad är bioinformatik och varför
More informationPREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA
PREDICTIVE ANALYTICS: PROVIDING NOVEL APPROACHES TO ENHANCE OUTCOMES RESEARCH LEVERAGING BIG AND COMPLEX DATA IMS Symposium at ISPOR at Montreal June 2 nd, 2014 Agenda Topic Presenter Time Introduction:
More informationProtein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
More informationData Mining with SAS. Mathias Lanner mathias.lanner@swe.sas.com. Copyright 2010 SAS Institute Inc. All rights reserved.
Data Mining with SAS Mathias Lanner mathias.lanner@swe.sas.com Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Data mining Introduction Data mining applications Data mining techniques SEMMA
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationDiscovering Local Subgroups, with an Application to Fraud Detection
Discovering Local Subgroups, with an Application to Fraud Detection Abstract. In Subgroup Discovery, one is interested in finding subgroups that behave differently from the average behavior of the entire
More informationCheminformatics and Pharmacophore Modeling, Together at Last
Application Guide Cheminformatics and Pharmacophore Modeling, Together at Last SciTegic Pipeline Pilot Bridging Accord Database Explorer and Discovery Studio Carl Colburn Shikha Varma-O Brien Introduction
More informationIntroduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationProteinQuest user guide
ProteinQuest user guide 1. Introduction... 3 1.1 With ProteinQuest you can... 3 1.2 ProteinQuest basic version 4 1.3 ProteinQuest extended version... 5 2. ProteinQuest dictionaries... 6 3. Directions for
More informationGENETIC DATA ANALYSIS
GENETIC DATA ANALYSIS 1 Genetic Data: Future of Personalized Healthcare To achieve personalization in Healthcare, there is a need for more advancements in the field of Genomics. The human genome is made
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationTHREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS
THREE DIMENSIONAL REPRESENTATION OF AMINO ACID CHARAC- TERISTICS O.U. Sezerman 1, R. Islamaj 2, E. Alpaydin 2 1 Laborotory of Computational Biology, Sabancı University, Istanbul, Turkey. 2 Computer Engineering
More informationHow to create and interpret the predictive analysis of a compound
How to create and interpret the predictive analysis of a compound Platform with suite of tools Predict & understand biological effects of small molecules & compounds Predict targets and metabolites, potential
More informationData Mining Analysis of HIV-1 Protease Crystal Structures
Data Mining Analysis of HIV-1 Protease Crystal Structures Gene M. Ko, A. Srinivas Reddy, Sunil Kumar, and Rajni Garg AP0907 09 Data Mining Analysis of HIV-1 Protease Crystal Structures Gene M. Ko 1, A.
More informationVertical data integration for melanoma prognosis. Australia 3 Melanoma Institute Australia, NSW 2060 Australia. kaushala@maths.usyd.edu.au.
Vertical integration for melanoma prognosis Kaushala Jayawardana 1,4, Samuel Müller 1, Sarah-Jane Schramm 2,3, Graham J. Mann 2,3 and Jean Yang 1 1 School of Mathematics and Statistics, University of Sydney,
More informationBigDataBench. Khushbu Agarwal
BigDataBench Khushbu Agarwal Last Updated: May 23, 2014 CONTENTS Contents 1 What is BigDataBench? [1] 1 1.1 SUMMARY.................................. 1 1.2 METHODOLOGY.............................. 1 2
More informationIntegrating Bioinformatics, Medical Sciences and Drug Discovery
Integrating Bioinformatics, Medical Sciences and Drug Discovery M. Madan Babu Centre for Biotechnology, Anna University, Chennai - 600025 phone: 44-4332179 :: email: madanm1@rediffmail.com Bioinformatics
More informationDifferential privacy in health care analytics and medical research An interactive tutorial
Differential privacy in health care analytics and medical research An interactive tutorial Speaker: Moritz Hardt Theory Group, IBM Almaden February 21, 2012 Overview 1. Releasing medical data: What could
More informationData Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
More informationModule 1. Sequence Formats and Retrieval. Charles Steward
The Open Door Workshop Module 1 Sequence Formats and Retrieval Charles Steward 1 Aims Acquaint you with different file formats and associated annotations. Introduce different nucleotide and protein databases.
More informationDrug design Drug repositioning Virtual screening
Drug design Drug repositioning Virtual screening May 2013 Plebiotic services Drug design & re-design Drug repositioning Virtual screening Homology modeling Library generation (combinatorial chemistry)
More informationLCs for Binary Classification
Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it
More informationCRAC: An integrated approach to analyse RNA-seq reads Additional File 3 Results on simulated RNA-seq data.
: An integrated approach to analyse RNA-seq reads Additional File 3 Results on simulated RNA-seq data. Nicolas Philippe and Mikael Salson and Thérèse Commes and Eric Rivals February 13, 2013 1 Results
More informationHow To Bet On An Nfl Football Game With A Machine Learning Program
Beating the NFL Football Point Spread Kevin Gimpel kgimpel@cs.cmu.edu 1 Introduction Sports betting features a unique market structure that, while rather different from financial markets, still boasts
More informationA.I. in health informatics lecture 1 introduction & stuff kevin small & byron wallace
A.I. in health informatics lecture 1 introduction & stuff kevin small & byron wallace what is this class about? health informatics managing and making sense of biomedical information but mostly from an
More informationLogistic Regression (1/24/13)
STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used
More informationIII. DATA SETS. Training the Matching Model
A Machine-Learning Approach to Discovering Company Home Pages Wojciech Gryc Oxford Internet Institute University of Oxford Oxford, UK OX1 3JS Email: wojciech.gryc@oii.ox.ac.uk Prem Melville IBM T.J. Watson
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationGOBII. Genomic & Open-source Breeding Informatics Initiative
GOBII Genomic & Open-source Breeding Informatics Initiative My Background BS Animal Science, University of Tennessee MS Animal Breeding, University of Georgia Random regression models for longitudinal
More informationResearch Data Integration of Retrospective Studies for Prediction of Disease Progression A White Paper. By Erich A. Gombocz
Research Data Integration of Retrospective Studies for Prediction of Disease Progression A White Paper By Erich A. Gombocz 2 Research Data Integration of Retrospective Studies for Prediction of Disease
More informationOn Efficiently Capturing Scien3fic Proper3es in Distributed Big Data without Moving the Data:
On Efficiently Capturing Scien3fic Proper3es in Distributed Big Data without Moving the Data: Case Study in Distributed Structural Biology using MapReduce Boyu Zhang, Trilce Estrada 2, Pietro Cico@ 3,
More informationJust the Facts: A Basic Introduction to the Science Underlying NCBI Resources
1 of 8 11/7/2004 11:00 AM National Center for Biotechnology Information About NCBI NCBI at a Glance A Science Primer Human Genome Resources Model Organisms Guide Outreach and Education Databases and Tools
More informationCENG 734 Advanced Topics in Bioinformatics
CENG 734 Advanced Topics in Bioinformatics Week 9 Text Mining for Bioinformatics: BioCreative II.5 Fall 2010-2011 Quiz #7 1. Draw the decompressed graph for the following graph summary 2. Describe the
More informationFinal Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
More informationFocusing on results not data comprehensive data analysis for targeted next generation sequencing
Focusing on results not data comprehensive data analysis for targeted next generation sequencing Daniel Swan, Jolyon Holdstock, Angela Matchan, Richard Stark, John Shovelton, Duarte Mohla and Simon Hughes
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationHow To Predict The Outcome Of The Ncaa Basketball Tournament
A Machine Learning Approach to March Madness Jared Forsyth, Andrew Wilde CS 478, Winter 2014 Department of Computer Science Brigham Young University Abstract The aim of this experiment was to learn which
More informationA Survey on Pre-processing and Post-processing Techniques in Data Mining
, pp. 99-128 http://dx.doi.org/10.14257/ijdta.2014.7.4.09 A Survey on Pre-processing and Post-processing Techniques in Data Mining Divya Tomar and Sonali Agarwal Indian Institute of Information Technology,
More informationDisambiguating Implicit Temporal Queries by Clustering Top Relevant Dates in Web Snippets
Disambiguating Implicit Temporal Queries by Clustering Top Ricardo Campos 1, 4, 6, Alípio Jorge 3, 4, Gaël Dias 2, 6, Célia Nunes 5, 6 1 Tomar Polytechnic Institute, Tomar, Portugal 2 HULTEC/GREYC, University
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More informationOpen PHACTS Workshop, February 2015. The Lilly Perspective: Challenges We Face & Tools We Need
Open PHACTS Workshop, February 2015 The Lilly Perspective: Challenges We Face & Tools We Need María Jesús Blanco, Ph.D. Director, Advanced Portfolio Strategies Marta Piñeiro-Núñez, Ph.D. Director, Open
More informationAn Introduction to the Use of Bayesian Network to Analyze Gene Expression Data
n Introduction to the Use of ayesian Network to nalyze Gene Expression Data Cristina Manfredotti Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co. Università degli Studi Milano-icocca
More informationHow To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
More informationIntroduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
More informationMobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
More informationDistance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
More informationlife science data mining
life science data mining - '.)'-. < } ti» (>.:>,u» c ~'editors Stephen Wong Harvard Medical School, USA Chung-Sheng Li /BM Thomas J Watson Research Center World Scientific NEW JERSEY LONDON SINGAPORE.
More informationtesto dello schema Secondo livello Terzo livello Quarto livello Quinto livello
Extracting Knowledge from Biomedical Data through Logic Learning Machines and Rulex Marco Muselli Institute of Electronics, Computer and Telecommunication Engineering National Research Council of Italy,
More informationMACHINE LEARNING BASICS WITH R
MACHINE LEARNING [Hands-on Introduction of Supervised Machine Learning Methods] DURATION 2 DAY The field of machine learning is concerned with the question of how to construct computer programs that automatically
More informationBiomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening
, pp.169-178 http://dx.doi.org/10.14257/ijbsbt.2014.6.2.17 Biomarker Discovery and Data Visualization Tool for Ovarian Cancer Screening Ki-Seok Cheong 2,3, Hye-Jeong Song 1,3, Chan-Young Park 1,3, Jong-Dae
More informationIngenuity Pathway Analysis (IPA )
ProductProfile Ingenuity Pathway Analysis (IPA ) For the analysis and interpretation of omics data IPA is a web-based software application for the analysis, integration, and interpretation of data derived
More informationWorkshop on Establishing a Central Resource of Data from Genome Sequencing Projects
Report on the Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects Background and Goals of the Workshop June 5 6, 2012 The use of genome sequencing in human research is growing
More informationChapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
More informationMachine Learning Logistic Regression
Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.
More informationEHR Databases and Their Role in Health & Innovation
8. New approaches to promoting innovation 8.4 Real-life data and learning from practice to advance innovation See Background Paper 8.4 (BP8_4Data.pdf) The costs of pharmaceutical R&D are high, with clinical
More informationMultivariate Tools for Modern Pharmaceutical Control FDA Perspective
Multivariate Tools for Modern Pharmaceutical Control FDA Perspective IFPAC Annual Meeting 22 January 2013 Christine M. V. Moore, Ph.D. Acting Director ONDQA/CDER/FDA Outline Introduction to Multivariate
More informationGene expression analysis. Ulf Leser and Karin Zimmermann
Gene expression analysis Ulf Leser and Karin Zimmermann Ulf Leser: Bioinformatics, Wintersemester 2010/2011 1 Last lecture What are microarrays? - Biomolecular devices measuring the transcriptome of a
More informationTOWARD BIG DATA ANALYSIS WORKSHOP
TOWARD BIG DATA ANALYSIS WORKSHOP 邁 向 巨 量 資 料 分 析 研 討 會 摘 要 集 2015.06.05-06 巨 量 資 料 之 矩 陣 視 覺 化 陳 君 厚 中 央 研 究 院 統 計 科 學 研 究 所 摘 要 視 覺 化 (Visualization) 與 探 索 式 資 料 分 析 (Exploratory Data Analysis, EDA)
More informationRETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
More informationThe Open PHACTS Discovery Platform Semantic data integration for Medicinal Chemists
Pharmacoinformatics Research Group Department of Pharmaceutical Chemistry The Open PHACTS Discovery Platform Semantic data integration for Medicinal Chemists Gerhard F. Ecker Dept. of Pharmaceutical Chemistry,
More informationIntegrated Data Mining Strategy for Effective Metabolomic Data Analysis
The First International Symposium on Optimization and Systems Biology (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 45 51 Integrated Data Mining Strategy for Effective Metabolomic
More information! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II
! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and
More informationPatient Similarity-guided Decision Support
Patient Similarity-guided Decision Support Tanveer Syeda-Mahmood, PhD IBM Almaden Research Center May 2014 2014 IBM Corporation What is clinical decision support? Rule-based expert systems curated by people,
More informationClassification and Prioritization of Biomedical Literature for the Comparative Toxicogenomics Database
Classification and Prioritization of Biomedical Literature for the Comparative Toxicogenomics Database Dina VISHNYAKOVA a,b,d,1, Emilie PASCHE a,b,d, Julien GOBEILL a,c,d, Arnaud GAUDINAT a,c,d, Christian
More informationData Integration via Constrained Clustering: An Application to Enzyme Clustering
Data Integration via Constrained Clustering: An Application to Enzyme Clustering Elisa Boari de Lima Raquel Cardoso de Melo Minardi Wagner Meira Jr. Mohammed Javeed Zaki Abstract When multiple data sources
More informationImprove Marketing Campaign ROI using Uplift Modeling. Ryan Zhao http://www.analyticsresourcing.com
Improve Marketing Campaign ROI using Uplift Modeling Ryan Zhao http://www.analyticsresourcing.com Objective To introduce how uplift model improve ROI To explore advanced modeling techniques for uplift
More informationA career on the science park
A career on the science park Onno van de Stolpe December 2014 Copyright 2014 Galapagos NV 1987 MOGEN Agricultural biotech pioneer Design of transgenic plants with improved traits Close link with Prof Schilperoort
More informationChapter 31 Data Driven Analytics for Personalized Healthcare
Chapter 31 Data Driven Analytics for Personalized Healthcare Jianying Hu, Adam Perer, and Fei Wang Abstract The concept of Learning Health Systems (LHS) is gaining momentum as more and more electronic
More informationFacebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
More informationMachine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
More information