Nathan Brown. The Application of Consensus Modelling and Genetic Algorithms to Interpretable Discriminant Analysis.
|
|
|
- Nicholas Smith
- 10 years ago
- Views:
Transcription
1 Nathan Brown The Application of Consensus Modelling and Genetic Algorithms to Interpretable Discriminant Analysis Workshop Chemoinformatics in Europe: Research and Teaching 30 th May 2006
2 Discriminant Analysis Using a GA Predictive vs. Diagnostic Modelling Discriminant Analysis with a Genetic Algorithm Consensus and Splice Modelling Experimental Studies MDDR: 1130 renin and 636 COX inhibitors 1 Oral drugs: 1082 FDA-approved drugs 2 1. Hert, J.; Willet, P.; Wilton, D. J.; Acklin, P.; Azzaoui, K.; Jacoby, E.; Schuffenhauer, A.; Comparison of Fingerprint-Based Methods for Virtual Screening Using Multiple Bioactive Reference Structures. J. Chem. Inf. Comput. Sci. 2004, 44, Vieth, M.; Siegel, M. G.; Higgs, R. E.; Watson, I. A.; Robertson, D. H.; Savin, K. A.; Durst, G. L.; Hipskind, P. A. Characteristic Physical Properties and Structural Fragments of Marketed Oral Drugs. J. Med. Chem. 2004, 47, June 2006
3 Predictive versus Diagnostic Models Highly predictive models tend to obfuscate what is important for the property being modelled Highly interpretable models tend to be less effective in prediction power However, both objectives are very important * Adapted from a diagram by Richard Lewis We want highly predictive models that can also guide our decision-making processes 3 6 June 2006
4 Discriminant Analysis Supervised learning Dependent variable is known for dataset and used in training the model with the independent variables Optimize separation of classes Evolve weights for binned descriptors Score solutions according to ability to separate objects Discover descriptor ranges that are important for discrimination which can then be applied to make informed decisions 1. Gillet, V. J.; Willett, P.; Bradshaw, J. Identification of Biological Activity Profiles Using Substructural Analysis and Genetic Algorithms. J. Chem. Inf. Comput. Sci. 1998, 38, June 2006
5 Chromosome Encoding Selection of N descriptors Each descriptor partitioned into B i bins Each bin can take any value in the range {0 W} Chromosome length is then (N B i ) PSA N = 3 B = {4, 7, 5} MW ClogP 5 6 June 2006
6 Descriptor Selection Calculate physicochemical descriptors Cluster descriptors (not objects) Select descriptors that are: more orthogonal, and more interpretable for the medicinal chemist Some dataset dependency 6 6 June 2006
7 Fitness Functions 1 Initial Enhancement (IE) Emphasises enrichment in top NACT% of recalled molecules i.e. mean rank of all actives recalled after NACT Global Enhancement (GE) Emphasises enrichment of all actives in recalled molecules i.e. mean rank of all actives Maximum Difference Enhancement (MDE) Emphasises maximum difference in scores between the two classes 7 6 June 2006
8 Fitness Functions 2 Existing fitness function used a combination of evaluations: Number of actives in the top N% Average rank of actives over entire rank Maximised Difference Enhancement (MDE) The difference of the average rank of the two classes being discriminated MDE will tend to result in molecules where the separation between the two classes is maximised globally and rewarding ranks with more interesting molecules in the initial part of the rank 8 6 June 2006
9 Consensus Models Aim to reduce stochastic effects of using a single chromosome 9 6 June 2006
10 Splice Models Essentially a manual recombination operator to effect a more optimal solution model based on feedback and intuition 10 6 June 2006
11 Renin Consensus Discrimination Model Global Enhancement 1 0 M1TR M1TS1 M1TS2 CM1TS1 CM1TS2 M5TR M5TS1 M5TS2 CM5TS1 CM5TS2 M6TR M6TS1 M6TS2 CM6TS1 CM6TS2 Model Single Model Result Consensus Model Result 11 6 June 2006
12 Renin Consensus Discrimination Model Global Enhancement 1 0 M1TR M1TS1 M1TS2 CM1TS1 CM1TS2 M5TR M5TS1 M5TS2 CM5TS1 CM5TS2 M6TR M6TS1 M6TS2 CM6TS1 CM6TS2 Model Single Model Result Consensus Model Result 12 6 June 2006
13 SM6TS2 Renin Splice Discrimination Model Global Enhancement M1TS1 M1TS2 CM1TS1 CM1TS2 M5TS1 M5TS2 CM5TS1 CM5TS2 M6TS1 M6TS2 CM6TS1 CM6TS2 SM6TR SM6TS1 Model Single Model Result Consensus Model Result Splice Model Result 13 6 June 2006
14 SM6TS2 Renin Splice Discrimination Model Global Enhancement M1TS1 M1TS2 CM1TS1 CM1TS2 M5TS1 M5TS2 CM5TS1 CM5TS2 M6TS1 M6TS2 CM6TS1 CM6TS2 SM6TR SM6TS1 Model Single Model Result Consensus Model Result Splice Model Result 14 6 June 2006
15 COX Splice Discrimination Model Global Enhancement M1TR M1TS1 M1TS2 CM1TS1 CM1TS2 M2TR M2TS1 M2TS2 CM2TS1 CM2TS2 Model Single Model Result Consensus Model Result 15 6 June 2006
16 COX Splice Discrimination Model Global Enhancement M1TR M1TS1 M1TS2 CM1TS1 CM1TS2 M2TR M2TS1 M2TS2 CM2TS1 CM2TS2 Model Single Model Result Consensus Model Result 16 6 June 2006
17 Comparative Study: Oral vs. Non-Oral Drugs Oral vs. non-oral drugs dataset 1 GA model compared with models generated with Naïve Bayes Classifier (NBC) Support Vector Machines (SVM) Investigating: Consistency of results Interpretation of models 1. Vieth, M.; Siegel, M. G.; Higgs, R. E.; Watson, I. A.; Robertson, D. H.; Savin, K. A.; Durst, G. L.; Hipskind, P. A. Characteristic Physical Properties and Structural Fragments of Marketed Oral Drugs. J. Med. Chem. 2004, 47, June 2006
18 Oral Drug Discrimination Model Global Enhancement Training Set Test Set 1 Test Set 2 GA NBC SVM 18 6 June 2006
19 Model Interpretability Model weights indicate Important descriptors Important ranges Used to guide decisionmaking processes Similarity searching Filtering rules Rules are focused on domain of interest 19 6 June 2006
20 Conclusions Consensus and splice models provide consistently improved results GA models provide greater or similar interpretability than other methods applied here Models are transparent as to which descriptors and their ranges are of greatest importance in discriminating Indications that the GA and NBC methods could be applied in combination Investigation of complementarity 1. Ganguly, M.; Brown, N.; Schuffenhauer, A.; Ertl, P.; Gillet, V. J.; Greenidge, P. A. Introducing the Consensus Modeling Concept in Genetic Algorithms: Application to Interpretable Discrimination Analysis. Submitted to J. Chem. Inf. Mod June 2006
21 Areas the Student Covered Cluster analysis Druglikeness Discriminant analysis Variable selection Genetic algorithms Statistical learning methods Java programming Method development 21 6 June 2006
22 What does the student gain? Coding and adapting software Tackling everyday challenges of research Performing research in industry Application-context drug research Empowered to pursue their own research 22 6 June 2006
23 What do the mentors gain? Freedom to pursue an avenue of interest Developing skills in student mentoring A new viewpoint with new ideas Assisting in training the next generation of scientists 23 6 June 2006
24 Acknowledgements University of Sheffield Milan Ganguly Val Gillet Peter Willett UCSF Jérôme Hert Cheminformatics Peter Ertl Stephen Jelfs Computer-Aided Drug Discovery Paulette Greenidge Richard Lewis Nikolaus Stiefl Molecular & Library Informatics Kamal Azzaoui Edgar Jacoby Ansgar Schuffenhauer 24 6 June 2006
Fingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures
Fingerprint-Based Virtual Screening Using Multiple Bioactive Reference Structures Jérôme Hert, Peter Willett and David J. Wilton (University of Sheffield, Sheffield, UK) Pierre Acklin, Kamal Azzaoui, Edgar
Cheminformatics and its Role in the Modern Drug Discovery Process
Cheminformatics and its Role in the Modern Drug Discovery Process Novartis Institutes for BioMedical Research Basel, Switzerland With thanks to my colleagues: J. Mühlbacher, B. Rohde, A. Schuffenhauer
Data Visualization in Cheminformatics. Simon Xi Computational Sciences CoE Pfizer Cambridge
Data Visualization in Cheminformatics Simon Xi Computational Sciences CoE Pfizer Cambridge My Background Professional Experience Senior Principal Scientist, Computational Sciences CoE, Pfizer Cambridge
KNIME Enterprise server usage and global deployment at NIBR
KNIME Enterprise server usage and global deployment at NIBR Gregory Landrum, Ph.D. NIBR Informatics Novartis Institutes for BioMedical Research, Basel 8 th KNIME Users Group Meeting Berlin, 26 February
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde
Cheminformatics and Pharmacophore Modeling, Together at Last
Application Guide Cheminformatics and Pharmacophore Modeling, Together at Last SciTegic Pipeline Pilot Bridging Accord Database Explorer and Discovery Studio Carl Colburn Shikha Varma-O Brien Introduction
A Statistician s View of Big Data
A Statistician s View of Big Data Max Kuhn, Ph.D (Pfizer Global R&D, Groton, CT) Kjell Johnson, Ph.D (Arbor Analytics, Ann Arbor MI) What Does Big Data Mean? The advantages and issues related to Big Data
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Integrating Medicinal Chemistry and Computational Chemistry: The Molecular Forecaster Approach
Integrating Medicinal Chemistry and Computational Chemistry: The Molecular Forecaster Approach Molecular Forecaster Inc. www.molecularforecaster.com Company Profile Founded in 2010 by Dr. Eric Therrien
Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
speed thought Getting the most of CHEMAXON Integration June 2006 of The Power of at the
ETL Data Mining Workflow Engine In Database Analytics Process Knowledge Creation How Soon Can We Deliver? Which Project Is Most Successful? What More Information Do We Need? Where Is The Risk In My Portfolio?
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
Use of Predictive ADME in Library Profiling and Lead Optimization
Use of Predictive ADME in Library Profiling and Lead Optimization Osman F. Güner and Robert D. Brown 223 rd ACS National Meeting April 2002, Orlando Florida Why Predictive ADME in Early Discovery? The
Ph.D. in Bioinformatics and Computational Biology Degree Requirements
Ph.D. in Bioinformatics and Computational Biology Degree Requirements Credits Students pursuing the doctoral degree in BCB must complete a minimum of 90 credits of relevant work beyond the bachelor s degree;
Working with telecommunications
Working with telecommunications Minimizing churn in the telecommunications industry Contents: 1 Churn analysis using data mining 2 Customer churn analysis with IBM SPSS Modeler 3 Types of analysis 3 Feature
We use Reaxys intensively for hit identification, hit-to-lead and lead optimization.
CASE STUDY Dr. Fabio C. Tucci, COO of Epigen Biosciences We use Reaxys intensively for hit identification, hit-to-lead and lead optimization. CREATING NEW ASSETS Epigen Biosciences is a start-up pharmaceutical
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Consensus Scoring to Improve the Predictive Power of in-silico Screening for Drug Design
Consensus Scoring to Improve the Predictive Power of in-silico Screening for Drug Design Masato Okada Faculty of Science and Technology, Masato Tsukamoto Faculty of Pharmaceutical Sciences, Hayato Ohwada
A Survey on Intrusion Detection System with Data Mining Techniques
A Survey on Intrusion Detection System with Data Mining Techniques Ms. Ruth D 1, Mrs. Lovelin Ponn Felciah M 2 1 M.Phil Scholar, Department of Computer Science, Bishop Heber College (Autonomous), Trichirappalli,
Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems.
Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems. Roberto Todeschini Milano Chemometrics and QSAR Research Group - Dept. of
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
Université de Montpellier 2 Hugo Alatrista-Salas : [email protected]
Université de Montpellier 2 Hugo Alatrista-Salas : [email protected] WEKA Gallirallus Zeland) australis : Endemic bird (New Characteristics Waikato university Weka is a collection
Learning from Diversity
Learning from Diversity Epitope Prediction with Sequence and Structure Features using an Ensemble of Support Vector Machines Rob Patro and Carl Kingsford Center for Bioinformatics and Computational Biology
Pre-Masters. Science and Engineering
Pre-Masters Science and Engineering Science and Engineering Programme information Students enter the programme with a relevant first degree and study in English on a full-time basis for either 3 or 2 terms
Structure of Presentation. The Role of Programming in Informatics Curricula. Concepts of Informatics 2. Concepts of Informatics 1
The Role of Programming in Informatics Curricula A. J. Cowling Department of Computer Science University of Sheffield Structure of Presentation Introduction The problem, and the key concepts. Dimensions
Mining a Corpus of Job Ads
Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department
Machine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
Final Project Report
CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes
Big Data Analytics for Healthcare
Big Data Analytics for Healthcare Jimeng Sun Chandan K. Reddy Healthcare Analytics Department IBM TJ Watson Research Center Department of Computer Science Wayne State University 1 Healthcare Analytics
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Careers in Management Consulting, Pharma and Biotech 9 JUL 2009
Careers in Management Consulting, Pharma and Biotech 9 JUL 2009 Alex Szidon background Biology undergrad (Dartmouth 94) ; Biochem Ph.D (UCSF 2002) L.E.K. management consulting in life sciences Business
LUCKY AHMED Department of Chemistry and Biochemistry Yale University, New Haven, CT 06511 Email: [email protected]
LUCKY AHMED Department of Chemistry and Biochemistry Yale University, New Haven, CT 06511 Email: [email protected] EDUCATION PhD in Computational Chemistry Spring- Dissertation Title: Computational
Feature Subset Selection in E-mail Spam Detection
Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature
AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE
ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS
The following module is compulsory for students who do not have an A-level pass in Mathematics. CH1M Chemistry M 20 4
BSc Chemistry For students entering Part 1 in 2011/2 Awarding Institution: Teaching Institution: Relevant QAA subject Benchmarking group(s): Faculty: Programme length: Date of specification: Programme
TRTML - A Tripleset Recommendation Tool based on Supervised Learning Algorithms
TRTML - A Tripleset Recommendation Tool based on Supervised Learning Algorithms Alexander Arturo Mera Caraballo 1, Narciso Moura Arruda Júnior 2, Bernardo Pereira Nunes 1, Giseli Rabello Lopes 1, Marco
STRUCTURE-GUIDED, FRAGMENT-BASED LEAD GENERATION FOR ONCOLOGY TARGETS
STRUCTURE-GUIDED, FRAGMENT-BASED LEAD GENERATION FOR ONCOLOGY TARGETS Stephen K. Burley Structural GenomiX, Inc. 10505 Roselle Street, San Diego, CA 92121 [email protected] www.stromix.com Summary Structural
Predictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath [email protected] National Institute of Industrial Engineering (NITIE) Vihar
How To Change Medicine
P4 Medicine: Personalized, Predictive, Preventive, Participatory A Change of View that Changes Everything Leroy E. Hood Institute for Systems Biology David J. Galas Battelle Memorial Institute Version
Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important
Health Spring Meeting May 2008 Session # 42: Dental Insurance What's New, What's Important Floyd Ray Martin, FSA, MAAA Thomas A. McInteer, FSA, MAAA Jonathan P. Polon, FSA Dental Insurance Fraud Detection
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata
Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling
MA2823: Foundations of Machine Learning
MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu [email protected]
Putting IBM Watson to Work In Healthcare
Martin S. Kohn, MD, MS, FACEP, FACPE Chief Medical Scientist, Care Delivery Systems IBM Research [email protected] Putting IBM Watson to Work In Healthcare 2 SB 1275 Medical data in an electronic or
De novo design in the cloud from mining big data to clinical candidate
De novo design in the cloud from mining big data to clinical candidate Jérémy Besnard Data Science For Pharma Summit 28 th January 2016 Overview the 3 bullet points Cloud based data platform that can efficiently
Program Overview. Updated 06/13
Program Overview Biomedical Informatics is an interdisciplinary science that involves both the conceptual and practical tools from diverse disciplines for the understanding, invention, generation and propagation
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
THE CAMBRIDGE CRYSTALLOGRAPHIC DATA CENTRE (CCDC)
ABOUT THE CAMBRIDGE CRYSTALLOGRAPHIC DATA CENTRE (CCDC) The CCDC is the trusted research institution responsible for the 50-year old Cambridge Structural Database (CSD) and its applications. Used by thousands
Data Integration. Lectures 16 & 17. ECS289A, WQ03, Filkov
Data Integration Lectures 16 & 17 Lectures Outline Goals for Data Integration Homogeneous data integration time series data (Filkov et al. 2002) Heterogeneous data integration microarray + sequence microarray
Program Overview. Updated 06/13
Program Overview Computing systems and technologies have become increasingly essential for modern practice of medicine, pharmaceutical and clinical research, efficient and effective management of health
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
The Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
Prof. Elizabeth Raymond Department of Chemistry Western Washington University
Prof. Elizabeth Raymond Department of Chemistry Western Washington University Keys to Success 1. Be informed. 2. Do not self select. 3. Find something that interests you...... have FUN, but do not limit
Big Data Challenges. technology basics for data scientists. Spring - 2014. Jordi Torres, UPC - BSC www.jorditorres.
Big Data Challenges technology basics for data scientists Spring - 2014 Jordi Torres, UPC - BSC www.jorditorres.eu @JordiTorresBCN Data Deluge: Due to the changes in big data generation Example: Biomedicine
The Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Analysis Tools and Libraries for BigData
+ Analysis Tools and Libraries for BigData Lecture 02 Abhijit Bendale + Office Hours 2 n Terry Boult (Waiting to Confirm) n Abhijit Bendale (Tue 2:45 to 4:45 pm). Best if you email me in advance, but I
Scoring Functions and Docking. Keith Davies Treweren Consultants Ltd 26 October 2005
Scoring Functions and Docking Keith Davies Treweren Consultants Ltd 26 October 2005 Overview Applications Docking Algorithms Scoring Functions Results Demonstration Docking Applications Drug Design Lead
A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering
A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences
Knowledge-based systems and the need for learning
Knowledge-based systems and the need for learning The implementation of a knowledge-based system can be quite difficult. Furthermore, the process of reasoning with that knowledge can be quite slow. This
High-Throughput Screening at The University of Chicago Cellular Screening Center. Sam Bettis Technical Director [email protected].
igh-throughput Screening at The University of Chicago Cellular Screening Center Sam Bettis Technical Director [email protected] igh-throughput Screening at The University of Chicago! Cellular Screening
Detecting client-side e-banking fraud using a heuristic model
Detecting client-side e-banking fraud using a heuristic model Tim Timmermans [email protected] Jurgen Kloosterman [email protected] University of Amsterdam July 4, 2013 Tim Timmermans, Jurgen
1 Topic. 2 Scilab. 2.1 What is Scilab?
1 Topic Data Mining with Scilab. I know the name "Scilab" for a long time (http://www.scilab.org/en). For me, it is a tool for numerical analysis. It seemed not interesting in the context of the statistical
How To Understand Protein-Protein Interaction And Inhibitors
Protein-Protein Interactions and Inhibitors Alan Naylor Independent Consultant Optibrium Consultants Meeting Cambridge 27 th November 2012 Why PPI inhibitors? PPIs are involved in many biological / disease
Mammoth Scale Machine Learning!
Mammoth Scale Machine Learning! Speaker: Robin Anil, Apache Mahout PMC Member! OSCON"10! Portland, OR! July 2010! Quick Show of Hands!# Are you fascinated about ML?!# Have you used ML?!# Do you have Gigabytes
A Case of Study on Hadoop Benchmark Behavior Modeling Using ALOJA-ML
www.bsc.es A Case of Study on Hadoop Benchmark Behavior Modeling Using ALOJA-ML Josep Ll. Berral, Nicolas Poggi, David Carrera Workshop on Big Data Benchmarks Toronto, Canada 2015 1 Context ALOJA: framework
Personalized Predictive Medicine and Genomic Clinical Trials
Personalized Predictive Medicine and Genomic Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute http://brb.nci.nih.gov brb.nci.nih.gov Powerpoint presentations
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
Role of Social Networking in Marketing using Data Mining
Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:
High Performance Computing Initiatives
High Performance Computing Initiatives Eric Stahlberg September 1, 2015 DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health National Cancer Institute Frederick National Laboratory is
Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm
Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Martin Hlosta, Rostislav Stríž, Jan Kupčík, Jaroslav Zendulka, and Tomáš Hruška A. Imbalanced Data Classification
Microarray Data Mining: Puce a ADN
Microarray Data Mining: Puce a ADN Recent Developments Gregory Piatetsky-Shapiro KDnuggets EGC 2005, Paris 2005 KDnuggets EGC 2005 Role of Gene Expression Cell Nucleus Chromosome Gene expression Protein
MS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
Use of social media data for official statistics
Use of social media data for official statistics International Conference on Big Data for Official Statistics, October 2014, Beijing, China Big Data Team 1. Why Twitter 2. Subjective well-being 3. Tourism
Machine learning for algo trading
Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with
BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, [email protected]) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
Machine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
Predictive Analytics Certificate Program
Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and
Original article: A SIMPLE CLICK BY CLICK PROTOCOL TO PERFORM DOCKING: AUTODOCK 4.2 MADE EASY FOR NON-BIOINFORMATICIANS
Original article: A SIMPLE CLICK BY CLICK PROTOCOL TO PERFORM DOCKING: AUTODOCK 4.2 MADE EASY FOR NON-BIOINFORMATICIANS Syed Mohd. Danish Rizvi 1, Shazi Shakil* 2, Mohd. Haneef 2 1 Department of Biosciences,
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
Azure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
A Logistic Regression Approach to Ad Click Prediction
A Logistic Regression Approach to Ad Click Prediction Gouthami Kondakindi [email protected] Satakshi Rana [email protected] Aswin Rajkumar [email protected] Sai Kaushik Ponnekanti [email protected] Vinit Parakh
Electronic Health Records: An introduction to openehr and archetypes
Electronic Health Records: An introduction to openehr and archetypes Dr. Sebastian Garde CCR Workshop Munich 29 th April 2008 Expectations Timely information and reports for ALL professions with a minimum
The INFUSIS Project Data and Text Mining for In Silico Modeling
The INFUSIS Project Data and Text Mining for In Silico Modeling Henrik Boström 1,2, Ulf Norinder 3, Ulf Johansson 4, Cecilia Sönströd 4, Tuve Löfström 4, Elzbieta Dura 5, Ola Engkvist 6, Sorel Muresan
Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources
Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources Ping Zhang IBM T. J. Watson Research Center Pankaj Agarwal GlaxoSmithKline Zoran Obradovic Temple University Terms and
Artificial Intelligence and Machine Learning Models
Using Artificial Intelligence and Machine Learning Techniques. Some Preliminary Ideas. Presentation to CWiPP 1/8/2013 ICOSS Mark Tomlinson Artificial Intelligence Models Very experimental, but timely?
