Knowledge Graph: Connecting Big Data Semantics. Ying Ding Indiana University
|
|
|
- Lauren Small
- 10 years ago
- Views:
Transcription
1 Knowledge Graph: Connecting Big Data Semantics Ying Ding Indiana University
2 Outline Vision Use Case: VIVO Ontology Use Case: Chem2Bio2RDF Challenges
3 VISION
4 Vision Changes in Search Strings vs. things
5 Vision Changes in Search Relation matters: connecting things/entities
6 Vision Changes in Search Subgraph: Context is king
7 Vision Changes in Search Future search: string entity relation subgraph Filippo Menczer & Elinor Ostrom
8 Entities Entities are everywhere Entities on the Web: person, location, organization, book, music (vivoweb.org) Entities in medicine: gene, drug, disease, protein, side effect (chem2bio2rdf.org)
9 VIVO
10 VIVO: National networking of scientists VIVO: $12.5M funded by National Institute of Health to enable national networking of scientists 9/1/2009 8/31/2012, with one year extension partners (Univ of Florida, Cornell Univ, Indiana University, Washington Univ, Scripps, Weill Cornell, Ponce Medical School) It utilizes Semantic Web technologies to model scientists and provides federated search to enhance the discovery of researchers and collaborators across the country Together with its sister project eagle i ($13M), they will provide the semantic portals to network people and share resources.
11
12 VIVO Ontology: Modeling Network of Scientists Network Structure: People: foaf:person, foaf:organization, Output: vivo:informationresources Relationship: vivo:role Academic Setting: Research (bibo:document, vivo:grant, vivo:project, vivo:software, vivo:dataset, vivo:researchlaboratory) Teaching (vivo:teacherrole, vivo:course) Service (vivo:service, vivo:editorrole, vivo:organizerrole, ) Expertise (skos:concept)
13
14
15 Relationships have nuances The VIVO ontology supports representing rich information about relationships and how they change over time description and duration of a person s participation in a project or event current and former employment, with titles and dates author order in a publication Implemented as classes whose members we call context nodes
16
17 VIVO ontology localization Different localization required by different institutions UF, Cornell, IU, WASHU, Scripps, MED Cornell How to make localization: Adding local namespace: indiana: indiana/ core: Local classes are the subclasses of the VIVO Core foaf:person core:non academic indiana:professional Staff indiana: AdministrativeServices
18 Modeling examples: Research Scenario: Prof. Katy Börner coauthored with Nianli, Russell, Angela for the following publication: Börner, Katy, Ma, Nianli, Duhon, Russell J., Zoss, Angela M. (2009) Open Data and Open Code for S&T Assessment. IEEE Intelligent Systems. 24(4), pp , July/August.
19 Modeling examples: Research < rdf:type < >. < < < < > rdf:type < < < < < rdf:type <
20 Modeling examples: Research < rdf:type < < < < < rdf:type < < < 2. < < <
21 RDF Graph core:facultym ember rdf:type individual:pers on25557 individual:per son rdf:type core:nonacade mic core:authorinauthorship core:authorship core:authorinauthorship individual:n74 rdf:type rdf:type individual:n28 81 core:authorrank 2 core:linkedinformationresource core:linkedinformationresource individual:n7109 rdf:type gy/bibo/article
22 Applications Querying semantic data SPARQL query builder onto.slis.indiana.edu/sparql/ Federated Search VIVO Search
23 CHEM2BIO2RDF
24 Big Data in Life Sciences There is now an incredibly rich resource of public information relating compounds, targets, genes, pathways, and diseases. Just for starters there is in the public domain information on: 69 million compounds and 449,392 bioassays (PubChem) 59 million compound bioactivities (PubChem Bioassay) 4,763 drugs (DrugBank) 9 million protein sequences (SwissProt) and 58,000 3D structures (PDB) 14 million human nucleotide sequences (EMBL) 22 million life sciences publications 800,000 new each year (PubMed) Multitude of other sets (drugs, toxicogenomics, chemogenomics, metagenomics ) Even more important are the relationships between these entities. For example a chemical compound can be linked to a gene or a protein target in a multitude of ways: Biological assay with percent inhibition, IC50, etc Crystal structure of ligand/protein complex Co occurrence in a paper abstract Computational experiment (docking, predictive model) Statistical relationship System association (e.g. involved in same pathways cellular processes)
25 How to take advantage of big data? New biomedical insights Knowledge discovery processes Integrative Tools & Algorithms Networks of data & relationships Nuclear receptors: PPAR gamma, PXR SPARQL query builder Association Search & pathfinding ChemoHub: network predictive models Topic models & ranking WENDI & Chemogenomic Explorer Plotviz 3D visualization Chem2Bio2RDF PubMedNet Databases & Publications Compounds, Drugs, Proteins, Genes, Pathways, Diseases, Side Effects, Publications
26 Text CSV Table HTML XML Patient Disease Tissue Cell Pathway DNA RNA Protein Drug
27 Text CSV Table HTML XML RDF need a data format! Patient Disease Tissue need semantics! bindto Cell Pathway DNA RNA Protein Drug
28 Chem2Bio2RDF NCI Human Tumor Cell Lines Data PubChem Compound Database PubChem Bioassay Database PubChem Descriptions of all PubChem bioassays Pub3D: A similarity searchable database of minimized 3D structures for PubChem compounds Drugbank MRTD: An implementation of the Maximum Recommended Therapeutic Dose set Medline: IDs of papers indexed in Medline, with SMILES of chemical structures ChEMBL chemogenomics database KEGG Ligand pathway database Comparative Toxicogenomics Database PhenoPred Data HuGEpedia: an encyclopedia of human genetic variation in health and disease. 31m chemical structures 59m bioactivity data points 3m/19m publications ~5,000 drugs
29 Dereferenable URI Bio2RDF Browsing PlotViz: Visualization Cytoscape Plugin Chem2Bio2RDF RDF Triple store LODD Linked Path Generation and Ranking uniprot Others SPARQL ENDPOINTS Third party tools
30 Relating Pathways to Adverse Drug Reactions
31 RDF alone is not enough Need standardization Troglitazone binds to PPARG Romozins binds to PPARG Romozins is another name of Troglitazone
32 Chem2Bio2OWL
33 33
34 RDF Search Target for Troglitazone PREFIX c2b2r: PREFIX bp: < PREFIX rdfs: < select distinct?target from < where {?chemical rdfs:label?drugname ; c2b2r:hasinteraction?interaction.?interaction c2b2r:hastarget [bp:name?target]; c2b2r:drugtarget true. FILTER (str(?drugname)="troglitazone") } Mashed Chem2Bio2RDF Annotated Chem2Bio2OWL
35 SEMANTIC GRAPH MINING: PATH FINDING ALGORITHM Dijkstra s algorithm 22
36 Bio LDA Latent Dirichlet Allocation (LDA) The core of the group of powerful statistical modeling techniques for automated extraction of latent topics from large document collections Bio LDA Extended LDA model with Bio terms as latent variable Bio terms: compound, gene, drug, disease, protein, side effect, pathways Calculate bio term entropies over topics Use the Kullback Leibler divergence as the non symmetric distance measure for two bioterms over topics
37 Example: Topic 10 Apply Bio LDA on 336,899 PubMed article abstracts in 2009 and extract 50 topics
38 Diversity subgraph Fig. Ranked association graphs between myocardial infarction and Troglitazone 38
39 Thiazolinediones (TZDs) revolutionary treatment for type II Diabetes Troglitazone (Rezulin): withdrawn in 2000 (liver disease) Rosiglitazone (Avandia): restricted in 2010 (cardiac disease) Rosiglitazone bound into PPAR γ Pioglitazone:???? (does decrease blood sugar levels, was associated with bladder tumors and has been withdrawn in some countries.)
40 PPARG: TZD target SAA2: Involved in inflammatory response implicated in cardiovascular disease (Current Opinion in Lipidology 15,3,, ) APOE: Apolipoprotein E3 essential for lipoprotein catabolism. Implicated in cardiovascular disease. ADIPOQ: Adiponectin involved in fatty acid metabolism. Implicated in metabolic syndrome, diabetes and cardiovascular disease CYP2C8: Cytochrome P450 present in cardiovascular tissue and involved in metabolism of xenobiotics CDKN2A: Tumor suppression gene SLC29A1: Membrane transporter
41 Semantic Prediction
42 ? Drug1 Target 1 Substructure Side effect Chemical ontology Gene expression profile bind Drug 2 From Ligand perspective
43 ? Drug 1 Target 1 bind Sequence 3D structure GO Ligand Target 2 From target perspective
44 Example: Troglitazone and PPARG Association score: Association significance: 9.06 x 10 6 => missing link predicted
45 Topology is important for association hassubstructure hassubstructure bind Cmpd 1 Cmpd 2 Protein 1 hassubstructure hassubstructure bind Cmpd 1 Cmpd 2 Protein 1
46 Semantics is important for association Cmpd 1 bind Protein bind bind Cmpd 2 2 Protein 1 Cmpd1 bind Protein 2 hasgo GO: hasgo Protein 1 Cmpd1 bind Protein 2 PPI Protein 1 Cmpd1 hassideeffect hyperten sion hasside ffect Cmpd 2 bind Protein 1 Cmpd1 hassubstructure substruct ure1 hassubstructure Cmpd 2 bind Protein 1
47 SLAP Pipeline Path filtering
48 Cross check with SEA SEA analysis (Nature 462, , 2009) predicts 184 new compound target pairs, 30 of which were experimentally tested 23 of these pairs were experimentally validated (<15uM) including 15 aminergic GPCR targets and 8 which crossed major receptor classification boundaries 9 of the aminergic GPCR target pairings were correctly predicted by SLAP (p<0.05) for the other 6 compounds were not present in our set 1 of the 8 cross boundary pairs was predicted
49 Assessing drug similarity from biological function Took 157 drugs with 10 known therapeutic indications, and created SLAP profiles against 1,683 human targets Pearson correlation between profiles > 0.9 from SLAP was used to create associations between drugs Drugs with the same therapeutic indication unsurprisingly cluster together Some drugs with similar profile have different indications potential for use in drug repurposing?
50 Challenges Generating entities: converting strings to things Using URI to identify/integrate entities (RDF) Using common schemas to represent semantics (ontologies) Managing relations: Model properties of relations Search and rank relations Handling context: tricky Triples vs. Quads Provenance: who says what, data provenance, how (process) provenance, workflow provenance
51 Challenges Others: Query efficiency, Data security, Data quality, Big Data + Big Challenge Unlimited Potential Connect Share Discover
52 Thanks
Big Data in Drug Discovery
Big Data in Drug Discovery David J. Wild Assistant Professor & Director, Cheminformatics Program Indiana University School of Informatics and Computing [email protected] - http://djwild.info Epochs in
Integrating Medicinal Chemistry and Computational Chemistry: The Molecular Forecaster Approach
Integrating Medicinal Chemistry and Computational Chemistry: The Molecular Forecaster Approach Molecular Forecaster Inc. www.molecularforecaster.com Company Profile Founded in 2010 by Dr. Eric Therrien
ProteinQuest user guide
ProteinQuest user guide 1. Introduction... 3 1.1 With ProteinQuest you can... 3 1.2 ProteinQuest basic version 4 1.3 ProteinQuest extended version... 5 2. ProteinQuest dictionaries... 6 3. Directions for
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
LDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany [email protected],
dixa a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University
dixa a data infrastructure for chemical safety Jos Kleinjans Dept of Toxicogenomics Maastricht University Current protocol for chemical safety testing Short Term Tests for Genetic Toxicity Bacterial Reverse
Healthcare Analytics. Aryya Gangopadhyay UMBC
Healthcare Analytics Aryya Gangopadhyay UMBC Two of many projects Integrated network approach to personalized medicine Multidimensional and multimodal Dynamic Analyze interactions HealthMask Need for sharing
From Data to Foresight:
Laura Haas, IBM Fellow IBM Research - Almaden From Data to Foresight: Leveraging Data and Analytics for Materials Research 1 2011 IBM Corporation The road from data to foresight is long? Consumer Reports
Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013
Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013 James Maltby, Ph.D 1 Outline of Presentation Semantic Graph Analytics Database Architectures In-memory Semantic Database Formulation
Web-Based Genomic Information Integration with Gene Ontology
Web-Based Genomic Information Integration with Gene Ontology Kai Xu 1 IMAGEN group, National ICT Australia, Sydney, Australia, [email protected] Abstract. Despite the dramatic growth of online genomic
Semantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies
Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies 1 Enterprise Information Challenge Source: Oracle customer 2 Vision of Semantically Linked Data The Network of Collaborative
Using Open Source software and Open data to support Clinical Trial Protocol design
Using Open Source software and Open data to support Clinical Trial Protocol design Nikolaos Matskanis, Joseph Roumier, Fabrice Estiévenart {nikolaos.matskanis, joseph.roumier, fabrice.estievenart}@cetic.be
> Semantic Web Use Cases and Case Studies
> Semantic Web Use Cases and Case Studies Case Study: Applied Semantic Knowledgebase for Detection of Patients at Risk of Organ Failure through Immune Rejection Robert Stanley 1, Bruce McManus 2, Raymond
Integrating Bioinformatics, Medical Sciences and Drug Discovery
Integrating Bioinformatics, Medical Sciences and Drug Discovery M. Madan Babu Centre for Biotechnology, Anna University, Chennai - 600025 phone: 44-4332179 :: email: [email protected] Bioinformatics
Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova
Using the Grid for the interactive workflow management in biomedicine Andrea Schenone BIOLAB DIST University of Genova overview background requirements solution case study results background A multilevel
Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo
DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo Expected Outcomes You will learn: Basic concepts related to ontologies Semantic model Semantic web Basic features of RDF and RDF
Extraction and Visualization of Protein-Protein Interactions from PubMed
Extraction and Visualization of Protein-Protein Interactions from PubMed Ulf Leser Knowledge Management in Bioinformatics Humboldt-Universität Berlin Finding Relevant Knowledge Find information about Much
org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers.
org.rn.eg.db December 16, 2015 org.rn.egaccnum Map Entrez Gene identifiers to GenBank Accession Numbers org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank
In Vivo and In Vitro Screening for Thyroid Hormone Disruptors
In Vivo and In Vitro Screening for Thyroid Hormone Disruptors Kevin M. Crofton National lhealth and Environmental Research Laboratory California Environmental Protection Agency Human Health Hazard Indicators
Processing Genome Data using Scalable Database Technology. My Background
Johann Christoph Freytag, Ph.D. [email protected] http://www.dbis.informatik.hu-berlin.de Stanford University, February 2004 PhD @ Harvard Univ. Visiting Scientist, Microsoft Res. (2002)
Exploring and Understanding Adverse Drug Reactions by Integrative Mining of Clinical Records and Biomedical Knowledge
Exploring and Understanding Adverse Drug Reactions by Integrative Mining of Clinical Records and Biomedical Knowledge http://euadr-project.org PEDRO LOPES [email protected] University of Manchester October
Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences
Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences It s not information overload, it s filter failure. Clay Shirky Life Sciences organizations face the challenge
How To Use Data Analysis To Get More Information From A Computer Or Cell Phone To A Computer
Applying Big Data approaches to Competitive Intelligence challenges THOMSON REUTERS IP & SCIENCE PHARMA CI EUROPE CONFERENCE & EXHIBITION TIM MILLER 19 FEBRUARY 2014 BIG DATA, NOT JUST ABOUT VOLUMES Patient
Version 1 2015. Module guide. Preliminary document. International Master Program Cardiovascular Science University of Göttingen
Version 1 2015 Module guide International Master Program Cardiovascular Science University of Göttingen Part 1 Theoretical modules Synopsis The Master program Cardiovascular Science contains four theoretical
Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources
Computational Drug Repositioning by Ranking and Integrating Multiple Data Sources Ping Zhang IBM T. J. Watson Research Center Pankaj Agarwal GlaxoSmithKline Zoran Obradovic Temple University Terms and
VIVO Dashboard A Drupal-based tool for harvesting and executing sophisticated queries against data from a VIVO instance
VIVO Dashboard A Drupal-based tool for harvesting and executing sophisticated queries against data from a VIVO instance! Paul Albert, Miles Worthington and Don Carpenter Chapter I: The Problem Administrators
Diabetes and Drug Development
Diabetes and Drug Development Metabolic Disfunction Leads to Multiple Diseases Hypertension ( blood pressure) Metabolic Syndrome (Syndrome X) LDL HDL Lipoproteins Triglycerides FFA Hyperinsulinemia Insulin
LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA
LINKED OPEN DRUG DATA FROM THE HEALTH INSURANCE FUND OF MACEDONIA Milos Jovanovik, Bojan Najdenov, Dimitar Trajanov Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University Skopje,
A leader in the development and application of information technology to prevent and treat disease.
A leader in the development and application of information technology to prevent and treat disease. About MOLECULAR HEALTH Molecular Health was founded in 2004 with the vision of changing healthcare. Today
PPInterFinder A Web Server for Mining Human Protein Protein Interaction
PPInterFinder A Web Server for Mining Human Protein Protein Interaction Kalpana Raja, Suresh Subramani, Jeyakumar Natarajan Data Mining and Text Mining Laboratory, Department of Bioinformatics, Bharathiar
Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives
Vad är bioinformatik och varför behöver vi det i vården? a bioinformatician's perspectives [email protected] 2015-05-21 Functional Bioinformatics, Örebro University Vad är bioinformatik och varför
Information Extraction from Patents: Combining Text- and Image-Mining. Martin Hofmann-Apitius
Information Extraction from Patents: Combining Text- and Image-Mining Martin Hofmann-Apitius Bonn-Aachen International Centre for Information Technology (B-IT) September 25, 2007 Status Report: Major Achievements
Identification of rheumatoid arthritis and osteoarthritis patients by transcriptome-based rule set generation
Identification of rheumatoid arthritis and osterthritis patients by transcriptome-based rule set generation Bering Limited Report generated on September 19, 2014 Contents 1 Dataset summary 2 1.1 Project
BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS
BBSRC TECHNOLOGY STRATEGY: TECHNOLOGIES NEEDED BY RESEARCH KNOWLEDGE PROVIDERS 1. The Technology Strategy sets out six areas where technological developments are required to push the frontiers of knowledge
Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments
Using Ontologies in Proteus for Modeling Data Mining Analysis of Proteomics Experiments Mario Cannataro, Pietro Hiram Guzzi, Tommaso Mazza, and Pierangelo Veltri University Magna Græcia of Catanzaro, 88100
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan
IT services for analyses of various data samples
IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical
SPARQL UniProt.RDF. Get these slides! Tutorial plan. Everyone has had some introduction slash knowledge of RDF.
SPARQL UniProt.RDF Everyone has had some introduction slash knowledge of RDF. Jerven Bolleman Developer Swiss-Prot Group Swiss Institute of Bioinformatics Get these slides! https://sites.google.com/a/jerven.eu/jerven/home/
City Data Pipeline. A System for Making Open Data Useful for Cities. [email protected]
City Data Pipeline A System for Making Open Data Useful for Cities Stefan Bischof 1,2, Axel Polleres 1, and Simon Sperl 1 1 Siemens AG Österreich, Siemensstraße 90, 1211 Vienna, Austria {bischof.stefan,axel.polleres,simon.sperl}@siemens.com
Pipeline Pilot Enterprise Server. Flexible Integration of Disparate Data and Applications. Capture and Deployment of Best Practices
overview Pipeline Pilot Enterprise Server Pipeline Pilot Enterprise Server (PPES) is a powerful client-server platform that streamlines the integration and analysis of the vast quantities of data flooding
Analysis of Illumina Gene Expression Microarray Data
Analysis of Illumina Gene Expression Microarray Data Asta Laiho, Msc. Tech. Bioinformatics research engineer The Finnish DNA Microarray Centre Turku Centre for Biotechnology, Finland The Finnish DNA Microarray
BIOINF 525 Winter 2016 Foundations of Bioinformatics and Systems Biology http://tinyurl.com/bioinf525-w16
Course Director: Dr. Barry Grant (DCM&B, [email protected]) Description: This is a three module course covering (1) Foundations of Bioinformatics, (2) Statistics in Bioinformatics, and (3) Systems
Bench to Bedside Clinical Decision Support:
Bench to Bedside Clinical Decision Support: The Role of Semantic Web Technologies in Clinical and Translational Medicine Tonya Hongsermeier, MD, MBA Corporate Manager, Clinical Knowledge Management and
Regulatory Issues in Genetic Testing and Targeted Drug Development
Regulatory Issues in Genetic Testing and Targeted Drug Development Janet Woodcock, M.D. Deputy Commissioner for Operations Food and Drug Administration October 12, 2006 Genetic and Genomic Tests are Types
ChemCloud - Chemical e-science Information Cloud. Adrian Paschke, Freie Universitaet Berlin Stephan Heineke, FIZ CHEMIE
ChemCloud - Chemical e-science Information Cloud Adrian Paschke, Freie Universitaet Berlin Stephan Heineke, FIZ CHEMIE 1 About FIZ CHEMIE 1830 Founding of Pharmaceutisches Centralblatt Reestablished 1981
HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM. Aniket Bochare - [email protected]. CMSC 601 - Presentation
HETEROGENEOUS DATA INTEGRATION FOR CLINICAL DECISION SUPPORT SYSTEM Aniket Bochare - [email protected] CMSC 601 - Presentation Date-04/25/2011 AGENDA Introduction and Background Framework Heterogeneous
Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management
Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management Ram Soma 2, Amol Bakshi 1, Kanwal Gupta 3, Will Da Sie 2, Viktor Prasanna 1 1 University of Southern California,
2.1. Data Mining for Biomedical and DNA data analysis
Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: [email protected]) Dr. G.N. Singh Department of Physics and
A Primer of Genome Science THIRD
A Primer of Genome Science THIRD EDITION GREG GIBSON-SPENCER V. MUSE North Carolina State University Sinauer Associates, Inc. Publishers Sunderland, Massachusetts USA Contents Preface xi 1 Genome Projects:
We have big data, but we need big knowledge
We have big data, but we need big knowledge Weaving surveys into the semantic web ASC Big Data Conference September 26 th 2014 So much knowledge, so little time 1 3 takeaways What are linked data and the
Visualizing Networks: Cytoscape. Prat Thiru
Visualizing Networks: Cytoscape Prat Thiru Outline Introduction to Networks Network Basics Visualization Inferences Cytoscape Demo 2 Why (Biological) Networks? 3 Networks: An Integrative Approach Zvelebil,
Lecture 11 Data storage and LIMS solutions. Stéphane LE CROM [email protected]
Lecture 11 Data storage and LIMS solutions Stéphane LE CROM [email protected] Various steps of a DNA microarray experiment Experimental steps Data analysis Experimental design set up Chips on catalog
Clinical and research data integration: the i2b2 FSM experience
Clinical and research data integration: the i2b2 FSM experience Laboratory of Biomedical Informatics for Clinical Research Fondazione Salvatore Maugeri - FSM - Hospital, Pavia, italy Laboratory of Biomedical
RETRIEVING SEQUENCE INFORMATION. Nucleotide sequence databases. Database search. Sequence alignment and comparison
RETRIEVING SEQUENCE INFORMATION Nucleotide sequence databases Database search Sequence alignment and comparison Biological sequence databases Originally just a storage place for sequences. Currently the
Lecture Outline. Introduction to Databases. Introduction. Data Formats Sample databases How to text search databases. Shifra Ben-Dor Irit Orr
Introduction to Databases Shifra Ben-Dor Irit Orr Lecture Outline Introduction Data and Database types Database components Data Formats Sample databases How to text search databases What units of information
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology
University of Glasgow - Programme Structure Summary C1G5-5100 MSc Bioinformatics, Polyomics and Systems Biology Programme Structure - the MSc outcome will require 180 credits total (full-time only) - 60
Subgraph Patterns: Network Motifs and Graphlets. Pedro Ribeiro
Subgraph Patterns: Network Motifs and Graphlets Pedro Ribeiro Analyzing Complex Networks We have been talking about extracting information from networks Some possible tasks: General Patterns Ex: scale-free,
excellent graph matching capabilities with global graph analytic operations, via an interface that researchers can use to plug in their own
Steve Reinhardt 2 The urika developers are extending SPARQL s excellent graph matching capabilities with global graph analytic operations, via an interface that researchers can use to plug in their own
Bioinformatics: course introduction
Bioinformatics: course introduction Filip Železný Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Intelligent Data Analysis lab http://ida.felk.cvut.cz
Pathway Analysis : An Introduction
Pathway Analysis : An Introduction Experiments Literature and other KB Data Knowledge Structure in Data through statistics Structure in Knowledge through GO and other Ontologies Gain insight into Data
Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study
Revealing Trends and Insights in Online Hiring Market Using Linking Open Data Cloud: Active Hiring a Use Case Study Amar-Djalil Mezaour 1, Julien Law-To 1, Robert Isele 3, Thomas Schandl 2, and Gerd Zechmeister
InSyBio BioNets: Utmost efficiency in gene expression data and biological networks analysis
InSyBio BioNets: Utmost efficiency in gene expression data and biological networks analysis WHITE PAPER By InSyBio Ltd Konstantinos Theofilatos Bioinformatician, PhD InSyBio Technical Sales Manager August
Gene Silencing Oligos (GSOs) Third Generation Antisense
Gene Silencing Oligos (GSOs) Third Generation Antisense Walter R. Strapps, Ph.D. Executive Director, RNA Therapeutics Idera Pharmaceuticals Cambridge, MA NASDAQ: IDRA www.iderapharma.com Idera is a leader
i2b2 Clinical Research Chart
i2b2 Clinical Research Chart Shawn Murphy MD, Ph.D. Griffin Weber MD, Ph.D. Michael Mendis Vivian Gainer MS Lori Phillips MS Rajesh Kuttan Wensong Pan MS Henry Chueh MD Susanne Churchill Ph.D. John Glaser
Managing enterprise applications as dynamic resources in corporate semantic webs an application scenario for semantic web services.
Managing enterprise applications as dynamic resources in corporate semantic webs an application scenario for semantic web services. Fabien Gandon, Moussa Lo, Olivier Corby, Rose Dieng-Kuntz ACACIA in short
An industry perspective on deployed semantic interoperability solutions
An industry perspective on deployed semantic interoperability solutions Ralph Hodgson, CTO, TopQuadrant SEMIC Conference, Athens, April 9, 2014 https://joinup.ec.europa.eu/community/semic/event/se mic-2014-semantic-interoperability-conference
Text Mining for Health Care and Medicine. Sophia Ananiadou Director National Centre for Text Mining www.nactem.ac.uk
Text Mining for Health Care and Medicine Sophia Ananiadou Director National Centre for Text Mining www.nactem.ac.uk The Need for Text Mining MEDLINE 2005: ~14M 2009: ~18M Overwhelming information in textual,
Analysis of the colorectal tumor microenvironment using integrative bioinformatic tools
MLECNIK Bernhard & BINDEA Gabriela Analysis of the colorectal tumor microenvironment using integrative bioinformatic tools INSERM U872, Jérôme Galon Team15: Integrative Cancer Immunology Cordeliers Research
Semantic Stored Procedures Programming Environment and performance analysis
Semantic Stored Procedures Programming Environment and performance analysis Marjan Efremov 1, Vladimir Zdraveski 2, Petar Ristoski 2, Dimitar Trajanov 2 1 Open Mind Solutions Skopje, bul. Kliment Ohridski
Protein-responsive ribozyme switches in eukaryotic cells
Protein-responsive ribozyme switches in eukaryotic cells Andrew B. Kennedy, James V. Vowles, Leo d Espaux, and Christina D. Smolke Presented by Marianne Linz and Jennifer Thornton March 11, 2015 Synthetic
An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials
ehealth Beyond the Horizon Get IT There S.K. Andersen et al. (Eds.) IOS Press, 2008 2008 Organizing Committee of MIE 2008. All rights reserved. 3 An Ontology Based Method to Solve Query Identifier Heterogeneity
AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE
ACCELERATING PROGRESS IS IN OUR GENES AGILENT S BIOINFORMATICS ANALYSIS SOFTWARE GENESPRING GENE EXPRESSION (GX) MASS PROFILER PROFESSIONAL (MPP) PATHWAY ARCHITECT (PA) See Deeper. Reach Further. BIOINFORMATICS
GenBank, Entrez, & FASTA
GenBank, Entrez, & FASTA Nucleotide Sequence Databases First generation GenBank is a representative example started as sort of a museum to preserve knowledge of a sequence from first discovery great repositories,
The Ontology and Architecture for an Academic Social Network
www.ijcsi.org 22 The Ontology and Architecture for an Academic Social Network Moharram Challenger Computer Engineering Department, Islamic Azad University Shabestar Branch, Shabestar, East Azerbaijan,
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
