Beyond Health 2.0: the semantic web and intelligent systems Erik van Mulligen PhD Marc Weeber PhD Ravi Kalaputapu PhD Erasmus University Medical Center, Rotterdam, The Netherlands Knewco Inc, New York, United States of America Netherlands Institute for Health Sciences (NIHES) Netherlanss Center for BioInformatics (NBIC)
Health Users John has already been using celebrex for a few years to kill the pain of his reumatoïde artritis. Recently he got stomach complaints, in particular pain. As a preparation for his visit to the general practitioner he is browsing the internet to find related information. Is he able to combine information from different sources (health sites, scientific literature, pharmaceutical sites) to such an extent that he is well informed? What tools would be necessary to locate and link right information?
Rationale To provide health information consumers (lay people, patients, scientists) with relevant, useful (additional) information when consulting information (on the web). Create a semantic web on top of existing information sources that links information topics from different sites and databases and with different modalities (text, video). Assist health information consumers with finding relevant, reliable information from the information avalanche.
much useful and relevant legacy data and web pages semantic web technology still under development approaches to overlay semantics on the current web combination strategies
overlay current web with semantics layer semantic web mapping web 1.0 & 2.0
Semantic Mining Observational data Peer reviewed data EHR BioBanks Studies Literature Guidelines Protocols Triple store RDF/OWL celebrex causes Upper Gastro-Intestinal Bleeding celecoxib causes upper gastrointestinal hemorrhage Ontology Ontology Diseases C001 Upper Gastro- Intestinal Bleeding Upper gastrointestinal hemorrhage Drugs Celecoxib Celebrex
EHR BioBanks Studies Literature Guidelines Protocols Nonsteroidal anti-inflammatory drugs (NSAIDs) are commonly used, but have risks associated with their use, including significant upper gastrointestinal tract bleeding. Older persons, persons taking anticoagulants, and persons with a history of upper gastrointestinal tract bleeding associated with NSAIDs are at especially high risk. nonsteroidal antiinflammatory drugs causes increase risk increase risk upper gastrointestinal tract bleeding older persons anticoagulants Triple store RDF/OWL nonsteroidal anti-inflammatory drugs causes upper gastrointestinal tract bleeding older persons increase risk upper gastrointestinal tract bleeding anti-coagulants increase risk upper gastrointestinal tract bleeding
EHR BioBanks Studies Literature Guidelines Protocols ontology development -NCBO -Unified Medical Language System -SNOMED CT OWL/RDF triple formalisms -nano publication -aggregation methods: association, mutual information specific projects -EU-ADR: detecting new side effects for drugs from observational data -OpenPHACTS: combining triples for drug discovery -CALBC: harmonization & alignment of different NER systems in a large corpus -Semantic MedLine: semantic relations between entities in PubMed
Example: EU-ADR Medical databases: 30 Million persons (IT, NL, UK, DK) Mapping of events and drugs Data mining Data extraction: periodic Signal detection Signal substantiation Development of extraction tools Literature Known side effects Retrospective and prospective signal validation Pathway analysis In-silico simulation
showing additional information in text Mapping web page to Semantics Triple store RDF/OWL celebrex causes Upper Gastro-Intestinal Bleeding celecoxib causes upper gastrointestinal hemorrhage named entity recognition on the fly, mapping term variants to same concept disambiguation on the fly using context identifying semantic relations/triples relevant for user identifying most relevant entities on a page
Adding semantics
Health Users John has already been using celebrex for a few years to kill the pain of his reumatoïde artritis. Using the semantic layer he now nows that celebrex is the brand name for celecoxib which belongs to the family of nons-steroidal anti-inflammatory drugs. This family of drugs is known to cause upper gastrointestinal bleedings. He will ask his general practitioner whether there are alternatives that don t have these particular side effects.
Requirements A rich enough ontology and triple store that connects topics On the fly analysis of web pages to identify health topics Term variations Disambiguation / page analysis Bench marking (CALBC, I2B2, BioCreative, TREC) Linkage with different information sources Information available at the point of reading
Semantic Enrichment Easy deployment Enrichment provided by site On demand enrichment User monitoring / intelligent systems Based on context highlight different topics Populating relevant linked information, depending on context Client-side user tracking to determine context
Business model Advertisement Licensing by site owner Licensing by end-user (app store) Open (source) architecture
Next Extending Context specific population of linked information Drugs -> side effects Disease -> treatments, guidelines Linking with online electronic health records Linking with social media (patient organizations, patients with same disease, patients like me) Deeper NLP
Thanks for your attention! I m happy to take questions either now, if time permits, or per e-mail: e.vanmulligen@erasmusmc.nl