Structuring Medical Records with Apache Stanbol. Rafa Haro, Senior Software Engineer, Athento Antonio Pérez Morales, Senior Software Engineer, Ixxus
|
|
- Piers Paul
- 7 years ago
- Views:
Transcription
1 Structuring Medical Records with Apache Stanbol Rafa Haro, Senior Software Engineer, Athento Antonio Pérez Morales, Senior Software Engineer, Ixxus
2 Committer, PMC Apache Stanbol, Apache ManifoldCF Topics: Document Analysis, NLP, Machine Learning, Semantic Technologies, ECM Apache Stanbol, Apache ManifoldCF Topics: ECM, Semantic Search, ETL, Machine Learning
3 Apache Stanbol provides a set of reusable components for semantic content management. It extends existing CMSs with a number of semantic services. Traditional Semantic CMS
4 Software Architecture for Semantically Enabled CM and ECM systems
5 Apache Stanbol Story Started within FP7 European Project IKS (Interactive Knowledge Stack ) IKS project brought together an Open Source Community for Defining and Building Platforms in the Semantic CMS Space Incubated in November 2010 Successfully promoted within CMS and ECM industry through IKS Early Adopters Program Graduated to Top-Level Apache Project in October 2012
6 What is a Semantic CMS? Traditional CMS Atomic Unit: Document Properties as meta-data (key-value schemas) Keyword Search Document Management Document Types Document Workflow Semantic CMS Atomic Unit: Entity Semantic meta-data (RDF) Semantic Search Knowledge Management Entity Management Ontologies Source: What Apache Stanbol Can Do for You?. Fabian Christ. ApacheCon Europe 2012
7 Key Points Designed to bring Semantic Technologies to existing CMS Non-intrusive set of RESTful Semantic Services Extremely Modular: Use only the modules you need Main Features: Multilingual Content Enhancement: Structure Content through Semantic Metadata Knowledge Bases Management Knowledge Models and Reasoning Semantic Indexing and Search
8 Stanbol Components Stanbol components provide: RESTful API Java APIs and OSGi services Stanbol components do NOT depend on each other however they can be easily combined to Apache Stanbol Enhancer Apache Stanbol EntityHub Apache Stanbol Ontology Manager Apache Stanbol Reasoners Stanbol Enhancement Engines Apache Stanbol ContentHub Apache Stanbol FactStore Apache Stanbol CMS Adapter Apache Stanbol Rules Apache Stanbol Component Layer
9 Stanbol Components (II) Enhancer: Extracts Knowledge from unstructured parsed content EntityHub: Manage Domain Entities and Topics (Knowledge Bases) ContentHub: Semantic Indexing / Search over your - semantic enhanced - Content CMS Adapter: Sync. your CMS with Apache Stanbol (JCR/CMIS) Ontology Manager: Manage you formal Domain Knowledge Reasoners & Rules: Apply Domain Knowledge to improve / validate extracted Information. Refactor / refine knowledge to align it to public schemas such as schema.org
10 Built on Top of Apache. Apache Felix as OSGi environment Apache Sling launchers and OSGi Tools Apache Maven for building Apache Clerezza as RDF Framework Apache Jena as TripleStore Apache Solr for Knowledge Bases Management Apache Tika for converting input Apache OpenNLP for NLP Processing
11 Integration Scenarios Stand-Alone Server (Stanbol Launchers) Web Application (Servlet-Container) Embedded within an OSGi environment Source: What Apache Stanbol Can Do for You?. Fabian Christ. ApacheCon Europe 2012
12 Project Current Status Incubation (Nov 2010) Apache Stanbol incubating (Aug 2012) Graduation (October 2012) IKS Project Ending (Dec 2012) Apache Stanbol (March 2014) Apache Stanbol (October 2016) Contributions (commits) to Trunk Since Incubation
13 Project Current Status (II) 22 PMC Members (Last Addition Jul 2016) 26 Committers (Last Addition May 2015) 3-5 active committers last 2 years dev@stanbol.apache.org: 228 subscribers Activity has been gradually decreasing 3 major releases Source: Apache Stanbol Committee Report Helper (
14 Stanbol Enhancer RDF
15 Stanbol Enhancer (II)
16 Stanbol Enhancer (III)
17 Stanbol Enhancement Chains Define how Content is processed by the Enhancer through an ExecutionPlan Different Implementations: API: ListChain: in order sequential enhancement engines execution. Parallel Execution of engines not supported WeightedChain: ExecutionPlan is calculated using the engines order metadata. Parallel Execution of engines allowed /enhancer: executes the default chain /enhancer/chain/{chain-name}: executes a concrete named chain /enhancer/engine/{engine-name}: executes a concrete named engine
18 Current Enhancement Engines Preprocessing Tika Engine content type detection text extraction from several document formats metadata extraction from several document formats Natural Language Processing Language Detection (different implementations) Sentence Detection (OpenNLP, SmartCN, REST) Tokenizer (OpenNLP, SmartCN, REST) POS Tagging (OpenNLP, REST) Chunking (OpenNLP, REST) NER (OpenNLP, OpenCalais, REST) Entity Linking Named Entity Linking EntityHub Linking Engine FST (Lucene Finit State Transducer) Linking Engine Entity Co-mention Commercial Engines (OpenCalais, Zemanta, CELI ) Sentiment Analysis Disambiguation DBPedia Spotlight Solr MLT based PostProcessing: Dereferencing
19 Stanbol EntityHub
20 Stanbol EntityHub (II) Manage Multiple Entity Sources (Knowledge Bases) Allows Fast Entity-Lookup using Apache Solr Referenced Site (Remote LD + Local Caches) Vs Managed Site (Entity CRUD Api over manually configured Sites) API: Query for Entities (used by Entity Linking Engines) curl -X POST -d "name=lyon&limit=10" \ CRUD for Managed Sites LDPath support for: Graph Path Retrieval (Used for dereferencing) Schema Translation Simple Reasoning friend-names = foaf:knows/foaf:name schema:name = rdfs:label[@en];
21 Use Case: Hexin Project - Structuring Medical Records R&D Project for Sergas (Galician Public Health Office) Clinical Data Analysis Platform for supporting: Clinical Assistance Epidemiology studies Medical Research Big Data approach for analyzing both structured historical clinical data and unstructured medical records Medical Records are written in Spanish and Galician
22 Hexin: Architecture Event Detection Process ETL URX BIG DATA (HDFS + HIVE) Data Source PatientId Date Structured Events Semantic Events Symptoms: Cough Unrest Reference Cases Detection Process New Case BI Rules Cassandra Unrest Cough Fever>38 Patient Validation Analysis
23 Hexin: Semantic Tagging
24 Hexin: Objective Paciente diabético desde los 5 años y con EPOC moderada grado 2 de la GOLD
25 Hexin:Solution Design Structure Medical Records using Apache Stanbol Enhancer Custom Ontology: Symptoms Diseases Diagnosis Tests Family and Personal History Custom Enhancement Chain: Language Detection > NLP > Entity Linking > Negation Detection > Fact Extraction
26 Hexin: Ontology
27 Hexin: Ontology Indexing For supporting the Entity Linking process against Hexin Ontology, an EntityHub site must be created 2 options: ManagedSite: full CRUD storage <-> DYNAMIC ReferencedSite: READ-ONLY remote site + local index Stanbol EntityHub Indexing Tool: RDF > JenaTDB > Solr Index hexin:* hexin:label > rdfs:label Configure Custom Namespaces, Mappings and Properties Generates an OSGi Bundle with the Yard and YardSite default configurations Copy the index to Stanbol /datafiles folder and install the bundle using Apache Felix OSGi Web Console
28 Hexin: Enhancement Chain Lang. Detect. OpenNLP-Sent. OpenNLP-Token OpenNLP-POS OpenNLP-Chunker Hexin Linking Fact Extract. Negex Custom Hexin Engine. Implemented for the project Entity Linking Engine. Available in Stanbol with a Custom Configuration for this use case NLP Engines. Available in Stanbol. Default Configuration Pre-Processing Engine. Available in Stanbol
29 Hexin: Linking
30 Hexin: Linking (II)
31 Hexin: public class MyEngine implements EnhancementEngine public void activate(componentcontext c) { // initialize, configure,... } public int canenhance(contentitem item) { if(...item matches our expectations...) { return ENHANCE_SYNCHRONOUS; } else { return CANNOT_ENHANCE; } } Maven build maven-bundleplugin adds OSGI metadata maven-scr-plugin adds services metadata OSGi bundle MANIFEST.MF OSGi metadata registered by OSGi MyEngine Service } public void computeenhancements(contentitem item) { // run the engine and add results to item s // RDF graph based on the item s InputStream } Install in Stanbol no restart needed
32 NLP at Apache Stanbol
33 NLP at Apache Stanbol (II) Browsable Map with Spans Spans sorted by Natural Order Iterator based API that allows concurrent Modifications Annotations supported at Spans Level POS Annotation PosTag tag (e.g. NE) lexical category (e.g. Noun) Phrase Annotation (chunks) PhraseTag tag (e.g. NP) lexical-category (e.g. NounPhrase) Sentiment Annotation SentimentTag:: Double Stanbol is an Amazing Tool Token Sentence Chunk Span Types: Token Chunk Sentence Text Section Analyzed Text
34 Hexin Custom Engine: Negex Context/Negex: Algorithm for Negation Detection Based on Triggers-Terms + Regex public abstract class AbstractNegexDetector implements NegexDetector public Set<IRI> detectnegations(string language, Graph metadata, AnalysedText at) throws NegexException{} protected abstract boolean isnegated(string language, String concept, String sentence); } Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. Oct 2001;34(5):
35 Hexin Custom Engine: Negex (II) Triggers Types: Pre-condition Negation terms (e.g. absence of) Pseudo Negation terms (e.g. no increase) Pre-condition possibility phrase (e.g. rule him out) Post-condition negation terms (e.g. unlikely) Termination terms (e.g. but, however) Implementation available under Apache License 2.0 Engine Implementation Challenges: Entity Annotations as Targets AnalyzedText and EntityAnnotations relationships are currently obfuscated GLUE CODE for locating Entity Annotations Spans by using START - END Text Annotations properties Once Entity Annotation sentence is located, is used as context along with the Entity surface-form (mention) for applying the algorithm Negation Returned as a Custom Property for the TextAnnotation (negated = True or False)
36 Hexin Custom Engine: Fact Extraction Paciente diabético desde los 5 años y con EPOC moderada grado 2 de la GOLD
37 Hexin Custom Engine: Fact Extraction (II) In-Context Entity Fact Extraction Facts returned as Entity RDF Metadata like the rest of Entity Properties Different Implementations of Context (all extracted from AnalyzedText structure) Sentence Context (default and usually enough) Window of Text Context Paragraph Context Rule Based Approach: Regex over RAW Text or POS tags Sequence ENTITY reserved word -> OR expression for all ENTITY labels
38 Hexin Custom Engine: Fact Extraction (III) Supported Expressions: diabetes diabético DM desde los N años diabetes diabético DM a los N años Debut diabetes diabético DM a los N años
39 Hexin Custom Engine: Fact Extraction (IV) POS based Rules: Diabetes diagnosed when he was 5 years old NNS VB WRB PRP VBD CD NNS JJ ENTITY \s VB * VB[be] (CD) years old or simply ENTITY \s VB * VB[be] (CD)
40 Thanks for your attention!
Semantic Content Management with Apache Stanbol
Semantic Content Management with Apache Stanbol Ali Anil SINACI and Suat GONUL SRDC Software Research & Development and Consultancy Ltd., ODTU Teknokent Silikon Blok No:14, 06800 Ankara, Turkey {anil,suat}@srdc.com.tr
More informationApache Sling A REST-based Web Application Framework Carsten Ziegeler cziegeler@apache.org ApacheCon NA 2014
Apache Sling A REST-based Web Application Framework Carsten Ziegeler cziegeler@apache.org ApacheCon NA 2014 About cziegeler@apache.org @cziegeler RnD Team at Adobe Research Switzerland Member of the Apache
More informationNatural Language Processing in the EHR Lifecycle
Insight Driven Health Natural Language Processing in the EHR Lifecycle Cecil O. Lynch, MD, MS cecil.o.lynch@accenture.com Health & Public Service Outline Medical Data Landscape Value Proposition of NLP
More informationThe Search API in Drupal 8. Thomas Seidl (drunken monkey)
The Search API in Drupal 8 Thomas Seidl (drunken monkey) Disclaimer Everything shown here is still a work in progress. Details might change until 8.0 release. Basic architecture Server Index Views Technical
More informationAtigeo at TREC 2012 Medical Records Track: ICD-9 Code Description Injection to Enhance Electronic Medical Record Search Accuracy
Atigeo at TREC 2012 Medical Records Track: ICD-9 Code Description Injection to Enhance Electronic Medical Record Search Accuracy Bryan Tinsley, Alex Thomas, Joseph F. McCarthy, Mike Lazarus Atigeo, LLC
More informationSoftware Architecture Document
Software Architecture Document Natural Language Processing Cell Version 1.0 Natural Language Processing Cell Software Architecture Document Version 1.0 1 1. Table of Contents 1. Table of Contents... 2
More informationFlattening Enterprise Knowledge
Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it
More informationAssociate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
More informationOpen Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
More informationProgramming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
More informationAppendix A: Inventory of enrichment efforts and tools initiated in the context of the Europeana Network
1/12 Task Force on Enrichment and Evaluation Appendix A: Inventory of enrichment efforts and tools initiated in the context of the Europeana 29/10/2015 Project Name Type of enrichments Tool for manual
More informationIf you have the Content, then Apache has the Technology! A whistle-stop tour of the Apache content related projects
If you have the Content, then Apache has the Technology! A whistle-stop tour of the Apache content related projects Nick Burch CTO Quanticate Apache Projects 154 Top Level Projects 33 Incubating Projects
More informationCourse 20489B: Developing Microsoft SharePoint Server 2013 Advanced Solutions OVERVIEW
Course 20489B: Developing Microsoft SharePoint Server 2013 Advanced Solutions OVERVIEW About this Course This course provides SharePoint developers the information needed to implement SharePoint solutions
More informationDeveloping Microsoft SharePoint Server 2013 Advanced Solutions
Course 20489B: Developing Microsoft SharePoint Server 2013 Advanced Solutions Page 1 of 9 Developing Microsoft SharePoint Server 2013 Advanced Solutions Course 20489B: 4 days; Instructor-Led Introduction
More informationText Analytics Software Choosing the Right Fit
Text Analytics Software Choosing the Right Fit Tom Reamy Chief Knowledge Architect KAPS Group http://www.kapsgroup.com Text Analytics World San Francisco, 2013 Agenda Introduction Text Analytics Basics
More informationCHAPTER 6 EXTRACTION OF METHOD SIGNATURES FROM UML CLASS DIAGRAM
CHAPTER 6 EXTRACTION OF METHOD SIGNATURES FROM UML CLASS DIAGRAM 6.1 INTRODUCTION There are various phases in software project development. The various phases are: SRS, Design, Coding, Testing, Implementation,
More informationUsing NLP and Ontologies for Notary Document Management Systems
Outline Using NLP and Ontologies for Notary Document Management Systems Flora Amato, Antonino Mazzeo, Antonio Penta and Antonio Picariello Dipartimento di Informatica e Sistemistica Universitá di Napoli
More informationNatural Language Processing
Natural Language Processing 2 Open NLP (http://opennlp.apache.org/) Java library for processing natural language text Based on Machine Learning tools maximum entropy, perceptron Includes pre-built models
More informationNatural Language Database Interface for the Community Based Monitoring System *
Natural Language Database Interface for the Community Based Monitoring System * Krissanne Kaye Garcia, Ma. Angelica Lumain, Jose Antonio Wong, Jhovee Gerard Yap, Charibeth Cheng De La Salle University
More informationDeveloper s Guide. How to Develop a Communiqué Digital Asset Management Solution
Developer s Guide How to Develop a Communiqué Digital Asset Management Solution 1 PURPOSE 3 2 CQ DAM OVERVIEW 4 2.1 2.2 Key CQ DAM Features 4 2.2 How CQ DAM Works 6 2.2.1 Unified Architecture 7 2.2.2 Asset
More informationAutomatic Knowledge Base Construction Systems. Dr. Daisy Zhe Wang CISE Department University of Florida September 3th 2014
Automatic Knowledge Base Construction Systems Dr. Daisy Zhe Wang CISE Department University of Florida September 3th 2014 1 Text Contains Knowledge 2 Text Contains Automatically Extractable Knowledge 3
More information11-792 Software Engineering EMR Project Report
11-792 Software Engineering EMR Project Report Team Members Phani Gadde Anika Gupta Ting-Hao (Kenneth) Huang Chetan Thayur Suyoun Kim Vision Our aim is to build an intelligent system which is capable of
More informationCOMBINING AND EASING THE ACCESS OF THE ESWC SEMANTIC WEB DATA
STI INNSBRUCK COMBINING AND EASING THE ACCESS OF THE ESWC SEMANTIC WEB DATA Dieter Fensel, and Alex Oberhauser STI Innsbruck, University of Innsbruck, Technikerstraße 21a, 6020 Innsbruck, Austria firstname.lastname@sti2.at
More informationInternals of Hadoop Application Framework and Distributed File System
International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop
More informationSo today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
More informationA Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks
A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks Text Analytics World, Boston, 2013 Lars Hard, CTO Agenda Difficult text analytics tasks Feature extraction Bio-inspired
More informationContent Management System (CMS)
Content Management System (CMS) ASP.NET Web Site User interface to the CMS SQL Server metadata storage, configuration, user management, order history, etc. Windows Service (C#.NET with TCP/IP remote monitoring)
More informationText Clustering Using LucidWorks and Apache Mahout
Text Clustering Using LucidWorks and Apache Mahout (Nov. 17, 2012) 1. Module name Text Clustering Using Lucidworks and Apache Mahout 2. Scope This module introduces algorithms and evaluation metrics for
More informationVRTRESEARCH&INNOVATION
15/02/2013 VRTRESEARCH&INNOVATION Project C: Future CMS Onderzoek mogelijkheden aggregatie en categorisatie content C.3.1.2 Proof of concept aggregatie en categorisatie content VRT RESEARCH &INNOVATION
More informationApache Karaf in real life ApacheCon NA 2014
Apache Karaf in real life ApacheCon NA 2014 Agenda Very short history of Karaf Karaf basis A bit deeper dive into OSGi Modularity vs Extensibility DIY - Karaf based solution What we have learned New and
More informationApache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com
Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
More informationGlassFish v3. Building an ex tensible modular Java EE application server. Jerome Dochez and Ludovic Champenois Sun Microsystems, Inc.
GlassFish v3 Building an ex tensible modular Java EE application server Jerome Dochez and Ludovic Champenois Sun Microsystems, Inc. Agenda Java EE 6 and GlassFish V3 Modularity, Runtime Service Based Architecture
More informationSemantic SharePoint. Technical Briefing. Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company
Semantic SharePoint Technical Briefing Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company What is Semantic SP? a joint venture between iquest and Semantic Web Company, initiated in
More informationUsing Apache Solr for Ecommerce Search Applications
Using Apache Solr for Ecommerce Search Applications Rajani Maski Happiest Minds, IT Services SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. 2 Copyright Information This document
More informationOracle BI Applications (BI Apps) is a prebuilt business intelligence solution.
1 2 Oracle BI Applications (BI Apps) is a prebuilt business intelligence solution. BI Apps supports Oracle sources, such as Oracle E-Business Suite Applications, Oracle's Siebel Applications, Oracle's
More informationThe Open Source Knowledge Discovery and Document Analysis Platform
Enabling Agile Intelligence through Open Analytics The Open Source Knowledge Discovery and Document Analysis Platform 17/10/2012 1 Agenda Introduction and Agenda Problem Definition Knowledge Discovery
More informationPublishing Linked Data Requires More than Just Using a Tool
Publishing Linked Data Requires More than Just Using a Tool G. Atemezing 1, F. Gandon 2, G. Kepeklian 3, F. Scharffe 4, R. Troncy 1, B. Vatant 5, S. Villata 2 1 EURECOM, 2 Inria, 3 Atos Origin, 4 LIRMM,
More informationShallow Parsing with Apache UIMA
Shallow Parsing with Apache UIMA Graham Wilcock University of Helsinki Finland graham.wilcock@helsinki.fi Abstract Apache UIMA (Unstructured Information Management Architecture) is a framework for linguistic
More informationAgent Services-Based Infrastructure for Online Assessment of Trading Strategies
Agent Services-Based Infrastructure for Online Assessment of Trading Strategies Longbing Cao, Jiaqi Wang, Li Lin, Chengqi Zhang Faculty of Information Technology, University of Technology Sydney, Australia
More informationHadoop IST 734 SS CHUNG
Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to
More informationChapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
More informationMongoDB Developer and Administrator Certification Course Agenda
MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL
More informationHadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
More informationServer-side OSGi with Apache Sling. Felix Meschberger Day Management AG 124
Server-side OSGi with Apache Sling Felix Meschberger Day Management AG 124 About Felix Meschberger > Senior Developer, Day Management AG > fmeschbe@day.com > http://blog.meschberger.ch > VP Apache Sling
More informationDesign and Implementation of a Semantic Web Solution for Real-time Reservoir Management
Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management Ram Soma 2, Amol Bakshi 1, Kanwal Gupta 3, Will Da Sie 2, Viktor Prasanna 1 1 University of Southern California,
More informationReference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
More informationConfiguring SharePoint 2013 Document Management and Search. Scott Jamison Chief Architect & CEO Jornata scott.jamison@jornata.com
Configuring SharePoint 2013 Document Management and Search Scott Jamison Chief Architect & CEO Jornata scott.jamison@jornata.com Configuring SharePoint 2013 Document Management and Search Scott Jamison
More informationLightweight Data Integration using the WebComposition Data Grid Service
Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed
More informationDeveloping Microsoft SharePoint Server 2013 Advanced Solutions
Course 20489B: Developing Microsoft SharePoint Server 2013 Advanced Solutions Course Details Course Outline Module 1: Creating Robust and Efficient Apps for SharePoint In this module, you will review key
More informationOptimizing Multilingual Search With Solr
www.basistech.com info@basistech.com 617-386-2090 Optimizing Multilingual Search With Solr Pg. 1 INTRODUCTION Today s search application users expect search engines to just work seamlessly across multiple
More informationSearch and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
More informationApache Flink. Fast and Reliable Large-Scale Data Processing
Apache Flink Fast and Reliable Large-Scale Data Processing Fabian Hueske @fhueske 1 What is Apache Flink? Distributed Data Flow Processing System Focused on large-scale data analytics Real-time stream
More informationInvestigating Hadoop for Large Spatiotemporal Processing Tasks
Investigating Hadoop for Large Spatiotemporal Processing Tasks David Strohschein dstrohschein@cga.harvard.edu Stephen Mcdonald stephenmcdonald@cga.harvard.edu Benjamin Lewis blewis@cga.harvard.edu Weihe
More informationIKAN ALM Architecture. Closing the Gap Enterprise-wide Application Lifecycle Management
IKAN ALM Architecture Closing the Gap Enterprise-wide Application Lifecycle Management Table of contents IKAN ALM SERVER Architecture...4 IKAN ALM AGENT Architecture...6 Interaction between the IKAN ALM
More informationSemantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
More informationTraining Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object
Training Management System for Aircraft Engineering: indexing and retrieval of Corporate Learning Object Anne Monceaux 1, Joanna Guss 1 1 EADS-CCR, Centreda 1, 4 Avenue Didier Daurat 31700 Blagnac France
More informationClient Overview. Engagement Situation. Key Requirements
Client Overview Our client is one of the leading providers of business intelligence systems for customers especially in BFSI space that needs intensive data analysis of huge amounts of data for their decision
More informationSearch and Real-Time Analytics on Big Data
Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its
More informationAutomate Your BI Administration to Save Millions with Command Manager and System Manager
Automate Your BI Administration to Save Millions with Command Manager and System Manager Presented by: Dennis Liao Sr. Sales Engineer Date: 27 th January, 2015 Session 2 This Session is Part of MicroStrategy
More informationOracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
More informationInternational Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763
International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing
More informationPHP Integration Kit. Version 2.5.1. User Guide
PHP Integration Kit Version 2.5.1 User Guide 2012 Ping Identity Corporation. All rights reserved. PingFederate PHP Integration Kit User Guide Version 2.5.1 December, 2012 Ping Identity Corporation 1001
More informationAN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS
AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS Wesley Deneke 1, Wing-Ning Li 2, and Craig Thompson 2 1 Computer Science and Industrial Technology Department, Southeastern Louisiana University,
More informationTriplestore Testing in the Cloud with Clojure. Ryan Senior
Triplestore Testing in the Cloud with Clojure Ryan Senior About Me Senior Engineer at Revelytix Inc Revelytix Info Strange Loop Sponsor Semantic Web Company http://revelytix.com Blog: http://objectcommando.com/blog
More informationIntroduction to IE with GATE
Introduction to IE with GATE based on Material from Hamish Cunningham, Kalina Bontcheva (University of Sheffield) Melikka Khosh Niat 8. Dezember 2010 1 What is IE? 2 GATE 3 ANNIE 4 Annotation and Evaluation
More informationThings Made Easy: One Click CMS Integration with Solr & Drupal
May 10, 2012 Things Made Easy: One Click CMS Integration with Solr & Drupal Peter M. Wolanin, Ph.D. Momentum Specialist (principal engineer), Acquia, Inc. Drupal contributor drupal.org/user/49851 co-maintainer
More informationBig Data and Semantic Web in Manufacturing. Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India
Big Data and Semantic Web in Manufacturing Nitesh Khilwani, PhD Chief Engineer, Samsung Research Institute Noida, India Outline Big data in Manufacturing Big data Analytics Semantic web technologies Case
More informationSemantic Stored Procedures Programming Environment and performance analysis
Semantic Stored Procedures Programming Environment and performance analysis Marjan Efremov 1, Vladimir Zdraveski 2, Petar Ristoski 2, Dimitar Trajanov 2 1 Open Mind Solutions Skopje, bul. Kliment Ohridski
More informationSocial Media Monitoring Tools enhanced by Semantic Web Technologies. Presentation of the Master Thesis Fabian Gasser
Social Media Monitoring Tools enhanced by Semantic Web Technologies Presentation of the Master Thesis Fabian Gasser Contents 1. 2. 3. 4. 5. 6. 7. 8. Main Concepts Challenges Research Question Social Media
More informationScope. Cognescent SBI Semantic Business Intelligence
Cognescent SBI Semantic Business Intelligence Scope...1 Conceptual Diagram...2 Datasources...3 Core Concepts...3 Resources...3 Occurrence (SPO)...4 Links...4 Statements...4 Rules...4 Types...4 Mappings...5
More informationNuxeo, an open source platform for content-centric business applications. Stéfane Fermigier, Nuxeo Laurent Doguin, Nuxeo
Nuxeo, an open source platform for content-centric business applications Stéfane Fermigier, Nuxeo Laurent Doguin, Nuxeo Nuxeo, the Company Providing an Open Source Content Management Platform for Business
More informationOracle Big Data Spatial & Graph Social Network Analysis - Case Study
Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Mark Rittman, CTO, Rittman Mead OTN EMEA Tour, May 2016 info@rittmanmead.com www.rittmanmead.com @rittmanmead About the Speaker Mark
More informationEffective Web Application Development with Apache Sling. Robert Munteanu ( @rombert ), Adobe Systems Romania
Effective Web Application Development with Apache Sling Robert Munteanu ( @rombert ), Adobe Systems Romania About the Speaker Apache Sling PMC member Fanboy of the Sling/JCR/OSGi stack Enthusiastic Open-Source
More informationDeveloping Microsoft SharePoint Server 2013 Advanced Solutions MOC 20489
Developing Microsoft SharePoint Server 2013 Advanced Solutions MOC 20489 Course Outline Module 1: Creating Robust and Efficient Apps for SharePoint In this module, you will review key aspects of the apps
More informationBig Data and Scripting Systems beyond Hadoop
Big Data and Scripting Systems beyond Hadoop 1, 2, ZooKeeper distributed coordination service many problems are shared among distributed systems ZooKeeper provides an implementation that solves these avoid
More informationAutomating Attack Analysis Using Audit Data. Dr. Bruce Gabrielson (BAH) CND R&T PMO 28 October 2009
Automating Attack Analysis Using Audit Data Dr. Bruce Gabrielson (BAH) CND R&T PMO 28 October 2009 2 Introduction Audit logs are cumbersome and traditionally used after the fact for forensics analysis.
More informationSemantic Data Management. Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies
Semantic Data Management Xavier Lopez, Ph.D., Director, Spatial & Semantic Technologies 1 Enterprise Information Challenge Source: Oracle customer 2 Vision of Semantically Linked Data The Network of Collaborative
More informationUpcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
More informationD5.3.2b Automatic Rigorous Testing Components
ICT Seventh Framework Programme (ICT FP7) Grant Agreement No: 318497 Data Intensive Techniques to Boost the Real Time Performance of Global Agricultural Data Infrastructures D5.3.2b Automatic Rigorous
More informationClinical Mapping (CMAP) Draft for Public Comment
Integrating the Healthcare Enterprise 5 IHE Patient Care Coordination Technical Framework Supplement 10 Clinical Mapping (CMAP) 15 Draft for Public Comment 20 Date: June 1, 2015 Author: PCC Technical Committee
More informationGenomeSpace Architecture
GenomeSpace Architecture The primary services, or components, are shown in Figure 1, the high level GenomeSpace architecture. These include (1) an Authorization and Authentication service, (2) an analysis
More informationSan Jose State University
San Jose State University Fall 2011 CMPE 272: Enterprise Software Overview Project: Date: 5/9/2011 Under guidance of Professor, Rakesh Ranjan Submitted by, Team Titans Jaydeep Patel (007521007) Zankhana
More informationCombining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery
Combining SAWSDL, OWL DL and UDDI for Semantically Enhanced Web Service Discovery Dimitrios Kourtesis, Iraklis Paraskakis SEERC South East European Research Centre, Greece Research centre of the University
More informationStratioDeep. An integration layer between Cassandra and Spark. Álvaro Agea Herradón Antonio Alcocer Falcón
StratioDeep An integration layer between Cassandra and Spark Álvaro Agea Herradón Antonio Alcocer Falcón StratioDeep An integration layer between Cassandra and Spark Álvaro Agea Herradón Antonio Alcocer
More informationLabelTranslator - A Tool to Automatically Localize an Ontology
LabelTranslator - A Tool to Automatically Localize an Ontology Mauricio Espinoza 1, Asunción Gómez Pérez 1, and Eduardo Mena 2 1 UPM, Laboratorio de Inteligencia Artificial, 28660 Boadilla del Monte, Spain
More informationCache Configuration Reference
Sitecore CMS 6.2 Cache Configuration Reference Rev: 2009-11-20 Sitecore CMS 6.2 Cache Configuration Reference Tips and Techniques for Administrators and Developers Table of Contents Chapter 1 Introduction...
More informationThe Prolog Interface to the Unstructured Information Management Architecture
The Prolog Interface to the Unstructured Information Management Architecture Paul Fodor 1, Adam Lally 2, David Ferrucci 2 1 Stony Brook University, Stony Brook, NY 11794, USA, pfodor@cs.sunysb.edu 2 IBM
More informationFull-text Search in Intermediate Data Storage of FCART
Full-text Search in Intermediate Data Storage of FCART Alexey Neznanov, Andrey Parinov National Research University Higher School of Economics, 20 Myasnitskaya Ulitsa, Moscow, 101000, Russia ANeznanov@hse.ru,
More informationHow To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
More informationHive Interview Questions
HADOOPEXAM LEARNING RESOURCES Hive Interview Questions www.hadoopexam.com Please visit www.hadoopexam.com for various resources for BigData/Hadoop/Cassandra/MongoDB/Node.js/Scala etc. 1 Professional Training
More informationUIMA: Unstructured Information Management Architecture for Data Mining Applications and developing an Annotator Component for Sentiment Analysis
UIMA: Unstructured Information Management Architecture for Data Mining Applications and developing an Annotator Component for Sentiment Analysis Jan Hajič, jr. Charles University in Prague Faculty of Mathematics
More informationTHE EUROPEAN DATA PORTAL
European Public Sector Information Platform Topic Report No. 2016/03 UNDERSTANDING THE EUROPEAN DATA PORTAL Published: February 2016 1 Table of Contents Keywords... 3 Abstract/ Executive Summary... 3 Introduction...
More informationInformation as a Service in a Data Analytics Scenario A Case Study
2008 IEEE International Conference on Web Services Information as a Service in a Analytics Scenario A Case Study Vishal Dwivedi, Naveen Kulkarni SETLabs, Infosys Technologies Ltd { Vishal_Dwivedi, Naveen_Kulkarni}@infosys.com
More informationHow RAI's Hyper Media News aggregation system keeps staff on top of the news
How RAI's Hyper Media News aggregation system keeps staff on top of the news 13 th Libre Software Meeting Media, Radio, Television and Professional Graphics Geneva - Switzerland, 10 th July 2012 Maurizio
More informationOWB Users, Enter The New ODI World
OWB Users, Enter The New ODI World Kulvinder Hari Oracle Introduction Oracle Data Integrator (ODI) is a best-of-breed data integration platform focused on fast bulk data movement and handling complex data
More informationHigh-Speed In-Memory Analytics over Hadoop and Hive Data
High-Speed In-Memory Analytics over Hadoop and Hive Data Big Data 2015 Apache Spark Not a modified version of Hadoop Separate, fast, MapReduce-like engine In-memory data storage for very fast iterative
More informationArchitectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
More informationArchitecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
More informationDE-20489B Developing Microsoft SharePoint Server 2013 Advanced Solutions
DE-20489B Developing Microsoft SharePoint Server 2013 Advanced Solutions Summary Duration Vendor Audience 5 Days Microsoft Developer Published Level Technology 21 November 2013 300 Microsoft SharePoint
More informationHow To Develop An Open Play Context Framework For Android (For Android)
Dynamix: An Open Plug-and-Play Context Framework for Android Darren Carlson and Andreas Schrader Ambient Computing Group / Institute of Telematics University of Lübeck, Germany www.ambient.uni-luebeck.de
More information