Open Source Techniques push Enterprise Search & Search Driven Applications and especially foster the application of Text Analytics

Size: px
Start display at page:

Download "Open Source Techniques push Enterprise Search & Search Driven Applications and especially foster the application of Text Analytics"

Transcription

1 Open Source Techniques push Enterprise Search & Search Driven Applications and especially foster the application of Text Analytics Exploring the Future of Enterprise Search, IPTS, Seville, October 2011 Dr. Christoph Goller, Director Research, IntraFind AG

2 Emergence of Lucene and Solr Lucene / Solr Built in late 90 s by Doug Cutting. Apache release 2001 State of the art Java library for indexing and ranking, many ports Wide acceptance by 2005, mostly by technology organizations and inside products Solr (2006): Lucene as Web Service, Converters, Connectors, UIMA and Carrot2 integration Open Source: developers mainly from the US and Europe 4,000+ sites Apple, Cisco, EMC, HP, IBM, LinkedIn, MySpace, CNET, Netflix, Salesforce, Twitter, Ebay, Immoscout, Gov, Wikipedia Exploring the Future of Enterprise Search, Seville,

3 Lucene/Solr Strengths Strengths: Best practice segmented index (like Google, Fast) Best practice, flexible ranking (term/field/doc boosts, function queries, custom scoring ) Best overall query performance and complete query capabilities (Boolean Operations, Wildcards, Similarity Search, Efficient Range Queries, Geo Search, Synonyms, Spell-Check, Hit-Snippets, Hit- Highlighting, Proximity Operators ) Multilingual Stemmers, Filters, Memory Mapped Indexes, Near Real- Time Search, Cache Management, Replication, Faceting, Grouping, Autocomplete Rapid Innovation, Extensible Architecture, complete control (open source) CORE TECHNOLOGY AS GOOD OR BETTER THAN ANY OTHER AND OPEN SOURCE Exploring the Future of Enterprise Search, Seville,

4 Lucene/Solr Weaknesses Weaknesses: No formal support, limited access to training & consulting Lack of stringent integrated QA Certain features for Enterprise Search missing: Secure Search Graphical user- and administration interfaces Integrated search of structured and un-structured content Connectors and stable Converters Viewer Component (preview or view docs in their original layout) Search Quality (intent, semantics, NLP, Linguistics)? Text Analytics? Exploring the Future of Enterprise Search, Seville,

5 Competitive Situation Fulltext Search has become a commodity Market strength and features of established Enterprise Search Providers will keep them going a while but Very hard to justify high prices, especially for large applications Very hard to justify closed and proprietary technology Good Enterprise Search Solutions have to offer more than just full-text search Companies like Intrafind can concentrate on solving the real problems in Enterprise Search Exploring the Future of Enterprise Search, Seville,

6 IntraFind Software AG Exploring the Future of Enterprise Search, Seville,

7 IntraFind Business Model Founding of the company: October 2000 More than 700 customers mainly in Germany, Austria, and Switzerland Partner Network (> 30 VAR & embedding partners) Employees: 20 Lucene Committers: B. Messer, C. Goller Our Open Source Search Business: Product Company: ifinder, Topic Finder, Knowledge Map, Tagging Service, Products are a combination of Open Source Components and in-house Development Support (up to 7x24), Services, Training, Stable API Relevancy, automatic generation of meta information Linguistic Analyzers for most European Languages Named Entity Recognition Text Classification Semantic Search Clustering Exploring the Future of Enterprise Search, Seville,

8 Morphological Analyzer vs. Stemming Morphological Analyzer: Lemmatizer: maps words to their base forms English going -> go (Verb) bought -> buy (Verb) bags -> bag (Noun) bacteria -> bacterium (Noun) German lief -> laufen (Verb) rannte -> rennen (Verb) Bücher -> Buch (Noun) Taschen -> Tasche (Noun) Decomposer: decomposes words into their compounds Kinderbuch (children s book) -> Kind (Noun) Buch (Noun) Versicherungsvertrag (contract of insurance) -> Versicherung (Noun) Vertrag (Noun) Holztisch (wooden table), Glastisch (table made of glass) Stemmer: usually simple algorithm going -> go king -> k??????????? Messer -> mess?????? Exploring the Future of Enterprise Search, Seville,

9 Advantages of Morphological Analysis Combines high Recall with high Precision for Search Applications Improves subsequent statistical methods Better suited as descriptions for semantic search / faceting / clustering than artificial stems, categories used for phrase detection Reliable Lookup in Lexicon Resources Thesaurus / Ontologies Cross-lingual search Available Languages: German, English, Spanish, French, Italian, Dutch, Russian, Polish, Serbo-Croatian, Greek, Chinese, Arabian, Pasthu Exploring the Future of Enterprise Search, Seville,

10 Named Entity Recognition (NER) Automated extraction of information from unstructured data People names Company names Brands from product lists Technical key figures from technical data (raw materials, product types, order IDs, process numbers, eclass categories) Names of streets and locations Currency and accounting values Dates Phone numbers, addresses, hyperlinks Search query: Which people / companies strongly correlate with each other? Exploring the Future of Enterprise Search, Seville,

11 Named Entities: Applications in Search Applications: Facets Search for Experts Additional Query Types Index Structure: Additional Tokens on the same position: N_PersonName N_Peter Müller N_Peter N_Müller Search for a person named Brown Search for a company near founded and Bill Gates Question Answering / Natural Language Queries (Demo) Exploring the Future of Enterprise Search, Seville,

12 Text Classification Automatic Assignment of Documents to Topics based on their content Learning Phase Example Documents Definition of Topics 1 N Topic Learner Classifier Rules (Parameters) Classification Phase Topic Association New Document Topic Classifier Exploring the Future of Enterprise Search, Seville,

13 Applications of Text Classification News: Newsletter-Management System Spam-Filtering Mail / Classification Product Classification (Online Shops), ECLASS /UNSPSC Subject Area Assignment Libraries & Publishing Companies Opinion Mining / Sentiment Detection Part of our Tagging Services Exploring the Future of Enterprise Search, Seville,

14 Knowledge Map Knowledge Map = Graphic way of visualizing enterprise knowledge Quick access to existing enterprise knowlege via setting filters or entering search terms Browse (navigate and explore) through your enterprise content Enables a 360 degree view on the existing enterprise knowledge Consideration of individual user authorizations Combine available meta-data (SAP invoice number, order number) and entities from unstructured data (e.g. minutes of SAP Team Meeting in PDF) Exploring the Future of Enterprise Search, Seville,

15 Enterprise Search via the Knowledge Map Tab browsing Full-text search combined with Knowledge Maps Search-while-you-type based on filters (in this example: filter author ) Content sources provide filtering capabilities within the Knowledge Map Exploring the Future of Enterprise Search, Seville,

16 Semantic Search Compute (on the fly) strongly correlated words and phases Presented as Contextor / Facet Used to expand or restrict search OntologyNet results after a search for ifinder Exploring the Future of Enterprise Search, Seville,

17 Questions? Dr. Christoph Goller Director Research Phone: Fax: Web: IntraFind Software AG Fraunhofer Strasse Planegg Germany Interested in becoming an IntraFind Partner? Just contact us! Exploring the Future of Enterprise Search, Seville,

Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications

Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications Morphological Analysis and Named Entity Recognition for your Lucene / Solr Search Applications Berlin Berlin Buzzwords 2011, Dr. Christoph Goller, IntraFind AG Outline IntraFind AG Indexing Morphological

More information

Intrafind Text Analytics. Research Department, Dr. Christoph Goller

Intrafind Text Analytics. Research Department, Dr. Christoph Goller Research Department, Dr. Christoph Goller IntraFind Software AG IntraFind Software AG Founding of the company: October 2000 More than 700 customers mainly in Germany, Austria, and Switzerland Partner Network

More information

Solr-based Search & Automatic Tagging at Zeit Online where Meta Data come from. ApacheCon Europe 2012 Dr. Christoph Goller, IntraFind Software AG

Solr-based Search & Automatic Tagging at Zeit Online where Meta Data come from. ApacheCon Europe 2012 Dr. Christoph Goller, IntraFind Software AG Solr-based Search & Automatic Tagging at Zeit Online where Meta Data come from ApacheCon Europe 2012 Dr. Christoph Goller, IntraFind Software AG IntraFind Software AG Solr-based Search & Automatic Tagging

More information

ifinder ENTERPRISE SEARCH

ifinder ENTERPRISE SEARCH DATA SHEET ifinder ENTERPRISE SEARCH ifinder - the Enterprise Search solution for company-wide information search, information logistics and text mining. CUSTOMER QUOTE IntraFind stands for high quality

More information

Text Classification based on Lucene and LibSVM / LibLinear. Berlin Buzzwords, June 4th, 2012, Dr. Christoph Goller, IntraFind Software AG

Text Classification based on Lucene and LibSVM / LibLinear. Berlin Buzzwords, June 4th, 2012, Dr. Christoph Goller, IntraFind Software AG Text Classification based on Lucene and LibSVM / LibLinear Berlin Buzzwords, June 4th, 2012, Dr. Christoph Goller, IntraFind Software AG Outline IntraFind Software AG Introduction to Text Classification

More information

High quality, low maintenance content tagging @ ZEIT Online Breno Faria, Christoph Goller

High quality, low maintenance content tagging @ ZEIT Online Breno Faria, Christoph Goller High quality, low maintenance content tagging @ ZEIT Online Breno Faria, Christoph Goller About Us IntraFind Software AG Elasticsearch Partner (we also do consulting) Specialist for Information Retrieval

More information

CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING

CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. Is there valuable

More information

Information Retrieval Elasticsearch

Information Retrieval Elasticsearch Information Retrieval Elasticsearch IR Information retrieval (IR) is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches

More information

PROMT Technologies for Translation and Big Data

PROMT Technologies for Translation and Big Data PROMT Technologies for Translation and Big Data Overview and Use Cases Julia Epiphantseva PROMT About PROMT EXPIRIENCED Founded in 1991. One of the world leading machine translation provider DIVERSIFIED

More information

Search and Real-Time Analytics on Big Data

Search and Real-Time Analytics on Big Data Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its

More information

Flattening Enterprise Knowledge

Flattening Enterprise Knowledge Flattening Enterprise Knowledge Do you Control Your Content or Does Your Content Control You? 1 Executive Summary: Enterprise Content Management (ECM) is a common buzz term and every IT manager knows it

More information

Search and Information Retrieval

Search and Information Retrieval Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search

More information

Why are Organizations Interested?

Why are Organizations Interested? SAS Text Analytics Mary-Elizabeth ( M-E ) Eddlestone SAS Customer Loyalty M-E.Eddlestone@sas.com +1 (607) 256-7929 Why are Organizations Interested? Text Analytics 2009: User Perspectives on Solutions

More information

Text Analytics Evaluation Case Study - Amdocs

Text Analytics Evaluation Case Study - Amdocs Text Analytics Evaluation Case Study - Amdocs Tom Reamy Chief Knowledge Architect KAPS Group http://www.kapsgroup.com Text Analytics World October 20 New York Agenda Introduction Text Analytics Basics

More information

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015

Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015 Computer-Based Text- and Data Analysis Technologies and Applications Mark Cieliebak 9.6.2015 Data Scientist analyze Data Library use 2 About Me Mark Cieliebak + Software Engineer & Data Scientist + PhD

More information

How To Make Sense Of Data With Altilia

How To Make Sense Of Data With Altilia HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to

More information

SQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS

SQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS Enterprise Data Problems in Investment Banks BigData History and Trend Driven by Google CAP Theorem for Distributed Computer System Open Source Building Blocks: Hadoop, Solr, Storm.. 3548 Hypothetical

More information

Text Analytics Software Choosing the Right Fit

Text Analytics Software Choosing the Right Fit Text Analytics Software Choosing the Right Fit Tom Reamy Chief Knowledge Architect KAPS Group http://www.kapsgroup.com Text Analytics World San Francisco, 2013 Agenda Introduction Text Analytics Basics

More information

www.inovoo.com EMC APPLICATIONXTENDER 8.0 Real-Time Document Management

www.inovoo.com EMC APPLICATIONXTENDER 8.0 Real-Time Document Management www.inovoo.com EMC APPLICATIONXTENDER 8.0 Real-Time Document Management 02 EMC APPLICATIONXTENDER 8.0 EMC ApplicationXtender (AX) is a web-based real-time document management system which stores, manages

More information

Optimizing Multilingual Search With Solr

Optimizing Multilingual Search With Solr www.basistech.com info@basistech.com 617-386-2090 Optimizing Multilingual Search With Solr Pg. 1 INTRODUCTION Today s search application users expect search engines to just work seamlessly across multiple

More information

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo

Selecting a Taxonomy Management Tool. Wendi Pohs InfoClear Consulting #SLATaxo Selecting a Taxonomy Management Tool Wendi Pohs InfoClear Consulting #SLATaxo InfoClear Consulting What do we do? Content Analytics Strategy and Implementation, including: Taxonomy/Ontology development

More information

C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER

C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER INTRODUCTION TO SAS TEXT MINER TODAY S AGENDA INTRODUCTION TO SAS TEXT MINER Define data mining Overview of SAS Enterprise Miner Describe text analytics and define text data mining Text Mining Process

More information

Corporate Presentation

Corporate Presentation Corporate Presentation AIM INVESTOR DAY II Edizione Palazzo Mezzanotte 15 aprile 2015 Stefano Spaggiari, CEO, Expert System Company Overview 2011 1992 Born the first software Errata Corrige 2000 Cogito

More information

DIGITAL MARKETING TRAINING

DIGITAL MARKETING TRAINING DIGITAL MARKETING TRAINING Digital Marketing Basics Keywords Research and Analysis Basics of advertising What is Digital Media? Digital Media Vs. Traditional Media Benefits of Digital marketing Latest

More information

CONCEPTCLASSIFIER FOR SHAREPOINT

CONCEPTCLASSIFIER FOR SHAREPOINT CONCEPTCLASSIFIER FOR SHAREPOINT PRODUCT OVERVIEW The only SharePoint 2007 and 2010 solution that delivers automatic conceptual metadata generation, auto-classification and powerful taxonomy tools running

More information

Using Apache Solr for Ecommerce Search Applications

Using Apache Solr for Ecommerce Search Applications Using Apache Solr for Ecommerce Search Applications Rajani Maski Happiest Minds, IT Services SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. 2 Copyright Information This document

More information

Apache Lucene. Searching the Web and Everything Else. Daniel Naber Mindquarry GmbH ID 380

Apache Lucene. Searching the Web and Everything Else. Daniel Naber Mindquarry GmbH ID 380 Apache Lucene Searching the Web and Everything Else Daniel Naber Mindquarry GmbH ID 380 AGENDA 2 > What's a search engine > Lucene Java Features Code example > Solr Features Integration > Nutch Features

More information

Semaphore Overview. A Smartlogic White Paper. Executive Summary

Semaphore Overview. A Smartlogic White Paper. Executive Summary Semaphore Overview A Smartlogic White Paper Executive Summary Enterprises no longer face an acute information access challenge. This is mainly because the information search market has matured immensely

More information

Personal Archive User Guide

Personal Archive User Guide Personal Archive User Guide Personal Archive gives you an unlimited mailbox and helps you quickly and easily access your archived email directly from Microsoft Outlook or Lotus Notes. Since Personal Archive

More information

CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise

CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise 5 APR 2011 1 2005... Advanced Analytics Harnessing Data for the Warfighter I2E GIG Brigade Combat Team Data Silos DCGS LandWarNet

More information

Introduction to IR Systems: Supporting Boolean Text Search. Information Retrieval. IR vs. DBMS. Chapter 27, Part A

Introduction to IR Systems: Supporting Boolean Text Search. Information Retrieval. IR vs. DBMS. Chapter 27, Part A Introduction to IR Systems: Supporting Boolean Text Search Chapter 27, Part A Database Management Systems, R. Ramakrishnan 1 Information Retrieval A research field traditionally separate from Databases

More information

Anotaciones semánticas: unidades de busqueda del futuro?

Anotaciones semánticas: unidades de busqueda del futuro? Anotaciones semánticas: unidades de busqueda del futuro? Hugo Zaragoza, Yahoo! Research, Barcelona Jornadas MAVIR Madrid, Nov.07 Document Understanding Cartoon our work! Complexity of Document Understanding

More information

Automated Multilingual Text Analysis in the Europe Media Monitor (EMM) Ralf Steinberger. European Commission Joint Research Centre (JRC)

Automated Multilingual Text Analysis in the Europe Media Monitor (EMM) Ralf Steinberger. European Commission Joint Research Centre (JRC) Automated Multilingual Text Analysis in the Europe Media Monitor (EMM) Ralf Steinberger European Commission Joint Research Centre (JRC) https://ec.europa.eu/jrc/en/research-topic/internet-surveillance-systems

More information

Technical Report. The KNIME Text Processing Feature:

Technical Report. The KNIME Text Processing Feature: Technical Report The KNIME Text Processing Feature: An Introduction Dr. Killian Thiel Dr. Michael Berthold Killian.Thiel@uni-konstanz.de Michael.Berthold@uni-konstanz.de Copyright 2012 by KNIME.com AG

More information

OpenText Output Transformation Server

OpenText Output Transformation Server OpenText Output Transformation Server Seamlessly manage and process content flow across the organization OpenText Output Transformation Server processes, extracts, transforms, repurposes, personalizes,

More information

IBM Content Analytics with Enterprise Search, Version 3.0

IBM Content Analytics with Enterprise Search, Version 3.0 IBM Content Analytics with Enterprise Search, Version 3.0 Highlights Enables greater accuracy and control over information with sophisticated natural language processing capabilities to deliver the right

More information

DBpedia German: Extensions and Applications

DBpedia German: Extensions and Applications DBpedia German: Extensions and Applications Alexandru-Aurelian Todor FU-Berlin, Innovationsforum Semantic Media Web, 7. Oktober 2014 Overview Why DBpedia? New Developments in DBpedia German Problems in

More information

INTERNET MARKETING. SEO Course Syllabus Modules includes: COURSE BROCHURE

INTERNET MARKETING. SEO Course Syllabus Modules includes: COURSE BROCHURE AWA offers a wide-ranging yet comprehensive overview into the world of Internet Marketing and Social Networking, examining the most effective methods for utilizing the power of the internet to conduct

More information

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy

Multi language e Discovery Three Critical Steps for Litigating in a Global Economy Multi language e Discovery Three Critical Steps for Litigating in a Global Economy 2 3 5 6 7 Introduction e Discovery has become a pressure point in many boardrooms. Companies with international operations

More information

Chapter 6. Attracting Buyers with Search, Semantic, and Recommendation Technology

Chapter 6. Attracting Buyers with Search, Semantic, and Recommendation Technology Attracting Buyers with Search, Semantic, and Recommendation Technology Learning Objectives Using Search Technology for Business Success Organic Search and Search Engine Optimization Recommendation Engines

More information

Web to Print Knowledge Experience. A Case Study of the Government of Hessen, Germany s Half-Time Report

Web to Print Knowledge Experience. A Case Study of the Government of Hessen, Germany s Half-Time Report Web to Print Knowledge Experience A Case Study of the Government of Hessen, Germany s Half-Time Report Halbzeitbilanz The Half Time Report An transparency initiative by the government of Hessen, Germany

More information

Taxonomies for Auto-Tagging Unstructured Content. Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013

Taxonomies for Auto-Tagging Unstructured Content. Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013 Taxonomies for Auto-Tagging Unstructured Content Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013 About Heather Hedden Independent taxonomy consultant, Hedden

More information

Corporate Presentation. Mercato AIM - Italia: opportunità e rischi per emittenti ed investitori

Corporate Presentation. Mercato AIM - Italia: opportunità e rischi per emittenti ed investitori Corporate Presentation Mercato AIM - Italia: opportunità e rischi per emittenti ed investitori Stefano Spaggiari, CEO, Expert System 3 giugno 2015 Company Overview 2011 1992 Born the first software Errata

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Live Office. Personal Archive User Guide

Live Office. Personal Archive User Guide Live Office Personal Archive User Guide Document Revision: 14 Feb 2012 Personal Archive User Guide Personal Archive gives you an unlimited mailbox and helps you quickly and easily access your archived

More information

Jobsket ATS. Empowering your recruitment process

Jobsket ATS. Empowering your recruitment process Jobsket ATS Empowering your recruitment process WELCOME TO JOBSKET ATS Jobsket ATS is a recruitment and talent acquisition software package built on top of innovation. Our software improves recruitment

More information

ELPUB Digital Library v2.0. Application of semantic web technologies

ELPUB Digital Library v2.0. Application of semantic web technologies ELPUB Digital Library v2.0 Application of semantic web technologies Anand BHATT a, and Bob MARTENS b a ABA-NET/Architexturez Imprints, New Delhi, India b Vienna University of Technology, Vienna, Austria

More information

aloe-project.de White Paper ALOE White Paper - Martin Memmel

aloe-project.de White Paper ALOE White Paper - Martin Memmel aloe-project.de White Paper Contact: Dr. Martin Memmel German Research Center for Artificial Intelligence DFKI GmbH Trippstadter Straße 122 67663 Kaiserslautern fon fax mail web +49-631-20575-1210 +49-631-20575-1030

More information

Semantic SharePoint. Technical Briefing. Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company

Semantic SharePoint. Technical Briefing. Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company Semantic SharePoint Technical Briefing Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company What is Semantic SP? a joint venture between iquest and Semantic Web Company, initiated in

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

Get the most value from your surveys with text analysis

Get the most value from your surveys with text analysis PASW Text Analytics for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That

More information

Exalead CloudView. Semantics Whitepaper

Exalead CloudView. Semantics Whitepaper Exalead CloudView TM Semantics Whitepaper Executive Summary As a partner in our business, in our life, we want the computer to understand what we want, to understand what we mean, when we interact with

More information

Enhancing Lotus Domino search

Enhancing Lotus Domino search Enhancing Lotus Domino search Efficiency & productivity through effective information location 2009 Diegesis Limited Enhanced Search for Lotus Domino Efficiency and productivity - effective information

More information

Full-text Search in Intermediate Data Storage of FCART

Full-text Search in Intermediate Data Storage of FCART Full-text Search in Intermediate Data Storage of FCART Alexey Neznanov, Andrey Parinov National Research University Higher School of Economics, 20 Myasnitskaya Ulitsa, Moscow, 101000, Russia ANeznanov@hse.ru,

More information

DIGITAL INVESTOR DAY Milano@Park Hyatt 5 Febbraio 2015 - ore 9.00

DIGITAL INVESTOR DAY Milano@Park Hyatt 5 Febbraio 2015 - ore 9.00 Corporate Presentation DIGITAL INVESTOR DAY Milano@Park Hyatt 5 Febbraio 2015 - ore 9.00 Historical Background 2011 1992 Born the first software Errata Corrige 2000 Creation of the US-based subsidiary

More information

Enhancing File System Search

Enhancing File System Search Enhancing File System Search Efficiency & productivity through effective information location Get started from just 6,000 2009 Diegesis Limited Enhanced Search for Windows and UNIX File Systems Efficiency

More information

MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts

MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS

More information

Get results with modern, personalized digital experiences

Get results with modern, personalized digital experiences Brochure HP TeamSite What s new in TeamSite? The latest release of TeamSite (TeamSite 8) brings significant enhancements in usability and performance: Modern graphical interface: Rely on an easy and intuitive

More information

Information Access Platforms: The Evolution of Search Technologies

Information Access Platforms: The Evolution of Search Technologies Information Access Platforms: The Evolution of Search Technologies Managing Information in the Public Sphere: Shaping the New Information Space April 26, 2010 Purpose To provide an overview of current

More information

EC Wise Report: Unlocking the Value of Deeply Unstructured Data. The Challenge: Gaining Knowledge from Deeply Unstructured Data.

EC Wise Report: Unlocking the Value of Deeply Unstructured Data. The Challenge: Gaining Knowledge from Deeply Unstructured Data. EC Wise Report: Unlocking the Value of Deeply Unstructured Data Feedback from the Market: Forest Rim enables significant improvements in the quality of semantic information derived from text data. This

More information

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014 Maximierung des Geschäftserfolgs durch SAP Predictive Analytics Andreas Forster, May 2014 Legal Disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed

More information

itunes Store Publisher User Guide Version 1.1

itunes Store Publisher User Guide Version 1.1 itunes Store Publisher User Guide Version 1.1 Version Date Author 1.1 10/09/13 William Goff Table of Contents Table of Contents... 2 Introduction... 3 itunes Console Advantages... 3 Getting Started...

More information

Search Big Data with MySQL and Sphinx. Mindaugas Žukas www.ivinco.com

Search Big Data with MySQL and Sphinx. Mindaugas Žukas www.ivinco.com Search Big Data with MySQL and Sphinx Mindaugas Žukas www.ivinco.com Agenda Big Data Architecture Factors and Technologies MySQL and Big Data Sphinx Search Server overview Case study: building a Big Data

More information

IT services for analyses of various data samples

IT services for analyses of various data samples IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

More information

CENG 734 Advanced Topics in Bioinformatics

CENG 734 Advanced Topics in Bioinformatics CENG 734 Advanced Topics in Bioinformatics Week 9 Text Mining for Bioinformatics: BioCreative II.5 Fall 2010-2011 Quiz #7 1. Draw the decompressed graph for the following graph summary 2. Describe the

More information

Multimedia Translations

Multimedia Translations Multimedia Translations Our extensive background as a translation company made us realize that words are no longer enough. The digital age requires the information to use the fastest channels available,

More information

www.coveo.com Unifying Search for the Desktop, the Enterprise and the Web

www.coveo.com Unifying Search for the Desktop, the Enterprise and the Web wwwcoveocom Unifying Search for the Desktop, the Enterprise and the Web wwwcoveocom Why you need Coveo Enterprise Search Quickly find documents scattered across your enterprise network Coveo is actually

More information

- Update 2.0.3. IBM Content Navigator Experience Platform. Sven Hapke Leading Technical Professional, Enterprise Content Management

- Update 2.0.3. IBM Content Navigator Experience Platform. Sven Hapke Leading Technical Professional, Enterprise Content Management New insights. Better outcomes. IBM Content Navigator Experience Platform - Update 2.0.3 Sven Hapke Leading Technical Professional, Enterprise Content Management Introducing IBM Content Navigator IBM s

More information

What We Do. Our products harness big data and transform it into actionable knowledge, to be consumed in 5 seconds

What We Do. Our products harness big data and transform it into actionable knowledge, to be consumed in 5 seconds White Paper 2014 5 SECONDS TO KNOWLEDGE Why KMS lighthouse We improve customer experience by means of knowledge, leading the Knowledge Driven Experience (KDE) revolution What We Do Our products harness

More information

Salesforce Knowledge Implementation Guide

Salesforce Knowledge Implementation Guide Salesforce Knowledge Implementation Guide Salesforce, Winter 16 @salesforcedocs Last updated: December 10, 2015 Copyright 2000 2015 salesforce.com, inc. All rights reserved. Salesforce is a registered

More information

Big Data and Data Quality - Mutually Exclusive?

Big Data and Data Quality - Mutually Exclusive? Session 11929 Big Data and Data Quality - Mutually Exclusive? Tom Deutsch tdeutsch@us.ibm.com Program Director, Big Data August 9, 2012 Abstract It is popular to think that Big Data technologies are so

More information

K@ A collaborative platform for knowledge management

K@ A collaborative platform for knowledge management White Paper K@ A collaborative platform for knowledge management Quinary SpA www.quinary.com via Pietrasanta 14 20141 Milano Italia t +39 02 3090 1500 f +39 02 3090 1501 Copyright 2004 Quinary SpA Index

More information

Taxonomies in Practice Welcome to the second decade of online taxonomy construction

Taxonomies in Practice Welcome to the second decade of online taxonomy construction Building a Taxonomy for Auto-classification by Wendi Pohs EDITOR S SUMMARY Taxonomies have expanded from browsing aids to the foundation for automatic classification. Early auto-classification methods

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

The Future of Business Analytics is Now! 2013 IBM Corporation

The Future of Business Analytics is Now! 2013 IBM Corporation The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics

More information

The following contains only Unclassified Information. Tyson Johnson. tjohnson@brightplanet.com 905-510-0750

The following contains only Unclassified Information. Tyson Johnson. tjohnson@brightplanet.com 905-510-0750 The following contains only Unclassified Information. Tyson Johnson tjohnson@brightplanet.com 905-510-0750 1 BrightPlanet Corporation Over 10 years of Deep Web Harvesting expertise, and 9 years working

More information

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS WHITEPAPER BASHO DATA PLATFORM BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS INTRODUCTION Big Data applications and the Internet of Things (IoT) are changing and often improving our

More information

LexisNexis TotalPatent. Training Manual

LexisNexis TotalPatent. Training Manual LexisNexis TotalPatent Training Manual March, 2013 Table of Contents 1 GETTING STARTED Signing On / Off Setting Preferences and Project IDs Online Help and Feedback 2 SEARCHING FUNDAMENTALS Overview of

More information

Slide 7. Jashapara, Knowledge Management: An Integrated Approach, 2 nd Edition, Pearson Education Limited 2011. 7 Nisan 14 Pazartesi

Slide 7. Jashapara, Knowledge Management: An Integrated Approach, 2 nd Edition, Pearson Education Limited 2011. 7 Nisan 14 Pazartesi WELCOME! WELCOME! Chapter 7 WELCOME! Chapter 7 WELCOME! Chapter 7 KNOWLEDGE MANAGEMENT TOOLS: WELCOME! Chapter 7 KNOWLEDGE MANAGEMENT TOOLS: Component Technologies LEARNING OBJECTIVES LEARNING OBJECTIVES

More information

Natural Language Processing in the EHR Lifecycle

Natural Language Processing in the EHR Lifecycle Insight Driven Health Natural Language Processing in the EHR Lifecycle Cecil O. Lynch, MD, MS cecil.o.lynch@accenture.com Health & Public Service Outline Medical Data Landscape Value Proposition of NLP

More information

RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS

RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for

More information

EMC SourceOne Email Management and ediscovery Overview

EMC SourceOne Email Management and ediscovery Overview EMC SourceOne Email Management and ediscovery Overview Deanna Hoover EMC SourceOne Systems Engineer 1 Agenda Value of Good Information Governance Introduction to EMC SourceOne Information Governance Email

More information

#mstrworld. No Data Left behind: 20+ new data sources with new data preparation in MicroStrategy 10

#mstrworld. No Data Left behind: 20+ new data sources with new data preparation in MicroStrategy 10 No Data Left behind: 20+ new data sources with new data preparation in MicroStrategy 10 MicroStrategy Analytics Agenda Product Workflows Different Data Import Processes Product Demonstrations Data Preparation

More information

Translation Solution for

Translation Solution for Translation Solution for Case Study Contents PROMT Translation Solution for PayPal Case Study 1 Contents 1 Summary 1 Background for Using MT at PayPal 1 PayPal s Initial Requirements for MT Vendor 2 Business

More information

Personalized Business Intelligence

Personalized Business Intelligence Personalized Business Intelligence arcplanet, 2011-03-31 Claus Nagler Head of Business Intelligence Solutions & Services Bayer Business Services GmbH Agenda 1 2 3 4 Introduction Bayer Company Profile Personalized

More information

Joint Research Centre

Joint Research Centre Joint Research Centre Open Source Monitoring Tools and Applications emm.newsbrief.eu Serving society Stimulating innovation Supporting legislation Open Source Monitoring - Overview EMM Introduction Custom

More information

UIMA: Unstructured Information Management Architecture for Data Mining Applications and developing an Annotator Component for Sentiment Analysis

UIMA: Unstructured Information Management Architecture for Data Mining Applications and developing an Annotator Component for Sentiment Analysis UIMA: Unstructured Information Management Architecture for Data Mining Applications and developing an Annotator Component for Sentiment Analysis Jan Hajič, jr. Charles University in Prague Faculty of Mathematics

More information

Google Search Appliance

Google Search Appliance Google Search Appliance Data Sheet Google Search Appliance Google Search Appliance 7.0 For more information visit: http://www.google.com/enterprise/search/ What s New Relevance algorithm enhancements Document

More information

Digital Asset Management. Content Control for Valuable Media Assets

Digital Asset Management. Content Control for Valuable Media Assets Digital Asset Management Content Control for Valuable Media Assets Overview Digital asset management is a core infrastructure requirement for media organizations and marketing departments that need to

More information

Leveraging Big Data. A case study from Thomson Reuters

Leveraging Big Data. A case study from Thomson Reuters Leveraging Big Data A case study from Thomson Reuters About the speakers Chawapong Suriyajan, Development Group Leader Sakol Suwinaitrakool Senior Solution Architect 2 FOLLOW US: facebook.com/thomsonreutersthailand

More information

Extend your analytic capabilities with SAP Predictive Analysis

Extend your analytic capabilities with SAP Predictive Analysis September 9 11, 2013 Anaheim, California Extend your analytic capabilities with SAP Predictive Analysis Charles Gadalla Learning Points Advanced analytics strategy at SAP Simplifying predictive analytics

More information

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project Janet Delve, University of Portsmouth Kuldar Aas, National Archives of Estonia Rainer Schmidt, Austrian Institute

More information

Enhancing Document Review Efficiency with OmniX

Enhancing Document Review Efficiency with OmniX Xerox Litigation Services OmniX Platform Review Technical Brief Enhancing Document Review Efficiency with OmniX Xerox Litigation Services delivers a flexible suite of end-to-end technology-driven services,

More information

Integration of a Multilingual Keyword Extractor in a Document Management System

Integration of a Multilingual Keyword Extractor in a Document Management System Integration of a Multilingual Keyword Extractor in a Document Management System Andrea Agili *, Marco Fabbri *, Alessandro Panunzi +, Manuel Zini * * DrWolf s.r.l., + Dipartimento di Italianistica - Università

More information

Build Vs. Buy For Text Mining

Build Vs. Buy For Text Mining Build Vs. Buy For Text Mining Why use hand tools when you can get some rockin power tools? Whitepaper April 2015 INTRODUCTION We, at Lexalytics, see a significant number of people who have the same question

More information

Text Mining - Scope and Applications

Text Mining - Scope and Applications Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss

More information

KHRESMOI. Medical Information Analysis and Retrieval

KHRESMOI. Medical Information Analysis and Retrieval KHRESMOI Medical Information Analysis and Retrieval Integrated Project Budget: EU Contribution: Partners: Duration: 10 Million Euro 8 Million Euro 12 Institutions 9 Countries 4 Years 1 Sep 2010-31 Aug

More information

Sentiment Analysis on Big Data

Sentiment Analysis on Big Data SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Localizing dynamic websites created from open source content management systems

Localizing dynamic websites created from open source content management systems Localizing dynamic websites created from open source content management systems memoqfest 2012, May 10, 2012, Budapest Daniel Zielinski Martin Beuster Loctimize GmbH [daniel martin]@loctimize.com www.loctimize.com

More information