Introduction to Epinomy Big Data Semantics
The Promise and Challenge of Big Data The application of big data in industrial settings is driving a productivity revolution. - Jeff Immelt, CEO/GE Companies that move quickly to capitalize on the potential of Big Data will often gain first mover advantage, enabling them to innovate in ways that are difficult to replicate. Organizations that delay the Big Data journey risk being leap frogged by more data-savvy competitors. - Capitalizing on the promise of Big Data/pwc, January 2013 While Big Data technologies and techniques are unlocking secrets previously hidden in enterprise data, the largest source of potential insight remains largely untapped. Unstructured content represents as much as eighty percent of an organizations total information assets. Darin Stewart/Gartner, May 1, 2013 Big Data analytics must reckon the importance and criticality of metadata A good metadata management solution must provide visibility across multiple solutions and bring business users into the fold for a collaborative, active metadata management process. The importance of metadata cannot be overstated. - Gautham Vemuganti/Infosys Labs Briefings, 2013
The Business Problem Enterprise content is hard to find 80% of enterprise content is unstructured Structured and unstructured data does not play well together Knowledge workers spend around 25% of their time searching for information Solutions that work on the Web do not work in the enterprise What s the point of big data if you can t find it?
Big Data Landscape 3 types of Big Data Epinomy addresses: 1. Big Text unstructured or semi-structured natural language, full text, grammatical, semantic 2. Big Tables structured data tabular, rows, columns, relational, rigid schema 3. Big Meta data about data taxonomies, ontologies, concepts, facets, dimensions, etc.
How does Applied Relevance Approach Enterprise Big Data Problems? Enterprise documents are an enterprise s Products Documents (products) can be labeled with descriptors (metadata) that detail the content of a document Descriptors (taxonomies/ontologies/dimensions) become the structure of the enterprise documents Metadata and dimensions allow documents to be readily found based on content Normalizing taxonomy/ontology terms (used to tag unstructured documents) and dimensions (from structured data) provides a common language for improved enterprise search The taxonomy/ontology/dimensional relationships allow for the discovery of other relevant documents (faceted search)
Epinomy Makes enterprise content findable Epinomy provides the tools to manage and execute a managed metadata strategy to unlock the secrets buried in unstructured enterprise documents/data. 1. Taxonomy/ontology management 2. Rules based auto-tagging of unstructured documents and data 3. Faceted search for improved access and discover of documents/data
Epinomy - Components Taxonomy Manager Tagging Preview Semantic Enrichment Visualization Relationship Manager Auto-tagging Engine Faceted Search Amazon style Term Rules Editor Suggested Topics Alert Manager Search MarkLogic 7 Server
Relationship Manager Taxonomy/Ontology Manager Term Rules Editor Tagging Preview Arbitrary Relationships (Triple Store) Suggested Topics
Auto-tagging Engine Semantic Enrichment Alert Manager Metadata for Search Rule-based Deterministic Real-time
Search Search Faceted Navigation Visualization Discovery
Faceted Search Amazon Style
The Future of Epinomy (coming in late 2013) Combined search of both structured and unstructured Big Data through a unified Epinomy faceted search interface using a common set of taxonomies/ontologies Tools to normalize the taxonomies/ontologies used to auto-tag unstructured data and the dimensions used to define structured data (rows & columns labels) normalized organization for search Point & click management of data sources and feeds in Epinomy
Point & Click Management of Data Sources/Feeds
100% Native Why MarkLogic? Built specifically for MarkLogic v7 MarkLogic XQuery solution No external servers required Super scalable Secure Fast Efficient Flexible Successful Customer Base in Government, Media and Financial Services