Enterprise Information Catalog. Self Service Data Discovery through Enterprise Information Catalog

Similar documents
Ganzheitliches Datenmanagement

Taming the Elephant with Big Data Management. Deep Dive

Data Integration Hub

Bringing Strategy to Life Using an Intelligent Data Platform to Become Data Ready. Informatica Government Summit April 23, 2015

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

The Future of Data Management

Native Connectivity to Big Data Sources in MSTR 10

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

OWB Users, Enter The New ODI World

How to avoid building a data swamp

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Big Data Analytics Nokia

Databricks. A Primer

Safe Harbor Statement

Databricks. A Primer

BIG DATA & DATA SCIENCE

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Cisco IT Hadoop Journey

Are You Big Data Ready?

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Oracle Big Data Discovery (BDD) Hadoop Visualization

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Organisaties groot en klein, beginnen zich meer en meer te realiseren dat inzicht in (real-time) data helpt

Apache Sentry. Prasad Mujumdar

Open Source Business Intelligence Intro

Informatica Version 10 Features and Advancements

Data Doesn t Communicate Itself Using Visualization to Tell Better Stories

SAP Agile Data Preparation

The Enterprise Data Hub and The Modern Information Architecture

SAP BusinessObjects Information Steward

Insert Information Protection Policy Classification from Slide 12 of the corporate presentation template

Mastering Big Data. Steve Hoskin, VP and Chief Architect INFORMATICA MDM. October 2015

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Using Tableau Software with Hortonworks Data Platform

Migrating Discoverer to OBIEE Lessons Learned. Presented By Presented By Naren Thota Infosemantics, Inc.

Hadoop & Spark Using Amazon EMR

The BIg Picture. Dinsdag 17 september 2013

BUSINESSOBJECTS DATA INTEGRATOR

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Building Your Big Data Team

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Architecting for the Internet of Things & Big Data

Metadata Application Understanding Software Migration

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

BUSINESSOBJECTS DATA INTEGRATOR

White Paper. Unified Data Integration Across Big Data Platforms

Unified Data Integration Across Big Data Platforms

Safe Harbor Statement

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

IBM BigInsights for Apache Hadoop

Service Oriented Data Management

Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management

Sisense. Product Highlights.

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Data Analysis with Various Oracle Business Intelligence and Analytic Tools

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

Hadoop Ecosystem B Y R A H I M A.

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Salesforce.com and MicroStrategy. A functional overview and recommendation for analysis and application development

An Oracle White Paper February Oracle Data Integrator 12c Architecture Overview

The Future of Data Management with Hadoop and the Enterprise Data Hub

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Extend your analytic capabilities with SAP Predictive Analysis

Qlik Sense Enabling the New Enterprise

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

Cisco IT Hadoop Journey

Introduction to Oracle Business Intelligence Standard Edition One. Mike Donohue Senior Manager, Product Management Oracle Business Intelligence

Big Data Technologies Compared June 2014

What s New with Informatica Data Services & PowerCenter Data Virtualization Edition

Search and Real-Time Analytics on Big Data

IBM InfoSphere BigInsights Enterprise Edition

Data Virtualization A Potential Antidote for Big Data Growing Pains

QlikView Business Discovery Platform. Algol Consulting Srl

Consulting and Systems Integration (1) Networks & Cloud Integration Engineer

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Compunnel. Business Intelligence, Master Data Management & Compliance (Healthcare) Largest Health Insurance Company in New Jersey.

Business Intelligence In SAP Environments

From Lab to Factory: The Big Data Management Workbook

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

Data Discovery, Analytics, and the Enterprise Data Hub

Exploring the Synergistic Relationships Between BPC, BW and HANA

SAP BusinessObjects BI Clients

How To Make Sense Of Data With Altilia

BIG DATA TRENDS AND TECHNOLOGIES

Transcription:

Enterprise Information Catalog Self Service Data Discovery through Enterprise Information Catalog

Safe Harbor The information being provided today is for informational purposes only. The development, release and timing of any Informatica product or functionality described today remain at the sole discretion of Informatica and should not be relied upon in making a purchasing decision. Statements made today are based on currently available information, which is subject to change. Such statements should not be relied upon as a representation, warranty or commitment to deliver specific products or functionality in the future.

Agenda Value Proposition Features and Functionality Demo Product Architecture Questions

Value Proposition

Market Trends Driving Next-Gen Metadata Management More Data, Many Systems Dark Data Increasing Regulations From Truth to Trust

Live Data Map Ø Knowledge graph of enterprise metadata assets Metadata Services Indexed catalog of extracted metadata System Characteristics Massive Scale High Availability Extensible High Load & Search Performance Data profile & statistical information Enrichment of content Discovered data domains Derived and discovered relationships Human input & behavior Classification, clustering Open APIs

A Common Foundation for Data Intelligence Applications Enterprise Information Catalog Unified view into enterprise data assets Secure@Source Enterprise-wide visibility into sensitive data risks Intelligent Data Lake.. Self-Service data preparation on big data Universal Metadata Services + Storage Layer Live Data Map Knowledge graph of enterprise metadata assets

Live Data Map: Foundation for Data Intelligence Data Discovery Sensitive Data Tracking Stewardship & Governance Smart Suggestions Ø Ø Ø Exploration Semantic Search Relationship Discovery Map Live Data Map Knowledge Relationships Graph of all enterprise Rules data assets EIC Catalog Glossary Statistics Ratings Ø Ø Ø Recommendations 360 degree views User Ratings All#Informatica# Repositories 3rd#party# BI,#Modeling,# Big#Data,#RDBMS Applications,# Business# glossary#&#context# User#Ratings,#Feedback,# Operational#Stats

Enterprise Information Catalog: Vision Enterprise Information Catalog enables Business and IT users realize the full potential of their enterprise data assets by providing a unified metadata view that includes technical metadata, business context, user annotations, relationships, data quality and usage

Enterprise Information Catalog Ø Unified view into enterprise information assets Business-user oriented solution Semantic search with dynamic facets Data lineage Change impact Relationships discovery High level data profiling Data domains Custom attributes with business classifications Broad metadata source connectivity Big data scale

With Big Data come bigger questions Where is data of this type? How did it get here? What data was used to create this attribute? Is my report using right data? Who owns this dataset? Is this dataset good for my analysis?

Enterprise Information Catalog Powered By Live Data Map 1 Data Classification Use Machine Learning to tag semantics to data elements in data sets ü Domain Discovery ü Smart Domains Roadmap Without associating semantics to technical assets in the catalog, it will be useless. However, we also don t have an army of people who can perform those associations for us. 2 Data Discovery Google for Enterprise Metadata : Large scale distributed metadata index ü Semantic Search ü Dynamic Facets We have 20000+ databases and no idea what is in them. 3 Data Governance Extract and Infer Lineage Relationships ü Lineage ü Business Glossary Integration Roadmap BCBS 239: Principle 9 - A bank should develop an inventory and classification of risk data items which includes a reference to the concepts used to elaborate the reports. INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

Enterprise Information Catalog Powered By Live Data Map 4 Broad Connectivity Scanners for DB, DWH, ETL, BI, Big Data, Applications and more ü Purpose Built Connectors ü Scanner SDK Roadmap We have data management software from multiple vendors in our IT environment. Need to see metadata from all for an endto-end picture 5 Big Data Scale Deployed on Hadoop and internally uses Titan, Solr and Spark ü Parallel Metadata Ingestion ü 24X7 Availability Mission critical system with 1 Billion Objects in 4 Applications and growing. 6 Crowdsourced Annotations Typed and Free Form User Annotations ü Custom Attributes ü Business Classifications Leverage Wisdom of the Crowds: Enrich datasets with tribal knowledge making classifications, comments and more available to everyone in the organization. INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

Features and Functionality 1

Data Discovery Data discovery through a powerful search engine to find relevant data Advanced keyword search with token matching to find the most relevant data assets in the catalog Search Auto-Complete provides suggestion as user types into the field Intelligent Facets are provided based on the search results allowing users to narrow the search to most relevant data assets

Data Lineage Interactively trace data origin through summarized lineage views for business users A simplified view of lineage that highlights the end points and not the transformations in between Drill down to expand any lineage path to see more details INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

360 Relationship Views Discover related datasets by uncovering relationships Get a 360 Degree View of data asset using the relationship view. Includes related tables, views, domains and reports Expand relationship circles to get more details on relationship types and objects. INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

Integrated Data Profiling Statistics Understand data quality statistics before using data sets for analysis Profiling Statistics are available in data asset views Detailed Profiling Statistics including value distributions, patterns, data types and data domains for Columns INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

Custom Attributes: Create and Assign Leveraging Wisdom of Crowds Define custom attributes in Live Data Map Administrator and choose the applicable data assets. Users assign value to the attributes in the search results or asset views. INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

Custom Attributes - Search and Filter Leveraging Wisdom of Crowds Custom Attribute values are searchable so that users can get to annotated data assets quickly You can also choose to filter the search results by values of custom attributes INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

Business Classifications Provide Business Context Extract Classifications from Business Glossary..and use it to annotate datasets in EIC INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

Enterprise Information Catalog DEMO

Live Data Map - Architecture Applications EIC Data Governance Data Security IDL Informatica and 3 rd Party REST API Services Search Smart Tags Admin Lineage Job Management Relationships Evolution Scheduler Scanners Plugins Data Profiler Plugin Inference Analyzers Ingestion Service Processing Data Profiling Engine Hadoop Grid (Yarn) Titan MRS PWH HBASE Storage

Big Data Scale Designed for Large Scale Internally uses a fully managed Hadoop cluster to support enterprise scale deployments Can also be deployed on an existing Hadoop cluster Graph technologies to store and query large enterprise knowledge graphs High Load and Search Performance Supports Parallel Metadata Ingestion to quickly update the catalog with multiple sources High Speed Indexing to provide the most updated catalog content to the users. Distributed indexes to provide unmatched search performance over millions of data assets. High Availability Fault tolerance and High availability to provide 24X7 Catalog uptime INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE

Supported Connectivity Big Data Cloudera Navigator Hive(Cloudera/Hortonworks/MapR) Informatica Informatica Powercenter Informatica BDM User Scanner Business Glossary Classifications Business Intelligence IBM Cognos SAP Business Objects Database Oracle DB2 DB2 for z/os SQL Server Sybase JDBC Teradata Netezza Cloud Salesforce Tableau

Enterprise Information Catalog Benefits Catalog all data and processing assets in the enterprise ü ü All Data Sources: Databases, DW, Big Data, ETL Tools, BI reports and more. Data Sources for All: Make Data Source Discovery available to everyone including data scientists, data analysts, enterprise architects and developers, reducing data silos and increasing collaboration. Find and Explore the most relevant datasets for your data needs ü ü ü Find Available Datasets: Data Discovery through a powerful search engine to find relevant data from the catalog. Explore Data Potential: Understand underlying semantics and correlations across datasets to quickly explore data potential. Increase Productivity: Spend less time searching for data and more time and more time analyzing it. Enrich datasets by capturing and sharing context across the organization ü ü Leverage Wisdom of the Crowds: Enrich datasets with tribal knowledge making classifications, comments and more available to everyone in the organization. Provide Business Context: Increase IT-Business collaboration by providing the right business context to technical data assets. INFORMATICA CONFIDENTIAL DO NOT DISTRIBUTE 2 6

Questions & Answers Gaurav Pathak G. Srinivasa Raghavan Sanjeev Cherian Vamsi Krishna Darren Wrigley Director, Product Management Director, Development Director, Development Senior Manager, Development Product Specialist

User Groups Informatica User Groups are a great way for you to invest in your professional development and learn about new Informatica offerings. Local Chapter Leaders manage each IUG online and via in person meetings Network and Socialize Find and share content, best practices & tips Learn about the latest technologies and solutions from Informatica Discover how colleagues and peers use Informatica https://network.informatica.com/welcome/ LEARN MORE AT IW16 : Go to the Solutions Expo Informatica Pavilion / Ecosystem & Innovation Area: Talk to regional user group leaders Learn about meeting plans Join your regional user group When: Monday 6:00pm 8:30pm Tuesday 10:45am 2:15pm Wednesday 10:30am 1:45pm Where: Moscone West Hall Level One

Enterprise Information Catalog Addressing Key User Challenges Data Analysts IT Data Architects I can t easily discover and explore data without IT help I have no insight into what data sets are related or where the data come from I need a self-service environment where I can find data & information about it easily I have to deliver data assets to my business users The business wants an easy way for consumers to discover and use existing data assets in the organization I need a single catalog for the organization reducing data silos I have too much redundant data sprawl I want a single place for all users to discover and enrich enterprise data assets.

LDM Administrator

Add Catalog Resource Define Connection Settings Provide Metadata Extract and Profiling settings Schedule Scanner Runs 3

Resource Library See configured resources and schedules in the Library view with Filters. 3

Connection Assignments Connection Assignment through an easy to use interface 3

Custom Attributes: Create and Assign Define custom attributes in Live Data Map Administrator and choose the applicable data assets. Users assign value to the attributes in the search results or asset views.

Reusable Configurations Create Reusable Configurations for DIS settings used in Profiling. 3

Monitor Tasks Monitor Task Status and View Logs