End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ
Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction, and competitiveness Reduce adverse outcomes like fraud, crime, downtime, security breaches
Data Management Teams Face Dual Pressures Growing data supply Relational, Mainframe 90% of the world s data was created in the last two years Documents, Emails Social Media, Web Logs Enterprise apps now using petabytes of data center storage Machine Device, Cloud Pressure to manage efficiently
Data Management Teams Face Dual Pressures Growing data demand Data Scientists There are more mobile devices than people on Earth Operational Dashboards Mobile Users Enterprises have more data consumers than ever before Automated Alerts Pressure to deliver performance
Data Warehouses Not Optimized for Modern Needs SOURCE DATA Databases DATA WAREHOUSE Files INGEST ETL on MPP & Grid Servers ELT Pushdown to EDW or scripting Servers & Mainframe Batch Replicate Stream Archive Extract Transform Load Transform Query Social Sensor data New sources of data like social media and sensors are not easily stored and processed in data warehouses 30-50% DW CPU capacity used by ELT processing 30-70% of data in DW is unused or infrequently used 5
Opportunity: Hadoop Is An Efficient, Scalable Platform Flexible DATA MODELING Scalable TO LARGE DATASETS Efficient BASED ON COMMODITY SERVER/STORAGE Enterprises are adopting Hadoop to augment Data Warehouses and drive more compelling analytical outcomes 6
But Hadoop Can Be Hard to Adopt Can t Re-Use EXISTING SKILLS WHEN PLATFORMS CHANGE Can t Re-Use EXISTING PROCESSES TO DRIVE SCALABILITY AND REPEATABILITY 7
Challenges in Realizing the Potential of Hadoop of enterprises face challenges in adopting Hadoop for better analytics Slow to Collect Hard To Re-Direct Manual To Perfect
Data Warehouse Optimization Solution Data Warehouse (DW) 3 Offload Data 3. As Data Volume Grows Too Large 4. Offload infrequently used data to Hadoop Data Sources Extract Transform Load (ETL) BI/Analytics Relational, Mainframe 1 Documents and Emails 4 6 Social Media, Web Logs 2 Machine Device, Cloud Offload ETL Workload 1. As ETL jobs become overwhelming 2. Offload some ETL jobs to Hadoop HDFS HDFS HDFS Cisco UCS Servers Deliver Enriched BI/Analytics 5. Federate data from DW and Hadoop 6. Deliver richer & deeper data for Analytics
Contexti Analytics Platform (Powered By Hadoop) Transaction / ERP Data Foot Traffic / Sensor Data Analytics Platform Analytics Platform On-Demand Custom Powered By Hadoop GET INSIGHTS on Customers, Competitors, Sales & Operations DRIVE GROWTH INNOVATE & COMPETE with Data-Driven Decisions Public Data / Demographic Data Geospatial Data Clickstream / Web Data Social Media Data + Wi-Fi & Mobile Data Customers
Contexti Analytics Platform (MapR/Cisco) Architecture & Design Contexti Analytics Platform Architecture & Design Data Sources Stream Serve Data Consumers Steam Data Acquisition Raw Data Stream (In-Memory) CEP / Stream Analytics Real-Time Incremental Views Application Systems Semi & Unstructured Data File Acquisition Data Ingestion Batch Raw & Enhanced Data Sets (HDFS / MFS) Deep Analytics Machine Learning Feature Generation Batch Pre-computed Views Data Access API RDBMS & MPP Platforms Structured Data RDBMS Data Ingestion Guided & Ad-Hoc Analysis Reporting, Search & Query Metadata Management Operations & Provisioning Governance
Many Products, Components & Integration Considerations
Contexti IP Codified Solutions from 3 years & 40+ projects Analytics Platform Contexti Platform Partners AUTOMATED & OPTIMIZED 40+ solution deployment Components 200+ Configurations SPEED-TO-MARKET Contexti 1 day Others 3+ months MANAGED / AS-A-SERVICE
Unleash the Power of Big Data Informatica Developers are Now Hadoop Developers = PowerCenter Big Data Edition Profile Parse ETL Cleanse Match = Data Quality Big Data Edition Relational, Mainframe Analytics Teams Documents and Emails Load Replicate Load Services Data Warehouse Social Media, Web Logs Stream Events Analytics & Op Dashboards Archive Topics Mobile Apps Machine Device, Cloud = Vibe Data Stream for Machine Data Alerts
Example: This is your transformation
Example: This is your transformation with Informatica
Informatica Big Data Edition Faster to Collect Easier To Re-Direct Surer To Perfect 200+ High Performance Connectors 100+ Pre-Built Parsers Complex Data Parsing Multiple Styles Data Ingestion Simple Visual Environment Optimized Execution Re-use existing skills Easy to monitor & secure Easy to mask sensitive data Extensive Data Quality Data Domain Discovery In-Depth Metadata Mgmt Business Glossary End-to-end Data Lineage Messaging, Web Services MPP Appliances Mainframe and Midrange Relational, NoSQL, Flat Files Unstructured Data XML Standards SaaS Social Media Industry Standards Packaged Apps
Customer s Benefit The entire Informatica platform all executes on Hadoop without a developer having to know Pig, Hive, or MapReduce etc. Allows you to develop your Big Data projects five times faster, resulting in lower development costs and most importantly, getting your Big Data projects out the door in 2 months instead of 10.
Changing the Analytics Equation: Shift Effort from Data Preparation to Data Analysis Hand Coding Time available for data analysis Develop new products and services faster and cheaper Free up Analysts to focus on analysis Allow more available & affordable Informatica developers to handle data preparation Time spent on data preparation (parse, profile, cleanse, transform, match)
Changing the Analytics Equation: Shift Effort from Data Preparation to Data Analysis With Hand Informatica Coding Time available for data analysis Develop new products and services faster and cheaper Free up Analysts to focus on analysis Allow more available & affordable Informatica developers to handle data preparation Time spent on data preparation (parse, profile, cleanse, transform, match)
Proven Technology Leadership - Gartner Enterprise Data Integration Cloud Data Integration Data Quality Data Masking Data Archiving Master Data Management
Proven Technology Leadership - Forrester Data Virtualization Enterprise ETL Cloud Data Integration Master Data Management Product Information Management Data Governance Tools Big Data Streaming Analytics Platforms
What Are Customers Doing with Informatica and Big Data?
Better Customer Experience! A Financial Services company sought to enrich customer data with online weblog data! Informatica provided! Fast weblog parsing! Quick enrichment! Lineage tracking! Data warehousing costs were reduced 24
Higher Service Levels for the Supply Chain! A Manufacturer sought to proactively identify issues before downtime affected costs and service levels! Informatica provided! Real-Time ingestion of sensor data! Efficient preparation of data! Potential maintenance issues predicted before they happen 25
Newer Products and Services! An energy company sought to deliver better service its customers! Informatica provided! Ingestion from smart meters! Enrichment of customer information! More accurate forecasting of customer demand and more customized delivery of innovative services 26
Thank You Franco Flore Alliance Sales Director - APJ