Anwendungsbereiche für Big Data & Analytics Dr. Thomas Keil, Program Marketing Manager Business Analytics, SAS Institute GmbH
SAS Institute SAS is the first company to call when you need to solve complex business problems. Dr. James H. Goodnight, CEO und Gründer von SAS Gegründet 1976 in Cary, North Carolina 12.000 Mitarbeiter weltweit in 400 SAS Offices in 52 Ländern Seit 1982, mit 470 Mitarbeitern in 6 deutschen Niederlassungen Etwa 50.000 Unternehmen und Organisationen aus allen Branchen setzen auf SAS Umsatz (D): 114 Millionen Umsatz (Int.): 2,43 Mrd. US$ Investition in R&D > 20 % 2
Gartner Hype Cyle Big Data Hype oder Trend? 3
Big Data Worum geht es? Source: An IDC White Paper - sponsored by EMC. As the Economy Contracts, the Digital Universe Expands. May 2009.. 4
Wie entsteht Big Data? Mobile transactions Sensor Data Pervasive Computing Weiter fortschreitende Digitalisierung Nutzer generieren Inhalte, teils automatisiert Maschinen generieren Daten 5
Big Data Eine Herausforderung Volumen Geschwindigkeit BIG DATA Heterogenität Es kommt darauf an, den Wert der Daten zu erkennen und zu nutzen! 6
SAS liefert Lösungen Erfassen Big Data Mehr Variablen Datenmanagement Analysieren Komplexe Berechnungen Varianten Explorative Analyse Umsetzen Schnelle Ergebnisse Monitoring Integration in Geschäftsprozesse Unlösbare Probleme / Neue Möglichkeiten / Unbekannte Risiken 7 Copyright 2011, SAS Institute Inc. All rights reserved.
8
High-Performance Computing SAS Analytics SAS High-Performance Computing SAS Grid Computing SAS In- Database SAS In- Memory Analytics Architecture Flexibility Deployment Flexibility Desktop, SMP, MPP, Grid On Premise, Cloud, Appliance 9
McKinsey-Studie zeigt Big Value Quelle: McKinsey Global Institute, May 2011: Big data: The next frontier for innovation, competition and productivity http://www.mckinsey.com/mgi/publications/big_data/pdfs/mgi_big_data_full_report.pdf 10
Beispiel: Handel Marketingoptimierung Visa Classic / Direct Mail Visa Classic / Call Center Visa Classic / Branch Kunden Visa Gold / Direct Mail Visa Gold / Call Center Visa Gold / Branch Home Equity Loan / Direct Mail Home Equity Loan / Call Center Home Equity Loan / Branch Millionen von Kunden Hunderte Angebote Bisher SAS High Performance Analytics Beschleunigung 5 h 29 min 2 min 164x 11
Beispiel: Bank Risikoberechnung 45.000 Finanzinstrumente 100.000 Marktparameter Zwei Zeitreihen 8,8 Mrd. Value at Risk- Berechnungen Bisher SAS High Performance Analytics Beschleunigung 18 h < 3 min 360x 12
Beispiel: Handel Preisfindung 73 Millionen Artikel Mehr als 800 Geschäfte Jede Woche hunderte Millionen von Preisentscheidungen Zwei bis drei Jahre historische Informationen ~ 3 Terabyte Daten 80% Hardware-Einsparungen! Bisher SAS High Performance Analytics Beschleunigung 27 h 15 min 1 h 15 min 22x 13
Beispiel: Gesundheitswesen Risikostrukturausgleich Quelle: GKV-Spitzenverband (Vortrag Mathias Kleinschmidt, 13.09.2011, Big-Data-Workshop, Fraunhofer-Inst, St. Augustin) 14
Beispiel: Geodaten Kundensegmentierung Quelle: infas geodaten (Vortrag Ludger Hertig, 13.09.2011, Big-Data-Workshop, Fraunhofer-Insitut, St. Augustin) 15
Anwendungsszenarien für jede Branche 16
Zusammengefasst... Immer mehr Daten mit SAS Kernkompetenz immer besseren Analysen in Business Analytics immer kürzerer Zeit für High Performance Computing Entscheidungsträger aufbereiten und Business Analytics Framework überall verfügbar machen. Mobile BI Strategie: SAS High Performance Analytics 17
18
Vielen Dank für Ihre Aufmerksamkeit. Dr. Thomas Keil Program Marketing Manager Business Analytics SAS Institute GmbH thomas.keil@ger.sas.com Tel. +49 6221 415-1268 Mobil: +49 173 6500790 Im Gespräch bleiben XING: Business Analytics mit SAS
Backup: Weitere Beispiele
SAS Grid Customer Churn and Cross-sell/Up-sell Telecommunications Challenges Grow market share Improve marketing efficacy Solution Grid-based deployment Better analytic processes and controls Benefits 15% improvement in marketing campaigns Reduced processing from 11 hrs. to 10 seconds We have jobs that use to take 11 hours to run. In the new analytics environment, they are running in around 10 seconds. Sandra Hogan, Director of Customer Intelligence 21
SAS In-Database Customer Buying Behaviors Retail Marketing Services Challenges Quickly profile customer behavior in real-time Better targeting of customer offers Analyze and gain insight on 2.5 petabytes of data Solution In-database processing Managed analytical environment Benefits Increase coupon redemption rate from 10 % to 25% Reduced model scoring from 4.5 hours to 60 seconds Catalina is now able to create more flexible, robust models that take advantage of complex analytic data preparation steps and methods without the need for manual recoding of our custom in-database scoring routines. Eric Williams, Chief Technology Officer for Catalina Marketing 22
SAS In-Database Customer Success Story Propensity to Pay Telecom Company: Providing Performance gains by Refactoring Past Approach Daily process begins with flat file creation at 6:30am SLA delivered at ~9:30am. File transferred to SQL Server, limited to ~350K customer records based on specific criteria. In-Database Approach Daily process begins at 4:00am with EDW load. All operational data loaded directly to EDW. No flat file or intermediate processing is needed. Business Issue: Improve collection of unpaid accounts Technical Limitation: 3 hour SLA Solution: SAS In-Database Scoring 300 step process to support data mining life cycle. 10 step process Scoring and customer selection done in-database against ALL customer rows Result: Business Process change leading to BETTER targeting and $1M to $3M extra collections a month. 30 MINUTES TO SCORE ~350k customers Runs in ~ 3 HOURS 4 MINUTES TO SCORE ~40M customers Runs in 12 MINUTES 23
Backup: Technische Umsetzung
Key Business Challenges Underutilized resources Support incremental growth Unnecessary data movement Guarantee uptime & continuity Increasing costs Growth in data and user volumes; complexity Slow time to results Slow response time Limited analysis due to lack of resources Low productivity SAS High Performance Computing SAS Grid Computing SAS In-Database SAS In-Memory Analytics Event Stream Processing 25
SAS High Performance Computing SAS Grid Computing SAS In- Database SAS In- Memory Analytics Event Stream Processing WHY IT Value Provide a centrally managed SAS environment to the enterprise Scale-out server infrastructure Business Reduce time to results Guaranteed uptime and continuity of services Increased user flexibility WHAT Functionality High availability Workload management Distributed enterprise job scheduling Scalability for a multi-user environment HOW Technology SAS Grid Manager Software to manage a SAS distributed (grid) environment SAS jobs/parts of a job (step boundaries) are split up to run in parallel across multiple servers in a managed grid environment. Shared physical storage is used within the environment Partner: Platform Computing 26
SAS High Performance Computing SAS Grid Computing SAS In-Database SAS In-Memory Analytics Event Stream Processing IT Business WHY WHAT Value Integrate with the current IT infrastructure, leveraging the current data warehouse investment Functionality Enable Data Governance Reduce Data Redundancy Reduce Information Latency Increase Hardware Utilization Streamlined Analytic Processes for Better Decisions Reduced time to results Minimize data preparation Accelerate data discovery Increase no. of models generated Develop complex models to improve outcomes HOW Technology SAS Scoring Accelerator, SAS Analytics Accelerator, SAS/ACCESS Ability to provide select Data Preparation, Data Exploration, Predictive Analytics and Scoring capabilities inside the data warehouse. The analytical computations within a SAS step boundary run in parallel leveraging the MPP (Massive Parallel Processing) and partitioned shared nothing capabilities of the database. Partners: Aster Data, EMC (GreenPlum), IBM DB2, Netezza, Oracle, and Teradata 27
SAS Scoring Accelerator and SAS Model Manager Model Management and Deployment Analytic Models SAS Model Manager SAS Scoring Accelerator Model Monitoring Database SAS Functions SAS Formats Enterprise Miner Models Adds Business Value Consistent model development and validation Understanding of model strategy and lifetime value Improves Production Process Efficient deployment of models in a timely manner Enforces Governance Process Audit trails for regulatory compliance Monitor the performance of models Provide qualitative overlay on test results Create reports detailing model performance Acceptable Recalibration Redevelopment Reduced data movement and latency Eliminate model score code rewrite and model revalidation efforts Achieve higher modelscoring performance and faster time to results Consolidate data to improve regulatory compliance Better manage, provision and govern data 28
SAS High Performance Computing SAS Grid Computing SAS In-Database SAS In-Memory Analytics Event Stream Processing IT Business WHY Value Provide a dedicated high performance, scalable, managed appliance for high-end analytics Solve complex and time-critical business problems involving big data in near real-time WHAT HOW Functionality Dedicated managed system (includes hardware and software) Designed and scoped for high performance computing and analytics Technology Near real-time results Handle large volumes of data and complex calculations Optimized for specific customer business issues SAS High Performance Markdown Optimization, SAS High Performance Risk, SAS High Performance Analytics, SAS High Performance Merchandise Planning Analytical computations and data are co-located in a distributed framework. In-memory analytics for rapid execution are able to read from and persist to distributed storage. Partners: Hewlett Packard (HP) for High Performance Solutions; Teradata and EMC/Greenplum for High Performance Analytics 29
Backup: Stufen von Analytics
31
1 STANDARD REPORTS Answer the questions: What happened? When did it happen? Example: Monthly or quarterly financial reports. We all know about these. They re generated on a regular basis and describe just what happened in a particular area. They re useful to some extent, but not for making long-term decisions. 2 3 4 AD HOC REPORTS Answer the questions: How many? How often? Where? Example: Custom reports that describe the number of hospital patients for every diagnosis code for each day of the week. At their best, ad hoc reports let you ask the questions and request a couple of custom reports to find the answers QUERY DRILLDOWN (OR OLAP) Answer the questions: Where exactly is the problem? How do I find the answers? Example: Sort and explore data about different types of cell phone users and their calling behaviors. Query drilldown allows for a little bit of discovery. OLAP lets you manipulate the data yourself to find out how many, what color and where. ALERTS Answer the questions: When should I react? What actions are needed now? Example: Sales executives receive alerts when sales targets are falling behind. With alerts, you can learn when you have a problem and be notified when something similar happens again in the future. Alerts can appear via e-mail, RSS feeds or as red dials on a scorecard or dashboard. 32
5 STATISTICAL ANALYSIS Answer the questions: Why is it happening? What opportunities am I missing? Example: Banks can discover why an increasing number of customers are refinancing their homes. Here we can begin to run some complex analytics, like frequency models and regression analysis. We can begin to look at why things are happening using the stored data and then begin to answer questions based on the data. 6 7 8 FORECASTING Answer the questions: What if these trends continue? How much is needed? When will it be needed? Example: Retailers can predict how demand for individual products will vary from store to store. Forecasting is one of the hottest markets and hottest analytical applications right now. It applies everywhere. In particular, forecasting demand helps supply just enough inventory, so you don t run out or have too much. PREDICTIVE MODELING Answer the questions: What will happen next? How will it affect my business? Example: Hotels and casinos can predict which VIP customers will be more interested in particular vacation packages. If you have 10 million customers and want to do a marketing campaign, who s most likely to respond? How do you segment that group? And how do you determine who s most likely to leave your organization? Predictive modeling provides the answers. OPTIMIZATION Answer the questions: How do we do things better? What is the best decision for a complex problem? Example: Given business priorities, resource constraints and available technology, determine the best way to optimize your IT platform to satisfy the needs of every user. Optimization supports innovation. It takes your resources and needs into consideration and helps you find the best possible way to accomplish your goals. 33