September 10-13, 2012 Orlando, Florida Solving big data problems in real-time with CEP and Dashboards - patterns and tips Kevin Wilson Karl Kwong
Learning Points Big data is a reality and organizations must embrace to succeed Complex Event Processing (CEP) technology gives us a new way to look at big data real-time micro-trending CEP supports data processing patterns that are very useful but difficult to implement in traditional database model Leveraging big data in real-time will change the way organizations run
Opportunities Speed GPS Emails Recent Explosion of Data Instant Messages Communication Documents Mobile Tweets Sensors Smart Meter Temperature Transactions Service Calls Inventory Movements Sales Orders Velocity Things IDC predicts size of data digital universe grow to 2.7 zetta-bytes by end 2012-48%
Putting it into Perspective Average human hair is about 70 µm diameter very small Let s say 1 byte of data = 1 human hair 2.7 zettabytes worth of hair side-by-side: distance circling the earth 100 billion times go to the Sun and back 100 thousand times X 100 billion X 100 thousand
Common Sources of Big Data in the Enterprise Enterprises Web - weblogs, click stream events, web transactions ERP - B2B transactions, B2C transactions Contact Center Emails, telephony Industries Telecomm call records (CDR) Utilities smart meters Manufacturing - equipment health Common Characteristics High volume and velocity Streaming sources Operational in nature Demands a new way to look at this data!
Two Common Approaches to Big Data In-memory Analytical Appliances SAP HANA Load data into large main memory Store data an optimized format Large set aggregations (few million rows) can be done in seconds Data analysis tools can interface with the appliance to provide typical data analysis Map Reduce and Distributed File Systems - Hadoop Takes advantage of distributed processing to transform data Aggregation and transformation of extremely large set (multi billion rows) can be done in hours Data can then be fed into more traditional data analysis systems
A Different Way to Look at Big Data What is addressed by in-memory and map reduce? Volume of data Processing time Analysis What do we gain from big data today? Higher resolution (more records) can be used for analysis Trending can done over longer periods So what is missing? Focused on historical analysis Insights more suitable for strategic and tactical decisions Need a way to cope with big data and answer what is happing right now!
Different Way to Trend Analytical Trending Examples: Quarterly sales performance Annual customer satisfaction Monthly branch queue time Typical Aggregation: Years Quarters Months Weeks Support strategic and tactical decisions Strategic investments Compensation and rewards Weekly Staffing Corporate performance Real-time Micro Trending Examples: Max wait time for agent Banner ad click rate Failed inspection rate Typical Aggregation: Days Hours Mins Rolling or sliding window Support operational or timesensitive decisions Overtime approval Agent allocation Cross or up sell Fraud detection
Rolling or Sliding Window Aggregation 3.6 3.8 5 3.5 3.6 4 3.4 3.4 3.3 3.2 2 3.2 3.1 1 3 10 51 Min Avg. 2.8 0 3 0 20-44 0-96 5-9 8 10 10-14 12 14 10-19 16 15-19 18 2020-24 22 20-29 24 26 25-29 28 Sliding window approach Aggregate over an logical time/event window Computation is done continuously Filter out noise Take into account the most recent data Always telling us what is happening right now! Traditional analysis relies on landmark aggregation periods Longer the period the less up-to-date Resolution is reduced Does not reflect what is happening right now Increasing aggregation frequencies Reduces latencies Increases resolution Still doesn t tell what is happening right now At some point shortening aggregation period breaks down Too short aggregation period exposes noise in the data Loose visibility on general data movement Is there a way hide noise and lower latency?
New Class of Software is Needed What is happening now? BPM Complex Event Processing (CEP) Alerts / Notifications SOA / ESB Streams APPS Context History KPIs / Goals Semantic Layer Visualize Analyze Data Warehouse Data Mart Performance Management What has happened
SAP Sybase Event Stream Processor INPUT STREAMS Studio (Authoring) Market Events Dashboards Transactions SAP Sybase Event Stream Processor? Sybase IQ Process Events Applications Message Bus Reference Data Unlimited number of input streams Input events in native formats Incoming data is processed as it arrives, according to the business logic defined using high level authoring tools Stream output to apps, dashboards Range of built-in adapters for out-of-thebox connectivity Java, C++ and.net API s for custom integration
Continuous Computing Language (CCL) CCL - primary method to interact with SAP Sybase ESP Extension to Structured Query Language (SQL) Added keywords for defining and manipulating time windows and related operations CCL allows continuous processing of high-volume of streaming data Insert Into StreamSummary Select Max(Price) as High, Min(Price) as Low, First(Price) as Open, Last(Price) as Close From StreamFeed Keep Every 1 minute Group By Symbol
Window-based Processing Patterns CCL enables some powerful window-based data processing concepts beyond continuous metrics Occurrence detection Detect 1-N occurrences of a condition over a time period Useful in fraud detection and intrusion detection Example: detect excessive use of a smart cash card over a short period of time Absence detection Test for absence of a certain event over a given period Useful in transportation and logistic scenarios Example: matching order, packing and shipping records over set SLA period absence of event trigger alert Threshold crossing Detect when a value crosses a predefined threshold Support up, down or dual direction threshold violation Use of multiple threshold to create complex alarm conditions Example: combine multiple threshold such as waittime, drop rate and skill set to set off critical alert to reallocated or call in addition agents Condition-based stream splitting State management
Power Dashboards with New Insights Create new real-time dashboards with real-time continuous metrics and alerts! Real-time dashboards allow users to: Assess current environment quickly Provide quick summary of situation Only see what s relevant and important for job Comprehend severity of situation (or opportunity) Show current information vs. projected or historic data Reflect impact across activities or processes or Project status (red/yellow/green) Show appropriate time window & appropriate detail what is being measured Act in time Display prominent but relevant alerts Point to specific actions
Putting All Together Demo
Some Questions Answered by Micro-trending Big Data Financial firm: I want to track the current value and net gain of all my positions, and monitor my aggregate exposures in real-time ecommerce: I want to customize offers based on current behavior to improve conversion rates Telecom provider: I want to alert Customer Service when an individual customer has just experienced their 4 th dropped call in a 2 hours Healthcare: I want to be alerted when resources staff and equipment, are not available in the right place at the right time Spot emerging threats or opportunities before it s too late React to changing conditions sooner Make decision based on more timely information
Relevance in Every Industry Financial Capital markets Banking fraud prevention Telecommunications Operations monitoring Mediation Proactive churn management Utilities Smart grid applications Demand management Retail / consumer product goods Real-time click stream analysis Customer sentiment analysis Supply chain management Hospitality / Service On-line gaming Customer experience and loyalty Healthcare Healthcare (e-care, asset tracking) Transportation Location-based monitoring Customer satisfaction / loyalty Public Sector Situational awareness for public safety Homeland security
Key Takeaways CEP complements HANA and map reduce in managing Big Data Real-time micro-trending of big data supports informed operational decision making CEP provides powerful data processing capabilities unachievable using tradition databases Combining CEP and Big Data give organizations a definite advantage
Questions?
Thank you for participating. Please provide feedback on this session by completing a short survey via the event mobile application. SESSION CODE: 0715 Learn more year-round at www.asug.com