2 Using Drools and other JBoss projects to build real time mobile app performance analytics Prabhat
3 Who is Prabhat Jha? Loves to Build Stuff of value. Co-founder of InstaOps Director of Software Development at Apigee Member of JBoss Data Grid (Infinispan), JBoss EPP, Richfaces team. Founder of Eejot, a non profit to help improve computer literacy in LinkedIn: prabhatkjha
4 To do List: A brief history of InstaOps Quick Demo Different technology stacks used to build InstaOps Architecture Evolution good, bad and ugly J Lessons
5 InstaOps A software start up from Austin, Texas Founders Alan Ho and Prabhat Jha Bootstrapped Goal : Help developers build better mobile apps based on experiences from building better web
6 Factors Affecting App Perf Device Platform Device Model Network Carrier Network Type Location App Store Policy
8 Why it s not an easy problem? Multi-tenant App with handful of users to millions of users. App with average usage less than a minute to few minutes to hour Flaky network connection Battery Optimization Limited Data Plan Not chronological data Some data could arrive after few weeks. SDK needs to be fail safe
12 Got the Min Viable Product (MVP) SDK sending data in real time. Visualization Closing the loop
13 Hibernate hibernate.hbm2ddl.auto Batch insert Flexibility to switch between HQL and native SQL ResultTransformer, aliastobean Problem with Projection and where clause (i.e this.blah) Explicitly identifying attributes type (.addscalar..) Enum little bit of problem
15 Seam Better support for conditional page flow Helps tie Front End and Back End together Custom Authentication, Custom Authorization, SSO, User Sign up Customization
16 RDBMS (MySQL) Familiar ORM Tooling Cloud Easier Custom Report FB, Twitter etc used and still use. Easier White labeling of the product
17 Performance & Availability Loss of data when REST server is down. Problem with high volume apps. Aggregation for UI/Visualization painfully slow.
18 Introduce Queue Amazon SQS High availability Not expensive Read lock Fine grain access control From REST to Polling SQS Buffer hence increases availability
19 Batching and CEP Batching & Read Lock from SQS Aggregation using stateful CEP with Drools Local aggregation Optimal active sessions, session length calculation Less stress on database DB stores local aggregates
20 Problems with Stateful CEP - Data Loss - OOM - Need Clustering & Synchronization for auto scale in and out - Not mature clustering/replication tech for public cloud - Lacking tools & resources to automate
21 Stateless CEP Local Aggregation using stateless CEP using Drools High volume app does not affect other apps Some Database overhead Some action moved to SDK such as new session detection Amazon Beanstalk Auto Scaling
22 Final Architecture
23 Drools Code Snippet for Carrier Performance rule "NetworkMetricsCarrierRule" no-loop true when $nm : ClientNetworkMetrics( $networkcarrier : networkcarrier ) from entry-point "clientnetworkmetricsstream" not (exists NetworkMetricsCarrier(type == $networkcarrier)) then insert( new NetworkMetricsCarrier($networkCarrier)); end
24 Drools Code Snippet for Local Aggregation rule "CompactNetworkMetricCarrierGenerator" when $networkcarrier : NetworkMetricsCarrier() $nm : ClientNetworkMetrics(endMinute!= null, networkcarrier == $networkcarrier.type) from entry-point "clientnetworkmetricsstream" not (exists (CompactNetworkMetrics(appId == $nm.appid,endminute == $nm.endminute, networkcarrier ==$networkcarrier.type) from entry-point "compactnetworkmetricsstream_bycarrier")) then log.info("detected new ClientNetworkMetrics to trigger new local aggregation
25 Drools Code Snippet for Local Aggregation $metric : CompactNetworkMetrics() from entry-point "compactnetworkmetricsstream_bycarrier" $sumnumsamples : Number() from accumulate (ClientNetworkMetrics(appId == $metric.appid, endminute == $metric.endminute, networkcarrier == $metric.networkcarrier, $numsamples : numsamples ) from entry-point "clientnetworkmetricsstream",
26 Send Alerts rule "SimpleAlarm" salience -600 no-loop true when AppErrorsAppIds($appId : type); $assertlogs : ArrayList(size > 0) from collect (ClientLog(logLevel == "A", appid == $appid) from entry-point "clientlogstream") then log.info("detected critical errors for " + $appid); alarmservice.sendcriticalerrors($assertlogs, $appid); end
27 Denormalized Data No Joins Optimized for Insert Not really schema less but sort of helps towards that
28 Minimal Indexing Optimized for Insert
29 Data Pruning
30 Problems with Current Architecture Near real time about 2 min delay. New aggregation criteria requires new everything Cloud based only. Wasted cycles in polling Data Pruning is expensive. n entries in DB for local aggregation where n is number of injestor machines.
31 FAQs Why no NO-SQL databases?
32 Lessons Learned Tools/Services used properly can solve new and unknown problem space Don t boil the ocean. Cloud is awesome!!
33 Lessons Learned Developers have difficulty making up their minds and like stuff free. J Early to market may be. Multi-tenancy is hydra with new tentacles popping up all the time. Big data solution does not have to be based on NoSQL, Hadoop and other buzz words.
34 Give it a shot!! It s
35 Future Changes Full Text Search using Elastic Search or Solr or LogStash (Still evaluating) Use No-SQL likely Cassandra for Raw
White Paper Cloud-Based SCADA Systems: The Benefits & Risks Is Moving Your SCADA System to the Cloud Right For Your Company? White Paper Is Moving Your SCADA System to the Cloud Right for Your Company?
MASARYK UNIVERSITY FACULTY OF INFORMATICS Best Practices in Scalable Web Development MASTER THESIS Martin Novák May, 2014 Brno, Czech Republic Declaration Hereby I declare that this paper is my original
Cost aware real time big data processing in Cloud Environments By Cristian Montero Under the supervision of Professor Rajkumar Buyya and Dr. Amir Vahid A minor project thesis submitted in partial fulfilment
For Big Data Analytics There s No Such Thing as Too Big The Compelling Economics and Technology of Big Data Computing March 2012 By: 4syth.com Emerging big data thought leaders Forsyth Communications 2012.
A Fresh Graduate s Guide to Software Development Tools and Technologies Chapter 1 Cloud Computing CHAPTER AUTHORS Wong Tsz Lai Hoang Trancong Steven Goh PREVIOUS CONTRIBUTORS: Boa Ho Man; Goh Hao Yu Gerald;
WHITE PAPER 1ntroduction... 2 Zenoss Enterprise: Functional Overview... 3 Zenoss Architecture: Four Tiers, Model-Driven... 6 Issues in Today s Dynamic Datacenters... 12 Summary: Five Ways Zenoss Enterprise
Getting Started with New Relic: A Newbie s Table of Contents INTRODUCTION: Hello There, Newbie CHAPTER 1: Application Monitoring Overview CHAPTER 2: Real User Monitoring (RUM) CHAPTER 3: Transaction Traces
OPEN DATA CENTER ALLIANCE : sm Big Data Consumer Guide SM Table of Contents Legal Notice...3 Executive Summary...4 Introduction...5 Objective...5 Big Data 101...5 Defining Big Data...5 Big Data Evolution...7
Customer Cloud Architecture for Big Data and Analytics Executive Overview Using analytics reveals patterns, trends and associations in data that help an organization understand the behavior of the people
The Top 5 AWS EC2 Performance Problems How to detect them, why they occur and how to resolve them Alexis Lê-Quôc, CTO, Datadog Mike Fiedler, Director of Technical Operations, Datadog Carlo Cabanilla, Senior
Jazz Performance Monitoring Guide Author: Daniel Toczala, Jazz Jumpstart Manager The goal of this performance monitoring guide is to provide the administrators and architects responsible for the implementation
How AWS Pricing Works May 2015 (Please consult http://aws.amazon.com/whitepapers/ for the latest version of this paper) Page 1 of 15 Table of Contents Table of Contents... 2 Abstract... 3 Introduction...
Database Performance Study By Francois Charvet Ashish Pande Please do not copy, reproduce or distribute without the explicit consent of the authors. Table of Contents Part A: Findings and Conclusions p3
3 Big Data: Challenges and Opportunities Roberto V. Zicari Contents Introduction... 104 The Story as it is Told from the Business Perspective... 104 The Story as it is Told from the Technology Perspective...
Top SQL Server Issues ABSTRACT: By Andy McDermid September 2014 In a complex database environment, keeping tabs on the health and stability of each system is critical to ensure data availability, accessibility,
An Oracle White Paper March 2013 Big Data Analytics Advanced Analytics in Oracle Database Advanced Analytics in Oracle Database Disclaimer The following is intended to outline our general product direction.
Guide to Selecting a New IP Business Phone System A guide to identifying, selecting, purchasing and installing a new IP business phone system. By Trevor Jones, Director of Marketing & Product Development,
Securing Enterprise Applications Version 1.1 Updated: November 20, 2014 Securosis, L.L.C. 515 E. Carefree Highway Suite #766 Phoenix, AZ 85085 T 602-412-3051 firstname.lastname@example.org www.securosis.com Author
About this guide Deep Security provides a single platform for server security to protect physical, virtual, and cloud servers as well as hypervisors and virtual desktops. Tightly integrated modules easily
Page 1 of 10 Making Optimal Use of JMX in Custom Application Monitoring Systems With any new technology, best practice documents are invaluable in helping developers avoid common errors and design quality
On Designing and Deploying Internet-Scale Services James Hamilton Windows Live Services Platform ABSTRACT The system-to-administrator ratio is commonly used as a rough metric to understand administrative
Securing Traditional and Cloud-Based Datacenters With Next-generation Firewalls February 2015 Table of Contents Executive Summary 3 Changing datacenter characteristics 4 Cloud computing depends on virtualization
GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and
Best Practices in Deploying and Managing DataStax Enterprise 1 Table of Contents Table of Contents... 2 Abstract... 3 DataStax Enterprise: The Fastest, Most Scalable Distributed Database Technology for
Web Scale IT in the Enterprise It all starts with the data Issue 1 2 Q&A With Claus Moldt, Former Global CIO for SalesForce.com and David Roth, CEO of AppFirst 6 From the Gartner Files: Building a Modern
DATAMEER USE CASES EBOOK Top Five High-Impact Use Cases for Big Data Analytics You ve been collecting data for years. Learn how to use it to grow your business and gain a competitive edge. INTRODUCTION
Application Performance Management for Enterprise Applications White Paper from ManageEngine Web: Email: email@example.com Table of Contents 1. Introduction 2. Types of applications used