1 designing and exploiting BIG DATA PLATFORM SQL BIG DATA PLATFORM
2 Purpose: Capturing allbusiness data and powering smart applications that turn data into profits Big Data Platform Smart Applications
3 Smart Applications turn data into profits by delivering % the most relevant LOYALTY offers for EACH CLIENT the BEST PRICE for each product at all times the most engaging MOBILE CONTENT or COUPON for each client % the best PRODUCT ASSORTMENT for each store all the time.. the right check-out COUPON for each client in real time the most RELEVANT 1- TO-1 COMMUNICATION at all times the right amount of INVENTORYin each store every day BIG DATA PLATFORM the most RELEVANT WEBSHOP CONTENT for each client Smart apps deliver > 10 of profits for every 1 of cost
4 Why BIG DATA PLATFORM? SQL (status quo) Big Data Platform Profitability Capacity SQL BIG DATA PLATFORM SQL BIG DATA PLATFORM Generate > 10 of profits for every 1 of cost Never lose or dump data again Speed SQL BIG DATA PLATFORM Decrease latency from hours to seconds Cost Capability SQL BIG SQL BIG DATA PLATFORM 1/100 th of SQL s cost Massively increase the depth of information captured
5 Acting as a bridge between Transactions and Decisions Transactional world Data Supply Chain Marketing Operations all relevant data sources Unlimited capacity Event Sourcing Agility & Resilience Very low cost ROI Decisional World Smart applications exploiting the data Making operational decisions High degree of automation High ROI Strong profitability
6 Removing the performance limitations of SQL Unlimited storage: Close to unlimited, affordable storage True availability: Stable and high performance no matter how much data is stored and how many applications are pulling data Preserving the full history of your data: Record every change to your data and enable much smarter decision (Event sourcing) High agility: Rapid iterations over the data Simplicity: Data is documented and truly easy to consume Serviceability: Emphasize on self-serving patterns while avoiding table clutter and cryptic fields Rapid & simple implementation: Scoping and implementation in weeks 1/100 th of the price of SQL storage at massively superior performance
7 Smart Apps take operational decisions and create ROI The Data Platform has the purpose to serve smart apps : Applications that deliver intelligence over the data made available. Examples: Choosing the right amount of inventory for each reference in each store. Choosing the right price for each reference in each store. Choosing the right coupon for each loyalty card holder.. Each questions is addressed by its own dedicated smart app. Theses apps can be developed internally or by 3rd parties. The Data Platform is built to offer the maximalagilityin the process: Plugging an app on top should be a matter ofhours. The goal for these Smart appsis to directly generateoperation decisions(ex: adjusting a price, launching a re-order, choosing a coupon) that are plugged directly into the transactional system later on.smart appsthat operate over the Data Platform should not be confused withbusiness Intelligence.
8 All experts in the company are empowered to initiate and use smart apps The number of people or teams that are involved with these smart apps can be very large: The Data Platform makes it very easy to consume clean and practical data. The performance to access data isboth very stable and very high. Expertise that exists among all the teams of the retailers can be preserved and leveraged. The Data Platform is precisely designed to not be the bottleneckof single data initiatives. The whole setup emphasizes self-service. After the initial coaching, consumer teams are able to work autonomously (without the Data Platform team). The platform will leverage the expertise of all experts in the company
9 The Pitfalls of classic data approaches True granular intelligence requires the full depth and history of data for really smart decisions. Classic approaches in commerce fail mainly for 3 reasons: Lack of agility (speed): Being data smart implies being able to iterate rapidly (weekly or even daily) over the data. Classic approaches are too slow: The world changes faster than results get delivered. Ongoing loss of information: A lot of data is lost in subtle yet critical ways due to the performance limits of classic solutions (e.g.: historical inventory levels, sales history beyond x month etc.) Lack of affordable scalability: Classic solutions limit the amount of data that can be stored and processed, while being extremely expensive. Unlimited scalability should neither require expensive hardware, nor require expensive teams to run.
10 A Custom Design insures 100% fit and max performance Custom Design: Why not use a big framework/application for all clients? 90% of the development effort attributable to the creation of connectors to the various relevant data sources This is always required. No packaged framework is ever a perfect fit for your business. The result is unnecessary complexity and an impedance mismatch: Teams end up trying to recycle features that where never intended for that exact use. A customization from the bottom up increases performance and fit compared to a pre-packaged approach. Example: By adopting a domain-specificdata format, it is possible to store and process 1 year of checkout receipts for a network of 1000 stores on a smartphone. See
11 Event Sourcing (I): Capture Much More SQL implies adopting tabular storage and its subtle yet very strong limitations. In SQL, only the present stage of each data field is preserved. The history of each data point is however lost. Example Stock-On-Hand History: In most ERPs, the SKU is associated with a table that matches SKU ID and SKU Stock-on-Hand. However, the history that has led to this stock-on-hand situation is lost. Many important questions can therefore not be addressed: What was the stock on hand at any point in time? What was the list of stock corrections applied to the SKU over time? How many units have been discarded because the reached the expiration date? What was the true service level of a product over time? When using Event Sourcing, all tables are replaced by a single (potentially very long) list of events. The whole path that has led to the present situation is recorded. Each event can truly capture theintentbehind each data change. Example Price Change: Instead of just entering a new price X for a SKU/location, it becomes possible to capture Price moved to X because competitor just lowered to Y. This type of information is hugely valuable to build smart apps and create ROI.
12 at a fraction of the Latency and 1/100 th of the price SQL is by design very slow on querying large datasets. A relational database trying to address a SQL query inreal-timehas no other option but to sequential iterate over a massive chunk of data. It simply cannotbe made fast. The event sourcing approach consumes the eventsas they come in, and the result is always up to date. The result is an extremely low latency. See example Loyalty data at the end of the presentation. Scalability & Cost: Storing billions of events is simple, and extremely well-suited for cloud storage. In practice monthly storage cost is approximately 1/100 th of the cost of SQL. Storage cost example 1 TB of data : 100 /monthinstead of /month in SQL storage
13 Cloud Computing is a Must-Have for the Data Platform Unprecedented TCO Public clouds (Windows Azure in particular) offer an unprecedented total cost of ownership to access computing resources: Ballpark: 100 /month per TB storage (Internal initiatives cannot come close to economies of scale of a public cloud) Availability and auto-scaling The cloud offers two properties that are simply invaluable for a Data Platform: 1. Auto-scaling: The infrastructure will adapt (*) to the workload pressure. Performance is always exactly the same (no matter it is the first day of the month, or middle of the night) 2. High availability and fault-tolerance: The cloud allocatescomputing resourcesand abstracts away the hardware failure. No need to worry any more about failing hard drives and the myriad of similar hardware glitches that do happen all the time. * However, auto-scaling works only if the software architecture has been natively designed for the cloud. Achieving this is exactly one of the aspects covered by Lokad. While the Data Platform is custom software, we strongly suggest to adopt a public cloud as it is an essential ingredient both to massively lower the costs and massively increase the agility of the project.
14 Exposing Smart Results, not Business Intelligence Automation is a requirement for cost efficiency & scalability When numbers are read by people, they are very expensive. In retail, any software that produces numbers that are expected to be read bypeopleis fundamentally non-scalable. The goal of the smart apps is to generateoperation decisions(ex: moving a price down) that are fed directly into the transactional systems.smart appsthat operate over the Data Platform should not be confused withbusiness Intelligence. The reliability of smart app outputs is insured by the Data Platform ERP systems are likely to expect very reliable data sources. The Data Platform offers a way to collect the results from the smart apps and to expose them. This introduces reliability even when the underlying smart app s analysis isnotreliable. By doing so, the Data Platform makes those results suitable for a production use through the existing transactional systems. The Data Platform serves as dedicated abstraction layer that helpsretail expertsto focus on their core domain instead of IT technicalities.
15 API We favor REST The projections are made accessible through API (application programming interface). We favor one very specific flavor of API: the REST API. There are many practical benefits of having APIs: The technology behind the API (aka the Data Platform itself) can be radically different from the technology powering thesmart apps. Retail is vast,one size fits all is not a reasonable position for any relatively large retailer. It creates overall access patterns that are much easier to document, much easier to consume as well. It allows tuning on a need basis very specific access rules, which can be extremely valuable when plugging 3 rd party companies to the Data Platform.
16 Application Example: Loyalty Data Storage Card Loyalty Challenge: Maintaining a projection of all loyalty card holders with half a dozen dimensions such as: The number of purchases in the last 3 years The average basket size Demographics. SQL: The above projection is quasi impossible to run as a SQL query. The query has to iterate over every single transaction over the last 3 years, which proves extremely time intensive. Cloud Data Platform: Retrieving such a projection can be done in seconds. (*) * A possible work-around is a SQL table dedicated to the client profiles and nightly batches that will update this table with the data of the day. However: It is convoluted and requires a database expert to devise a strategy to deal with the problem. It creates confusion between tables containing input data and intermediate computations. It creates data duplication and amplifies the overall scalability problems
17 Application Example: Inventory Tracking Status Quo: In SQL, only the present stage of each data field is preserved. The history of each data point is however lost. The problem: The information of what was when on-hand at any point in time is very valuable for many decisions and smart analytics, e.g.: Correction of electronic inventory records increased accuracy Out-of-shelf monitoring increased accuracy Inventory optimization measure true availability and performance SQL: SQL storage does not allow to preserve the history of on-hand inventory levels over time. Cloud Data Platform: Full history of on-hand levels available. User for out-of-shelf monitoring, for the automatic correction of records and for tracking inventory performance.
18 Application Example: Receipt and Coupon Storage % Status Quo: Limitations on storage capacity cause the loss of data history. Limitations on latency make reasonable queries impossible. Examples: Receipts can only be recorded for a few month, history is truncated Querying receipts is time consuming, expensive and limited Loyalty coupons cannot be accessed or created in real time Cloud Data Platform: The platform allows the efficient storage of all data and full history including events. The data is accessible from all parts of the company, even mobile devices. Low latency make the data useable for smart apps that provide value to staff and customers.
19 Application Example: ecommerce Data Hub Status Quo: ecommercemanagers know the value of data for their business. Massive amounts of data are generated each day. Personalization is the next challenge. However, ambition is far ahead of the status quo. Storage capacity large amounts of data generated by web analytics and operations Latency near real time access often required Accessibility Smart exploration (smart applications) Cloud Data Platform: The platform allows the efficient storage of all data and full history including events from all relevant sources. The data is accessible from all parts of the company and all applications. Low latency provide real time capabilities. Smart applications on top increase profitability. Examples: More relevant product suggestions/more relevant personalization of the webshop The best price for each product at any point in time The most efficient inventory optimizationfor minimal operative cost and highest availability
20 Required Investment < 100k Lokadcan help you rolling out your own BIG DATA PLATOFORM, and to plug and/or building smart apps that generate direct ROI. 50k for a scoping mission Scope the usages that would benefits from the Data Platform. Clarify the vision about the data, how it should be collected, structured and exposed. Setup the proper collaborative tools, development tools and processes to carry on with this Data Platform initiative. If required: Help hiring the 1 or 2 developers that will be needed internally to run the Data Platform. 20k for drafting a prototype Data Platform Goal: Setup a minimal project that could be extended to your technical teams, with the architecture and design patterns in place to kickstart the project. Plugging-in two identified data sources. Setting up an event storage over Windows Azure. Setup sample projections. Setup sample APIs. 20k for drafting a prototype smart app Goal: illustrate that ROI can be generated through the Data Platform. Devise a statistical analysis (prototype) to address an existing problem for the retailer.
21 Thank you! Contact: Joannes Vermorel Founder Matthias Steinberg CEO Head Office 10 rue P. de Champaigne 75013, Paris France German Office Wöhlertstrass 12/ Berlin Germany
Database Systems Journal vol. IV, no. 3/2013 31 Big Data Challenges Alexandru Adrian TOLE Romanian American University, Bucharest, Romania firstname.lastname@example.org The amount of data that is traveling across
How to embrace Big Data A methodology to look at the new technology Contents 2 Big Data in a nutshell 3 Big data in Italy 3 Data volume is not an issue 4 Italian firms embrace Big Data 4 Big Data strategies
IBM Industries White paper Business analytics in the cloud Driving business innovation through cloud computing and analytics solutions 2 Business analytics in the cloud Contents 2 Abstract 3 The case for
Product Overview for Windows Small Business Server 2011 December 2010 Abstract Microsoft offers Windows Small Business Servers as a business solution for small businesses by providing a simplified setup,
SAP Statement of Direction Business Intelligence Solutions Business Intelligence Solutions from SAP: Statement of Direction Table of Contents 3 Quick Facts 4 Driving Business Innovation Through Radical
Cloud Computing Payback An explanation of where the ROI comes from November, 2009 Richard Mayo Senior Market Manager Cloud Computing email@example.com Charles Perng IBM T.J. Watson Research Center firstname.lastname@example.org
SAP Business One Whitepaper Page 1 SAP Business One, The Answer to the Challenges of SMB Business Management Software Selection Contact: Daniel A. Carr email@example.com Phone: 248-347-4600 Date: June 14,
A smarter : an IBM perspective In collaboration with Frost & Sullivan Table of contents The changing customer 3 6 10 14 19 Digitally connected Social Informed and demanding Empowered 3 Worth a Tweet? 12%
Customer Cloud Architecture for Big Data and Analytics Executive Overview Using analytics reveals patterns, trends and associations in data that help an organization understand the behavior of the people
An introduction and guide to buying Cloud Services DEFINITION Cloud Computing definition Cloud Computing is a term that relates to the IT infrastructure and environment required to develop/ host/run IT
MASARYK UNIVERSITY FACULTY OF INFORMATICS Best Practices in Scalable Web Development MASTER THESIS Martin Novák May, 2014 Brno, Czech Republic Declaration Hereby I declare that this paper is my original
Three steps to put Predictive Analytics to Work The most powerful examples of analytic success use Decision Management to deploy analytic insight in day to day operations helping organizations make more
Issue 4 Handling Inactive Data Efficiently 1 Editor s Note 3 Does this mean long term backup? NOTE FROM THE EDITOR S DESK: 4 Key benefits of archiving the data? 5 Does archiving file servers help? 6 Managing
Retail Banking Business Review Industry Trends and Case Studies U.S. Bank Scotiabank Pershing LLC Saudi Credit Bureau Major International Bank Information Builders has been helping customers to transform
CIC Guide: Continuous Delivery Realization Enterprise DevOps realities and a path towards Continuous Delivery A Creative Intellect Consulting Executive Summary Report IT as a competitive advantage is an
Masaryk University Faculty of Informatics Master Thesis Database management as a cloud based service for small and medium organizations Dime Dimovski Brno, 2013 2 Statement I declare that I have worked
Hurwitz ViCtOrY index Advanced Analytics: The Hurwitz Victory Index Report SAP Hurwitz Index d o u b l e v i c t o r Marcia Kaufman COO and Principal Analyst Daniel Kirsch Senior Analyst Table of Contents
The Definitive Guide tm To Cloud Computing Ch apter 10: Key Steps in Establishing Enterprise Cloud Computing Services... 185 Ali gning Business Drivers with Cloud Services... 187 Un derstanding Business
www.pwc.com PwC Advisory Oracle practice 2012 How to drive innovation and business growth Leveraging emerging technology for sustainable growth 1 Heart of the matter Top growth driver today is innovation
Introduction.... 1 Emerging Trends and Technologies... 3 The Changing Landscape... 4 The Impact of New Technologies... 8 Cloud... 9 Mobile... 10 Social Media... 13 Big Data... 16 Technology Challenges...
MITSloan MANAGEMENT DIGITAL TRANSFORMATION: A ROADMAP FOR BILLION-DOLLAR ORGANIZATIONS FINDINGS FROM PHASE 1 OF THE DIGITAL TRANSFORMATION STUDY CONDUCTED BY THE MIT CENTER FOR DIGITAL BUSINESS AND CAPGEMINI