NeuroBayes Big Data Predictive Analytics for High Energy Physics & "Real Life Prof. Dr. Michael Feindt Karlsruhe Institute of Technology Founder & Chief Scientific Advisor, Blue Yonder GmbH&Co KG
Blue Yonder provides predictions --- based on data (scientifically sound with quantified uncertainty -- testable and falsifiable, Data Mining and conventional Business Intelligence Predictive Analytics WHAT HAS HAPPENED? WHY? WHAT WILL HAPPEN? DATA PATTERN PATTERN PREDICTIONS Influence on business performance
Blue Yonder is unique Predictive Analytics Suite with a combination of statistical algorithms and neural networks Experienced, award-winning physicists and computer scientists from renowned institutions such as CERN make up the development team : >70 Ph.D.s Optimal order quantity Probability densityp E(X) 100 120 Sales
General Overview Fundamental Research Insurance Premium Optimization... Sales Forecasts Sports event predictions Fraud Detection NeuroBayes Patient s treatment optimization
Our roots: 30 years of elementary particle physics, peaking at the LHC at CERN. Built to understand how exactly our universe works. Schreiben Sie hier Ihren Text CERN (1960) CERN (2005) LHC: 27km circumference Photo: CERN
Our Background: High Energy Physics Fundamental research at the forefront of science The Large Hadron Collider 100m under ground, 27km circumference Very strong worldwide competition on getting results from data very strong statistical methods: fast & robust multivariate algos Photo: CERN
NeuroBayes Fundamental research at the forefront of science Significance control Input Preprocessing Postprocessing Output Invented in 2000 for reconstruction of b quark fragmentation in DELPHI experiment. Further development in Phi-T, later Blue Yonder. Several hundred successful applications in DELPHI, CDF II, Belle, CMS, ATLAS, LHCb, H1, AMS experiments. More than 400 men-years development Robust and fast algorithm for reconstruction (= prediction) of conditional probability densities classifications with extreme generalization ability by means of Bayesian regularization.
NeuroBayes example: The LHCb trigger very fast intelligent decisions with NeuroBayes At the LHC (CERN) per experiment: 40 000 000 events per second, which translates into 1 PetaByte (1,000,000,000,000,000 Byte) per second raw data But only 1 PB of interesting data per year can be stored. Need online reduction by 1 : 10,000,000 At the LHCb experiment 30 000 instances of NeuroBayes running real-time 24/7 filter out the interesting events without introducing lifetime bias Photo: CERN
NeuroBayes example: Full reconstruction of B mesons at the Japanese B factory experiment Belle Fundamental research at the forefront of science Ø Belle experiment at KEK/Japan Ø 400 physicists from whole world Ø 10 years of data taking and analysis Ø World record luminosity Ø > 100 publications Ø Automatic hierarchical reconstruction system built from 72 NeuroBayes networks reconstructed about 1100 different reactions with a factor 2 larger efficiency than all analyses before Ø Much cleaner signal Ø Work performed by 2 PhD and 1 master student Ø Corresponds to 500 normal PhD theses Ø Corresponds to another 10 years of data taking
Future: intelligent decisions directly on sensor (Belle II pixel detector), before big data reaches any computer Goal: find (classify) all relevant pixel information Big Data challenge: process approx. 10G bit per s Solution: NeuroBayes on Hardware Blue Yonder Seite 10
NeuroBayes @ hardware*: 200 million decisions per second à 5ns for one decision BELLE 2 Experiment : utilizes 40 boards: à 8 billion decisions per second *features dedicated hardware board:»»»»» NeuroBayes on FPGA Field Programmable Gate Array: (XILINX Virtex6 VLX75T) Clock frequency: 250 MHz Approx. 1 decision per clock cycle (fully pipelined architecture) Probability decision output possible Blue Yonder Seite 11
NeuroBayes from Science to Industry Predictive Analytics in High Energy Physics Energy Momentum 10 0 50 Direction 90 E(X) Type Sub-Detector Kaon Calo propabilityp Particle Property Distance 200... Use all available and relevant information as input, e.g. measurements from the various sub-detectors, NeuroBayes will extract statistically significant patterns in the data to derive the prediction. Prediction will return the best estimator for a measurement including a statistically sound estimation of the expected spread.
NeuroBayes from Science to Industry Predictive Analytics in industry Article size Picture size M 21% e.g. Retail colour red E(X) Previous sales brand 24 171 propabilityp Prediction sales price 19,9... Use all available and relevant information as input, e.g. article properties, previous sales, etc NeuroBayes will extract statistically significant patterns in the data to derive the prediction. Prediction will return e.g. the most probable sales rate including a statistically sound estimation of the expected spread. NeuroBayes allows data-driven analysis and forecasts both in science and industry
We are familiar with your questions BIG DATA PREDICTIVE ANALYTICS Each day, companies receive giant quantities of structured and unstructured data from different sources Sample recognition of data for the prediction of risks and development to provide forward-looking bases for decision-making Growth of the data volume MCKINSEY MCKINSEY Increase of operating margins + 40% + 60%
We are the pioneers SOCIAL MEDIA PRODUCT RECOMMENDATION RETARGETING PREDICTIVE MAINTENANCE AREAS OF APPLICATION PREDICTION OF RISKS SALES PREDICTIONS DYNAMIC PRICING CUSTOMER ANALYSIS MERCHANDISE PLANNING
Technology Overview OPERATIONAL SYSTEM (Environment) WEB UI & SERVICES NEUROBAYES SYSTEM Simulator (industry-specific) STREAM- BASED INPUT BATCH INPUT High Performance Data Transformation R > R > Trainer Expert Predictions Industryspecific Output REAL-TIME FEED OUTPUT HISTORICAL TRANSACTION DATA EXPERTISE (Industry-specific models)
NeuroBayes - a neural network of the 2nd generation and much more NeuroBayes is an algorithm capable of forecasting whole probability density functions individual patient average patient
NeuroBayes System Working principle Historic Data Record a =... b =... c =...... t =! NeuroBayes Teacher Expert System Expertise Probability density function for target quantity t Current Data Record a =... b =... c =...... t =? NeuroBayes Expert
Forecasts conditional on many features The forecast depends on several features which themselves have several manifestations Features may be arbitrarily correlated Features can be ordered / unordered sets and continuous variables This results in a complex & high-dimensional space which is impossible to treat with classical methods Example: What s the right dose for a patient, if she is 56 years old, slightly overweight, works out on 2 days a week, enjoys late dinners, has been treated for 2 other diseases already, etc, etc,
Preprocessing The preprocessing is an extremely important process before training a network It involves steps like: Ø smoothing out statistical fluctuations and outliers in input variables Ø transforming variables to unified characteristics (mean, width) Ø Decorrelate variables Ø Find variables with significant impact / throw out others The benefits of a powerful preprocessing algorithm involve Ø increased robustness Ø increased network training results (minima easier to find) Ø increased training speed
Training a network The following process describes how a Neural Net is trained: 1. Start with initial values 2. Measure correct forecasts 3. Vary the weights in order to increase correct forecasts 4. After some loops, an optimal set of weights will (hopefully) be found 5. Save the weights and the topology => This is your expertise (!) This method corresponds to an optimization in a high-dimensional and complex space an extremely hard task (!) NB: Finding the global maximum can never be guaranteed NB: This is why the preprocessing is so important, it creates a smooth surface
Illustration of the optimization problem in two dimensions Find the deepest valley in only two dimensons you cannot look at each valley valleys are not smooth you only have limited time Now imagine this task for 100 dimensions only local minima can be found Seite 22 26.09.13 Blue Yonder at BASF
What is so special about forecasting whole density functions? Shape of the distribution becomes visible (e.g. non-gaussian) Uncertainty of the forecast becomes visible and can be extracted as a number (variance, MAD of dist.) Different estimators may be used, such as: mean, modus, median, p% quantile
Neural Networks concepts adopted from neuro science Ø The human brain solves complex problems very efficiently, detects patterns and stores information (memory) Ø It consists of approximately 10 11 neurons and10 14 connections Ø Oversimplied: Neurons start sending signals when other neurons reach a certain activity threshold
NeuroBayes output interpretable as a Bayesian posterior Likelihood Prior Posterior Evidence NB1: Posterior ist the probability that the theory is right under the given data NB2: Pior distributions need to applied carefully. A non-informative prior distribution is not flat (tax authorities know about that). Bayesian regularisation at each analysis step in order to avoid overfitting and select models with good generalisation properties is essential!
Individualized Classifcation & Density Forecasts Ø Classification Ø Target will be true / false Ø NeuroBayes delivers a number that can be interpreted as a Bayesian Posterior Probability Ø Probability density forecast Ø Individiual PDF for each event with asymetric uncertainties
Retail Optimization of item sales predictions Solution: Provision of item sales predictions on a daily basis Predictions for calculation of the return quota Creation of detailed merchandise planning suggestions Result: Improvement of predictions by 40% Inventory improvement in the double-digit million range per year. "A self-learning system such as NeuroBayes suits our dynamic business model Our prediction quality is increasing constantly and the sales quantities predicted are becoming ever more precise. The solution helps us adjust early on to future developments. Michael Sinn, Director Purchasing Support Page 27 26.09.13 Blue Yonder at BASF
Sales Forecast Fashion Example: OTTO Group Per item:» Sales forecast» Two estimates on spread (68% and 95% confidence intervals) Sales [units]
ROI calculations for the Otto Group
Perishable goods in Supermarkets Meat, fruit & veg, bread, diary,. Around 7% of all perishable foods (e.g. meat, fruit & veg., etc) have to be disposed of in German supermarkets. That s about 89M tons of food wasted per year 26.09.13 Blue Yonder at BASF 30
Grocery Chain: Auto Replenishment Predictions from Blue Yonder vs. In-House solution Overconfidence and gut feeling produced up to 40% higher write-offs in stores not fully automated by Blue Yonder Pecentage of write-offs CW 06 CW 07 CW 08 CW 09 CW 10 CW 11 CW 12 CW 13 CW 14 Blue Yonder Forecasted (Actual) In-House Solution Forecast (Actual)
e.g. Individual risk predictions for car insurances: Accident probability Claims distribution Large claim prediction Contract cancellation prediction è Successfully implemented at
Correlations to target variable Ramler II-Plot
NeuroBayes delivers precise prognoses for the customer-individual number and height of claims Premium differentiation: NeuroBayes adjusts premium to customer-individual risk Customer structure optimisation Bind your good customers and take the bad customers Rentability improvement: Simultaneously increase your total premium volume and decrease your claims rate with a more just tariff system Risiko Premium volume Anzahl Kunden Alter Tarif NeuroBayes Claims rate Bisheriger Tarif Prämie, normiert Alter Tarif NeuroBayes
Private health insurance claims per year anything but normally distributed... NeuroBayes has the solution for difficult distributions of type f (t) = (1 P) δ(t) + P f (t t > 0) Many insured persons (fraction1-p) do not generate any claim When there is at least one claim, (fraction P), these are distributed according to f(t t>0). This distribution has fat tails (extremely high claims). t Difficult to handle by classical methods
NeuroBayes calculates for each insured person x the individualised Bayesian probability density. NeuroBayes has the solution for difficult distributions of type f (t x) = (1 P( x)) δ(t)+ P( x) f (t t > 0, x ) Insured person x will have no claims with probability 1-P(x) If insured person x will have any claim, the costs will be distributed according to f(t t>0,x) t δ(t) = Dirac- delta-,,function (distribution)
Healthcare insurance long term prediction from anamnesis NeuroBayes Expert Estimation (risk premium loading) Ø Expert estimations are at best random for patients with a long history even systematically wrong. Ø NeuroBayes forecasts costs correctly and significantly beats expert estimations more than 10 years into the future
Revenue Forecast Example: dm Large German drug-store chain Key Challenge:» Revenue prediction for each individual store,» Used for staff planning» Up to ½ year in advance» Keep track of opening times, public holidays, weather,.. Forecast = 1.02 Sales Easter Ascension Whitsun Page 38 26.09.13 Blue Yonder at BASF
Revenue Forecast Example: dm Large German drug-store chain Forecasts for individual stores» Prediction of the full probability density function.» Precise forecast of the exptected revenue including exptected spread (68% and 95% confidence intervals) Probability density function revenue
Further Examples» Churn-Management in telecommunications» Identify customers who have a high risk to cancel their monthly contract» Forecast of targeted promotions and individual measures to prevent churn. Blue Yonder beats all our churn prediction models. The more complicated and challenging the task the better. NeuroBayes outperforms the competition.» Churn-Management for daily newspapers» Identify 67% of all customers likely to cancel their contract by predicting the most interesting 10% of all customers to target. Fractopm cancellations
Further Examples Risk-Management Among all customers who have a high risk of not paying their debts, identify those who are most likely to pay their outstanding debts. Many more 300.000 250.000 200.000 150.000 100.000 50.000 0 35 Tage 60 Tage days days Testgruppe NeuroBayes Kontrastgruppe Conv. approach
Prognosis of sports events from historical data: NeuroNetz er Results: Probabilities for home - tie guest
Blue Yonder: Awards for Big Science Startup DLD 2013: Best Enterprise Solution Retail Technology Award 2012 3 time winner of the Data Mining Cup bwcon Hightech Award 2012 Finalist 2012 Finalist 2013 Special Prize Deutsche Boerse 2012
Disclaimer This Presentation (the Presentation) has been prepared by Blue Yonder GmbH & Co KG (collectively, with any officer, director, employee, advisor or agent of any of them, the Preparers) for the purpose of setting out certain confidential information in respect of Blue Yonder s business activities and strategy. References to the Presentation includes any information which has been or may be supplied in writing or orally in connection with the Presentation or in connection with any further inquiries in respect of the Presentation. This Presentation is for the exclusive use of the recipients to whom it is addressed. This Presentation and the information contained herein is confidential. In addition to the terms of any confidentiality undertaking that a recipient may have entered into with Blue Yonder, by its acceptance of the Presentation, each recipient agrees that it will not, and it will procure that each of its agents, representatives, advisors, directors or employees (collectively, Representatives), will not, and will not permit any third party to, copy, reproduce or distribute to others this Presentation, in whole or in part, at any time without the prior written consent of Blue Yonder, and that it will keep confidential all information contained herein not already in the public domain and will use this Presentation for the sole purpose of setting out [familiarizing itself with] certain limited background information concerning Blue Yonder and its business strategy and activities. The foregoing confidentiality obligation shall be legally binding for the recipient infinitely. This Presentation is not intended to serve as basis for any investment decision. If a recipient has signed a confidentiality undertaking with Blue Yonder, this Presentation also constitutes Confidential Information for the purposes of such undertaking. While the information contained in this Presentation is believed to be accurate, the Preparers have not conducted any investigation with respect to such information. The Preparers expressly disclaim any and all liability for representations or warranties, expressed or implied, contained in, or for omissions from, this Presentation or any other written or oral communication transmitted to any interested party in connection with this Presentation so far as is permitted by law. In particular, but without limitation, no representation or warranty is given as to the achievement or reasonableness of, and no reliance should be placed on, any projections, estimates, forecasts, analyses or forward looking statements contained in this Presentation which involve by their nature a number of risks, uncertainties or assumptions that could cause actual results or events to differ materially from those expressed or implied in this Presentation. Only those particular representations and warranties which may be made in a definitive written agreement, when and if one is executed, and subject to such limitations and restrictions as may be specified in such agreement, shall have any legal effect. By its acceptance hereof, each recipient agrees that none of the Preparers nor any of their respective Representatives shall be liable for any direct, indirect or consequential loss or damages suffered by any person as a result of relying on any statement in or omission from this Presentation, along with other information furnished in connection therewith, and any such liability is expressly disclaimed. Except to the extent otherwise indicated, this Presentation presents information as of the date hereof. The delivery of this Presentation shall not, under any circumstances, create any implication that there will be no change in the affairs of Blue Yonder after the date hereof. In furnishing this Presentation, the Preparers reserve the right to amend or replace this Presentation at any time and undertake no obligation to update any of the information contained in the Presentation or to correct any inaccuracies that may become apparent. This Presentation shall remain the property of Blue Yonder. Blue Yonder may, at any time, request any recipient, or its Representatives, shall promptly deliver to Blue Yonder or, if directed in writing by Blue Yonder, destroy all confidential information relating to this Presentation received in written, electronic or other tangible form whatsoever, including without limitation all copies, reproductions, computer diskettes or written materials which contain such confidential information. At such time, all other notes, analyses or compilations constituting or containing confidential information in the recipient s, or their Representatives, possession shall be destroyed. Such destruction shall be certified to Blue Yonder by the recipient in writing. Neither the dissemination of this Presentation nor any part of its contents is to be taken as any form of commitment on the part of the Preparers or any of their respective affiliates to enter any contract or otherwise create any legally binding obligation or commitment. The Preparers expressly reserve the right, in their absolute discretion, without prior notice and without any liability to any recipient to terminate discussions with any recipient or any other parties. The distribution of this Presentation in certain jurisdictions may be restricted by law and, accordingly, recipients of this Presentation represent that they are able to receive this Presentation without contravention of any unfulfilled registration requirements or other legal restrictions in the jurisdiction in which they reside or conduct business. Page 44 26.09.13 Blue Yonder at BASF