Big Data Big Business - Achieving advantage through technology
|
|
- Alberta Lyons
- 8 years ago
- Views:
Transcription
1 Big Data Big Business - Achieving advantage through technology Prof. Dr. Michael Feindt, Karlsruhe Institute of Technology KIT Chief Scientific Advisor, Blue Yonder GmbH & Co KG European Life & Health Tour, London, October 12, 2012
2 Big Data: Google, Facebook: Unstructured data from the web Map Reduce For many users it means something quite different: Gigantic databases Technology at CERN and other particle accelerators Grid Computing Predictive Analytics, NeuroBayes Data driven decision making in companies Also very interesting for insurances! Statistically relevant insight instead of gut feeling.
3 Predictive Analytics - the IT-topic of the coming years -gigantic value of data stored in data warehouses -optimisation and automatisation of strategic and especially regularly reccurring operative decisons -Predictive Analytics Software uses information in company data bases, combines it with external data sources, and processes it with most modern mathematical methods to calculate predictions about the future, employing probability densities, and on this basis makes optimal decisions.
4
5 Big Data at CERN: 40 million collisions per second. 1 PByte= 1015 Byte= bytes of data / second Prof. Dr. Michael Feindt, KIT and Blue Yonder, Swiss Re Life and Health Insurance Tour, London, Oct. 12,
6 Data rates Trigger: Datareduction 1/10 Mio. 1 PB per year are stored and made available to thousands of physicists worldwide: à GRID
7 1 PetaByte = Byte If 1 bit corresponds to one leaf... 1 Pbyte corresponds to all leaves on earth
8 NeuroBayes A high-tech-algorithm from experimental high energy physics can learn complex dependencies from historical data bases of companies and uses this for predictions of the future. Based on an artificial neural network, but is much more. Extremely high generalistion ability (i.e. the future reality is well described by the prognosed probability density.)
9 NeuroBayes in elementary particle physics: Discovery of new particles and new reactions Used in Online-Trigger at LHCb Full automation of complex scientific analyses (KEK: Equivalent of about 500 PhD theses with 72 NeuroBayes networks: efficiency +100% compared to manual work of 400 scientists in 10 years)
10 Knowledge from world-class research in high energy physics CERN, Fermilab, KEK Teilchenkollisionen the largest particle accelators of the world collisions per second. Gigantic data Peta-Bytes of raw data, to be distributed world-wide 1 interesting event per 10 mio collisions. applied to problems in economy Prof. Dr. Michael Feindt, KIT and Blue Yonder, Swiss Re Life and Health Insurance Tour, London, Oct. 12, 2012 Seite 10
11 Stationary, distance-, online trading Better sales predictions and optimised merchandise planning for articles, to improve ability to deliver and less unsold articles at end of season What were if fashion in the right colour and size were never sold out? Page 11
12 1000 stores, 5000 fresh articles Up to 5 mio. sales predictions per day Up to 1.5 billion sales predictions per year Fully automatic optimal purchase quantity calculation per store/article/ day and automatic ordering What, if a large retailer knew exactly how much fruit of each sort were sold which day? Page 12
13 Millions of customers, hundred millions of historic transactions Fair, risk-adjusted tariffs Precise overview over the risk of an insurance What, if an insurance knew individual risks over months and years into the future? Page 13
14 A company with strong brains Founded in 2008, Blue Yonder with its NeuroBayes Suite belongs to the leading vendors of prediction and pattern recognition software or Predictive Analytics.» Unique predictive analytics suite with a combination of statistical algorithms and highly optimised neural nets.» Distinguished and experienced physicists and information scientists from famous research institutes like CERN are the developing team» NeuroBayes has its origin in experimental particle physics and was developed in more than 400 man-years. Seite 14
15 The difference: better algorithms / employees / results Cyberchampion Award Data Mining Cup Top Product Trading Retail Technology Award Bwcon:CyberOne Award Winner High Potentials CyberChampions judges yooung and expanding companies in the technoloy region around Karlsruhe, Winner 2006,,auction prices (,,ebay ) Winner 2009 Sales prediction ( Libri ) Winner 2010 Intelligent Couponing ( Amazon ) After bronze in 2011 readers of the journal handelsjournal voted for Silver for the NeuroBayes in 2012 in the category cost effectiveness Retail technology award: Best Enterprise Solution for Blue- Yonder-Customer OTTO 2012: Blue Yonder most innovative Medium sized/ growth companies in Baden- Württemberg Seite 15
16 Neural Networks The information (the knowledge, die expertise) is in the connections between the nerve cells. Each neuron takes fuzzy decisions (Fuzzy-logic) NeuroBayes > learns extremely fast from historical data (weeksà minutes) > is extremely robust > suppresses statistical noise (high generalisibility) > Can make binary decisions (classify) > Can calculate complete probability densities > Can predict the future reliably
17 Prediction of the complete probability density Expectation value Mode Standard deviation (Volatility) Deviation from normal distibution (heavy tail)
18 Turnaround prediction for distance selling company
19
20
21
22 The <phi-t> mouse game: or: even your ``free will is predictable //
23 Technology overview: NeuroBayes» System integration ion basis of standard protocols and interfaces» Highly performant, scalable data processing architecture» Handles data in batch or real-time streaming mode» Training at run time possible without degradation of performance Seite 23
24 NeuroBayes applications for insurances
25 e.g. Individual risk predictions for car insurances: Accident probability Claims distribution Large claim prediction Contract cancellation prediction Successfully implemented at
26 Correlations to target variable Ramler II-Plot
27 NeuroBayes constructs risk optimal tariff systems Majority of good customers pays too much and thus subsidize the bad customers not paying enough: 40% pay too much 60% pay too little NeuroBayes adjusts the premium to the individual risk (at constant overall premium) 57% pay less than before 43% pay more Increase of the new NeuroBayes premium by 10%: 50% pay less than before 50% pay more Anzahl Kunden Prämie zu hoch Prämie zu niedrig Anzahl Kunden Anzahl Kunden Risiko/Prämie Prämie, normiert Prämie, normiert
28 NeuroBayes delivers precise prognoses for the customer-individual number and hight of claims Premium differentiation: NeuroBayes adjusts premium to customer-individual risk Customer structure optimisation Bind your good customers and take the bad customers Rentability improvement: Simultaneously increase your total premium volume and decrease your claims rate with a more just tariff system Risiko Premium volume Anzahl Kunden Alter Tarif NeuroBayes Claims rate Bisheriger Tarif Prämie, normiert Alter Tarif NeuroBayes
29 Private health insurance claims per year anything but normally distributed... NeuroBayes has the solution for difficult distributions of type f (t) = (1 " P)# $(t) + P# f (t t > 0)! Many insured persons (fraction1-p) do not generate any claim When there is at least one claim, (fraction P), these are distributed according to f(t t>0). This distribution has fat tails (extremely high claims). t Difficult to handle by classical methods
30 NeuroBayes calculates for each insured person x the individualised Bayesian probability density. NeuroBayes has the solution for difficult distributions of type f (t x) = (1! P( x))"!(t)+ P( x)" f (t t > 0, x ) Insured person x will have no claims with probability 1-P(x) If insured person x will have any claim, the costs will be distributed according to f(t t>0,x) t δ(t) = Dirac- delta-,,function (distribution)
31 Evaluation of prediction methods Typical classical prediction methods e.g. generalised linear models return one value (point estimator). Often the interpretation is not unique. Often the value must be calibrated. Mostly no uncertainty or distribution of the truth around this value are predcicted. NeuroBayes can do more. From probability density economically optimal point estimator can be determined. Often used quality criterion: Mean square deviation or R 2 Not robust for distributions with,,fat tail And economically irrelevant (insurance pays, not 2 ) Quality of prediction results should be evaluated by following criteria: > high individualisation > generalisibility (no overtraining; i.e. individualisation turns out to be correct). >correct prediction of expectation values (Autocalibration). >correct predcition of uncertainty in form of credibility intervals. Much better: Median absolute deviation (MAD)
32 As large as possible individualisation The area between the lift-chart and the diagonal (Gini-coefficient) is as large as possible Gini = 0.41 Gini = 0.32 Sort customers according to predicted risk. Select fraction x with largest predcitions. y= Fraction of cumulated claims in this selection. max Fläche = 0.47 NeuroBayes Prof. Dr. Michael Feindt NeuroBayes-Prognosen als Basis für risikogerechte Wechselmodelle in der PKV classical model The NeuroBayes prediction individualises better! Seite 32
33 Judgement of prediciton methods Good generalisation ability (no overtraining) Gini coefficient (area between lift chart and diagonal) on test sample (green) compatible with expectation of training sample Lift-Chart for true claims Die forbidden regions correspond to a sorting power better than the truth (impossible) or worse than random. Test-Datensatz The area in this case is compapatible wiht the expectation (even a bit better) Trainings-Datensatz Prof. Dr. Michael Feindt NeuroBayes-Prognosen als Basis für risikogerechte Wechselmodelle in der PKV Seite 33
34 Check of calibration (NeuroBayes-diagonal plot) Test-Sample NeuroBayes-expectation value prediction: The mean value of the truth of all insured persons with mean prediction in 100 -bin really is 100 prediction is correct!
35 Quality of prediction Sorting of classical methods seems sensible, there is some correlation. However, without further calibration as prediction of mean value unusable. Red points Mean value of truth in bin Green region contains 68 % of entries with given mean prediction yellow region contains 95 % of entries with given mean Should lie on diagonal Should be as narrow as possible klassisches Modell NeuroBayes Prognose Daten mit Rechnungsbetrag>0 Daten mit Rechnungsbetrag>0 Prof. Dr. Michael Feindt NeuroBayes-Prognosen als Basis für risikogerechte Wechselmodelle in der PKV Seite 35
36 Quality of prediction Test of individual credibility intervals In future we will know the truth. We already now can predict that it will lie in 68 % of all cases in the predicted 1σ-credibility interval in 95 % of all cases in the predicted 2σ-credibility interval Typical test result: Fraction of entries in 1σ-interval : 68% expected Fraction of entries in 2σ-interval : 95% expected 67.9% measured 94.3% measured Tests in many very different applications: NeuroBayes -credibility intervals are very reliable. Most classical methods cannot calculate reliable zuverlässigen confidence or credibility intervals. Experience from NeuroBayes PKV-projekts: Credibility intervals for prediction in 2 years About 9% larger that for next year.
37 Prediction of quantiles is reliable over the complete widths. Multi-quantile-test: Fraction of insured persons in bins of e.g. expected mean costs, whose true claims (future information) will lie below the predicted 30%-, 20%-, 10%-...quantile. Prognostizierte mittlere Kosten The true costs are distributed over the complete width of the predcited probability density just as predicted. The predicted quantiles are reliably reconstructed. Attention: Bayes theorem! If tests are separately performed in subsamples, this separation is not allowed not depend on future information! Allowed are e.g. separation into sex, age, tariff, and any predicted quantities, in short any information known at prediction time.
38 Long-time prediction from anamnesis Target here: Probability that an insured person will claim more than average for his age/sex Simple NeuroBayes text analysis of anamnesis at start of contract (regularised Naive Bayes-ansatz) Measure for classical modelling: Percentage risk loading Comparison of sorting power for different time horizons: Bayes-analysis of anamnesis makes significant predcition even more that 10 years in advance Usual risk loading factors almost no correlation to truth, long term even worse than random Prof. Dr. Michael Feindt NeuroBayes-Prognosen als Basis für risikogerechte Wechselmodelle in der PKV Seite 38
39 Usage of NeuroBayes allows large improvements... Customer-individual claims distribution Probability distributions of indivual insured persons can vary considerably. Sorting powe NeuroBayes successfully sorts insured persons according to expected claims NeuroBayes calculates an individual probability distribution for each single customer. This allows a prediction of all releveant quantiles, thresholds and other statistical quantities. NeuroBayes shows by far best sorting power. The rank correlation coefficient is 3-times larger compare to classical (GLM)- predcition models. rsp: Rangkorrelationskoeffizient nach Spearman
40 Big Data and Predictive Analytics data driven individualisation of reliable predictions is possible using most modern statistical methods and software on large data sets. Prediction of individual risks More justice in tariffs, more profitable and simultaneously for the majority of clients more attractive tariffs. Individual customer scoring Individual optimisation of cross-selling Contract cancellation predictions Churn management Insurances only use a small part of the treasure sleeping in there data bases. Gigantic economic chances!
NeuroBayes Big Data Predictive Analytics for High Energy Physics & "Real Life
NeuroBayes Big Data Predictive Analytics for High Energy Physics & "Real Life Prof. Dr. Michael Feindt Karlsruhe Institute of Technology Founder & Chief Scientific Advisor, Blue Yonder GmbH&Co KG Blue
More informationThe Best from Two Worlds The Blue Yonder View on Data Analytics
The Best from Two Worlds The Blue Yonder View on Data Analytics Prof. Dr. Michael Feindt IEKP, Karlsruhe Institute of Technology Founder, Phi-T GmbH Founder & Chief Scientific Advisor, Blue Yonder GmbH
More informationMaximum Likelihood vs. Least Squares
Precondition Basis Maximum Likelihood vs. Least Squares pdf exactly known Height of pdf Mean and variance known Deviation from mean Efficiency Complexity Robustness Correlated measurements Special case
More informationNeuroBayes An advanced statistical tool for high energy physics and business
NeuroBayes An advanced statistical tool for high energy physics and business Prof. Dr. Michael Feindt CETA - Centrum für Elementarteilchen- und Astroteilchenphysik IEKP, Universität Karlsruhe Phi-T GmbH,
More informationSoftware for data analysis and accurate forecasting. Forecasts for Guaranteed Profits. The Predictive Analytics Software for Insurance Companies
Software for data analysis and accurate forecasting Forecasts for Guaranteed Profits The Predictive Analytics Software for Insurance Companies About Blue Yonder Blue Yonder, established in 2008, is the
More informationBlue Yonder Research Papers
Blue Yonder Research Papers Why cutting edge technology matters for Blue Yonder solutions Prof. Dr. Michael Feindt, Chief Scientific Advisor Abstract This article gives an overview of the stack of predictive
More informationNeural networks in data analysis
ISAPP Summer Institute 2009 Neural networks in data analysis Michal Kreps ISAPP Summer Institute 2009 M. Kreps, KIT Neural networks in data analysis p. 1/38 Outline What are the neural networks 1 Basic
More informationSoftware for data analysis and accurate forecasting. Forecasts for Certain Profits. The Predictive Analytics Software for Insurance Companies
Software for data analysis and accurate forecasting Forecasts for Certain Profits The Predictive Analytics Software for Insurance Companies About Blue Yonder Thanks to its highly successful NeuroBayes
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More informationBig Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
More informationProf. Dr. Michael Feindt KCETA - Centrum für Elementarteilchen- und Astroteilchenphysik IEKP, Universität Karlsruhe, KIT Phi-T GmbH, Karlsruhe
NeuroBayes et al.: professional methods for optimised reconstruction algorithms and statistical analysis Prof. Dr. Michael Feindt KCETA - Centrum für Elementarteilchen- und Astroteilchenphysik IEKP, Universität
More informationImproving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP
Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More informationWhy is Internal Audit so Hard?
Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets
More informationData-Driven Decisions: Role of Operations Research in Business Analytics
Data-Driven Decisions: Role of Operations Research in Business Analytics Dr. Radhika Kulkarni Vice President, Advanced Analytics R&D SAS Institute April 11, 2011 Welcome to the World of Analytics! Lessons
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationINTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr.
INTELLIGENT ENERGY MANAGEMENT OF ELECTRICAL POWER SYSTEMS WITH DISTRIBUTED FEEDING ON THE BASIS OF FORECASTS OF DEMAND AND GENERATION Chr. Meisenbach M. Hable G. Winkler P. Meier Technology, Laboratory
More informationAutomated decision-making along the product life cycle saves OTTO millions
Customer Case Study RETAIL Automated decision-making along the product life cycle saves OTTO millions OTTO is a leader in Smart Data in German retail Overview Customer Online retailer for fashion and lifestyle
More informationDanny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
More informationA Property & Casualty Insurance Predictive Modeling Process in SAS
Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing
More informationNeural Network and Genetic Algorithm Based Trading Systems. Donn S. Fishbein, MD, PhD Neuroquant.com
Neural Network and Genetic Algorithm Based Trading Systems Donn S. Fishbein, MD, PhD Neuroquant.com Consider the challenge of constructing a financial market trading system using commonly available technical
More informationAnalecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
More informationOutline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationAdvanced analytics at your hands
2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously
More informationCONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationChapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationIntroduction to Engineering Using Robotics Experiments Lecture 17 Big Data
Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems
More informationDriving Insurance World through Science - 1 - Murli D. Buluswar Chief Science Officer
Driving Insurance World through Science - 1 - Murli D. Buluswar Chief Science Officer What is The Science Team s Mission? 2 What Gap Do We Aspire to Address? ü The insurance industry is data rich but ü
More informationInsurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.
Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics
More informationCorporate Brochure. The best forecasts with Big Data. Software for data analysis and accurate forecasting
Corporate Brochure The best forecasts with Big Data Software for data analysis and accurate forecasting Content About Blue Yonder 3 Analyzing and Using Big Data 4 Blue Yonder Portfolio 5 Demand Planning
More informationBusiness Intelligence and Decision Support Systems
Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley
More informationUsing Adaptive Random Trees (ART) for optimal scorecard segmentation
A FAIR ISAAC WHITE PAPER Using Adaptive Random Trees (ART) for optimal scorecard segmentation By Chris Ralph Analytic Science Director April 2006 Summary Segmented systems of models are widely recognized
More informationFrom Big Data to Smart Data Thomas Hahn
Siemens Future Forum @ HANNOVER MESSE 2014 From Big to Smart Hannover Messe 2014 The Evolution of Big Digital data ~ 1960 warehousing ~1986 ~1993 Big data analytics Mining ~2015 Stream processing Digital
More informationData Science Center Eindhoven. Big Data: Challenges and Opportunities for Mathematicians. Alessandro Di Bucchianico
Data Science Center Eindhoven Big Data: Challenges and Opportunities for Mathematicians Alessandro Di Bucchianico Dutch Mathematical Congress April 15, 2015 Contents 1. Big Data terminology 2. Various
More informationUsing reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators
More informationWebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
More informationHigh-Performance Analytics
High-Performance Analytics David Pope January 2012 Principal Solutions Architect High Performance Analytics Practice Saturday, April 21, 2012 Agenda Who Is SAS / SAS Technology Evolution Current Trends
More informationSURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
More informationSpin-Off from Physics Research to Business
From Delphi to Phi-T Spin-Off from Physics Research to Business Prof. Dr. Michael Feindt KCETA - Centrum für Elementarteilchen- und Astroteilchenphysik IEKP, Universität Karlsruhe, Karlsruhe Institute
More informationMagruder Statistics & Data Analysis
Magruder Statistics & Data Analysis Caution: There will be Equations! Based Closely On: Program Model The International Harmonized Protocol for the Proficiency Testing of Analytical Laboratories, 2006
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationHow To Use Blue Yonder'S Predictive Analytics Software
Blue Yonder in practice Successfully realize Industry 4.0 s potential with accurate forecasts and automated decision-making Examples of applications of Blue Yonder Predictive Analytics in industry Blue
More informationGrabbing Value from Big Data: Mining for Diamonds in Financial Services
Financial Services Grabbing Value from Big Data: Mining for Diamonds in Financial Services How financial services companies can harness the innovative power of big data 2 Grabbing Value from Big Data:
More informationA Property and Casualty Insurance Predictive Modeling Process in SAS
Paper 11422-2016 A Property and Casualty Insurance Predictive Modeling Process in SAS Mei Najim, Sedgwick Claim Management Services ABSTRACT Predictive analytics is an area that has been developing rapidly
More informationPractice#1(chapter1,2) Name
Practice#1(chapter1,2) Name Solve the problem. 1) The average age of the students in a statistics class is 22 years. Does this statement describe descriptive or inferential statistics? A) inferential statistics
More informationStatistical Challenges with Big Data in Management Science
Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision
More informationBIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
More informationData Mining mit der JMSL Numerical Library for Java Applications
Data Mining mit der JMSL Numerical Library for Java Applications Stefan Sineux 8. Java Forum Stuttgart 07.07.2005 Agenda Visual Numerics JMSL TM Numerical Library Neuronale Netze (Hintergrund) Demos Neuronale
More informationMonitoring chemical processes for early fault detection using multivariate data analysis methods
Bring data to life Monitoring chemical processes for early fault detection using multivariate data analysis methods by Dr Frank Westad, Chief Scientific Officer, CAMO Software Makers of CAMO 02 Monitoring
More informationBI SURVEY. The world s largest survey of business intelligence software users
1 The BI Survey 12 KPIs and Dashboards THE BI SURVEY 12 The Customer Verdict The world s largest survey of business intelligence software users 11 This document explains the definitions and calculation
More informationChoices, choices, choices... Which sequence database? Which modifications? What mass tolerance?
Optimization 1 Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance? Where to begin? 2 Sequence Databases Swiss-prot MSDB, NCBI nr dbest Species specific ORFS
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationReal-time PCR: Understanding C t
APPLICATION NOTE Real-Time PCR Real-time PCR: Understanding C t Real-time PCR, also called quantitative PCR or qpcr, can provide a simple and elegant method for determining the amount of a target sequence
More informationBig Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
More informationAP Physics 1 and 2 Lab Investigations
AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks
More informationBig Data and utility function in bank services. Nikolay K. Vitanov 1
Big Data and utility function in bank services Selected aspects Nikolay K. Vitanov 1 1 Institute of Mechanics, Bulgarian Academy of Sciences Sofia, 16. 06. 2015 Vitanov (BAS) Big Data and utility function
More informationEPSRC Cross-SAT Big Data Workshop: Well Sorted Materials
EPSRC Cross-SAT Big Data Workshop: Well Sorted Materials 5th August 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations
More informationBehavioral Segmentation
Behavioral Segmentation TM Contents 1. The Importance of Segmentation in Contemporary Marketing... 2 2. Traditional Methods of Segmentation and their Limitations... 2 2.1 Lack of Homogeneity... 3 2.2 Determining
More informationBig Data and Analytics:
responsive, credible, flexible Big Data and Analytics: New data sources create transformation opportunities Mike Davis Principal Analyst All images acknowledged msmd advisors Ltd 2012 1 Running order Why
More informationCongrats to Game Winners. How can computation use data to solve problems? What topics have we covered in CS 202? Part 1: Completed!
CS 202: Introduction to Computation " UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department Professor Andrea Arpaci-Dusseau How can computation use data to solve problems? Congrats to Game Winners
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationSee the wood for the trees
See the wood for the trees Dr. Harald Schöning Head of Research The world is becoming digital socienty government economy Digital Society Digital Government Digital Enterprise 2 Data is Getting Bigger
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationNavigating the big data challenge
Navigating the big data challenge Do you have lots of data but few insights? By Rasmus Wegener and Velu Sinha Rasmus Wegener is a partner with Bain & Company in Atlanta. Velu Sinha is a partner in Bain
More information2.500 Threshold. 2.000 1000e - 001. Threshold. Exponential phase. Cycle Number
application note Real-Time PCR: Understanding C T Real-Time PCR: Understanding C T 4.500 3.500 1000e + 001 4.000 3.000 1000e + 000 3.500 2.500 Threshold 3.000 2.000 1000e - 001 Rn 2500 Rn 1500 Rn 2000
More informationWhy Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
More informationUsing Predictive Maintenance to Approach Zero Downtime
SAP Thought Leadership Paper Predictive Maintenance Using Predictive Maintenance to Approach Zero Downtime How Predictive Analytics Makes This Possible Table of Contents 4 Optimizing Machine Maintenance
More informationHow the Past Changes the Future of Fraud
How the Past Changes the Future of Fraud Addressing payment card fraud with models that evaluate multiple risk dimensions through intelligence Card fraud costs the U.S. card payments industry an estimated
More informationBig Data Strategies Creating Customer Value In Utilities
Big Data Strategies Creating Customer Value In Utilities National Conference ICT For Energy And Utilities Sofia, October 2013 Valery Peykov Country CIO Bulgaria Veolia Environnement 17.10.2013 г. One Core
More informationAGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
More informationA Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND
Paper D02-2009 A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND ABSTRACT This paper applies a decision tree model and logistic regression
More informationExploratory Data Analysis
Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
More informationCoolaData Predictive Analytics
CoolaData Predictive Analytics 9 3 6 About CoolaData CoolaData empowers online companies to become proactive and predictive without having to develop, store, manage or monitor data themselves. It is an
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationProblem Solving and Data Analysis
Chapter 20 Problem Solving and Data Analysis The Problem Solving and Data Analysis section of the SAT Math Test assesses your ability to use your math understanding and skills to solve problems set in
More informationData Mining and Neural Networks in Stata
Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it
More informationBig Data, Official Statistics and Social Science Research: Emerging Data Challenges
Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Professor Paul Cheung Director, United Nations Statistics Division Building the Global Information System Elements of
More information2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist
2015 Analyst and Advisor Summit Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist Agenda Key Facts Offerings and Capabilities Case Studies When to Engage
More informationEasily Identify Your Best Customers
IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do
More informationData Mining and Visualization
Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford Overview Data mining components Functionality Example application Quality control Visualization Use of 3D Example application Market research
More informationKNIME UGM 2014 Partner Session
KNIME UGM 2014 Partner Session DYMATRIX Stefan Weingaertner DYMATRIX CONSULTING GROUP 1 Agenda 1 Company Introduction 2 DYMATRIX Customer Intelligence Offering 3 PMML2SQL / PMML2SAS Converter 4 Uplift
More informationPredict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationAlgorithmic Trading Session 1 Introduction. Oliver Steinki, CFA, FRM
Algorithmic Trading Session 1 Introduction Oliver Steinki, CFA, FRM Outline An Introduction to Algorithmic Trading Definition, Research Areas, Relevance and Applications General Trading Overview Goals
More informationStreaming Analytics and the Internet of Things: Transportation and Logistics
Streaming Analytics and the Internet of Things: Transportation and Logistics FOOD WASTE AND THE IoT According to the Food and Agriculture Organization of the United Nations, every year about a third of
More informationData Centric Computing Revisited
Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data
More informationText Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com
Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around
More informationAbout The Express Software Identification Database (ESID)
About The Express Software Identification Database (ESID) The Express Software Identification Database (ESID) is a comprehensive catalog of commercial and free PC software applications that run on Windows
More informationData Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
More informationAMS 5 CHANCE VARIABILITY
AMS 5 CHANCE VARIABILITY The Law of Averages When tossing a fair coin the chances of tails and heads are the same: 50% and 50%. So if the coin is tossed a large number of times, the number of heads and
More informationA STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
More informationANALYTICS BUILT FOR INTERNET OF THINGS
ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that
More informationData Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationPredictive modelling around the world 28.11.13
Predictive modelling around the world 28.11.13 Agenda Why this presentation is really interesting Introduction to predictive modelling Case studies Conclusions Why this presentation is really interesting
More informationSignature Verification Why xyzmo offers the leading solution.
Dynamic (Biometric) Signature Verification The signature is the last remnant of the hand-written document in a digital world, and is considered an acceptable and trustworthy means of authenticating all
More informationBig Data Introduction, Importance and Current Perspective of Challenges
International Journal of Advances in Engineering Science and Technology 221 Available online at www.ijaestonline.com ISSN: 2319-1120 Big Data Introduction, Importance and Current Perspective of Challenges
More informationImprove Cooperation in R&D. Catalyze Drug Repositioning. Optimize Clinical Trials. Respect Information Governance and Security
SINEQUA FOR LIFE SCIENCES DRIVE INNOVATION. ACCELERATE RESEARCH. SHORTEN TIME-TO-MARKET. 6 Ways to Leverage Big Data Search & Content Analytics for a Pharmaceutical Company Improve Cooperation in R&D Catalyze
More informationInsights. Did we spot a black swan? Stochastic modelling in wealth management
Insights Did we spot a black swan? Stochastic modelling in wealth management The use of financial economic models has come under significant scrutiny over the last 12 months in the wake of credit and equity
More information