Objectives The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions 2013 Project Botticelli Ltd & entire material 2012 Microsoft Corp unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.
Register on projectbotticelli.com Introduction to BI & Big Data DAX MDX Data Mining
Big data, or just complex data? preparing interpreting velocity volume Data variety complexity
Today s big data, tomorrow s little data Complexity vs. current capabilities FAA International Flight Service Station, Honolulu, Hawaii, 1964 (Public Domain Image)
Domain Common big data scenarios Financial services Media & Entertainment Retail Telecommunications Government Healthcare Modeling true risk Threat analysis and fraud detection Recommendation engines Ad targeting Point of sales transaction analysis Customer churn analysis Customer churn prevention Network performance optimization Cyber security (botnets, fraud) Traffic congestion and re-routing Genomics research Cancer research Trade surveillance Credit scoring and analysis Search quality Abuse and click fraud detection Sentiment analysis Call Detail Record (CDR) analysis Network failure prediction Environmental monitoring Antisocial monitoring via social media Health pandemics early detection Air quality monitoring
Big data + traditional BI = power & simplicity Big, fast, or complex data Microsoft HDInsight SQL Server tabular, multidimensional, relational DW, or PDW Interaction, exploration, visualisation
Microsoft HDInsight Apache Hadoop distribution Developed by Hortonworks & Microsoft Integrated with Microsoft BI
Hadoop Principles Practical method for massive parallelisation of analytical data processing
Part 1: the job DEMO
Hadoop Principles: Data
Hadoop Principles: MapReduce
Hadoop cluster
Hadoop cluster Buster Cluster, an early research project by Miles Osborne, University of Edinburgh, School of Informatics. Picture used with permission. http://homepages.inf.ed.ac.uk/miles/
Hadoop cluster Cloud rent-a-hadoop-cluster, or: Supercomputer for cents Windows Azure HD Insight
Processing logic in HDInsight
JS MapReduce Wordcount
Pig Latin Example It s All Parallel! [see http://pig.apache.org/docs/r0.7.0/tutorial.html]
Reusing processing logic libraries Collaborative filtering, recommenders, clustering, singular value decomposition, parallel frequent pattern mining, naive Bayes, decision tree
Part 2: the results DEMO
From HDInsight to attractive Microsoft BI
Operationalising Hadoop
Summary projectbotticelli.com video PPTs articles rafal.net
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions 2013 Project Botticelli Ltd & entire material 2013 Microsoft Corp unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.