R YOU READY FOR PYTHON? Sunday 19th April, 2015

Size: px
Start display at page:

Download "R YOU READY FOR PYTHON? Sunday 19th April, 2015"

Transcription

1 R YOU READY FOR PYTHON? Sunday 19th April, 2015

2 THIS IS NOT A PYTHON VS R TALK credits -

3 WHO ARE WE? Danilo Maurizio Advanced Analytics Division data must be attended to give them confidence Gianluca Emireni Advanced Analytics Senior a life spent between good (statistics) and evil (IT) "@horsa.it, ".join(["danilo.maurizio","gianluca.emireni",""])[:-2]

4 WHO ARE WE? :-)

5 THE MOST COMPLETE REFERENCE :-) Data scientist Average programmer looking for a job that pays as much as what a top programmer would get. Sometimes also goes by the name data analyst. Statistician Mathematician who can t program. Correlation does not imply causation We looked at the wrong data set and can t draw any conclusions from it. Often represented in a graph to create the illusion of adding value. Machine Learning Statistical technique used by the sales and marketing department of big data vendors to secure their yearly bonus. (also see Our company is big data ready ) Hadoop Open-source software used for distributed computing. Data Scientists seem to have a quota to drop the name every two sentences when talking big data, but most only know the logo is a yellow elephant. There is a significant effect but Sentence-start used by data scientists or statisticians when they ve put weeks of work into their analysis, the results look fishy and not as expected, and there is no time to redo the analysis.

6 spurious correlation - ON TORTURING DATA

7 WE PREFER THIS: DRAWING INFERENCES FROM THE DATA Data scientist is sometimes used as an excuse for ignorance, as in I don t understand probability and all that stuff, but I don t need to because I m a data scientist, not a statistician Data science could be a useful umbrella term for statistics, machine learning, decision theory, etc. Also, the title data scientist is rightfully associated with people who have better computational skills than statisticians typically have John D. Cook:

8 WHAT WE DO We draw inference from data using: R python legacy stats suite (SAS, SPSS, SAP predictive stack, STATA)

9 COMFORTABLE WITH OPEN SOURCE project python pandas R RStudio Spark estimated effort (COCOMO model) $ 15 Mio. $ 2.5 Mio. $ 12 Mio. $ 6 Mio. $ 5.3 Mio.

10 more and more often we turn around the central question: how to balance and mix the best of R and python?

11 WHEN PYTHON We slightly and slowly moved all of our data management towards python (etl and data movement)

12 AND WHEN R while being tied to R for statistical learning

13 PURSUING THE BEST BALANCE

14 RANDOM FORESTS THROUGH SCIKIT- LEARN ON REVERSE LOGISTIC Fashion e-commerce has a huge problem with return rate - most of us have wives and credit cards :-) Purchase history and carts have enough information to train a model? (we know that features design/selection is the most time consuming activity) We tried hard and succeeded :-) Most important features: fit index, cart entropy, past shoppers attitude, transaction value, product quantity, Now deployed real time as a web service with milliseconds response time using: flask, circus, scikit-learn used to dynamically set shipping price

15 WHY PYTHON? We love caret library, an R counterpart of scikit-learn, but is much easier web serving this kind of model on python stack. 150 lines of code are enough to deliver json document with shopping bags return probability.

16 WHY R? In some contexts, CRAN shop (R libraries repository) offer very mature packages able to solve the whole class of statistical problems. For example, time series forecasting and statistical matching (causal inference) are kind of problems where R outperforms python in terms of completeness, documentation, deepness, The forecast library (Rob J. Hyndman) is the best in class package for time series modeling, not only for code quality but also for its theoretical and methodological support. The MatchIt and CEM library (Iacus, King, Porro) offer a full range of techniques to perform statistical matching.

17 flickr CC - BETTER TOGETHER

18 WHY TOGETHER? The wide range of problems we are called to face let us use both languages together: sometimes python leads the analysis, other times is R. R and python glued together:

19 flickr CC - WE ARE IN GOOD COMPANY

20 DO YOU KNOW STACKOVERFLOW? How many times, googling for help, you have been led to stackoverflow? Did you notice that R users and pydata users meet together in a huge number of threads?

21 STACKOVERFLOW - #PYDATA

22 Disclaimer: not done with matplotlib YET ANOTHER TAG CLOUD #1/2

23 STACKOVERFLOW - #RSTATS

24 Disclaimer: not done with ggplot2 YET ANOTHER TAG CLOUD #2/2

25 QUESTIONS How wide is the area of stackoverflow users active in both python and R Q&A? Is there any difference among the behavior of polyglots and their purists colleagues? Are they finally any smarter?

26 DATA CAPTURE Thanks to StackExchange data explorer* we designed a bunch of query to harvest information about questions and answers related to tags associated with python data stack (scipy, numpy, pandas, scikit-learn) and R (r, rstat, r-faq). On top of this data we built a user registry, labeling each user according to their joint (in)activity in these two different domains. (*)

27 A STEADY GROWTH

28 GROUPS OVERLAP users Pythonistas

29 POLYGLOTS Stackoverflow users you can find in both groups

30 AND THEIR ACTIVITY users Pythonistas Q&A volumes

31 AVERAGE INTERACTIONS PER USER

32 WHO OWNS THE KNOWLEDGE? Pareto point of view for Pythonistas

33 WHO OWNS THE KNOWLEDGE? Pareto point of view for Pythonistas _Pythonistats are users active on stackoverflow for #pydata related #tags _80% of total amount of answers is given by (core) users _more than users have never answered to a question

34 WHO OWNS THE KNOWLEDGE? Pareto point of view for users

35 WHO OWNS THE KNOWLEDGE? Pareto point of view for users _users are users active on stackoverflow for #rstats related #tags _80% of total amount of answers is given by less than (core) users _more than users have never answered to a question

36 POLYGLOTS ARE PROBLEM SOLVERS? We tried to explain the probability of a Question being successfully closed with an accepted Answer, using some regressors: year and month of the question view counts, score, favorites and comments summed by post number of answers given by polyglots and number of answers given by Pythonistas/useRs plus some tags used as dummies The main question is: are polyglots smarter than R and python purists?

37 POLYGLOTS VS PYTHONISTAS While Pythonistas answers are likely to reduce the probability of the post being closed, the effect of polyglots contributions to posts is statistically significantly greater than zero, having a positive effect.

38 POLYGLOTS vs users Also R based posts benefit of polyglots interventions, showing a positive effect on the probability of successful post closing.

39 TOWARDS A LESS TAUTOLOGICAL QUESTION flickr cc -

40 HOW LONG DOES IT TAKES TO ANSWER A QUESTION? How Pythonistas, users and Polyglots presence affects questions lifetime?

41 YOUR #PYDATA QUESTIONS WILL BE ANSWERED IN 200* MINUTES OR NEVER Questions are solved mostly immediately, 50% closed within 40 minutes *3rd quartile

42 PYTHON QUESTIONS LIFETIME Polyglots presence contribute on reducing question resolution Kaplan-Meier survival estimates

43 YOUR #RSTATS QUESTIONS WILL BE ANSWERED IN 136 MINUTES OR NEVER Questions are solved mostly immediately, 50% closed within 32 minutes *3rd quartile

44 R QUESTIONS LIFETIME users and Polyglots presence ensure the lowest response time Kaplan-Meier survival estimates

45 PYTAGS QUESTIONS LIFETIME 1/3

46 PYTAGS QUESTIONS LIFETIME 2/3

47 PYTAGS QUESTIONS LIFETIME 3/3

48 from greeting import thankyou

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2 DATA SCIENCE CURRICULUM Before class even begins, students start an at-home pre-work phase. When they convene in class, students spend the first eight weeks doing iterative, project-centered skill acquisition.

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Unlocking the True Value of Hadoop with Open Data Science

Unlocking the True Value of Hadoop with Open Data Science Unlocking the True Value of Hadoop with Open Data Science Kristopher Overholt Solution Architect Big Data Tech 2016 MinneAnalytics June 7, 2016 Overview Overview of Open Data Science Python and the Big

More information

R Tools Evaluation. A review by Analytics @ Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015

R Tools Evaluation. A review by Analytics @ Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015 R Tools Evaluation A review by Analytics @ Global BI / Local & Regional Capabilities Telefónica CCDO May 2015 R Features What is? Most widely used data analysis software Used by 2M+ data scientists, statisticians

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics The Data Engineer Mike Tamir Chief Science Officer Galvanize Steven Miller Global Leader Academic Programs IBM Analytics Alessandro Gagliardi Lead Faculty Galvanize Businesses are quickly realizing that

More information

Big Data Paradigms in Python

Big Data Paradigms in Python Big Data Paradigms in Python San Diego Data Science and R Users Group January 2014 Kevin Davenport! http://kldavenport.com kldavenportjr@gmail.com @KevinLDavenport Thank you to our sponsors: Setting up

More information

Google AdWords vs Google Analytics: Dissecting Remarketing Lists. Written by Carrie Albright, Senior Account Manager. hanapinmarketing.

Google AdWords vs Google Analytics: Dissecting Remarketing Lists. Written by Carrie Albright, Senior Account Manager. hanapinmarketing. Google AdWords vs Google Analytics: Dissecting Remarketing Lists Written by Carrie Albright, Senior Account Manager In PPC, the power of remarketing is undeniable. Being able to interact with those who

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

How To Write A Data Analysis Project

How To Write A Data Analysis Project Section 1. Data Analytics Lifecycle Overview The Data Analytics Lifecycle is designed specifically for Big Data problems and data science projects. The lifecycle has six phases, and project work can occur

More information

What is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014

What is Data Science? Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 What is Data Science? { Data, Databases, and the Extraction of Knowledge Renée T., @becomingdatasci, November 2014 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa

More information

Data Science Certificate Program

Data Science Certificate Program Information Technologies Programs Data Science Certificate Program Accelerate Your Career extension.uci.edu/datascience Offered in partnership with University of California, Irvine Extension s professional

More information

Big Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify

Big Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify Big Data at Spotify Anders Arpteg, Ph D Analytics Machine Learning, Spotify Quickly about me Quickly about Spotify What is all the data used for? Quickly about Spark Hadoop MR vs Spark Need for (distributed)

More information

SAP SE - Legal Requirements and Requirements

SAP SE - Legal Requirements and Requirements Finding the signals in the noise Niklas Packendorff @packendorff Solution Expert Analytics & Data Platform Legal disclaimer The information in this presentation is confidential and proprietary to SAP and

More information

Assessing the Proposed 2014 Statistics Curriculum 9/22/2013 V0A. www.statlit.org/pdf/2014-schield-dsi2-slides.pdf 1

Assessing the Proposed 2014 Statistics Curriculum 9/22/2013 V0A. www.statlit.org/pdf/2014-schield-dsi2-slides.pdf 1 Assessing the Proposed 2014 Statistics Curriculum 9/22/2013 V0A 1 Business Analytics vs. Data Science by Milo Schield Member: International Statistical Institute US Rep: International Statistical Literacy

More information

What is Data Science? Girl Develop It! Meetup Renée M. P. Teate, March 2015

What is Data Science? Girl Develop It! Meetup Renée M. P. Teate, March 2015 What is Data Science? { Girl Develop It! Meetup Renée M. P. Teate, March 2015 Let s start with: What is Data? http://upload.wikimedia.org/wikipedia/commons/f/f0/darpa _Big_Data.jpg https://encryptedtbn2.gstatic.com/images?q=tbn:and9gcs9dku3_tzi-swwyaqee5y0ehuvoiznsya_raknubbd0jyxpx7pw

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

How To Perform Predictive Analysis On Your Web Analytics Data In R 2.5

How To Perform Predictive Analysis On Your Web Analytics Data In R 2.5 How to perform predictive analysis on your web analytics tool data June 19 th, 2013 FREE Webinar by Before we start... www Q & A? Our speakers Carolina Araripe Inbound Marketing Strategist @Tatvic http://linkd.in/yazvvn

More information

Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm!

Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm! Session 85 IF, Predictive Analytics for Actuaries: Free Tools for Life and Health Care Analytics--R and Python: A New Paradigm! Moderator: David L. Snell, ASA, MAAA Presenters: Brian D. Holland, FSA, MAAA

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

Confidently Anticipate and Drive Better Business Outcomes

Confidently Anticipate and Drive Better Business Outcomes SAP Brief Analytics s from SAP SAP Predictive Analytics Objectives Confidently Anticipate and Drive Better Business Outcomes See the future more clearly with predictive analytics See the future more clearly

More information

Ibis: Scaling Python Analy=cs on Hadoop and Impala

Ibis: Scaling Python Analy=cs on Hadoop and Impala Ibis: Scaling Python Analy=cs on Hadoop and Impala Wes McKinney, Budapest BI Forum 2015-10- 14 @wesmckinn 1 Me R&D at Cloudera Serial creator of structured data tools / user interfaces Mathema=cian MIT

More information

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users. Bonus Chapter Ten Major Predictive Analytics Vendors In This Chapter Angoss FICO IBM RapidMiner Revolution Analytics Salford Systems SAP SAS StatSoft, Inc. TIBCO This chapter highlights ten of the major

More information

Extend your analytic capabilities with SAP Predictive Analysis

Extend your analytic capabilities with SAP Predictive Analysis September 9 11, 2013 Anaheim, California Extend your analytic capabilities with SAP Predictive Analysis Charles Gadalla Learning Points Advanced analytics strategy at SAP Simplifying predictive analytics

More information

CORPORATE OVERVIEW. Big Data. Shared. Simply. Securely.

CORPORATE OVERVIEW. Big Data. Shared. Simply. Securely. CORPORATE OVERVIEW Big Data. Shared. Simply. Securely. INTRODUCING PHEMI SYSTEMS PHEMI unlocks the power of your data with out-of-the-box privacy, sharing, and governance PHEMI Systems brings advanced

More information

A Non-Geek s. Hadoop and the Enterprise Data Warehouse. best practices. big DATA. by Tamara dull. a SAS Best Practices white paper

A Non-Geek s. Hadoop and the Enterprise Data Warehouse. best practices. big DATA. by Tamara dull. a SAS Best Practices white paper big DATA A Non-Geek s Big Data Playbook Hadoop and the Enterprise Data Warehouse a SAS Best Practices white paper best practices T H O UGHT PROVOKING BUSINESS by Tamara dull A Non-Geek s Big Data Playbook:

More information

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data 100 001 010 111 From Raw Data to 10011100 Actionable Insights with 00100111 MATLAB Analytics 01011100 11100001 1 Access and Explore Data For scientists the problem is not a lack of available but a deluge.

More information

Data structures for statistical computing in Python Wes McKinney SciPy 2010 McKinney () Statistical Data Structures in Python SciPy 2010 1 / 31 Environments for statistics and data analysis The usual suspects:

More information

Digital Analytics Checkup:

Digital Analytics Checkup: Digital Analytics Checkup: How to evaluate the impact of your web analytics data A Digital Marketing Depot White Paper Executive Summary Marketing organizations are being inundated with a greater volume,

More information

Maximize Revenues on your Customer Loyalty Program using Predictive Analytics

Maximize Revenues on your Customer Loyalty Program using Predictive Analytics Maximize Revenues on your Customer Loyalty Program using Predictive Analytics 27 th Feb 14 Free Webinar by Before we begin... www Q & A? Your Speakers @parikh_shachi Technical Analyst @tatvic Loves js

More information

Augmented Search for IT Data Analytics. New frontier in big log data analysis and application intelligence

Augmented Search for IT Data Analytics. New frontier in big log data analysis and application intelligence Augmented Search for IT Data Analytics New frontier in big log data analysis and application intelligence Business white paper May 2015 IT data is a general name to log data, IT metrics, application data,

More information

AcademyR Course Catalog

AcademyR Course Catalog AcademyR Course Catalog Table of Contents Our Philosophy...3 Courses Listed by Role Data Analyst...4 Data Scientist...6 R Programmer...9 Statistician.... 10 BI Developer... 11 System Administrator... 12

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

RESEARCH NOTE NETSUITE S IMPACT ON MANUFACTURING COMPANY PERFORMANCE

RESEARCH NOTE NETSUITE S IMPACT ON MANUFACTURING COMPANY PERFORMANCE Document K59 RESEARCH NOTE NETSUITE S IMPACT ON MANUFACTURING COMPANY PERFORMANCE THE BOTTOM LINE When Nucleus analysts investigated the use of NetSuite by manufacturers, they found these companies were

More information

Explode Six Direct Marketing Myths

Explode Six Direct Marketing Myths White Paper Explode Six Direct Marketing Myths Maximize Campaign Effectiveness with Analytic Technology Table of contents Introduction: Achieve high-precision marketing with analytics... 2 Myth #1: Simple

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

Business Plan Strategy. John Debrincat

Business Plan Strategy. John Debrincat Business Plan Strategy John Debrincat Agenda Business models Plan to succeed Mission Strategy Technology Engagement Stakeholders Business Models Web Influencing Off-line Sales 80% of all web-influenced

More information

Augmented Search for Software Testing

Augmented Search for Software Testing Augmented Search for Software Testing For Testers, Developers, and QA Managers New frontier in big log data analysis and application intelligence Business white paper May 2015 During software testing cycles,

More information

MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

More information

Cloud Big Data Architectures

Cloud Big Data Architectures Cloud Big Data Architectures Lynn Langit QCon Sao Paulo, Brazil 2016 About this Workshop Real-world Cloud Scenarios w/aws, Azure and GCP 1. Big Data Solution Types 2. Data Pipelines 3. ETL and Visualization

More information

Machine Learning for Understanding User Behaviours. Semi-Supervised Learning Applied to Click Streams

Machine Learning for Understanding User Behaviours. Semi-Supervised Learning Applied to Click Streams Machine Learning for Understanding User Behaviours Semi-Supervised Learning Applied to Click Streams Goals Motivation for semi-supervised learning and log analytics Overall Methodology Leveraging Hadoop

More information

White Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices.

White Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices. White Paper Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices. Contents Data Management: Why It s So Essential... 1 The Basics of Data Preparation... 1 1: Simplify Access

More information

Big Analytics: A Next Generation Roadmap

Big Analytics: A Next Generation Roadmap Big Analytics: A Next Generation Roadmap Cloud Developers Summit & Expo: October 1, 2014 Neil Fox, CTO: SoftServe, Inc. 2014 SoftServe, Inc. Remember Life Before The Web? 1994 Even Revolutions Take Time

More information

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014 Maximierung des Geschäftserfolgs durch SAP Predictive Analytics Andreas Forster, May 2014 Legal Disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed

More information

Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate

Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate Description The Helzberg School of Management has launched two graduate-level certificates: one in Data

More information

RESEARCH NOTE NETSUITE S IMPACT ON E-COMMERCE COMPANIES

RESEARCH NOTE NETSUITE S IMPACT ON E-COMMERCE COMPANIES Document L17 RESEARCH NOTE NETSUITE S IMPACT ON E-COMMERCE COMPANIES THE BOTTOM LINE Nucleus Research analyzed the activities of online retailers using NetSuite to assess the impact of the software on

More information

Building Analytics and Big Data Capabilities Tom Davenport CDB Annual Conference May 23, 2012

Building Analytics and Big Data Capabilities Tom Davenport CDB Annual Conference May 23, 2012 Building Analytics and Big Data Capabilities Tom Davenport CDB Annual Conference May 23, 2012 A Bright Idea Informatics/Analytics on Small and Big Data It works for: Old companies (GE, P&G, Marriott, Bank

More information

Statistical/ IT Skills

Statistical/ IT Skills Statistical/ IT Skills A Data Scientist must have or be able to quickly acquire a detailed knowledge and understanding of Big Data statistical methodology, concepts and research as they apply to the production

More information

Certificate Program in Applied Big Data Analytics in Dubai. A Collaborative Program offered by INSOFE and Synergy-BI

Certificate Program in Applied Big Data Analytics in Dubai. A Collaborative Program offered by INSOFE and Synergy-BI Certificate Program in Applied Big Data Analytics in Dubai A Collaborative Program offered by INSOFE and Synergy-BI Program Overview Today s manager needs to be extremely data savvy. They need to work

More information

KnowledgeSEEKER POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE

KnowledgeSEEKER POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE Most Effective Modeling Application Designed to Address Business Challenges Applying a predictive strategy to reach a desired business

More information

Customer Case Study. Automatic Labs

Customer Case Study. Automatic Labs Customer Case Study Automatic Labs Customer Case Study Automatic Labs Benefits Validated product in days Completed complex queries in minutes Freed up 1 full-time data scientist Infrastructure savings

More information

SEYMOUR SLOAN IDEAS THAT MATTER

SEYMOUR SLOAN IDEAS THAT MATTER SEYMOUR SLOAN IDEAS THAT MATTER The value of Big Data: How analytics differentiate winners A DATA DRIVEN FUTURE Big data is fast becoming the term keeping senior executives up at night. The promise of

More information

ANACONDA. Open Source Modern Analytics Platform Powered by Python ANACONDA DELIVERS OPEN ENTERPRISE PYTHON KEY FEATURES WHY YOU LL LOVE ANACONDA

ANACONDA. Open Source Modern Analytics Platform Powered by Python ANACONDA DELIVERS OPEN ENTERPRISE PYTHON KEY FEATURES WHY YOU LL LOVE ANACONDA 1 Open Source Modern Analytics Platform Powered by Python KEY FEATURES 100% Open Source Modern Analytics Platform Powered by Python Single click installation Package management Works with Windows, OS X,

More information

SOCIAL MEDIA CAMPAIGNS

SOCIAL MEDIA CAMPAIGNS 5 BEST SOCIAL MEDIA CAMPAIGNS to Drive Customers Through Your Sales Funnel Stages a publication 5 Best Social Media Campaigns to Drive Customers Through Your Sales Funnel Stages 2 5 Best Social Media Campaigns

More information

THE THREE "Rs" OF PREDICTIVE ANALYTICS

THE THREE Rs OF PREDICTIVE ANALYTICS THE THREE "Rs" OF PREDICTIVE As companies commit to big data and data-driven decision making, the demand for predictive analytics has never been greater. While each day seems to bring another story of

More information

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.

More information

Big Data to trade bonds/fx & Python demo on FX intraday vol

Big Data to trade bonds/fx & Python demo on FX intraday vol Big Data to trade bonds/fx & Python demo on FX intraday vol Saeed Amen, Quantitative Strategist Managing Director & Co-founder of The Thalesians @thalesians / commentary around finance saeed@thalesians.com

More information

A Simple Guide to Churn Analysis

A Simple Guide to Churn Analysis A Simple Guide to Churn Analysis A Publication by Evergage Introduction Thank you for downloading A Simple Guide to Churn Analysis. The goal of this guide is to make analyzing churn easy, meaning you wont

More information

SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise

SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise Frequently Asked Questions SAP HANA Vora SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise SAP HANA Vora software enables digital businesses to innovate and compete through in-the-moment

More information

Johan Hallberg Research Manager / Industry Analyst IDC Nordic Services & Sourcing Digital Transformation Global CIO Agenda

Johan Hallberg Research Manager / Industry Analyst IDC Nordic Services & Sourcing Digital Transformation Global CIO Agenda IDC s Big Data Predictions 2015 Johan Hallberg Research Manager / Industry Analyst IDC Nordic Services & Sourcing Digital Transformation Global CIO Agenda Big Data Opportunity: The Need for Deep Personalization

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

Intermediate Advanced All

Intermediate Advanced All DAY 1: JUNE 14 INSPIRE ME S & PANELS SPEECH BIG DATA = YOUR COMPETITIVE WEAPON 9:30 AM 10:15 AM Big Data has gone from trendy to critical over the past few years. It is now woven into every sector and

More information

Quantified Self: Analyzing the Big Data of our Daily Life. Andreas Schreiber <Andreas.Schreiber@dlr.de> PyData Berlin 2014

Quantified Self: Analyzing the Big Data of our Daily Life. Andreas Schreiber <Andreas.Schreiber@dlr.de> PyData Berlin 2014 DLR.de Chart 1 Quantified Self: Analyzing the Big Data of our Daily Life Andreas Schreiber PyData Berlin 2014 DLR.de Chart 2 Introduction Scientist, Head of department Co-Founder,

More information

web analytics ...and beyond Not just for beginners, We are interested in your thoughts:

web analytics ...and beyond Not just for beginners, We are interested in your thoughts: web analytics 201 Not just for beginners, This primer is designed to help clarify some of the major challenges faced by marketers today, such as:...and beyond -defining KPIs in a complex environment -organizing

More information

ESS event: Big Data in Official Statistics

ESS event: Big Data in Official Statistics ESS event: Big Data in Official Statistics v erbi v is 1 Parallel sessions 2A and 2B LEARNING AND DEVELOPMENT: CAPACITY BUILDING AND TRAINING FOR ESS HUMAN RESOURCES FACILITATOR: JOSÉ CERVERA- FERRI 2

More information

Auto Days 2011 Predictive Analytics in Auto Finance

Auto Days 2011 Predictive Analytics in Auto Finance Auto Days 2011 Predictive Analytics in Auto Finance Vick Panwar SAS Risk Practice Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Introduction Changing Risk Landscape - Key Drivers and Challenges

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

IBM Cognos Business Intelligence on Cloud

IBM Cognos Business Intelligence on Cloud IBM Cognos Business Intelligence on Cloud Operate and succeed at a new business speed Highlights Take advantage of world-class reporting, analysis, dashboards and visualization capabilities offered as

More information

High Performance Predictive Analytics in R and Hadoop:

High Performance Predictive Analytics in R and Hadoop: High Performance Predictive Analytics in R and Hadoop: Achieving Big Data Big Analytics Presented by: Mario E. Inchiosa, Ph.D. US Chief Scientist August 27, 2013 1 Polling Questions 1 & 2 2 Agenda Revolution

More information

www.ultipromo.com FOR SALE BY OWNER LIST PRICE $199,999 CONTACT: CRAIG DAVIDIUK, PRESIDENT TEL: 604 815 8225

www.ultipromo.com FOR SALE BY OWNER LIST PRICE $199,999 CONTACT: CRAIG DAVIDIUK, PRESIDENT TEL: 604 815 8225 www.ultipromo.com FOR SALE BY OWNER LIST PRICE $199,999 CONTACT: CRAIG DAVIDIUK, PRESIDENT TEL: 604 815 8225 HISTORY Ultimate Promotions is a 13 year old online vendor of lapel pins, medals and promotional

More information

Getting to Know Your Online Donors Can Pay Off

Getting to Know Your Online Donors Can Pay Off Who are our online donors? How are they different? Are online donors better givers more generous and reliable than other donors? A statistical analysis provides answers. Getting to Know Your Online Donors

More information

ANALYTICS IN BIG DATA ERA

ANALYTICS IN BIG DATA ERA ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut

More information

Credit Risk Analysis Using Logistic Regression Modeling

Credit Risk Analysis Using Logistic Regression Modeling Credit Risk Analysis Using Logistic Regression Modeling Introduction A loan officer at a bank wants to be able to identify characteristics that are indicative of people who are likely to default on loans,

More information

Data Management: Best Practices. Michelle Craft Research IT Coordinator mcraft@discovery.wisc.edu http://cct-resources.discovery.wisc.

Data Management: Best Practices. Michelle Craft Research IT Coordinator mcraft@discovery.wisc.edu http://cct-resources.discovery.wisc. Data Management: Best Practices Michelle Craft Research IT Coordinator mcraft@discovery.wisc.edu http://cct-resources.discovery.wisc.edu Stanford University Libraries, Data Management Services library.stanford.edu/research/data-managementservices/data-best-practices

More information

Data Visualization Techniques

Data Visualization Techniques Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The

More information

Datameer Cloud. End-to-End Big Data Analytics in the Cloud

Datameer Cloud. End-to-End Big Data Analytics in the Cloud Cloud End-to-End Big Data Analytics in the Cloud Datameer Cloud unites the economics of the cloud with big data analytics to deliver extremely fast time to insight. With Datameer Cloud, empowered line

More information

R and Hadoop: Architectural Options. Bill Jacobs VP Product Marketing & Field CTO, Revolution Analytics @bill_jacobs

R and Hadoop: Architectural Options. Bill Jacobs VP Product Marketing & Field CTO, Revolution Analytics @bill_jacobs R and Hadoop: Architectural Options Bill Jacobs VP Product Marketing & Field CTO, Revolution Analytics @bill_jacobs Polling Question #1: Who Are You? (choose one) Statistician or modeler who uses R Other

More information

Build Vs. Buy For Text Mining

Build Vs. Buy For Text Mining Build Vs. Buy For Text Mining Why use hand tools when you can get some rockin power tools? Whitepaper April 2015 INTRODUCTION We, at Lexalytics, see a significant number of people who have the same question

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

Statistics Meets Big Data 統 計 遇 見 大 數 據

Statistics Meets Big Data 統 計 遇 見 大 數 據 Stat3980: Statistics in Banking and Finance Statistics Meets Big Data 統 計 遇 見 大 數 據 Dr. Aijun Zhang Spring 2016@HKBU 1 Course Title: STAT3980/MATH4875 Overview Selected Topics in Statistics Statistics

More information

whitepaper Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R

whitepaper Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R Table of Contents 3 Predictive Analytics with TIBCO Spotfire 4 TIBCO Spotfire Statistics Services 8 TIBCO Enterprise Runtime

More information

Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R

Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R PREDICTIVE ANALYTICS WITH TIBCO SPOTFIRE TIBCO Spotfire is the premier data discovery and analytics platform, which provides

More information

Independent process platform

Independent process platform Independent process platform Megatrend in infrastructure software Dr. Wolfram Jost CTO February 22, 2012 2 Agenda Positioning BPE Strategy Cloud Strategy Data Management Strategy ETS goes Mobile Each layer

More information

The Dating Guide to SEO

The Dating Guide to SEO TheMxGroup.com (800) 827-0170 1 The Dating Guide to SEO Maybe you re still hoping for a first date. Or maybe you ve been together for a while, but things just aren t clicking. Either way, if you re not

More information

An interdisciplinary model for analytics education

An interdisciplinary model for analytics education An interdisciplinary model for analytics education Raffaella Settimi, PhD School of Computing, DePaul University Drew Conway s Data Science Venn Diagram http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

More information

An In-Depth Look at In-Memory Predictive Analytics for Developers

An In-Depth Look at In-Memory Predictive Analytics for Developers September 9 11, 2013 Anaheim, California An In-Depth Look at In-Memory Predictive Analytics for Developers Philip Mugglestone SAP Learning Points Understand the SAP HANA Predictive Analysis library (PAL)

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

Data Science, Predictive Analytics & Big Data Analytics Solutions. Service Presentation

Data Science, Predictive Analytics & Big Data Analytics Solutions. Service Presentation Data Science, Predictive Analytics & Big Data Analytics Solutions Service Presentation Did You Know That According to the new research from GE and Accenture*: 87% of companies believe Big Data analytics

More information

Data Visualization Techniques

Data Visualization Techniques Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The

More information

Introduction to Python

Introduction to Python 1 Daniel Lucio March 2016 Creator of Python https://en.wikipedia.org/wiki/guido_van_rossum 2 Python Timeline Implementation Started v1.0 v1.6 v2.1 v2.3 v2.5 v3.0 v3.1 v3.2 v3.4 1980 1991 1997 2004 2010

More information

Disrupting The Market: Predictive Analytics As A Service

Disrupting The Market: Predictive Analytics As A Service Disrupting The Market: Predictive Analytics As A Service 0 Problem 8.7 Billion Connected Devices 1 Growing 25% Annually What Does This Data Tell Us About Sensor Use? 1 Study conducted by Cisco 1 Solution

More information

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics SAP Brief SAP HANA Objectives Transform Your Future with Better Business Insight Using Predictive Analytics Dealing with the new reality Dealing with the new reality Organizations like yours can identify

More information

The Big Data Revolution: welcome to the Cognitive Era.

The Big Data Revolution: welcome to the Cognitive Era. The Big Data Revolution: welcome to the Cognitive Era. Yves Eychenne, Cloud Advisor, IBM Email: yves.eychenne@fr.ibm.com @yeychenne 2015 INTERNATIONAL BUSINESS MACHINES CORPORATION Agenda Big Data and

More information

FIVE BIG DATA SECURITY CONSIDERATIONS

FIVE BIG DATA SECURITY CONSIDERATIONS BIG DATA ANALYTICS IS A MULTI- BILLION-DOLLAR OPPORTUNITY AND KEEPING A FIRM HOLD ON INFORMATION SECURITY IS KEY FIVE BIG DATA SECURITY CONSIDERATIONS NO ONE DOUBTS THE POTENTIAL BUSINESS VALUE OF BIG

More information

TOTAL DATA INTEGRATION

TOTAL DATA INTEGRATION The Impact of Big Data on Integration and Governance Big data and Total Data have the potential to change the face of the data integration market. This report outlines the key drivers shaping this sector

More information

SAP Predictive Analytics

SAP Predictive Analytics SAP Predictive Analytics What s the best that COULD happen? Bringing predictive analytics to the end user SAP Forum Belgium September 9, 2015 Waldemar Adams @adamsw SVP & GM Analytics SAP Europe, Middle-East

More information