Big Data Use Cases. At Salesforce.com. Narayan Bharadwaj Director, Product Management

Size: px
Start display at page:

Download "Big Data Use Cases. At Salesforce.com. Narayan Bharadwaj Director, Product Management Salesforce.com. @nadubharadwaj"

Transcription

1 Big Data Use Cases At Salesforce.com Narayan Bharadwaj Director, Product Management

2 Safe harbor Safe harbor statement under the Private Securi9es Li9ga9on Reform Act of 1995: This presenta9on may contain forward- looking statements that involve risks, uncertain9es, and assump9ons. If any such uncertain9es materialize or if any of the assump9ons proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward- looking statements we make. All statements other than statements of historical fact could be deemed forward- looking, including any projec9ons of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future opera9ons, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertain9es referred to above include but are not limited to risks associated with developing and delivering new func9onality for our service, new products and services, our new business model, our past opera9ng losses, possible fluctua9ons in our opera9ng results and rate of growth, interrup9ons or delays in our Web hos9ng, breach of our security measures, the outcome of intellectual property and other li9ga9on, risks associated with possible mergers and acquisi9ons, the immature market in which we operate, our rela9vely limited opera9ng history, our ability to expand, retain, and mo9vate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non- salesforce.com products, and u9liza9on and selling to larger enterprise customers. Further informa9on on poten9al factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10- Q for the most recent fiscal quarter ended July 31, This documents and others containing important disclosures are available on the SEC Filings sec9on of the Investor Informa9on sec9on of our Web site. Any unreleased services or features referenced in this or other presenta9ons, press releases or public statements are not currently available and may not be delivered on 9me or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obliga9on and does not intend to update these forward- looking statements.

3 Agenda Big Data use cases Technology Use case discussion Collabora9ve Filtering Q&A

4 Got Cloud Data? 130k customers Millions of users 800 million transac9ons/day Terabytes/day

5 Technology

6 Big Data Ecosystem

7 Data Science tools ecosystem Apache Pig Version=0.9.1

8 : Prashant Kommireddi Lars : Ian Varley

9 Big Data Use Cases Product Metrics User behavior analysis Capacity planning Monitoring intelligence Collec9ons Query Run9me Predic9on Early Warning System Collabora9ve Filtering Search Relevancy Internal App Product feature

10 Product Metrics

11 Product Metrics Problem Statement Track feature usage/adop9on across 130k+ customers Eg: Accounts, Contacts, Visualforce, Apex, Track standard metrics across all features Eg: #Requests, #UniqueOrgs, #UniqueUsers, AvgResponseTime, Track features and metrics across all channels API, UI, Mobile Primary audience: Execu9ves, Product Managers

12 Product Metrics Pipeline User Input (Page Layout) CollaboraWon (ChaXer) Reports, Dashboards Feature Metrics (Custom Object) Trend Metrics (Custom Object) API Workflow Formula Fields API Client Machine Java Program Pig script generator Hadoop Workflow Log Pull Log Files

13 VisualizaWon (Reports & Dashboards)

14 VisualizaWon (Reports & Dashboards)

15 Collaborate, Iterate (ChaXer)

16 User Behavior Analysis

17 Problem Statement How do we reduce number of clicks on the user interface? What are the top user click path sequences? What are the user clusters/personas? Approach: Markov transi9on for click path, D3.js visuals K- means (unsupervised) clustering for user groups

18 Markov TransiWons for "Setup" pages

19 K- means clustering of "Setup" pages

20 Collabora9ve Filtering Jed Crosby

21 CollaboraWve Filtering Problem Statement Show similar files within an organiza9on Content- based approach Community- base approach

22 Popular File

23 Related File

24 We found this relawonship using item- to- item collaborawve filtering Amazon published this algorithm in Amazon.com RecommendaAons: Item- to- Item CollaboraAve Filtering, by Gregory Linden, Brent Smith, and Jeremy York. IEEE Internet Compu9ng, January- February At Salesforce, we adapted this algorithm for Hadoop, and we use it to recommend files to view and users to follow.

25 Example: CF on 5 files Annual Report Vision Statement Dilbert Comic Darth Vader Cartoon Disk Usage Report

26 View History Table Miranda (CEO) Annual Report Vision Statement Dilbert Cartoon Darth Vader Cartoon Disk Usage Report Bob (CFO) Susan (Sales) Chun (Sales) Alice (IT)

27 RelaWonships between the files Annual Report Vision Statement Dilbert Cartoon Darth Vader Cartoon Disk Usage Report

28 RelaWonships between the files Annual Report 2 Vision Statement Dilbert Cartoon Darth Vader Cartoon 1 1 Disk Usage Report

29 Sorted relawonships for each file Annual Report Vision Statement Dilbert Cartoon Darth Vader Cartoon Disk Usage Report Dilbert (2) Dilbert (3) Vision Stmt. (3) Dilbert (3) Dilbert (1) Vision Stmt. (2) Annual Rpt. (2) Darth Vader (3) Vision Stmt. (1) Darth Vader (1) Darth Vader (1) Annual Rpt. (2) Disk Usage (1) Disk Usage (1) The popularity problem: no9ce that Dilbert appears first in every list. This is probably not what we want. The solu9on: divide the relawonship tallies by file populariwes.

30 Normalized relawonships between the files Annual Report.82 Vision Statement Dilbert Cartoon Darth Vader Cartoon Disk Usage Report

31 Sorted relawonships for each file, normalized by file populariwes Annual Report Vision Statement Dilbert Cartoon Darth Vader Cartoon Disk Usage Report Vision Stmt. (.82) Annual Report (.82) Darth Vader (.77) Dilbert (.77) Darth Vader (.58) Dilbert (.63) Dilbert (.77) Vision Stmt. (.77) Disk Usage (.58) Dilbert (.45) Darth Vader (.33) Annual Report (.63) Vision Stmt. (.33) Disk Usage (.45) High rela9onship tallies AND similar popularity values now drive closeness.

32 The item- to- item CF algorithm 1) Compute file populari9es 2) Compute rela9onship tallies and divide by file populari9es 3) Sort and store the results

33 MapReduce Overview Map Shuffle Reduce (adapted from hsp://code.google.com/p/mapreduce- framework/wiki/ MapReduce)

34 1. Compute File PopulariWes <user, file> Inverse iden9ty map <file, List<user>> Reduce <file, (user count)> Result is a table of (file, popularity) pairs that you store in the Hadoop distributed cache.

35 Example: File popularity for Dilbert (Miranda, Dilbert), (Bob, Dilbert), (Susan, Dilbert), (Chun, Dilbert), (Alice, Dilbert) Inverse iden9ty map <Dilbert, {Miranda, Bob, Susan, Chun, Alice}> Reduce (Dilbert, 5)

36 2a. Compute relawonship tallies - find all relawonships in view history table <user, file> Iden9ty map <user, List<file>> Reduce <(file1, file2), Integer(1)>, <(file1, file3), Integer(1)>, <(file(n- 1), file(n)), Integer(1)> Rela9onships have their file IDs in alphabe9cal order to avoid double coun9ng.

37 Example 2a: Miranda s (CEO) file relawonship votes (Miranda, Annual Report), (Miranda, Vision Statement), (Miranda, Dilbert) Iden9ty map <Miranda, {Annual Report, Vision Statement, Dilbert}> Reduce <(Annual Report, Dilbert), Integer(1)>, <(Annual Report, Vision Statement), Integer(1)>, <(Dilbert, Vision Statement), Integer(1)>

38 2b. Tally the relawonship votes - just a word count, where each relawonship occurrence is a word <(file1, file2), Integer(1)> Iden9ty map <(file1, file2), List<Integer(1)> Reduce: count and divide by populari9es <file1, (file2, similarity score)>, <file2, (file1, similarity score)> Note that we emit each result twice, one for each file that belongs to a rela9onship.

39 Example 2b: the Dilbert/Darth Vader relawonship <(Dilbert, Vader), Integer(1)>, <(Dilbert, Vader), Integer(1)>, <(Dilbert, Vader), Integer(1)> Iden9ty map <(Dilbert, Vader), {1, 1, 1}> Reduce: count and divide by populari9es <Dilbert, (Vader, sqrt(3/5))>, <Vader, (Dilbert, sqrt(3/5))>

40 3. Sort and store results <file1, (file2, similarity score)> Iden9ty map <file1, List<(file2, similarity score)>> Reduce <file1, {top n similar files}> Store the results in your loca9on of choice

41 Example 3: SorWng the results for Dilbert <Dilbert, (Annual Report,.63)>, <Dilbert, (Vision Statement,.77)>, <Dilbert, (Disk Usage,.45)>, <Dilbert, (Darth Vader,.77)> Iden9ty map <Dilbert, {(Annual Report,.63), (Vision Statement,.77), (Disk Usage,.45), (Darth Vader,.77)}> Reduce <Dilbert, {Darth Vader, Vision Statement}> (Top 2 files) Store results

42 Appendix Cosine formula and normaliza9on trick to avoid the distributed cache cosθ AB = A B A B = A A B B Mahout has CF Asympto9c order of the algorithm is O(M*N 2 ) in worst case, but is helped by sparsity.

43 Narayan Bharadwaj Director, Product

44

Developers: Build Next Generation Apps. Michael Yeganeh Solution Engineering Lead myeganeh@salesforce.com

Developers: Build Next Generation Apps. Michael Yeganeh Solution Engineering Lead myeganeh@salesforce.com Developers: Build Next Generation Apps Michael Yeganeh Solution Engineering Lead myeganeh@salesforce.com Safe harbor Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This

More information

Creating Service Relevance for M2M Data

Creating Service Relevance for M2M Data Creating Service Relevance for M2M Data Jon Upton October 2014 M2M Summit jupton@salesforce.com Safe Harbor Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation

More information

The Fastest Path to the Cloud Building Your SaaS Company on Force.com

The Fastest Path to the Cloud Building Your SaaS Company on Force.com The Fastest Path to the Cloud Building Your SaaS Company on Force.com Kai Mäkelä salesforce.com kmakela@salesforce.com Safe Harbor Safe harbor statement under the Private Securities Litigation Reform Act

More information

PLATFORM AS A SERVICE MULTI TENANCY AND OPEN STANDARDS. Peter Chittum @pchittum salesforce.com!

PLATFORM AS A SERVICE MULTI TENANCY AND OPEN STANDARDS. Peter Chittum @pchittum salesforce.com! PLATFORM AS A SERVICE MULTI TENANCY AND OPEN STANDARDS Peter Chittum @pchittum salesforce.com! Platform as a Service Multi Tenancy and Open Standards Peter Chittum Developer Evangelist @pchittum Safe Harbor

More information

5 Steps to Building a Successful Channel Program in the Cloud. Ron Huddleston SVP, ISV Alliances

5 Steps to Building a Successful Channel Program in the Cloud. Ron Huddleston SVP, ISV Alliances 5 Steps to Building a Successful Channel Program in the Cloud Ron Huddleston SVP, ISV Alliances Safe Harbor Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation

More information

The Desktop is Dead... Let s Talk About the Living! Bruce Richardson, Chief Enterprise Strategist brichardson@salesforce.com

The Desktop is Dead... Let s Talk About the Living! Bruce Richardson, Chief Enterprise Strategist brichardson@salesforce.com The Desktop is Dead... Let s Talk About the Living! Bruce Richardson, Chief Enterprise Strategist brichardson@salesforce.com The Customer Revolution Safe Harbor Safe harbor statement under the Private

More information

Social Enterprise Java Apps

Social Enterprise Java Apps Social Enterprise Java Apps Safe Harbor Statement Safe harbor statement under the Private Securities Litigation Reform Act of 1995. This presentation may contain forward-looking statements that involve

More information

Webhooks. Near-real time event processing with guaranteed delivery of HTTP callbacks. HBaseCon 2015

Webhooks. Near-real time event processing with guaranteed delivery of HTTP callbacks. HBaseCon 2015 Webhooks Near-real time event processing with guaranteed delivery of HTTP callbacks HBaseCon 2015 Alan Steckley Principal Software Engineer, Salesforce 2 Poorna Chandra Software Engineer, Cask 3 Safe Harbor

More information

Cloud to Cloud Integrations with Force.com. Sandeep Bhanot Developer Evangelist @cloudysan

Cloud to Cloud Integrations with Force.com. Sandeep Bhanot Developer Evangelist @cloudysan Cloud to Cloud Integrations with Force.com Sandeep Bhanot Developer Evangelist @cloudysan Safe Harbor Salesforce.com Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This

More information

Increase HR Productivity with Salesforce.com platforms

Increase HR Productivity with Salesforce.com platforms Increase HR Productivity with Salesforce.com platforms Mark Schoemaker ISV Programs EMEA North @mschoemaker Safe Harbor Safe harbor statement under the Private Securities Litigation Reform Act of 1995:

More information

Salesforce.com and the financial services sector

Salesforce.com and the financial services sector Don t be clouded by the cloud: Salesforce.com and the financial services sector Martijn Simons Account Executive Financial Services @Martijn_On_Line In//martijn-simons Lien Ceulemans Corporate legal counsel

More information

VerticalResponse for AppExchange: Past, Present and Future

VerticalResponse for AppExchange: Past, Present and Future VerticalResponse for AppExchange: Past, Present and Future Presented By: Joshua Feinberg: VP, Product Management Alex Scalisi: Sales Executive Special Guest Speaker: Judy Loehr: Senior Sales & Marketing

More information

Secure Coding SSL, SOAP and REST. Astha Singhal Product Security Engineer salesforce.com

Secure Coding SSL, SOAP and REST. Astha Singhal Product Security Engineer salesforce.com Secure Coding SSL, SOAP and REST Astha Singhal Product Security Engineer salesforce.com Safe Harbor Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may

More information

SPRING 14 RELEASE NOTES

SPRING 14 RELEASE NOTES SPRING 14 RELEASE NOTES At Salesforce ExactTarget Marketing Cloud your success is our top priority and we re working hard to continuously improve the Marketing Cloud solutions you use. We recently reached

More information

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Copyright 2014 Splunk Inc. Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS Dritan Bi=ncka BD Solu=ons Architecture Disclaimer During the course of this presenta=on, we may make forward looking statements

More information

Welcome to the Force.com Developer Day

Welcome to the Force.com Developer Day Welcome to the Force.com Developer Day Sign up for a Developer Edition account at: http://developer.force.com/join Nicola Lalla nlalla@saleforce.com n_lalla nlalla26 Safe Harbor Safe harbor statement under

More information

Successfully Scaling an Agile Innovation Culture with Perforce

Successfully Scaling an Agile Innovation Culture with Perforce Successfully Scaling an Agile Innovation Culture with Perforce Steve Greene VP, Program Management Salesforce.com Mike Saha Sr. Manager, Release Engineering Salesforce.com Safe Harbor Safe harbor statement

More information

Welcome to the Real-Time Cloud

Welcome to the Real-Time Cloud Welcome to the Real-Time Cloud Daniel Burton Sr. Vice President, Global Public Policy salesforce.com dburton@salesforce.com Safe Harbor Safe harbor statement under the Private Securities Litigation Reform

More information

How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9

How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9 Copyright 2014 Splunk Inc. Splunk for Mobile Intelligence Bill Emme< Director, Solu?ons Marke?ng Panos Papadopoulos Director, Product Management Disclaimer During the course of this presenta?on, we may

More information

Force.com: Secure Cloud Development. Varun Badhwar Force.com Security Manager

Force.com: Secure Cloud Development. Varun Badhwar Force.com Security Manager Force.com: Secure Cloud Development Varun Badhwar Force.com Security Manager Safe Harbor Statement Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may

More information

Investor Presenta,on Third Quarter 2014. 2014 ServiceNow All Rights Reserved 1

Investor Presenta,on Third Quarter 2014. 2014 ServiceNow All Rights Reserved 1 Investor Presenta,on Third Quarter 2014 2014 ServiceNow All Rights Reserved 1 FORWARD- LOOKING STATEMENTS, INDUSTRY AND MARKET DATA This presenta>on contains forward- looking statements that are based

More information

WELCOME! Webinar on roundcorner's donor engagement platform roundcause. with Childfund International, IRC, Salesforce Foundation and roundcorner

WELCOME! Webinar on roundcorner's donor engagement platform roundcause. with Childfund International, IRC, Salesforce Foundation and roundcorner WELCOME! Webinar on roundcorner's donor engagement platform roundcause with Childfund International, IRC, Salesforce Foundation and roundcorner Please stand by, we will get started soon. NOTE: Audio should

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Building the Global Cloud

Building the Global Cloud Building the Global Cloud Beyond IT Migration to the Enterprise Peter Coffee Head of Platform Research salesforce.com inc. Safe Harbor Safe harbor statement under the Private Securities Litigation Reform

More information

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More Copyright 2015 Splunk Inc. Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More Stela Udovicic Sr. Product Marke?ng Manager Clayton

More information

BPO. Accerela*ng Revenue Enhancements Through Sales Support Services

BPO. Accerela*ng Revenue Enhancements Through Sales Support Services BPO Accerela*ng Revenue Enhancements Through Sales Support Services What is BPO? Business Process Outsorcing (BPO) is the process of outsourcing specific business func6ons to a third- party service provider

More information

A R o a d t o y o u r C l o u d. Professional Service. C R M a n d C l o u d C o n s u l t i n g

A R o a d t o y o u r C l o u d. Professional Service. C R M a n d C l o u d C o n s u l t i n g RM-C A R o a d t o y o u r C l o u d Professional Service C R M a n d C l o u d C o n s u l t i n g CRM-C Highlights! A Unique Cloud CRM Consulting service firm! Specializing in cloud CRM and Office Collaboration

More information

BIG DATA - HADOOP PROFESSIONAL amron

BIG DATA - HADOOP PROFESSIONAL amron 0 Training Details Course Duration: 30-35 hours training + assignments + actual project based case studies Training Materials: All attendees will receive: Assignment after each module, video recording

More information

Embedded Analytics. The new battleground of banking. Stuart Ward Director Financial Services, APAC, Qlik

Embedded Analytics. The new battleground of banking. Stuart Ward Director Financial Services, APAC, Qlik Embedded Analytics The new battleground of banking Stuart Ward Director Financial Services, APAC, Qlik Legal Disclaimer This Presentation contains forward-looking statements, including, but not limited

More information

Leading the Automation of Advertising. Pioneers of Advertising Automation

Leading the Automation of Advertising. Pioneers of Advertising Automation Leading the Automation of Advertising Pioneers of Advertising Automation Safe Harbor These materials and the accompanying oral presenta3on contain forward- looking statements, including statements that

More information

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012 Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team rlancaster@orbitz.com @rob1lancaster Organizer of Chicago

More information

Appendix A: Case Studies

Appendix A: Case Studies Appendix A: Case Studies 1. CiscoServiceOne (CSOne) Project Salesforce Service Cloud Implementation Background Currently the majority of service operations for Cisco are handled through Oracle ebusiness

More information

Installing the LotusLive TM Package for Salesforce.com

Installing the LotusLive TM Package for Salesforce.com Installing the LotusLive TM Package for Salesforce.com Before you install Make sure that Team Selling and Account Teams are enabled. To enable Team Selling: Select Setup > Customize > Opportunity > Opportunity

More information

Unified Batch & Stream Processing Platform

Unified Batch & Stream Processing Platform Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built

More information

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. www.spark- project.org. University of California, Berkeley UC BERKELEY

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. www.spark- project.org. University of California, Berkeley UC BERKELEY Spark in Action Fast Big Data Analytics using Scala Matei Zaharia University of California, Berkeley www.spark- project.org UC BERKELEY My Background Grad student in the AMP Lab at UC Berkeley» 50- person

More information

Safe Harbor. Henning B. Treichl. Senior Sales Engineer salesforce.com

Safe Harbor. Henning B. Treichl. Senior Sales Engineer salesforce.com Henning B. Treichl Senior Sales Engineer salesforce.com Safe Harbor Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements

More information

Map- reduce, Hadoop and The communica3on bo5leneck. Yoav Freund UCSD / Computer Science and Engineering

Map- reduce, Hadoop and The communica3on bo5leneck. Yoav Freund UCSD / Computer Science and Engineering Map- reduce, Hadoop and The communica3on bo5leneck Yoav Freund UCSD / Computer Science and Engineering Plan of the talk Why is Hadoop so popular? HDFS Map Reduce Word Count example using Hadoop streaming

More information

COURSE CONTENT Big Data and Hadoop Training

COURSE CONTENT Big Data and Hadoop Training COURSE CONTENT Big Data and Hadoop Training 1. Meet Hadoop Data! Data Storage and Analysis Comparison with Other Systems RDBMS Grid Computing Volunteer Computing A Brief History of Hadoop Apache Hadoop

More information

Salesforce Certified Force.com Developer Study Guide

Salesforce Certified Force.com Developer Study Guide Salesforce Certified Force.com Developer Study Guide Summer 15 STUDY GUIDE 0 Contents ABOUT THE SALESFORCE CERTIFIED FORCE.COM DEVELOPER PROGRAM... 1 SECTION 1. PURPOSE OF THIS STUDY GUIDE... 1 SECTION

More information

The Big Data Ecosystem at LinkedIn. Presented by Zhongfang Zhuang

The Big Data Ecosystem at LinkedIn. Presented by Zhongfang Zhuang The Big Data Ecosystem at LinkedIn Presented by Zhongfang Zhuang Based on the paper The Big Data Ecosystem at LinkedIn, written by Roshan Sumbaly, Jay Kreps, and Sam Shah. The Ecosystems Hadoop Ecosystem

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

SIMPLIFYING BIG DATA Real- &me, interac&ve data analy&cs pla4orm for Hadoop NFLABS

SIMPLIFYING BIG DATA Real- &me, interac&ve data analy&cs pla4orm for Hadoop NFLABS SIMPLIFYING BIG DATA Real- &me, interac&ve data analy&cs pla4orm for Hadoop NFLABS Did you know? Founded in 2011, NFLabs is an enterprise software c o m p a n y w o r k i n g o n developing solutions to

More information

ITG Software Engineering

ITG Software Engineering Introduction to Cloudera Course ID: Page 1 Last Updated 12/15/2014 Introduction to Cloudera Course : This 5 day course introduces the student to the Hadoop architecture, file system, and the Hadoop Ecosystem.

More information

KICK-START CLOUD VENTURES

KICK-START CLOUD VENTURES Contents SALESFORCE & CRM PRACTICE GROUP 3 MARKETING & CAMPAIGN MESSAGE ORCHESTRATION 4 FORCE.COM & ISV PARTNER INTEGRATED COLLABORATION & CAMPAIGN MANAGEMENT 4 MARKETING & OPERATIONAL MESSAGE ORCHESTRATION

More information

The Flink Big Data Analytics Platform. Marton Balassi, Gyula Fora" {mbalassi, gyfora}@apache.org

The Flink Big Data Analytics Platform. Marton Balassi, Gyula Fora {mbalassi, gyfora}@apache.org The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora" {mbalassi, gyfora}@apache.org What is Apache Flink? Open Source Started in 2009 by the Berlin-based database research groups In the Apache

More information

10 Steps to Preparedness

10 Steps to Preparedness 10 Steps to Preparedness Key Take- Aways Review basics of disaster recovery and con2nuity of opera2ons. Understand what you can do to prepare your pool and its members for an unplanned interrup2on. Ini2ate

More information

Splunk Enterprise in the Cloud Vision and Roadmap

Splunk Enterprise in the Cloud Vision and Roadmap Copyright 2013 Splunk Inc. Splunk Enterprise in the Cloud Vision and Roadmap Alex Munk PM Cloud #splunkconf Ledio Ago Director of Engineering Cloud Legal NoJces During the course of this presentajon, we

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

CAPTURING & PROCESSING REAL-TIME DATA ON AWS

CAPTURING & PROCESSING REAL-TIME DATA ON AWS CAPTURING & PROCESSING REAL-TIME DATA ON AWS @ 2015 Amazon.com, Inc. and Its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

This exam contains 13 pages (including this cover page) and 18 questions. Check to see if any pages are missing.

This exam contains 13 pages (including this cover page) and 18 questions. Check to see if any pages are missing. Big Data Processing 2013-2014 Q2 April 7, 2014 (Resit) Lecturer: Claudia Hauff Time Limit: 180 Minutes Name: Answer the questions in the spaces provided on this exam. If you run out of room for an answer,

More information

Automate Your BI Administration to Save Millions with Command Manager and System Manager

Automate Your BI Administration to Save Millions with Command Manager and System Manager Automate Your BI Administration to Save Millions with Command Manager and System Manager Presented by: Dennis Liao Sr. Sales Engineer Date: 27 th January, 2015 Session 2 This Session is Part of MicroStrategy

More information

SmartConnect User Credentials 2012

SmartConnect User Credentials 2012 User Credentials Used When The SmartConnect client connects to Microsoft Dynamics GP When connecting to the Microsoft Dynamics GP the credentials of the current AD user are used to connect to Dynamics

More information

Cloudera Manager Introduction

Cloudera Manager Introduction Cloudera Manager Introduction Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained

More information

How To Grow A Data Center System

How To Grow A Data Center System Zettaset Big Data Ecosystem Discussion Guide Jim Vogt, President & CEO, Zettaset June 20, 2014 The informa,on provided in this document cons,tutes confiden,al and proprietary informa,on of Ze8aset, Inc.

More information

BENCHMARKING V ISUALIZATION TOOL

BENCHMARKING V ISUALIZATION TOOL Copyright 2014 Splunk Inc. BENCHMARKING V ISUALIZATION TOOL J. Green Computer Scien

More information

Tim Blevins Execu;ve Director Labor and Revenue Solu;ons. FTA Technology Conference August 4th, 2015

Tim Blevins Execu;ve Director Labor and Revenue Solu;ons. FTA Technology Conference August 4th, 2015 Tim Blevins Execu;ve Director Labor and Revenue Solu;ons FTA Technology Conference August 4th, 2015 Governance and Organiza;onal Strategy PaIerns of Fraud and Abuse in Government What tools can we use

More information

SpringCM Integration Guide. for Salesforce

SpringCM Integration Guide. for Salesforce SpringCM Integration Guide for Salesforce January 2013 Introduction You are minutes away from fully integrating SpringCM into your Salesforce account. The SpringCM Open Cloud Connector will allow you to

More information

Development Model for the Cloud Paradigm Shift of the Same Old Same Old? Dr. Umit Yalcinalp, Salesforce.com Developer Evangelist

Development Model for the Cloud Paradigm Shift of the Same Old Same Old? Dr. Umit Yalcinalp, Salesforce.com Developer Evangelist Development Model for the Cloud Paradigm Shift of the Same Old Same Old? Dr. Umit Yalcinalp, Salesforce.com Developer Evangelist Computing History Reduce Complexity, Do More Turing Machines Assembly code

More information

Distributed Calculus with Hadoop MapReduce inside Orange Search Engine. mardi 3 juillet 12

Distributed Calculus with Hadoop MapReduce inside Orange Search Engine. mardi 3 juillet 12 Distributed Calculus with Hadoop MapReduce inside Orange Search Engine What is Big Data? $ 5 billions (2012) to $ 50 billions (by 2017) Forbes «Big Data is the new definitive source of competitive advantage

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

Channel Bytes. Accelera'ng Managed Services

Channel Bytes. Accelera'ng Managed Services Channel Bytes Accelera'ng Managed Services Housekeeping Webinar is being recorded. Slides and recording link will be available tomorrow.? Contact informa8on is provided at the end of the webinar. #channelbytes

More information

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros

Background on Elastic Compute Cloud (EC2) AMI s to choose from including servers hosted on different Linux distros David Moses January 2014 Paper on Cloud Computing I Background on Tools and Technologies in Amazon Web Services (AWS) In this paper I will highlight the technologies from the AWS cloud which enable you

More information

Webinar: Having the Best of Both World- Class Customer Experience and Comprehensive Iden=ty Security

Webinar: Having the Best of Both World- Class Customer Experience and Comprehensive Iden=ty Security Webinar: Having the Best of Both World- Class Customer Experience and Comprehensive Iden=ty Security With Iden>ty Expert and UnboundID Customer Bill Bonney Today s Speakers Bill Bonney Formerly Director,

More information

An Open Dynamic Big Data Driven Applica3on System Toolkit

An Open Dynamic Big Data Driven Applica3on System Toolkit An Open Dynamic Big Data Driven Applica3on System Toolkit Craig C. Douglas University of Wyoming and KAUST This research is supported in part by the Na3onal Science Founda3on and King Abdullah University

More information

WHAT S NEW IN SAS 9.4

WHAT S NEW IN SAS 9.4 WHAT S NEW IN SAS 9.4 PLATFORM, HPA & SAS GRID COMPUTING MICHAEL GODDARD CHIEF ARCHITECT SAS INSTITUTE, NEW ZEALAND SAS 9.4 WHAT S NEW IN THE PLATFORM Platform update SAS Grid Computing update Hadoop support

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Data Algorithms. Mahmoud Parsian. Tokyo O'REILLY. Beijing. Boston Farnham Sebastopol

Data Algorithms. Mahmoud Parsian. Tokyo O'REILLY. Beijing. Boston Farnham Sebastopol Data Algorithms Mahmoud Parsian Beijing Boston Farnham Sebastopol Tokyo O'REILLY Table of Contents Foreword xix Preface xxi 1. Secondary Sort: Introduction 1 Solutions to the Secondary Sort Problem 3 Implementation

More information

Ankush Cluster Manager - Hadoop2 Technology User Guide

Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush Cluster Manager - Hadoop2 Technology User Guide Ankush User Manual 1.5 Ankush User s Guide for Hadoop2, Version 1.5 This manual, and the accompanying software and other documentation, is protected

More information

Has been into training Big Data Hadoop and MongoDB from more than a year now

Has been into training Big Data Hadoop and MongoDB from more than a year now NAME NAMIT EXECUTIVE SUMMARY EXPERTISE DELIVERIES Around 10+ years of experience on Big Data Technologies such as Hadoop and MongoDB, Java, Python, Big Data Analytics, System Integration and Consulting

More information

A bit about Hadoop. Luca Pireddu. March 9, 2012. CRS4Distributed Computing Group. luca.pireddu@crs4.it (CRS4) Luca Pireddu March 9, 2012 1 / 18

A bit about Hadoop. Luca Pireddu. March 9, 2012. CRS4Distributed Computing Group. luca.pireddu@crs4.it (CRS4) Luca Pireddu March 9, 2012 1 / 18 A bit about Hadoop Luca Pireddu CRS4Distributed Computing Group March 9, 2012 luca.pireddu@crs4.it (CRS4) Luca Pireddu March 9, 2012 1 / 18 Often seen problems Often seen problems Low parallelism I/O is

More information

Salesforce Admin Course Content: Chapter 1 CRM Introduction Introduction to CRM? Why CRM?

Salesforce Admin Course Content: Chapter 1 CRM Introduction Introduction to CRM? Why CRM? Salesforce Admin Course Content: Chapter 1 CRM Introduction Introduction to CRM? Why CRM? Chapter 2 Introduction to Cloud Computing & Salesforce.com Cloud Computing - Overview What is Software-as-a-Service

More information

Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp

Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)

More information

Exchange of experience from a SuccessFactors LMS Implementa9on

Exchange of experience from a SuccessFactors LMS Implementa9on Exchange of experience from a SuccessFactors LMS Implementa9on Seen from a user perspective Hanne Vasshus Ask Competency Management Cau9onary Statement The following presenta9on includes forward- looking

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

Machine- Learning Summer School - 2015

Machine- Learning Summer School - 2015 Machine- Learning Summer School - 2015 Big Data Programming David Franke Vast.com hbp://www.cs.utexas.edu/~dfranke/ Goals for Today Issues to address when you have big data Understand two popular big data

More information

Oracle Cloud Strategy

Oracle Cloud Strategy Oracle Cloud Strategy Mark Hurd June 25, 2014 Copyright 2014 Oracle and/or its affiliates. All rights reserved. Oracle Confiden?al Internal/Restricted/Highly Restricted 6 Safe Harbor Statement "Safe Harbor"

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

Chatter Answers Implementation Guide

Chatter Answers Implementation Guide Chatter Answers Implementation Guide Salesforce, Winter 16 @salesforcedocs Last updated: October 16, 2015 Copyright 2000 2015 salesforce.com, inc. All rights reserved. Salesforce is a registered trademark

More information

The Evolu*on of Service Management

The Evolu*on of Service Management The Evolu*on of Extending Disciplines Across the Enterprise Michael Jones Regional CTO - Architecture Michael.Jones@servicenow.com 2015 Now All Rights Reserved 1 How work gets done today! Emails Spreadsheets

More information

What's New in SAS Data Management

What's New in SAS Data Management Paper SAS034-2014 What's New in SAS Data Management Nancy Rausch, SAS Institute Inc., Cary, NC; Mike Frost, SAS Institute Inc., Cary, NC, Mike Ames, SAS Institute Inc., Cary ABSTRACT The latest releases

More information

10605 BigML Assignment 4(a): Naive Bayes using Hadoop Streaming

10605 BigML Assignment 4(a): Naive Bayes using Hadoop Streaming 10605 BigML Assignment 4(a): Naive Bayes using Hadoop Streaming Due: Friday, Feb. 21, 2014 23:59 EST via Autolab Late submission with 50% credit: Sunday, Feb. 23, 2014 23:59 EST via Autolab Policy on Collaboration

More information

Streaming items through a cluster with Spark Streaming

Streaming items through a cluster with Spark Streaming Streaming items through a cluster with Spark Streaming Tathagata TD Das @tathadas CME 323: Distributed Algorithms and Optimization Stanford, May 6, 2015 Who am I? > Project Management Committee (PMC) member

More information

Recommendation Tool Using Collaborative Filtering

Recommendation Tool Using Collaborative Filtering Recommendation Tool Using Collaborative Filtering Aditya Mandhare 1, Soniya Nemade 2, M.Kiruthika 3 Student, Computer Engineering Department, FCRIT, Vashi, India 1 Student, Computer Engineering Department,

More information

Kaseya Fundamentals Workshop DAY THREE. Developed by Kaseya University. Powered by IT Scholars

Kaseya Fundamentals Workshop DAY THREE. Developed by Kaseya University. Powered by IT Scholars Kaseya Fundamentals Workshop DAY THREE Developed by Kaseya University Powered by IT Scholars Kaseya Version 6.5 Last updated March, 2014 Day Two Overview Day Two Lab Review Patch Management Configura;on

More information

Salesforce Integration

Salesforce Integration Salesforce Integration 2015 Bomgar Corporation. All rights reserved worldwide. BOMGAR and the BOMGAR logo are trademarks of Bomgar Corporation; other trademarks shown are the property of their respective

More information

Big Data Spatial Analytics An Introduction

Big Data Spatial Analytics An Introduction 2013 Esri International User Conference July 8 12, 2013 San Diego, California Technical Workshop Big Data Spatial Analytics An Introduction Marwa Mabrouk Mansour Raad Esri iu UC2013. Technical Workshop

More information

Student Project 2 - Apps Frequently Installed Together

Student Project 2 - Apps Frequently Installed Together Student Project 2 - Apps Frequently Installed Together 42matters is a rapidly growing start up, leading the development of next generation mobile user modeling technology. Our solutions are used by big

More information

BIG DATA SOLUTION DATA SHEET

BIG DATA SOLUTION DATA SHEET BIG DATA SOLUTION DATA SHEET Highlight. DATA SHEET HGrid247 BIG DATA SOLUTION Exploring your BIG DATA, get some deeper insight. It is possible! Another approach to access your BIG DATA with the latest

More information

Making big data simple with Databricks

Making big data simple with Databricks Making big data simple with Databricks We are Databricks, the company behind Spark Founded by the creators of Apache Spark in 2013 Data 75% Share of Spark code contributed by Databricks in 2014 Value Created

More information

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Introduction to Big Data! with Apache Spark UC#BERKELEY# Introduction to Big Data! with Apache Spark" UC#BERKELEY# This Lecture" The Big Data Problem" Hardware for Big Data" Distributing Work" Handling Failures and Slow Machines" Map Reduce and Complex Jobs"

More information

Distributed Computing and Big Data: Hadoop and MapReduce

Distributed Computing and Big Data: Hadoop and MapReduce Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:

More information

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763 International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

MPS & VPS: Not Just for Hos1ng!

MPS & VPS: Not Just for Hos1ng! MPS & VPS: Not Just for Hos1ng! Ivan Hur) Sr. Product Manager Verio Inc Privileged and Confiden/al: NDA Required for External Disclosure 2/11/10 1 Privileged and Confiden/al: NDA Required for External

More information

Oracle Data Miner (Extension of SQL Developer 4.0)

Oracle Data Miner (Extension of SQL Developer 4.0) An Oracle White Paper October 2013 Oracle Data Miner (Extension of SQL Developer 4.0) Generate a PL/SQL script for workflow deployment Denny Wong Oracle Data Mining Technologies 10 Van de Graff Drive Burlington,

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

Hadoop Development & BI- 0 to 100

Hadoop Development & BI- 0 to 100 Development Master the Data Analysis tools like Pig and hive Data Science Hadoop Development & BI- 0 to 100 Build a recommendation engine Hadoop Development - 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

THE STATE OF THE DATA WAREHOUSE

THE STATE OF THE DATA WAREHOUSE March 2015 Sponsored by Introduction As the volume and types of business data have increased at a phenomenal pace, and the cost to store that data has plummeted, businesses have looked to data analytics

More information