Big Data in Banking: Hype or Future?

Size: px
Start display at page:

Download "Big Data in Banking: Hype or Future?"

Transcription

1 Big Data in Banking: Hype or Future? prof. David Martens

2 Agenda What s new about Big Data? Marketing applications in banking o Response modeling using payment data o Customer acquisition using browsing data Risk management applications in banking o Retail default prediction using Facebook data o SME Default prediction using board of director data Challenges for the future in banking

3

4 What is Big Data?

5 What is Big Data? Big Data: data that is so large that traditional data processing systems are unable to deal with it o Storage and analysis 4

6 What is Big Data? Hadoop o o Open-source framework for data-intensive distributed processing on commodity hardware Derived from MapReduce and Google File System (GFS) papers Big Data > Hadoop

7 What is Big Data? Google o 24,000 TB per day (2009) Pinterest o 20 TB per day Twitter o 12 GB per day or 800 tweets per second (2010) PrediCube, spinoff UA o 30 GB per day Bank payment data o 5-10 GB for all payment data of one year Is Banking data really Big?

8 What is Data Mining? Data mining: automatic extraction of knowledge from data Setting the scene with credit scoring example Data Data mining technique Pattern Predictions 7

9 What is Big Data? Transaction ID Items Bread, Milk, Apple Bread, Milk, Eggs, Pen Cold Drink, Chocolate, Milk Bread, Orange Fish, Vegetables Paper, Pencil Meat, Oil, Milk Data mining +

10 What is Big Data?

11 What is Big Data? Banking The story of Signet Bank o 1990 o Fairbanks and Morris o Model profitability, not just default o No interest from big banks o Signet Bank invested in data assets o Huge success, credit operations spinoff Now Capital One Defining and leveraging data assets

12 Agenda What s new about Big Data? Marketing applications in banking o Response modeling using payment data o Customer acquisition using browsing data Risk management applications in banking o Retail default prediction using Facebook data o SME Default prediction using board of director data Challenges for the future in banking

13 Case Study 1: Mining Payment Data From payment data to pseudo-social network o 21 million transactions o Response modeling o Significant improvement over traditional modeling Payment receivers customers Little Bookstore John John Alex An Pete DeliC Amazon Alex An Jeff SportCenterX Pete Jeff EnergyInc [David Martens, Foster Provost, Pseudo-social network targeting from consumer, transaction data, New York University - Stern School of Business - Working paper CeDER Patent application PCT/US2011/028175]

14 Case Study 1: Mining Payment Data From payment data to pseudo-social network o 21 million transactions o Response modeling o Anonymized! customers X00123 Payment receivers M0011 X00123 X00560 X10353 X00056 M8463 M9963 X00560 X10353 X11333 M1365 X00056 X11333 M8005

15 Case Study 1: Mining Payment Data Application: response modeling Target variable on two products Pension fund and Long term deposit account `Socio-demographic data 289 variables Socio-demographic, product possession, product use, customer behavior When sending offer to top 1%: 3 times more conversions! Variety of (already available) data improves performance 14

16 Case Study 1: Mining Payment Data Is Bigger Data Better? SD: Using bank s structured data Bigger not better! Size of data set used for training (% of 1.2 million consumers in total) [Junqué de Fortuny, Martens and Provost Predictive Modeling with Big Data: Is Bigger Really Better? Big Data Journal 1(4): ]

17 Case Study 1: Mining Payment Data Is Bigger Data Better? Bigger is better! % of a data set of 1.2 million consumers used for training % of a data set of 1.2 million consumers used for training PSN: prediction based on data on fine-grained behavior SD: traditional predictive modeling based on socio-demographic data PSN + SD: ensemble model combining both Volume of (already available) data improves performance

18 Case Study 2: Customer Acquisition Predict product interest based on web browsing data o Show ad only to those that are predicted to be interested o Spinoff of UA Data: 1 billion records of persons visiting webpages Results: % conversions Work done with Dieter Devlaminck (PrediCube) www,predicube.com

19 Agenda What s new about Big Data? Marketing applications in banking o Response modeling using payment data o Customer acquisition using browsing data Risk management applications in banking o Retail default prediction using Facebook data o SME Default prediction using board of director data Challenges for the future in banking

20 Case Study 3: Default prediction with Facebook data Default prediction with Facebook data for micro-finance o o o In collaboration with NY-based Lenddo Philippines, $ loans Facebook data (opt-in) Messages Likes Sociodemo Tags Friends Work done with Sofie De Cnudde, Ellen Tobback, Julie Moeyersoms, Marija Stankova (Universiteit Antwerpen) and Vinayak Javaly (Lenddo)

21 Cluster of befriended defaulters Case Study 3: Default prediction with Facebook data Friends network Cluster of befriended nondefaulters friends defaulter non-defaulter

22 Case Study 3: Default prediction with Facebook data Liking data Facebook page like defaulter non-defaulter

23 Case Study 3: Default prediction with Facebook data Facebook data is very predictive for default prediction Accuracy of predictions Behavioral data is more valuable than social network data

24 Case Study 4: SME default prediction SME default prediction Data (Belfirst) o SMEs o Financial + Board members and managers Network of companies Work done with Ellen Tobback, Julie Moeyersoms, Marija Stankova (Universiteit Antwerpen)

25 Case Study 4: SME default prediction Def Def Predictions based on profile of connected companies Def Def Def Def Def Def Def Example network of connected defaulting SMEs, due to two directors

26 Case Study 4: SME default prediction Big Data? Just 1% of the data

27 Case Study 4: SME default prediction Sample network of connected defaulting SMEs, with three clusters due to three persons

28 Case Study 4: SME default prediction Predictive power for default prediction o If we consider the 1% most riskiest SMEs, we can find 58% more defaulters Variety of (already available) data improves performance and insight

29 Agenda What s new about Big Data? Marketing applications in banking o Response modeling using payment data o Customer acquisition using browsing data Risk management applications in banking o Retail default prediction using Facebook data o SME Default prediction using board of director data Challenges for the future in banking

30 Privacy! Hey, you re having a baby! Target

31 Privacy vs data as an asset - a spectrum Using and selling all data Selling payment data Selling socio-demo data Selling anonymised data Using Facebook data Tracking users online Using anonymized payment data for marketing Using anonymized payment data for risk management Using socio-demo and credit data for risk management Nothing

32 Competition from Google & Co? Mainly in payment Data Fees and data asset Leveraging their existing data asset

33 Competition from FinTech? Niche products Loyal Belgians: ability of banks to retain their customers FinTech startups: collaborate or acquire

34 References Data Science for Business o By Foster Provost and Tom Fawcett o O Reilly Predictive Analytics: Techniques and Applications in Credit Risk Modelling o By Tony van Gestel, Bart Baesens and David Martens o Oxford University Press

35 Conclusion Big Data in Banking: Hype or Future? o Both! Big Data o Banks already quite advanced in data analyses o Don t be fooled by the hype What to do? o Define and leverage data assets! o Collaborate with emerging FinTech startups o Main challenge: attracting data scientists

36 Q&A Prof. dr. ir. David Martens Applied Data Mining Faculteit Toegepaste Economische Wetenschappen Universiteit Antwerpen E: W:

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Hadoop & SAS Data Loader for Hadoop

Hadoop & SAS Data Loader for Hadoop Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle

More information

Of all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rijmenam, Think Bigger, 2014

Of all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rijmenam, Think Bigger, 2014 What is Big Data? Of all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rijmenam, Think Bigger, 2014 Data in the Twentieth Century and before In 1663,

More information

Big Data, Applied. Overview of Big Data from our experiences at Telefonica s digital services. Jose Luis Agundez. @ciberjos

Big Data, Applied. Overview of Big Data from our experiences at Telefonica s digital services. Jose Luis Agundez. @ciberjos Big Data, Applied Overview of Big Data from our experiences at Telefonica s digital services Jose Luis Agundez @ciberjos September 2014 Agenda 1. What s the big deal with Big Data? A new information business

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

More information

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 FREE echapter C H A P T E R1 Big Data and Analytics Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 percent of the data in the

More information

The big data revolution

The big data revolution The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing

More information

Distributed Systems. Lec 2: Example use cases: Cloud computing, Big data, Web services

Distributed Systems. Lec 2: Example use cases: Cloud computing, Big data, Web services Distributed Systems Lec 2: Example use cases: Cloud computing, Big data, Web services 1 Example Use Cases Cloud computing (today) What it means and how it began Big data (today) Role of distributed systems

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Big data and its transformational effects

Big data and its transformational effects Big data and its transformational effects Professor Fai Cheng Head of Research & Technology September 2015 Working together for a safer world Topics Lloyd s Register Big Data Data driven world Data driven

More information

Leveraging Big Social Data

Leveraging Big Social Data Leveraging Big Social Data Leveraging Big Social Data New ways of processing and analyzing Big Data have led to innovations across many industries from software that can diagnose Parkinson s to earthquake

More information

Leveraging unstructured data for improved decision making: A retail banking perspective

Leveraging unstructured data for improved decision making: A retail banking perspective View Point Leveraging unstructured data for improved decision making: A retail banking perspective - Sowmya Ramachandran and Kalyan Malladi Overview Up until now, despite possessing a large stash of structured

More information

Driving Better Marketing Results with Big Data and Analytics David Corrigan, IBM, Director of Product Marketing

Driving Better Marketing Results with Big Data and Analytics David Corrigan, IBM, Director of Product Marketing Driving Better Marketing Results with Big Data and Analytics David Corrigan, IBM, Director of Product Marketing Optimizing Marketing with Big Data and Analytics Leverage Social Media Datacentric Marketing

More information

Big Analytics: A Next Generation Roadmap

Big Analytics: A Next Generation Roadmap Big Analytics: A Next Generation Roadmap Cloud Developers Summit & Expo: October 1, 2014 Neil Fox, CTO: SoftServe, Inc. 2014 SoftServe, Inc. Remember Life Before The Web? 1994 Even Revolutions Take Time

More information

The? Data: Introduction and Future

The? Data: Introduction and Future The? Data: Introduction and Future Husnu Sensoy Global Maksimum Data & Information Technologies Global Maksimum Data & Information Technologies The Data Company Massive Data Unstructured Data Insight Information

More information

L1: Introduction to Hadoop

L1: Introduction to Hadoop L1: Introduction to Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General

More information

Harnessing Digital. November 2014

Harnessing Digital. November 2014 Harnessing Digital November 2014 Who is WSI? Founded in 1995 World s largest digital agency network 1000+ offices Operating in 87 Countries 2014 WSI. All rights reserved. Our Corporate Partners Award Winning

More information

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms

More information

Big Data and Analytics

Big Data and Analytics Big Data and Analytics Industry Landscape Big Data Everywhere! BIG DATA Data that is TOO LARGE & TOO COMPLEX for conventional data tools to capture, store and analyze. The 3V s of Big Data Shares traded

More information

Social Networks in Data Mining: Challenges and Applications

Social Networks in Data Mining: Challenges and Applications Social Networks in Data Mining: Challenges and Applications SAS Talks May 10, 2012 PLEASE STAND BY Today s event will begin at 1:00pm EST. The audio portion of the presentation will be heard through your

More information

SOCIAL MEDIA FOR MSMEs A turning point. By DR. PRALAY DEY National Small Industries Corporation (NSIC)

SOCIAL MEDIA FOR MSMEs A turning point. By DR. PRALAY DEY National Small Industries Corporation (NSIC) SOCIAL MEDIA FOR MSMEs A turning point By DR. PRALAY DEY National Small Industries Corporation (NSIC) IMPORTANCE OF MSMEs in India 44+ MILLIONS UNITS 45% manufacturing output MSMEs in India 40% EXPORT

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

Social media has CHANGED THE WORLD as we know it by connecting people, ideas and products across the globe.

Social media has CHANGED THE WORLD as we know it by connecting people, ideas and products across the globe. Social Media for Retailers: Six Social Media Marketing Tips to Drive Online Sales........................................................ 2 Social media has CHANGED THE WORLD as we know it by connecting

More information

Harnessing the True Power of Data

Harnessing the True Power of Data Harnessing the True Power of Data Find out how financial institutions can leverage big data to better understand and transform the customer experience. 10101010101100011010100010110011100101010 0110001101010001011001110010101010001100

More information

A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM

A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM A PERFORMANCE ANALYSIS of HADOOP CLUSTERS in OPENSTACK CLOUD and in REAL SYSTEM Ramesh Maharjan and Manoj Shakya Department of Computer Science and Engineering Dhulikhel, Kavre, Nepal lazymesh@gmail.com,

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Copyright 2014, Neudesic. All rights reserved.

Copyright 2014, Neudesic. All rights reserved. 2 Accelerating Modernization Across the Enterprise Cloud Computing User Experience Enterprise Mobility Customer Relationship Management Business Analysis Managed Services Custom Application Development

More information

What is Big Data? Concepts, Ideas and Principles. Hitesh Dharamdasani

What is Big Data? Concepts, Ideas and Principles. Hitesh Dharamdasani What is Big Data? Concepts, Ideas and Principles Hitesh Dharamdasani # whoami Security Researcher, Malware Reversing Engineer, Developer GIT > George Mason > UC Berkeley > FireEye > On Stage Building Data-driven

More information

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current

More information

Big Data Big Deal? Salford Systems www.salford-systems.com

Big Data Big Deal? Salford Systems www.salford-systems.com Big Data Big Deal? Salford Systems www.salford-systems.com 2015 Copyright Salford Systems 2010-2015 Big Data Is The New In Thing Google trends as of September 24, 2015 Difficult to read trade press without

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,

More information

Introduction to Big Data the four V's

Introduction to Big Data the four V's Chapter 1: Introduction to Big Data the four V's This chapter is mainly based on the Big Data script by Donald Kossmann and Nesime Tatbul (ETH Zürich) Big Data Management and Analytics 15 Goal of Today

More information

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Computing in clouds: Where we come from, Where we are, What we can, Where we go

Computing in clouds: Where we come from, Where we are, What we can, Where we go Computing in clouds: Where we come from, Where we are, What we can, Where we go Luc Bougé ENS Cachan/Rennes, IRISA, INRIA Biogenouest With help from many colleagues: Gabriel Antoniu, Guillaume Pierre,

More information

Professional Diploma in Digital Marketing

Professional Diploma in Digital Marketing Professional Diploma in Digital Marketing Agenda Day 1: Day 2: Day 3: Day 4: Day 5: to Digital Marketing Search Engine Optimisation Search Engine Marketing Email Marketing Digital Display Advertising Mobile

More information

Big Data and Open Data

Big Data and Open Data Big Data and Open Data Bebo White SLAC National Accelerator Laboratory/ Stanford University!! bebo@slac.stanford.edu dekabytes hectobytes Big Data IS a buzzword! The Data Deluge From the beginning of

More information

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems

More information

Data Mining in the Swamp

Data Mining in the Swamp WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all

More information

Hadoop Usage At Yahoo! Milind Bhandarkar (milindb@yahoo-inc.com)

Hadoop Usage At Yahoo! Milind Bhandarkar (milindb@yahoo-inc.com) Hadoop Usage At Yahoo! Milind Bhandarkar (milindb@yahoo-inc.com) About Me Parallel Programming since 1989 High-Performance Scientific Computing 1989-2005, Data-Intensive Computing 2005 -... Hadoop Solutions

More information

Big Data Efficiencies That Will Transform Media Company Businesses

Big Data Efficiencies That Will Transform Media Company Businesses Big Data Efficiencies That Will Transform Media Company Businesses TV, digital and print media companies are getting ever-smarter about how to serve the diverse needs of viewers who consume content across

More information

Big Data Specialized Studies

Big Data Specialized Studies Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate

More information

U.S. Mobile Benchmark Report

U.S. Mobile Benchmark Report U.S. Mobile Benchmark Report ADOBE DIGITAL INDEX 2014 80% 40% Methodology Report based on aggregate and anonymous data across retail, media, entertainment, financial service, and travel websites. Behavioral

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Sell to the World! Part Two. Alex Kramer Business & Entrepreneurship Center

Sell to the World! Part Two. Alex Kramer Business & Entrepreneurship Center Sell to the World! Part Two Alex Kramer Business & Entrepreneurship Center Sponsored by: Alex Kramer Director, BEC at Cabrillo College alkramer@cabrillo.edu Round Robin: If you re looking for give me a

More information

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

Colleen s Interview With Ivan Kolev

Colleen s Interview With Ivan Kolev Colleen s Interview With Ivan Kolev COLLEEN: [TO MY READERS] Hello, everyone, today I d like to welcome you to my interview with Ivan Kolev (affectionately known as Coolice). Hi there, Ivan, and thank

More information

WHITE PAPER: DATA DRIVEN MARKETING DECISIONS IN THE RETAIL INDUSTRY

WHITE PAPER: DATA DRIVEN MARKETING DECISIONS IN THE RETAIL INDUSTRY WHITE PAPER: DATA DRIVEN MARKETING DECISIONS IN THE RETAIL INDUSTRY By: Dan Theirl Rubikloud Technologies Inc. www.rubikloud.com Prepared by: Laura Leslie Neil Laing Tiffany Hsiao SUMMARY: Data-driven

More information

Outline. What is Big data and where they come from? How we deal with Big data?

Outline. What is Big data and where they come from? How we deal with Big data? What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,

More information

Big Data and Privacy in a Digital World

Big Data and Privacy in a Digital World Big Data and Privacy in a Digital World Jose-Luis Agundez @ciberjos Blog: http://blog.digital.telefonica.com/?s=ciberjos 25th February 2015 300+ Million customers, 20+ countries, 90 year history Source:

More information

The Pioneer in Social Targeting. Marketing to Social Connections on the Web

The Pioneer in Social Targeting. Marketing to Social Connections on the Web The Pioneer in Social Targeting Marketing to Social Connections on the Web Powered by the explosion of social media on the web, Media6Degrees has pioneered a radical approach to ad targeting that moves

More information

Big Data & the Cloud: The Sum Is Greater Than the Parts

Big Data & the Cloud: The Sum Is Greater Than the Parts E-PAPER March 2014 Big Data & the Cloud: The Sum Is Greater Than the Parts Learn how to accelerate your move to the cloud and use big data to discover new hidden value for your business and your users.

More information

Opportunities with Predictive Analytics. Greg Leflar, Vice President greg.leflar@parivedasolutions.com

Opportunities with Predictive Analytics. Greg Leflar, Vice President greg.leflar@parivedasolutions.com Opportunities with Predictive Analytics Greg Leflar, Vice President greg.leflar@parivedasolutions.com Opportunities for Predictive Analytics We help you separate the Value from the Hype The field of predictive

More information

A Berkeley View of Big Data

A Berkeley View of Big Data A Berkeley View of Big Data Ion Stoica UC Berkeley BEARS February 17, 2011 Big Data is Massive Facebook: 130TB/day: user logs 200-400TB/day: 83 million pictures Google: > 25 PB/day processed data Data

More information

EXECUTIVE REPORT. Big Data and the 3 V s: Volume, Variety and Velocity

EXECUTIVE REPORT. Big Data and the 3 V s: Volume, Variety and Velocity EXECUTIVE REPORT Big Data and the 3 V s: Volume, Variety and Velocity The three V s are the defining properties of big data. It is critical to understand what these elements mean. The main point of the

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D. Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology

More information

Hadoop for Enterprises:

Hadoop for Enterprises: Hadoop for Enterprises: Overcoming the Major Challenges Introduction to Big Data Big Data are information assets that are high volume, velocity, and variety. Big Data demands cost-effective, innovative

More information

CPS 216: Advanced Database Systems (Data-intensive Computing Systems) Shivnath Babu

CPS 216: Advanced Database Systems (Data-intensive Computing Systems) Shivnath Babu CPS 216: Advanced Database Systems (Data-intensive Computing Systems) Shivnath Babu A Brief History Relational database management systems Time 1975-1985 1985-1995 1995-2005 Let us first see what a relational

More information

MapReduce, Hadoop and Amazon AWS

MapReduce, Hadoop and Amazon AWS MapReduce, Hadoop and Amazon AWS Yasser Ganjisaffar http://www.ics.uci.edu/~yganjisa February 2011 What is Hadoop? A software framework that supports data-intensive distributed applications. It enables

More information

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined

More information

Leveraging Big Data. A case study from Thomson Reuters

Leveraging Big Data. A case study from Thomson Reuters Leveraging Big Data A case study from Thomson Reuters About the speakers Chawapong Suriyajan, Development Group Leader Sakol Suwinaitrakool Senior Solution Architect 2 FOLLOW US: facebook.com/thomsonreutersthailand

More information

Trends in Business Intelligence

Trends in Business Intelligence Trends in Business Intelligence The Impact of SaaS and Social Data Shawn P. Rogers Vice President Research ~ Business Intelligence Practice Enterprise Management Associates décembre 16, 2010 Speaker Shawn

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

The key to knowing the best price is to fully understand consumer behavior.

The key to knowing the best price is to fully understand consumer behavior. A price optimization tool designed for small to mid-size companies to optimize infrastructure and determine the perfect price point per item in any given week DEBORAH WEINSWIG Executive Director- Head,

More information

Big Data Scoring. April 2014

Big Data Scoring. April 2014 Big Data Scoring April 2014 There was 5 exabytes of information created between the dawn of civilization through 2003 that much information is now created every 2 days - Eric Schmidt, Google CEO 2 Why

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

The Big Picture on Big Data. Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg

The Big Picture on Big Data. Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg The Big Picture on Big Data Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg Objective of Talk 1. Deliver a Primer on Big Data. 2. How does this emerging topic apply to Quality? 3.

More information

Firebird meets NoSQL (Apache HBase) Case Study

Firebird meets NoSQL (Apache HBase) Case Study Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

Social media has changed the world as we know it by connecting people, ideas and products across the globe.

Social media has changed the world as we know it by connecting people, ideas and products across the globe. Social Media for Retailers: Six Social Media Marketing Tips to Drive Online Sales.................................................................. 2 Social media has changed the world as we know it by

More information

Harnessing the True Power of Data

Harnessing the True Power of Data EBOOK Harnessing the True Power of Data Find Out How Financial Institutions Can Leverage Big Data to Better Understand and Transform the Customer Experience. 101010101010110001101010001011001110010 100101100011010100010110011100101010100

More information

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

INTRO TO BIG DATA. Djoerd Hiemstra. http://www.cs.utwente.nl/~hiemstra. Big Data in Clinical Medicinel, 30 June 2014

INTRO TO BIG DATA. Djoerd Hiemstra. http://www.cs.utwente.nl/~hiemstra. Big Data in Clinical Medicinel, 30 June 2014 INTRO TO BIG DATA Big Data in Clinical Medicinel, 30 June 2014 Djoerd Hiemstra http://www.cs.utwente.nl/~hiemstra WHY BIG DATA? 2 Source: http://en.wikipedia.org/wiki/mount_everest 3 19 May 2012: 234 people

More information

Outstanding performance track record

Outstanding performance track record Thematic approach uncovers best opportunities Dynamically play into real world developments Outstanding performance track record Experienced management team Global Themes that Shape the Future April 2013

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.

CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof. CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensie Computing Uniersity of Florida, CISE Department Prof. Daisy Zhe Wang Map/Reduce: Simplified Data Processing on Large Clusters Parallel/Distributed

More information

Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com

Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com Introduction to Predictive Analytics Dr. Ronen Meiri Outline From big data to predictive analytics Predictive Analytics vs. BI Intelligent platforms What can we do with it. The modeling process. Example

More information

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data

More information

Age of Big data. Presented by: Mohammad Iqbal BCM -2014

Age of Big data. Presented by: Mohammad Iqbal BCM -2014 Age of Presented by: Mohammad Iqbal BCM -2014 Agenda Big? Big evolution from Big? Name Symbol Value Kilobyte KB 10^3 BIG DATA Megabyte MB 10^6 Gigabyte GB 10^9 Terabyte TB 10^12 Petabyte PB 10^15 So large

More information

Online analytics survey

Online analytics survey Online analytics survey SCREENERS Qa. Can I confirm that your business has a website? Yes 1 No 2 Any companies responding as No were screened out from the survey Qb. Can I confirm that you are in a position

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

AppSymphony White Paper

AppSymphony White Paper AppSymphony White Paper Secure Self-Service Analytics for Curated Digital Collections Introduction Optensity, Inc. offers a self-service analytic app composition platform, AppSymphony, which enables data

More information

Social Media. Marketing Guide B2B

Social Media. Marketing Guide B2B Social Media Marketing Guide B2B Introduction Social media has revolutionised how people communicate and consume information online. By harnessing the power of the social media buzz and effectively incorporating

More information

The Trends and Roadblocks in Retail e-commerce: A Recap of the 2012 etail West Conference by Mogreet

The Trends and Roadblocks in Retail e-commerce: A Recap of the 2012 etail West Conference by Mogreet The Trends and Roadblocks in Retail e-commerce: A Recap of the 2012 etail West Conference by Mogreet Each year, as technology and consumer adoption continues to evolve, retailer e- commerce teams are facing

More information

Big Data Big Data/Data Analytics & Software Development

Big Data Big Data/Data Analytics & Software Development Big Data Big Data/Data Analytics & Software Development Danairat T. danairat@gmail.com, 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development

More information

# Not a part of 1Z0-061 or 1Z0-144 Certification test, but very important technology in BIG DATA Analysis

# Not a part of 1Z0-061 or 1Z0-144 Certification test, but very important technology in BIG DATA Analysis Section 9 : Case Study # Objectives of this Session The Motivation For Hadoop What problems exist with traditional large-scale computing systems What requirements an alternative approach should have How

More information

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Annex: Concept Note. Big Data for Policy, Development and Official Statistics New York, 22 February 2013

Annex: Concept Note. Big Data for Policy, Development and Official Statistics New York, 22 February 2013 Annex: Concept Note Friday Seminar on Emerging Issues Big Data for Policy, Development and Official Statistics New York, 22 February 2013 How is Big Data different from just very large databases? 1 Traditionally,

More information

BIG DATA AND ANALYTICS

BIG DATA AND ANALYTICS BIG DATA AND ANALYTICS Björn Bjurling, bgb@sics.se Daniel Gillblad, dgi@sics.se Anders Holst, aho@sics.se Swedish Institute of Computer Science AGENDA What is big data and analytics? and why one must bother

More information

A Brief Outline on Bigdata Hadoop

A Brief Outline on Bigdata Hadoop A Brief Outline on Bigdata Hadoop Twinkle Gupta 1, Shruti Dixit 2 RGPV, Department of Computer Science and Engineering, Acropolis Institute of Technology and Research, Indore, India Abstract- Bigdata is

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON BIG DATA MANAGEMENT AND ITS SECURITY PRUTHVIKA S. KADU 1, DR. H. R.

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information