The Library (Big) Data scien4st

Size: px
Start display at page:

Download "The Library (Big) Data scien4st"

Transcription

1 The Library (Big) Data scien4st IFLA/ALA webinar: Big Data: new roles and opportuni4es for new librarians June 15 th 2016 IFLA Big Data Special Interest Group (SIG) Wouter Klapwijk, Stellenbosch University, SIG convenor

2 IFLA Big Data SIG Proposed at WLIC 2014, Lyon Pe44oned for at WLIC 2015, Cape Town Endorsed by the IFLA Professional CommiXee in December 2015 SIG sponsor: IT Sec4on Objec4ves: 1. Provide a focus point for developing ideas regarding Big Data as it affects libraries 2. Provide a pla[orm within IFLA to assess and develop the avenues of response from IFLA to this developing area

3 Deconstruc<ng Big Data and data science This talk is based on a number of everyday ques4ons: 1. What does data science mean? ² is it only happening in Tech places like Facebook and Google? 2. What are Data Scien4st? ² can Librarians also be Data Scien4sts? 3. Is data science the science of Big Data? ² what is the rela4onship between Big Data and data science? 4. Exactly what is Big Data anyway? ² just how big is Big? or is Big rela4ve? is library data Big?

4 Data science A set of fundamental principles that guide the extrac4on of knowledge from data The civil engineering of data : turning data into data products Goal of data science: Ø to improve decision-making, for the bexerment of organiza4ons and society at large

5 Rela<on to other engineering concepts data mining ü helps accomplish data science goals via technologies that incorporate data science principles ü but its techniques are much more extensive than the set of principles comprising data science data warehousing ü a facilita4ng technology for data mining ü but not always included as part of data mining

6 Rela<on to other compu<ng concepts data processing ü is more general than data science ü there is processing involved in all aspects of compu4ng Big Data technologies ü are ocen used for data processing in support of data mining techniques ü and other data science ac4vi4es

7 Science or CraG? v The term data science has existed for over 30 years it is a field onto itself v Founda4on rests in century old prac4ces of Sta4s4cs, Mathema4cs, and since mid-20 th century, also Computer Sciences v It is not just a rebranding of Sta4s4cs and Machine Learning in the context of the Tech industry v Much of the field development is happening in Industry, and not in Academia

8 Examples of data science products Domain Internet Example Recommenda4on systems (Amazon = books; Facebook = friends) Finance Credit ra4ngs Educa<on Personalized learning and assessment Government Policies based on data

9 Prac<cing data science What do Data Scien4st do?

10 The Data Scien<st Two aspects to consider: 1. understand what they DO in business 2. understand which SKILLS they must possess

11 What do they DO? 1. They ask ques4ons o Probe, being curious 2. They (try to) solve problems o Analy4cal thinking, making new discoveries 3. They cul4vate (new) soc skills o Communica4ng and visualizing data

12 What do they DO? 1. They ask ques4ons o Probe, being curious 2. They (try to) solve problems o Analy4cal thinking, making new discoveries 3. They cul4vate (new) soc skills o Communica4ng and visualizing data How much of the above do you as librarian do?

13 The library professional s genes? Is it in our pedigree to con4nuously ask ques4ons? Do we have the taste and mindset for analy4cal thinking? Do we only do ad hoc analysis, or do we prefer an ongoing conversa4on with data? Is there enough of a Business Analyst or Social Scien4st in us?

14 Which SKILLS do they need? Soc skills Hard skils Domainspecific skills Data Scien4st

15 Which SKILLS do they need? Linear algebra Communica4on Soc skills Hard skils Sta4s4cs Ar4ficial Intelligence Domainspecific skills Machine Learning Understand the business, e.g. librarianship Data Scien4st

16 Data Scien<sts are team members Sta4s4cian Mathema4cian Systems administrator Data programmer Social scien4st? Different skills are embedded across mul<-disciplinary team members

17 The Data Scien<st team profile Sta4s4cs Mathema4cs Computer Science Domain exper4se It is important to understand data science even if you never intend to do it yourself

18 Prac<cing data science What does the crac of data science look like?

19 3 disciplinary areas SOURCES DATA ANALYTICS COLLECT CLEAN INTEGRATE PROCESS VISUALIZATION COMMUNICATE

20 3 disciplinary areas SOURCES DATA Systems Administrator ANALYTICS COLLECT CLEAN INTEGRATE PROCESS Data Programmer App designer VISUALIZATION COMMUNICATE Each disciplinary area requires different skills

21 SOURCES Database ( ) First integrated datastore (Bachman), 1963 Rela4onal data model (Codd), 1970 SQL (Boyce & Chamberlain), Data Warehouse ( ) Big Data ( ) First commercial RDBMS (Oracle), 1979 DB2 (IBM), 1983 First KDD workshop, 1989 NoSQL (Evans), 2009 First KDD data mining conference (Fayyaad, Shapiro), 1995

22 SOURCES SMALL DATA Databases Data Warehouses Quan4ta4ve and qualita4ve Mostly structured and indexical BIG DATA Mostly unstructured (80%) Various sources Needs to be related and combined Metadata Long tail data A/V Logs Social Incomplete Data Taxonomy: some data are neither just big nor just small

23 SOURCES Small data The term denotes the opposite of Big Data Data usually housed in databases and data warehouses Usually structured, qualita4ve and indexical in nature Examples: Library data, Research Data (RDM) Research data = primary data

24 SOURCES Big data Datasets that are too large for tradi4onal data processing and storage systems (3V s, 4V s, 5V s) Classified into 3 classes of datafica2on : 1. Directed data (e.g. surveillance data) 2. Automated data (e.g. device generated data) 3. Volunteered data (e.g. social networks data)

25 ANALYTICS Database ( ) First integrated datastore (Bachman), 1963 Rela4onal data model (Codd), 1970 SQL (Boyce & Chamerlain), Data Warehouse ( ) Big Data ( ) First commercial RDBMS (Oracle), 1979 DB2 (IBM), 1983 First KDD workshop, 1989 NoSQL (Evans), 2009 First KDD data mining conference (Fayyaad, Shapiro), 1995 Business Intelligence (BI) DATA DELUGE Data Science

26 ANALYTICS There are 4 broad classes of analy4cs (ocen used in combina4on): 1. Data mining and paxern recogni4on v AI Machine Learning Data Mining 2. Data visualiza4on and visual analy4cs v App development 3. Sta4s4cal analysis v Sta4s4cal techniques and principles (regression, etc.) 4. Predic4on, simula4on, and op4miza4on v Algorithms

27 Prac<cing data science In Libraries

28 1. Data at Scale Value and Insight can be extracted from small data by scaling them up into larger datasets, for reuse through digital data infrastructures

29 2. Analyzing Exhaust data Exhaust data = produced as a by-product of the main func4on of a device or system Most exhaust data is transient in nature it is never examined and simply discarded! Example: log of a self-checkout unit

30 Example of analyzing Exhaust data Structured and Unstructured data Ac<onable Insights VISITS PATRONS LOANS LOCATIONS Books DVDs BeXer forecasts for future library planning BeXer usage of systems and resources DIGITIZED BOOKS Journals Produc4vity gain with bexer decision-making

31 Examples of Analysis and Visualiza<on Library analy4cs toolkit Harvard University: hxps://osc.hul.harvard.edu/liblab/projects/libraryanaly4cs-toolkit Text analy4cs Google Books Ngram Viewer: hxps://books.google.com/ngrams Open Source implementa4on Bookworm: hxp://bookworm.culturomics.org

32 In summary The fundamental principle of data science is that data, and the capability to extract useful knowledge from it, should be regarded as a key strategic asset. Libraries must learn to start thinking data-analy<cally. Do we only use gut and intui4on, or also data and rigor, in our decision-making?

33 In summary You can apply the same principles, tools and techniques for small data than you would for big data the tools of data science are as appropriate for gigabyte as they are for petabyte scale datasets (hxps://datascience.berkeley.edu/about/what-is-data-science/)

34 In summary Challenges for librarians: q There is a shortage of Big Data talent q The Big Data SIG is axemp4ng to understand and frame Big Data problems Opportuni4es for librarians: q Grow your data analy4cal skills q AXend online courses: Khan Academy, Coursera, Socware Carpentry, digital books q There are free socware tools: R, (SQL Server 2016 includes R), Python, app visualiza4on tools

35 Thank you

Big Data Visualiza9on

Big Data Visualiza9on Big Data Visualiza9on Dr. Steve Cutchin Associate Professor Computer Science 2012 Boise State University 1 Computer Science Department 10 Faculty + 3 Lectures + 2 New hires. 400 Undergraduates Enrolled

More information

Data Science And Big Data Analytics Course

Data Science And Big Data Analytics Course Data Science And Big Data Analytics Course Copyright 2014 EMC Corpora3on. All Rights Reserved. Introduc3on and Course Agenda 1 Introduc3on and Course Agenda 2 Introduc3on and Course Agenda The following

More information

MSc Data Science at the University of Sheffield. Started in September 2014

MSc Data Science at the University of Sheffield. Started in September 2014 MSc Data Science at the University of Sheffield Started in September 2014 Gianluca Demar?ni Lecturer in Data Science at the Informa?on School since 2014 Ph.D. in Computer Science at U. Hannover, Germany

More information

An Open Dynamic Big Data Driven Applica3on System Toolkit

An Open Dynamic Big Data Driven Applica3on System Toolkit An Open Dynamic Big Data Driven Applica3on System Toolkit Craig C. Douglas University of Wyoming and KAUST This research is supported in part by the Na3onal Science Founda3on and King Abdullah University

More information

Project Por)olio Management

Project Por)olio Management Project Por)olio Management Important markers for IT intensive businesses Rest assured with Infolob s project management methodologies What is Project Por)olio Management? Project Por)olio Management (PPM)

More information

A R o a d t o y o u r C l o u d. Professional Service. C R M a n d C l o u d C o n s u l t i n g

A R o a d t o y o u r C l o u d. Professional Service. C R M a n d C l o u d C o n s u l t i n g RM-C A R o a d t o y o u r C l o u d Professional Service C R M a n d C l o u d C o n s u l t i n g CRM-C Highlights! A Unique Cloud CRM Consulting service firm! Specializing in cloud CRM and Office Collaboration

More information

Mission. To provide higher technological educa5on with quality, preparing. competent professionals, with sound founda5ons in science, technology

Mission. To provide higher technological educa5on with quality, preparing. competent professionals, with sound founda5ons in science, technology Mission To provide higher technological educa5on with quality, preparing competent professionals, with sound founda5ons in science, technology and innova5on, commi

More information

B2B Offerings. Helping businesses op2mize. Infolob s amazing b2b offerings helps your company achieve maximum produc2vity

B2B Offerings. Helping businesses op2mize. Infolob s amazing b2b offerings helps your company achieve maximum produc2vity B2B Offerings Helping businesses op2mize Infolob s amazing b2b offerings helps your company achieve maximum produc2vity What is B2B? B2B is shorthand for the sales prac4ce called business- to- business

More information

Data Warehousing. Yeow Wei Choong Anne Laurent

Data Warehousing. Yeow Wei Choong Anne Laurent Data Warehousing Yeow Wei Choong Anne Laurent Databases Databases are developed on the IDEA that DATA is one of the cri>cal materials of the Informa>on Age Informa>on, which is created by data, becomes

More information

Opportuni)es and Challenges of Textual Big Data for the Humani)es

Opportuni)es and Challenges of Textual Big Data for the Humani)es Opportuni)es and Challenges of Textual Big Data for the Humani)es Dr. Adam Wyner, Department of Compu)ng Prof. Barbara Fennell, Department of Linguis)cs THiNK Network Knowledge Exchange in the Humani)es

More information

Fixed Scope Offering (FSO) for Oracle SRM

Fixed Scope Offering (FSO) for Oracle SRM Fixed Scope Offering (FSO) for Oracle SRM Agenda iapps Introduc.on Execu.ve Summary Business Objec.ves Solu.on Proposal Scope - Business Process Scope Applica.on Implementa.on Methodology Time Frames Team,

More information

Understanding Cloud Compu2ng Services. Rain in business success with amazing solu2ons in Cloud technology

Understanding Cloud Compu2ng Services. Rain in business success with amazing solu2ons in Cloud technology Understanding Cloud Compu2ng Services Rain in business success with amazing solu2ons in Cloud technology What is Cloud Compu2ng? Cloud compu2ng encompasses various services and ac2vi2es carried out over

More information

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas Big Data The Big Picture Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas What is Big Data? Big Data gets its name because that s what it is data that

More information

Keeping Pace with Big Data

Keeping Pace with Big Data - A Data Mining Perspec>ve Huan Liu, Tempe, AZ hep://www.public.asu.edu/~huanliu NSF Workshop on Big Data Analy6cs for Infrastructure and Building Resilience and Sustainability, Beijing, China Sept 19-20,

More information

Ibis: Scaling Python Analy=cs on Hadoop and Impala

Ibis: Scaling Python Analy=cs on Hadoop and Impala Ibis: Scaling Python Analy=cs on Hadoop and Impala Wes McKinney, Budapest BI Forum 2015-10- 14 @wesmckinn 1 Me R&D at Cloudera Serial creator of structured data tools / user interfaces Mathema=cian MIT

More information

This Symposium brought to you by www.ttcus.com

This Symposium brought to you by www.ttcus.com This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data

More information

Doing Multidisciplinary Research in Data Science

Doing Multidisciplinary Research in Data Science Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov 16 May

More information

An Introduc@on to Big Data, Apache Hadoop, and Cloudera

An Introduc@on to Big Data, Apache Hadoop, and Cloudera An Introduc@on to Big Data, Apache Hadoop, and Cloudera Ian Wrigley, Curriculum Manager, Cloudera 1 The Mo@va@on for Hadoop 2 Tradi@onal Large- Scale Computa@on Tradi*onally, computa*on has been processor-

More information

Redesigning Data System Technology Curricula. IBM BDAEdCon 2014 Las Vegas Dr. Elena Gortcheva, Program Chair for MSc Data Systems Technology, UMUC

Redesigning Data System Technology Curricula. IBM BDAEdCon 2014 Las Vegas Dr. Elena Gortcheva, Program Chair for MSc Data Systems Technology, UMUC Redesigning Data System Technology Curricula in the Big Data World IBM BDAEdCon 2014 Las Vegas Dr. Elena Gortcheva, Program Chair for MSc Data Systems Technology, UMUC Presentation Outline Challenges Phase

More information

Cyber Security With Big Data

Cyber Security With Big Data Cyber Security With Big Data Fast. Complete. Cost-Effec1ve. Harry J Foxwell, PhD Principal Consultant Oracle Public Sector Oct 2015 Safe Harbor Statement The following is intended to outline our general

More information

1 Actuate Corpora-on 2013. Big Data Business Analy/cs

1 Actuate Corpora-on 2013. Big Data Business Analy/cs 1 Big Data Business Analy/cs Introducing BIRT Analy3cs Provides analysts and business users with advanced visual data discovery and predictive analytics to make better, more timely decisions in the age

More information

BPO. Accerela*ng Revenue Enhancements Through Sales Support Services

BPO. Accerela*ng Revenue Enhancements Through Sales Support Services BPO Accerela*ng Revenue Enhancements Through Sales Support Services What is BPO? Business Process Outsorcing (BPO) is the process of outsourcing specific business func6ons to a third- party service provider

More information

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist 2015 Analyst and Advisor Summit Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist Agenda Key Facts Offerings and Capabilities Case Studies When to Engage

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Introducing the Reimagined Power BI Platform. Jen Underwood, Microsoft

Introducing the Reimagined Power BI Platform. Jen Underwood, Microsoft Introducing the Reimagined Power BI Platform Jen Underwood, Microsoft Thank You Sponsors Empower users with new insights through familiar tools while balancing the need for IT to monitor and manage user

More information

Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015

Step by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015 Step by Step: Big Data Technology Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015 Data Sources IT Infrastructure Analytics 2 B y 2015, 20% of Global 1000 organizations

More information

Cloud Compu?ng & Big Data in Higher Educa?on and Research: African Academic Experience

Cloud Compu?ng & Big Data in Higher Educa?on and Research: African Academic Experience 3 rd SG13 Regional Workshop for Africa on ITU- T Standardiza?on Challenges for Developing Countries Working for a Connected Africa (Livingstone, Zambia, 23-24 February 2015) Cloud Compu?ng & Big Data in

More information

DATA MINING AND WAREHOUSING CONCEPTS

DATA MINING AND WAREHOUSING CONCEPTS CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation

More information

An interdisciplinary model for analytics education

An interdisciplinary model for analytics education An interdisciplinary model for analytics education Raffaella Settimi, PhD School of Computing, DePaul University Drew Conway s Data Science Venn Diagram http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

More information

How To Understand The Big Data Paradigm

How To Understand The Big Data Paradigm Big Data and Its Empiricist Founda4ons Teresa Scantamburlo The evolu4on of Data Science The mechaniza4on of induc4on The business of data The Big Data paradigm (data + computa4on) Cri4cal analysis Tenta4ve

More information

Solving Healthcare Problems with Big Data. Alicia d Empaire

Solving Healthcare Problems with Big Data. Alicia d Empaire Solving Healthcare Problems with Big Data Alicia d Empaire Agenda Ø Introduction of Baptist Health South Florida Ø Healthcare Changes and Challenges Ø Role of Technology and Analytics Ø BHSF Big Data Analytics

More information

Cloud Compu)ng. Yeow Wei CHOONG Anne LAURENT

Cloud Compu)ng. Yeow Wei CHOONG Anne LAURENT Cloud Compu)ng Yeow Wei CHOONG Anne LAURENT h-p://www.b- eye- network.com/blogs/eckerson/archives/cloud_compu)ng/ 2011 h-p://www.forbes.com/sites/tjmccue/2014/01/29/cloud- compu)ng- united- states- businesses-

More information

Managed Services. An essen/al set of tools for today's businesses

Managed Services. An essen/al set of tools for today's businesses Managed Services An essen/al set of tools for today's businesses Manage your enterprise better with a holis/c solu/on to all your IT worries only at Infolob What are Managed Services? By far the most cu/ng

More information

The Business Value of Predictive Analytics

The Business Value of Predictive Analytics The Business Value of Predictive Analytics Alys Woodward Program Manager, European Business Analytics, Collaboration and Social Solutions, IDC London, UK 15 November 2011 Copyright IDC. Reproduction is

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Pu#ng together a bioinforma1cs team: 2014 compared with 1997

Pu#ng together a bioinforma1cs team: 2014 compared with 1997 Pu#ng together a bioinforma1cs team: 2014 compared with 1997 BIG DATA and Healthcare Analy3cs Melbourne, Thursday 3 rd April 2014 Terry Speed, Walter & Eliza Hall Ins3tute of Medical Research 1 Overview

More information

INCREMENTAL, APPROXIMATE DATABASE QUERIES AND UNCERTAINTY FOR EXPLORATORY VISUALIZATION. Danyel Fisher Microso0 Research

INCREMENTAL, APPROXIMATE DATABASE QUERIES AND UNCERTAINTY FOR EXPLORATORY VISUALIZATION. Danyel Fisher Microso0 Research INCREMENTAL, APPROXIMATE DATABASE QUERIES AND UNCERTAINTY FOR EXPLORATORY VISUALIZATION Danyel Fisher Microso0 Research Exploratory Visualiza9on Ini9al Query Process query Get a response Change parameters

More information

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

Consolidate, Organize, Search, and Visualize: ZED Report Catalog provides a single point of entry across mul9ple repor9ng tools and pla:orms

Consolidate, Organize, Search, and Visualize: ZED Report Catalog provides a single point of entry across mul9ple repor9ng tools and pla:orms Consolidate, Organize, Search, and Visualize: ZED Report Catalog provides a single point of entry across mul9ple repor9ng tools and pla:orms Vasu Pabbaraju, Execu

More information

BIG DATA & DATA SCIENCE

BIG DATA & DATA SCIENCE BIG DATA & DATA SCIENCE ACADEMY PROGRAMS IN-COMPANY TRAINING PORTFOLIO 2 TRAINING PORTFOLIO 2016 Synergic Academy Solutions BIG DATA FOR LEADING BUSINESS Big data promises a significant shift in the way

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

Theo JD Bothma Department of Informa1on Science theo.bothma@up.ac.za

Theo JD Bothma Department of Informa1on Science theo.bothma@up.ac.za Theo JD Bothma Department of Informa1on Science theo.bothma@up.ac.za Reflec1ons on the role of corpora and big data in e- lexicography in rela1on to end user informa1on needs CILC 2015 7th Interna1onal

More information

How To Use A Webmail On A Pc Or Macodeo.Com

How To Use A Webmail On A Pc Or Macodeo.Com Big data workloads and real-world data sets Gang Lu Institute of Computing Technology, Chinese Academy of Sciences BigDataBench Tutorial MICRO 2014 Cambridge, UK INSTITUTE OF COMPUTING TECHNOLOGY 1 Five

More information

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

More information

Big Data and Big Analy-cs Trends: The Promise and the Hype. Gregory Piatetsky KDnuggets

Big Data and Big Analy-cs Trends: The Promise and the Hype. Gregory Piatetsky KDnuggets Big Data and Big Analy-cs Trends: The Promise and the Hype Gregory Piatetsky KDnuggets KDnuggets 2012 1 My Data PhD in applying Machine Learning to databases Researcher at GTE Labs started the first project

More information

Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath)

Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath) Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath) Alex Pinto Chief Data Scien2st Niddel / MLSec Project @alexcpsec @MLSecProject @NiddelCorp Agenda Security Singularity

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.

More information

Data Warehouses and NoSQL Sharing Administra6ve Informa6on

Data Warehouses and NoSQL Sharing Administra6ve Informa6on Data Warehouses and NoSQL Sharing Administra6ve Informa6on Carmen Barandela So-ware Engineer CERN / GS AIS October 24 28, 2011 JINR/CERN Grid and Management Informa6on Systems Agenda Data Warehouses in

More information

Financial Services/Banking (Portable Technology) Healthcare (Portable Technology) SoPware in U8li8es Industry (Office Produc8vity- CRM)

Financial Services/Banking (Portable Technology) Healthcare (Portable Technology) SoPware in U8li8es Industry (Office Produc8vity- CRM) To show examples of how SMBs use technology in a posi8ve way for their businesses at their desks or out of the office. Focusing this on the many companies that are under 50 employees on Cape Cod and surrounding

More information

Insider s Guide to Digital Media Measurement Sen5ment Analysis Symposium 2015

Insider s Guide to Digital Media Measurement Sen5ment Analysis Symposium 2015 Insider s Guide to Digital Media Measurement Sen5ment Analysis Symposium 2015 Presented By Stephen D. Rappaport, Global Digital Advisor, Sunstar Inc. Senior Consultant SDR Consul5ng E. steve@sdrconsul5ngllc.com

More information

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Exploiting Data at Rest and Data in Motion with a Big Data Platform Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, sarah_brader@uk.ibm.com What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Tim Blevins Execu;ve Director Labor and Revenue Solu;ons. FTA Technology Conference August 4th, 2015

Tim Blevins Execu;ve Director Labor and Revenue Solu;ons. FTA Technology Conference August 4th, 2015 Tim Blevins Execu;ve Director Labor and Revenue Solu;ons FTA Technology Conference August 4th, 2015 Governance and Organiza;onal Strategy PaIerns of Fraud and Abuse in Government What tools can we use

More information

Boomer Technology Group, LLC.

Boomer Technology Group, LLC. Consul'ng has its ups and downs. This presenta'on is meant to educate those interested in this career path. As well as re- enforce what seasoned consultants already know. This informa'on is presented on

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture Apps and data source extensions with APIs Future white label, embed or integrate Power BI Deploy Intelligent

More information

BIG DATA CHALLENGES AND PERSPECTIVES

BIG DATA CHALLENGES AND PERSPECTIVES BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,

More information

An Integrated Approach to Manage IT Network Traffic - An Overview Click to edit Master /tle style

An Integrated Approach to Manage IT Network Traffic - An Overview Click to edit Master /tle style An Integrated Approach to Manage IT Network Traffic - An Overview Click to edit Master /tle style Agenda A quick look at ManageEngine Tradi/onal Traffic Analysis Techniques & Tools Changing face of Network

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Elixir Business Analytics Platform and Data API Server for Harnessing Data for Value Creation CFC Presented by:

Elixir Business Analytics Platform and Data API Server for Harnessing Data for Value Creation CFC Presented by: Elixir Business Analytics Platform and Data API Server for Harnessing Data for Value Creation CFC Presented by: Lau Shih Hor Chief Executive Officer Elixir Technology About Elixir Technology Company Founded

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. Impact of Big Data in Oil & Gas Industry Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. New Age Information 2.92 billions Internet Users in 2014 Twitter processes 7 terabytes

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Making Sense of Big Data. Dr. Thomas E. Potok Computa2onal Data Analy2cs Group Leader Oak Ridge Na2onal Laboratory potokte@ornl.

Making Sense of Big Data. Dr. Thomas E. Potok Computa2onal Data Analy2cs Group Leader Oak Ridge Na2onal Laboratory potokte@ornl. Making Sense of Big Data Dr. Thomas E. Potok Computa2onal Data Analy2cs Group Leader Oak Ridge Na2onal Laboratory potokte@ornl.gov 865-574- 0834 ORNL s Big Data Legacy Science National Security Energy

More information

Getting Started with Oracle Data Miner 11g R2. Brendan Tierney

Getting Started with Oracle Data Miner 11g R2. Brendan Tierney Getting Started with Oracle Data Miner 11g R2 Brendan Tierney Scene Setting This is not about DB log mining This is an introduction to ODM And how ODM can be included in OBIEE (next presentation) Domain

More information

Welcome to our Online Information Session! November 2014

Welcome to our Online Information Session! November 2014 Welcome to our Online Information Session! November 2014 Master of Information Systems Management (MISM) Master of Science in Information Security Policy and Management (MSISPM) Sean Beggs, Director of

More information

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

The University of Jordan

The University of Jordan The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S

More information

Research at the Department of Computer Science and Software Engineering. Professor Yong Yue BEng, PhD, CEng, FIET, FIMechE 17 October 2014

Research at the Department of Computer Science and Software Engineering. Professor Yong Yue BEng, PhD, CEng, FIET, FIMechE 17 October 2014 Research at the Department of Computer Science and Software Engineering Professor Yong Yue BEng, PhD, CEng, FIET, FIMechE 17 October 2014 Research Areas Ar%ficial intelligence Robo%cs Data mining Image

More information

Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework

Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework Founda'onal IT Governance A Founda'onal Framework for Governing Enterprise IT Adapted from the ISACA COBIT 5 Framework Steven Hunt Enterprise IT Governance Strategist NASA Ames Research Center Michael

More information

Power to the People: Analy0cs for All

Power to the People: Analy0cs for All Arijit Sengupta CEO, BeyondCore, Inc. Power to the People: Analy0cs for All " Ten patents related to Advanced Analytics, Privacy/Security and BPaaS. " Previously worked at Oracle, Microsoft, Yankee Group

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Web Services and Development of Semantic Applications

Web Services and Development of Semantic Applications Web Services and Development of Semantic Applications Trish Whetzel Outreach Coordinator THE NATIONAL CENTER FOR BIOMEDICAL ONTOLOGY Na#onal Center for Biomedical Ontology Mission To create software for

More information

Achieving Business Value through Big Data Analytics Philip Russom

Achieving Business Value through Big Data Analytics Philip Russom Achieving Business Value through Big Data Analytics Philip Russom TDWI Research Director for Data Management October 3, 2012 Sponsor 2 Speakers Philip Russom Research Director, Data Management, TDWI Brian

More information

The Data Reservoir. 10 th September 2014. Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Informa4on Solu4ons

The Data Reservoir. 10 th September 2014. Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Informa4on Solu4ons Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Solu4ons The Reservoir 10 th September 2014 A growing demand Business Teams want Open access to more informa4on More

More information

The Elusive U,lity Customer: How Big Data & Analy,cs Connects U,li,es & Their Customers

The Elusive U,lity Customer: How Big Data & Analy,cs Connects U,li,es & Their Customers The Place Analy,cs Leaders Turn to for Answers Member.U(lityAnaly(cs.com The Elusive U,lity Customer: How Big & Analy,cs Connects U,li,es & Their Customers Mike Smith Vice President, U(lity Analy(cs Ins(tute

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Bringing Big Analytics to the Masses Neal Leavitt

Bringing Big Analytics to the Masses Neal Leavitt Bringing Big Analytics to the Masses Neal Leavitt CS846 short paper presentation Song Wang 1 2015/9/29 Motivation Agenda Issues for Small Business Analytics for all Drawbacks Summary 2 2015/9/29 Motivation

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,

More information

Oracle Big Data Strategy Simplified Infrastrcuture

Oracle Big Data Strategy Simplified Infrastrcuture Big Data Oracle Big Data Strategy Simplified Infrastrcuture Selim Burduroğlu Global Innovation Evangelist & Architect Education & Research Industry Business Unit Oracle Confidential Internal/Restricted/Highly

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

Disrupting The Market: Predictive Analytics As A Service

Disrupting The Market: Predictive Analytics As A Service Disrupting The Market: Predictive Analytics As A Service 0 Problem 8.7 Billion Connected Devices 1 Growing 25% Annually What Does This Data Tell Us About Sensor Use? 1 Study conducted by Cisco 1 Solution

More information

Big Data Analytics. Big Data is

Big Data Analytics. Big Data is Big Data Analytics Big Data Analytics Forrester defines big data as the frontier of a firm s ability to store, process, and access (SPA) all the data it needs to operate effectively, make decisions, reduce

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

Integra(ng Data Analy(cs into a Risk- Based Audit Plan. Presented by: Andrew Simpson, MBA, Chief Operating Officer, CaseWare Analytics

Integra(ng Data Analy(cs into a Risk- Based Audit Plan. Presented by: Andrew Simpson, MBA, Chief Operating Officer, CaseWare Analytics Integra(ng Data Analy(cs into a Risk- Based Audit Plan Presented by: Andrew Simpson, MBA, Chief Operating Officer, CaseWare Analytics Drivers of Risk Management Risk is high on the agenda for boards today

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Big Data. Lyle Ungar, University of Pennsylvania

Big Data. Lyle Ungar, University of Pennsylvania Big Data Big data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. McKinsey Data Scientist: The Sexiest Job of the 21st Century -

More information

Promo%ng Your OCS Business through Digital & Social Media. Presented by: John Healy

Promo%ng Your OCS Business through Digital & Social Media. Presented by: John Healy Promo%ng Your OCS Business through Digital & Social Media Presented by: John Healy Who Is John Healy? jhealy@healyco.com 25+ years in marke;ng, communica;ons & PR consul;ng Specialty: Helping clients balance

More information

Extend your analytic capabilities with SAP Predictive Analysis

Extend your analytic capabilities with SAP Predictive Analysis September 9 11, 2013 Anaheim, California Extend your analytic capabilities with SAP Predictive Analysis Charles Gadalla Learning Points Advanced analytics strategy at SAP Simplifying predictive analytics

More information

UCLA: Data management, governance, and policy issues

UCLA: Data management, governance, and policy issues UCLA: Data management, governance, and policy issues Chris

More information

The DATA Difference Targe.ng for Stronger ROI!

The DATA Difference Targe.ng for Stronger ROI! The DATA Difference Targe.ng for Stronger ROI! Presented by: Dr. John Leininger Department of Graphic Communica

More information