Big Data and Analytics: Challenges and Opportunities
|
|
- Henry Bruce
- 8 years ago
- Views:
Transcription
1 Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif University of Technology 4 July 2015
2 We are Generating Vast Amounts of Data!! 2 Remote patient monitoring Product sensors Healthcare Social media Manufacturing books, music, videos, etc. Retail Real time location data Digitalization of Artefacts Location-Based Services
3 We are Generating Vast Amounts of Data!! Air Bus A380: generate 10 TB every 30 min Twitter: Generate approximately 12 TB of data per day. Facebook: Facebook data grows by over 500 TB daily. New York Stock: Exchange 1TB of data everyday. 3
4 We are Generating Vast Amounts of Meta-data!! 4 Provenance Data Versioning Privacy Security We are Tracing everything: Who did What? When? Where?
5 We are Generating Vast Amounts of Meta-data!! 5 Provenance Data Versioning Privacy Security We are Tracing everything: Who did What? When? Where?
6 We are Generating Vast Amounts of Meta-data!! Reading a book, e.g. Kindle tracks: what you are reading, when you are reading it, how often you read it, etc. Listening to music, e.g. mp3 player tracks: what you are listening to, when and how often, in what order, etc. Smart phones, e.g. iphone tracks: our location, our speed, what apps we are using, who we are ringing, etc. 6
7 We are Generating Vast Amounts of Meta-data!! Reading a book, e.g. Kindle tracks: what you are reading, when you are reading it, how often you read it, etc. Listening to music, e.g. mp3 player tracks: what you are listening to, when and how often, in what order, etc. Smart phones, e.g. iphone tracks: our location, our speed, what apps we are using, who we are ringing, etc. 7
8 Big Data and Big Meta-Data 8 Big share, comment, review, crowdsource, etc.
9 So, What is Big Data? Big data refers to our ability to collect and analyse the ever expanding amounts of data and meta-data that we are generating every second! Challenges: Capture, Storage, Search, Sharing, Transfer, Analysis, Visualization, etc. 9
10 So, What is Big Data? Big data refers to our ability to collect and analyse the ever expanding amounts of data and meta-data that we are generating every second! Challenges: Capture, Storage, Search, Sharing, Transfer, Analysis, Visualization, etc. 10
11 Volume What Makes it Big Data? the vast amounts of data generated every second. Velocity the speed at which new data is generated and moves around. Variety the increasingly different types of data. Veracity the quality of data, e.g. the messiness of the data. Needs detecting and correcting noisy and inconsistent data Value Statistical, Events, Correlation, Hypothetical 11
12 Challenges: How to Store and Process? 12 Big data is high volume, high velocity, and/or high variety information assets. Require new forms of storage and processing. On-hand database management tools? Traditional data processing applications?
13 Challenges: Big Data Storage NoSQL databases: 13 Employs less constrained consistency models. Simple retrieval and appending operations. Significant performance benefits. Examples: Key value Store Document Store Graph Database
14 (Graphs are Everywhere) Challenges: Big Data Storage 14 Social Network User Collaborative Filtering Netflix Movie Probabilistic Analysis Text Analysis Docs Wiki Words
15 Challenges: Big Data Analytics (Graphs are Everywhere) 15 Social Network User Collaborative Filtering Netflix Movie Probabilistic Analysis Text Analysis Docs Wiki Words
16 Challenges: Big Data Analytics (Graphs are Everywhere) 16 Social Network User Collaborative Filtering Netflix Movie Probabilistic Analysis Text Analysis Docs Wiki Words
17 Challenges: Big Data Processing Apache Hadoop: 17 Hadoop is an open source framework that uses a simple programming model to enable distributed processing of large data sets on clusters of computers. Apache Hadoop solution: Distributed File System (HDFS) MapReduce Pig HCatalog Who Use Hadoop? Amazon Facebook Google IBM New York Times Yahoo!
18 Challenges: Big Data Processing Apache Spark: Fast and Expressive Cluster Computing Engine Compatible with Apache Hadoop 18 Efficient In-memory storage Usable Rich APIs in Java, Scala, Python
19 Challenges: Big Data Processing Apache Spark: Fast and Expressive Cluster Computing Engine Compatible with Apache Hadoop 19 Efficient In-memory storage Usable Rich APIs in Java, Scala, Python Resilient Distributed Dataset (RDD), Spark's data storage model
20 Challenges: Big Data Integration 20 Workflows IT Systems Web Services People Example Scenario: Business Processes (BPs) BPs Execution Log..
21 Challenges: Big Data Integration 21 Workflows IT Systems Web Services People Example Scenario: Business Processes (BPs) BPs Execution Log..
22 Challenges: Big Data Integration Messy, schema-less and complex Big Data world. Less than 10% of Big Data world are genuinely relational. 22 e.g. Linked Data
23 Challenges: Big Data Integration Big Data-as-a-Service: Effective processing of big data within acceptable processing time Easy access of the big data and the big data analysis results API Engineering 23 ProgrammableWeb - APIs, Mashups and the Web as Platform; DataSift, CSDL
24 Challenges: Big data requires a broad set of skills 24 Math and Operations Research Expertise Data Experts Data architecture, management, governance, policy Develop analytic algorithms Decision Making Executive and Management Apply information to solve business issues Tool Developers Mask complexity and analytics to lower skills boundaries Visualization Expertise Interpret data sets, determine correlations and present in meaningful ways Industry Vertical Domain Expertise Develop hypothesis, identify relevant business issues, ask the right questions
25 Challenges: Big Data Analytics Analytics can be defined in many ways, but what matters is the purpose of analytics. Most definitions agree on the following: 25 Analytics is used to gain insights from data in order to make better decisions, using mathematical or scientific methods. Data Insight Action Analyse Decide Manage the Data Understand the Data Act on the Data
26 Challenges: Big Data Analytics Analytics can be defined in many ways, but what matters is the purpose of analytics. Most definitions agree on the following: 26 Analytics is used to gain insights from data in order to make better decisions, using mathematical or scientific methods. Data Insight Action Analyse Decide Manage the Data Understand the Data Act on the Data
27 Challenges: Big Data Analytics 27
28 Challenges: Big Data Analytics 28
29 Challenges: Big Data Analytics Example: Beheshti et al., Scalable Graph-based OLAP Analytics over Process Execution Data, DAPD Journal (2015). Beheshti et al., A Framework and a Language for On-Line Analytical Processing on Graphs, WISE Conference (2012). 29 OLAP, is an approach to answering multi-dimensional analytical queries swiftly. Problem: extension of existing OLAP techniques to analysis of graphs is not straightforward. key business insights remain hidden in the interactions among objects. Solution: On-Line Analytical Processing on Graphs
30 Challenges: Big Data Analytics 30
31 (Graph Data Model) Challenges: Big Data Analytics Nodes (Entities, Folders, and Paths) Entities: 31 Structured/Unstructured typed/un-typed data objects. Paper Author Venue
32 (Graph Data Model) Challenges: Big Data Analytics Nodes (Entities, Folders, and Paths) Folder Nodes: Contains a set of inter-related entities (e.g. events). Can be a placeholder for the result of a given query. 32 Set of related authors. Set of related papers.
33 (Graph Data Model) Challenges: Big Data Analytics Nodes (Entities, Folders, and Paths) Path Nodes: Contains a set of paths (i.e. a path is a transitive relationship between two entities) 33 Alex author-of published-in VLDB Paper: Big Data
34 (Graph Data Model) Challenges: Big Data Analytics Nodes (Entities, Folders, and Paths) Relationships: Is a directed link between a pair of entities. Can be explicit or implicit. 34 Folder Implicit (e.g. member-of) Explicit (e.g. triggered-by)
35 (Graph OLAP) Challenges: Big Data Analytics 35
36 (Graph OLAP) Challenges: Big Data Analytics 36
37 (Graph OLAP) Challenges: Big Data Analytics 37
38 (Graph OLAP) Challenges: Big Data Analytics 38
39 Challenges: Big Data Analytics Big Data Analytic benefits from: NLP Machine Learning pattern recognition, learning, KG NLP Example: Beheshti, et al,, A Systematic Review and Comparative Analysis of Cross-Document Coreference Resolution Methods and Tools, Computing Journal (2015), submitted. 39
40 (Graph Data Model) Challenges: Big Data Analytics 40 NLP Example: Beheshti S.M.R., et al,, A Systematic Review and Comparative Analysis of Cross-Document Coreference Resolution Methods and Tools, Computing Journal (2015), submitted.
41 (Graph Data Model) Challenges: Big Data Analytics NLP Example: Beheshti S.M.R., et al,, A Systematic Review and Comparative Analysis of Cross- Document Coreference Resolution Methods and Tools, Computing Journal (2015), submitted. 41
42 (Graph Data Model) Challenges: Big Data Analytics 42 Big Data Analytics benefits from: NLP Machine Learning Pattern recognition, Learning, Extraction, Classification, Enrichment, Linking, etc. Knowledge Graph (KG) KG Construction Open Source Data
43 (Graph Data Model) Challenges: Big Data Analytics 43 Big Data Analytics benefits from: NLP Machine Learning Pattern recognition, Learning, Extraction, Classification, Enrichment, Linking, etc. Knowledge Graph (KG) KG Construction Open Source Data
44 Big Data Leadership!! Industry has been in the lead Google, Amazon, Yahoo!, etc. University researchers have been left behind!! due to lack of access to large-scale cluster computing facilities Government agencies are making heavy investments Investments in big-data computing will have extraordinary near-term and long-term benefits. Cloud computing must be considered a strategic resource 44
45 Big Data: Opportunities 45 Varieties of Data Text Social Media Networks Multimedia Machine Data Sensors Analytics Organizing Big Data Navigating through data Summarizing Big Data Process Analytics Support decision-making Integration Integrating enterprise and public data Linking data/context Entity Extraction and Integration Knowledge Graph Big Data Performance In memory New Benchmarks and Architecture User Experience automation and intelligent guidance Visualizing with Analytics Interacting with Analytics Storytelling
46 Big Data: Opportunities 46 Varieties of Data Text Social Media Networks Multimedia Machine Data Sensors Analytics Organizing Big Data Navigating through data Summarizing Big Data Process Analytics Support decision-making Integration Integrating enterprise and public data Linking data/context Entity Extraction and Integration Knowledge Graph Big Data Performance In memory New Benchmarks and Architecture User Experience automation and intelligent guidance Visualizing with Analytics Interacting with Analytics Storytelling
47 Conclusion Why Big Data is different from past Very Large Datasets? Meta-Data!! 47 Having the ability to analyse Big Data is of limited value if users cannot understand the analysis. How can the industry and academia collaborate towards solving Big Data challenges!! What is big today maybe not be big tomorrow!
48 Questions / Suggestions 48
COMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationBig Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
More informationTransforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
More informationHow To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
More informationWhat do Big Data & HAVEn mean? Robert Lejnert HP Autonomy
What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy Much higher Volumes. Processed with more Velocity. With much more Variety. Is Big Data so big? Big Data Smart Data Project HAVEn: Adaptive Intelligence
More informationSurfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
More informationBig Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
More informationMonitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
More informationBig Data Challenges and Success Factors. Deloitte Analytics Your data, inside out
Big Data Challenges and Success Factors Deloitte Analytics Your data, inside out Big Data refers to the set of problems and subsequent technologies developed to solve them that are hard or expensive to
More informationKeywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationMachine Learning and Cloud Computing. trends, issues, solutions. EGI-InSPIRE RI-261323
Machine Learning and Cloud Computing trends, issues, solutions Daniel Pop HOST Workshop 2012 Future plans // Tools and methods Develop software package(s)/libraries for scalable, intelligent algorithms
More informationSunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
More informationHadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
More informationGetting to Know Big Data
Getting to Know Big Data Dr. Putchong Uthayopas Department of Computer Engineering, Faculty of Engineering, Kasetsart University Email: putchong@ku.th Information Tsunami Rapid expansion of Smartphone
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationIntegrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
More informationBayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationTalend Real-Time Big Data Sandbox. Big Data Insights Cookbook
Talend Real-Time Big Data Talend Real-Time Big Data Overview of Real-time Big Data Pre-requisites to run Setup & Talend License Talend Real-Time Big Data Big Data Setup & About this cookbook What is the
More informationBIG DATA CHALLENGES AND PERSPECTIVES
BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,
More informationHadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN
Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current
More informationScaling Out With Apache Spark. DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf
Scaling Out With Apache Spark DTL Meeting 17-04-2015 Slides based on https://www.sics.se/~amir/files/download/dic/spark.pdf Your hosts Mathijs Kattenberg Technical consultant Jeroen Schot Technical consultant
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationIndustry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, mobitko@ra.rockwell.com Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
More informationBIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationTrends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
More informationBIG DATA TOOLS. Top 10 open source technologies for Big Data
BIG DATA TOOLS Top 10 open source technologies for Big Data We are in an ever expanding marketplace!!! With shorter product lifecycles, evolving customer behavior and an economy that travels at the speed
More informationTutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
More informationHow Companies are! Using Spark
How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made
More informationBig Data Are You Ready? Jorge Plascencia Solution Architect Manager
Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something
More informationBusiness Intelligence for Big Data
Business Intelligence for Big Data Will Gorman, Vice President, Engineering May, 2011 2010, Pentaho. All Rights Reserved. www.pentaho.com. What is BI? Business Intelligence = reports, dashboards, analysis,
More informationBIG DATA IN BUSINESS ENVIRONMENT
Scientific Bulletin Economic Sciences, Volume 14/ Issue 1 BIG DATA IN BUSINESS ENVIRONMENT Logica BANICA 1, Alina HAGIU 2 1 Faculty of Economics, University of Pitesti, Romania olga.banica@upit.ro 2 Faculty
More informationThe 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
More informationIntroduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# This Lecture" The Big Data Problem" Hardware for Big Data" Distributing Work" Handling Failures and Slow Machines" Map Reduce and Complex Jobs"
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationIBM Solution Framework for Lifecycle Management of Research Data. 2008 IBM Corporation
IBM Solution Framework for Lifecycle Management of Research Data Aspects of Lifecycle Management Research Utilization of research paper Usage history Metadata enrichment Usage Pattern / Citation Collaboration
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationBig Data and Data Science. The globally recognised training program
Big Data and Data Science The globally recognised training program Certificate in Big Data Analytics Duration 5 days Big Data and Data Science enables value creation from data, through the use of calculative
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationBUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business
BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang (kzhang@rmsmith.umd.edu) Lecture-Discussions:
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationInternational Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
More informationThe Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn
The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress
More informationArchitecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
More informationHortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
More informationData-Intensive Programming. Timo Aaltonen Department of Pervasive Computing
Data-Intensive Programming Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Lecturer: Timo Aaltonen University Lecturer timo.aaltonen@tut.fi Assistants: Henri Terho and Antti
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationReal Time Data Processing using Spark Streaming
Real Time Data Processing using Spark Streaming Hari Shreedharan, Software Engineer @ Cloudera Committer/PMC Member, Apache Flume Committer, Apache Sqoop Contributor, Apache Spark Author, Using Flume (O
More informationBig Data Analytics with Spark and Oscar BAO. Tamas Jambor, Lead Data Scientist at Massive Analytic
Big Data Analytics with Spark and Oscar BAO Tamas Jambor, Lead Data Scientist at Massive Analytic About me Building a scalable Machine Learning platform at MA Worked in Big Data and Data Science in the
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationThe Future of Business Analytics is Now! 2013 IBM Corporation
The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics
More informationHadoop2, Spark Big Data, real time, machine learning & use cases. Cédric Carbone Twitter : @carbone
Hadoop2, Spark Big Data, real time, machine learning & use cases Cédric Carbone Twitter : @carbone Agenda Map Reduce Hadoop v1 limits Hadoop v2 and YARN Apache Spark Streaming : Spark vs Storm Machine
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationBig Data and Open Data
Big Data and Open Data Bebo White SLAC National Accelerator Laboratory/ Stanford University!! bebo@slac.stanford.edu dekabytes hectobytes Big Data IS a buzzword! The Data Deluge From the beginning of
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationIntroduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationSpark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. www.spark- project.org. University of California, Berkeley UC BERKELEY
Spark in Action Fast Big Data Analytics using Scala Matei Zaharia University of California, Berkeley www.spark- project.org UC BERKELEY My Background Grad student in the AMP Lab at UC Berkeley» 50- person
More informationBig Data: Tools and Technologies in Big Data
Big Data: Tools and Technologies in Big Data Jaskaran Singh Student Lovely Professional University, Punjab Varun Singla Assistant Professor Lovely Professional University, Punjab ABSTRACT Big data can
More informationGAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
More informationSafe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
More informationDATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights
DATA EXPERTS We accelerate research and transform data to help you create actionable insights WE MINE WE ANALYZE WE VISUALIZE Domains Data Mining Mining longitudinal and linked datasets from web and other
More informationBig Data and Analytics (Fall 2015)
Big Data and Analytics (Fall 2015) Core/Elective: MS CS Elective MS SPM Elective Instructor: Dr. Tariq MAHMOOD Credit Hours: 3 Pre-requisite: All Core CS Courses (Knowledge of Data Mining is a Plus) Every
More informationSQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS
Enterprise Data Problems in Investment Banks BigData History and Trend Driven by Google CAP Theorem for Distributed Computer System Open Source Building Blocks: Hadoop, Solr, Storm.. 3548 Hypothetical
More informationBig Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
More informationJenny Woodruff Innovation & Low Carbon Networks Engineer Steve Burns Innovation & Low Carbon Networks Engineer LCNF2013 Thursday 14 th November 2013
NETWORK MONITORING DATA Using and manipulating data to predict network behaviour. Jenny Woodruff Innovation & Low Carbon Networks Engineer Steve Burns Innovation & Low Carbon Networks Engineer Super Conducting
More informationINTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
More informationAligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
More informationBig Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
More informationData Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
More informationHow Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization
More informationThe Internet of Things and Big Data: Intro
The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific
More informationBig Data Analytics Hadoop and Spark
Big Data Analytics Hadoop and Spark Shelly Garion, Ph.D. IBM Research Haifa 1 What is Big Data? 2 What is Big Data? Big data usually includes data sets with sizes beyond the ability of commonly used software
More informationAssociate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
More informationANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
More informationIntroduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
More informationDeveloping Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More informationKeywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop
Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning
More informationDepartment of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15
Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 15 Big Data Management V (Big-data Analytics / Map-Reduce) Chapter 16 and 19: Abideboul et. Al. Demetris
More informationHadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
More informationBig Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014
White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page
More informationHarnessing the Data Flood: Oracle s Visionary Platform from Device to Data Center. Chris Baker Senior Vice President Worldwide ISV/OEM Java Sales
Harnessing the Data Flood: Oracle s Visionary Platform from Device to Data Center Chris Baker Senior Vice President Worldwide ISV/OEM Java Sales Canvas Lumber Compass Sextant 1851 America s Cup The oldest
More informationAlexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
More informationApache Hadoop. Alexandru Costan
1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open
More informationFast Data in the Era of Big Data: Twitter s Real-
Fast Data in the Era of Big Data: Twitter s Real- Time Related Query Suggestion Architecture Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, Jimmy Lin Presented by: Rania Ibrahim 1 AGENDA Motivation
More informationThe Need for Training in Big Data: Experiences and Case Studies
The Need for Training in Big Data: Experiences and Case Studies Guy Lebanon Amazon Background and Disclaimer All opinions are mine; other perspectives are legitimate. Based on my experience as a professor
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationCiteSeer x in the Cloud
Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationCIS492 Special Topics: Cloud Computing د. منذر الطزاونة
CIS492 Special Topics: Cloud Computing د. منذر الطزاونة Big Data Definition No single standard definition Big Data is data whose scale, diversity, and complexity require new architecture, techniques, algorithms,
More informationWhere is... How do I get to...
Big Data, Fast Data, Spatial Data Making Sense of Location Data in a Smart City Hans Viehmann Product Manager EMEA ORACLE Corporation August 19, 2015 Copyright 2014, Oracle and/or its affiliates. All rights
More informationBIG DATA ANALYTICS For REAL TIME SYSTEM
BIG DATA ANALYTICS For REAL TIME SYSTEM Where does big data come from? Big Data is often boiled down to three main varieties: Transactional data these include data from invoices, payment orders, storage
More informationOracle Big Data Spatial and Graph
Oracle Big Data Spatial and Graph Oracle Big Data Spatial and Graph offers a set of analytic services and data models that support Big Data workloads on Apache Hadoop and NoSQL database technologies. For
More informationManifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
More informationUnderstanding Your Customer Journey by Extending Adobe Analytics with Big Data
SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction
More information