Research trends relevant to data warehousing and OLAP include [Cuzzocrea et al.]: Combining the benefits of RDBMS and NoSQL database systems



Similar documents
Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

COMP9321 Web Application Engineering

locuz.com Big Data Services

Big Data and Analytics: Challenges and Opportunities

How To Turn Big Data Into An Insight

A Professional Big Data Master s Program to train Computational Specialists

How To Create A Data Science System

The basic data mining algorithms introduced may be enhanced in a number of ways.

BIG DATA TRENDS AND TECHNOLOGIES

Integrating a Big Data Platform into Government:

Understanding traffic flow

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

The 4 Pillars of Technosoft s Big Data Practice

SAP and Hortonworks Reference Architecture

Data Mining with Big Data e-health Service Using Map Reduce

Where is... How do I get to...

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method

Sanjeev Kumar. contribute

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

New Features in Oracle Application Express 4.1. Oracle Application Express Websheets. Oracle Database Cloud Service

The Future of Business Analytics is Now! 2013 IBM Corporation

Navigating the Big Data infrastructure layer Helena Schwenk

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

CitusDB Architecture for Real-Time Big Data

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

Introduction to Data Mining

Topics in basic DBMS course

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Using Open Source NoSQL technologies in Designing Systems for Delivering Electric Vehicle Data Analytics.

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

Reference Architecture, Requirements, Gaps, Roles

Next-Generation Cloud Analytics with Amazon Redshift

Augmented Search for IT Data Analytics. New frontier in big log data analysis and application intelligence

Big Data and Healthcare Payers WHITE PAPER

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

High Performance Data Management Use of Standards in Commercial Product Development

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

Complex, true real-time analytics on massive, changing datasets.

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

How To Handle Big Data With A Data Scientist

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

UNIVERSITY OF INFINITE AMBITIONS. MASTER OF SCIENCE COMPUTER SCIENCE DATA SCIENCE AND SMART SERVICES

Real-time distributed Complex Event Processing for Big Data scenarios

How To Use Big Data For Telco (For A Telco)

Information Management course

Big Data Mining: Challenges and Opportunities to Forecast Future Scenario

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Managing Data in Motion

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Complexity and Scalability in Semantic Graph Analysis Semantic Days 2013

ANALYTICS STRATEGY: creating a roadmap for success

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

Business Intelligence meets Big Data: An Overview on Security and Privacy

BIG DATA IN BUSINESS ENVIRONMENT

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

Framework and key technologies for big data based on manufacturing Shan Ren 1, a, Xin Zhao 2, b

BIG. Big Data Analysis John Domingue (STI International and The Open University) Big Data Public Private Forum

The big data business model: opportunity and key success factors

Towards Privacy aware Big Data analytics

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April ISSN

Updating Your Microsoft SQL Server 2008 BI Skills to SQL Server 2008 R2

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

Amplify Serviceability and Productivity by integrating machine /sensor data with Data Science

Big Data and Analytics (Fall 2015)

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Delivering secure, real-time business insights for the Industrial world

New Eco-Systems in the software and service domain in the Cloud area

A Checklist for Big Data Success. Steve Ackley Sandeep Kumar Gupta

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

تعریف Big Data. موضوعات مطرح در حوزه : Big Data. 1. Big Data Foundations

Integrating SAP and non-sap data for comprehensive Business Intelligence

TRAINING PROGRAM ON BIGDATA/HADOOP

Play with Big Data on the Shoulders of Open Source

Comprehensive Analytics on the Hortonworks Data Platform

Chapter ML:XI. XI. Cluster Analysis

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Advanced Big Data Analytics with R and Hadoop

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Big Data on the Open Cloud

High-Performance Analytics

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP

Optimizing the Hybrid Cloud

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Transcription:

DATA WAREHOUSING RESEARCH TRENDS Research trends relevant to data warehousing and OLAP include [Cuzzocrea et al.]: Data source heterogeneity and incongruence Filtering out uncorrelated data Strongly unstructured nature of data sources High scalability Combining the benefits of RDBMS and NoSQL database systems Query optimization issues in Hadoop-based systems Integrating multidimensional data models and techniques with Hadoop visualisation of big data analytics 1

DATA MINING RESEARCH TRENDS Data mining research trends include exploration of the following [Han & Kamber]: Application exploration Scalable and interactive data mining methods Integration of data mining with search engines, database, data warehouse and cloud database systems Mining social and information networks Mining spatiotemporal, moving-objects and cyber-physical systems Mining multimedia, text, and web data Mining biological and biomedical data 2

Data mining with software engineering and system engineering Visual and audio data mining Distributed data mining and real-time data stream mining Privacy protection and information security in data mining The above areas for both data warehousing and data mining largely have a computer technology focus. One of our original definitions of a data warehouse was... a collection of technologies aimed at enabling the knowledge worker to make better and faster decisions. Hence, business goals also need to be addressed in considering the use of these technologies. Also, it is increasingly recognised that the complexity of the data and multiple technologies used by a knowledge worker introduces its own problems. 3

PRACTICAL LESSONS FROM LARGE-SCALE DATA MINING The importance of domain knowledge. Look and know your data. Many data mining projects fail because of the lack of adequate data: suitable data pre-processing, sampling are crucial. Understand the business goals when developing a mining model. Important metrics in training and deployment of mining models: offline training time, online prediction time, model size or memory footprint. Understand the impact of architecture and scalability issues on algorithm design and implementation. Importance of people in considering the above. Importance of two under-researched areas: visualisation real-time interaction with large datasets. 4

BIG DATA ANALYTICS TRENDS AND FUTURE DIRECTIONS Challenges include [Assunção et al.]: Data management Extracting meaningful content quickly out of large volumes of data, especially when it is unstructured. Aggregating and correlating streaming data from multiple sources. Storage enabling easy migration between data centres and cloud providers. Programming models optimized for streaming and/or multidimensional data. Engines combining applications with multiple programming models in a single execution abstraction. Optimizing resource usage and energy consumption during execution of such applications. 5

Model building and scoring Techniques utilising the elasticity and scale of cloud systems. Standards and APIs supporting prediction and wider analytics as services. Visualisation and user interaction Efficient techniques supporting real-time visualisation. Cost-effective devices for large-scale visualisation. Other challenges Business models enabling analytics and analytics as a service on the cloud. Supporting replicability of analytics. Incorporating human expertise into the analytics process, particularly in cloud environments. Location and context-aware techniques for collecting, processing, analysing and visualising large-scale mobile and sensor data. 6

Reading (Optional) V Gopalkrishnan et al, Big Data, Big Business: Bridging the Gap, Proc. Big Mine 12, August 2012. Y Chen et al., Practical Lessons of Data Mining at Yahoo!, Proc. CIKM 09, 1047-1055, 2009. J Lin & D Ryaboy, Scaling Big Data Mining Infrastructure: The Twitter Experience, ACM SIGKDD Exploration, 14(2), 6-19, 2012. M D. Assunção et al., Big Data computing and clouds: Trends and future directions, J. Parallel and Distributed Computing, 79-80, 3-15, 2015. 7