ISSN: Page 120

Size: px
Start display at page:

Download "ISSN: Page 120"

Transcription

1 Satellite Conference ICSTSD 2016 International Conference on Science and Technology for Sustainable Development, Kuala Lumpur, Modernization of Data Warehousing through Open Source Softwares Preeti P. Deshmukh Department of Master in Computer Application, Prof. Ram Meghe Institute of Technology & Research, Bandera Amravati Abstract: This paper aims to investigate the utilization of different open source softwares to make the traditional data warehousing techniques modern at big data scale. Data warehouse stores current and historical data which are used for creating analytical reports for knowledge workers throughout the enterprise. The data warehouse plays a critical role in storing, managing, and processing information at big data scale. Now a days we have many applications as replacements for the data warehouse for example Hadoop, Cassandra, MongoDB, and other NoSQL and so the modernization is important for data warehousing. Enterprises are demanding business value from Big Data. As a result, the ability to blend petabytes and exabytes of data from historical and streaming sources becomes a necessity and data warehouse modernization becomes a top priority. Because the data warehouse is often thought of as the heart of an enterprise s Big Data and analytics strategies, modernizing it has a potentially powerful and very positive effect on the bottom-line impact of new technologies, platforms, tools, and practices. No matter what modernization strategy is in play, all data warehouses require significant adjustments to the logical and systems architectures of the extended data warehouse environment. expansive variety of data types, and the real-time processing rapidity of how data is being used to grow and operate the business. These changes are so profound that Gartner reports, Data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing. The modern enterprise needs a logical architecture that can smoothly scale to meet these volume demands with real-time processing power and the ability to manage any data type to rapidly connect the business to valuable insights. This means that the traditional data warehouse needs to evolve into a modern data warehouse. Real-Time Data: The traditional data warehouse was designed to store and analyze historical information on the assumption that data would be captured now and analyzed later. System architectures focused on scaling relational data up with larger hardware and processing to an operations schedule based on sanitized data. Keywords: Open source, DWE, DW, MPP Introduction: Definition: A data warehouse is a database designed to enable business intelligence activities: it exists to help users understand and enhance their organization's performance. It is designed for query and analysis rather than for transaction processing, and usually contains historical data derived from transaction data, but can include data from other sources. Data warehouses separate analysis workload from transaction workload and enable an organization to consolidate data from several sources. Fig-1: General Architecture of Traditional Data warehousing This helps in: Maintaining historical records Analyzing the data to gain a better understanding of the business and to improve the business So far the speed of how data is captured, processed, and used is increasing. Companies are using real-time data to change, build, or optimize their businesses as well as to sell, transact, and engage in dynamic, event-driven processes like market trading. The traditional data warehouse simply was not architected to support near real-time The traditional data warehouses we find are under pressure of the increasing heaviness of unpredictable volumes of data, the ISSN: Page 119

2 Traditional Data Warehouse Problems: The traditional data warehouse suffered from a number of problems: 1. High licensing and storage costs meant that only large companies could afford to implement data warehouse solutions. 2. Large and expensive database administrators and support staff required to tune and manage the data warehouse so that it can supply the constantly changing BI information 3. Recent requirements for predictive analytics have resulted in very slow query performance. This has caused business users frustration as they seek to quickly test their business models. 4. Small and medium size companies often could not afford to implement BI and data warehouse solutions and were therefore at a considerable competitive disadvantage. What was required was a new type of data warehouse that was both very fast for performing advanced data analytical queries, and, at the same time had a low Total Cost of Ownership (TCO) that ensured that it could be used by both Small Medium sized Enterprise (SME) and large companies. The data warehousing and business intelligence market can be described by the curve, with different component technologies at different points along with the curve. expensive proprietary appliances, such as large-scale extract, transform, and load (ETL) workloads. With the speed of business today, practitioners require more agile, more powerful approaches to data warehousing and analytics that are simultaneously cost-effective to scale. Only open source data warehousing can meet those requirements. While relating open source data warehousing to the larger big data technology stack, which is based on open source technology Data warehousing is an important part of the big data stack, and open source data warehousing in particular is a perfect complement to the other open source technologies in that stack, such as Hadoop. Open source data warehousing is also complementary to other big data technologies from a workload perspective, providing flexible, high-performance analytics and reporting capabilities that support other important workloads such as streaming and unstructured data analysis. Need of Data warehouse modernization Data warehouses support mission-critical workloads, so it is important to select an open source data warehouse with an active, growing community that is continuously developing the code base. It is also critical to pick an open source data warehouse that has at least one but preferably several trusted vendors backing it up with world-class support. Figure-2: Data warehousing technology markets graph Open Source Data warehousing Definition: An open source data warehouse is a specialized database built entirely on open source software code that supports enterprise, production-grade data analytics and reporting. An open source data warehouse should also support large-scale exploratory analytics and data science workloads including machine learning. Open source data warehousing significantly reduces Total Cost of Ownership. With open source data warehousing, there are no software licensing costs and no expensive proprietary hardware to purchase. The code is free and generally runs on inexpensive commodity hardware. Open source data warehouses are also an ideal environment for important but less complex workloads many enterprises currently run on DW modernization assumes many forms, from server upgrades and tweaks for data models, to adding new platforms into the extended data warehouse environment (DWE), to replacing the primary DW platform. Modernization may involve using features previously untapped, such as in-memory databases, in-database analytics, real-time functions, and data federation or virtualization. Systems integrated with the DW need attention, too. Analytics, reporting, and data integration are also modernizing, and the DW is under pressure to provision data in ways that enable modern end-user practices such as visualization, advanced analytics, data prep, and self-service data access. The arrival of big data has made such provisioning more business critical and more difficult. In recent surveys by TDWI Research, roughly half of respondents report that they will replace their primary data warehouse (DW) platform and/or analytic tools within three years. Ripping out and replacing a DW or analytics platform is expensive for IT budgets and intrusive for business users. This raises the question: What circumstances would lead so many people down such a dramatic path? It's because many organizations need a more modern DW platform to address a number of new and future business and technology requirements. In a nutshell, organizations that seek to modernize their data warehouse environment do so to ISSN: Page 120

3 improve advanced analytics, scale, speed, productivity, or economics. Each of the five reasons listed here has multiple meanings, they are all interrelated, and users sort the five into varying priority orders, based on their needs. Even so, in general, the list constitutes the top five reasons for data warehouse modernization, and they can provide some guidance for users facing modernization. 1. Advanced analytics. Many organizations have invested heavily in reporting and OLAP, but now they need to invest in advanced forms of analytics to leverage big data, find new customer segments, and stay competitive. 2. Speed. Organizations likewise need the data warehouse and related systems to operate faster because speed contributes to scale, supports agile development and discovery analytics, and brings analytics closer to real-time business operations. 3. Scale. This continues to be an issue with big data and other burgeoning enterprise datasets as well as with growing numbers of concurrent users, reports, analyses, and data structures. 4. Productivity. Traditional requirements gathering, prototyping, and development takes months, which is too long for a modern business. That's why agile development methods are now the norm in DW/BI and analytics. Likewise, users are adopting agile tool types, including those for data exploration and discovery, data profiling, and data visualization. 5. Costs. The good news that modernization is not only a chance to increase speeds and feeds in your data warehouse environment, but it is also a golden opportunity to rethink DW overall costs, as users seek to save money in some areas (storage, CPUs, upgrades, admin) so they can invest in others (new data platforms, analytic tools, and developing new solutions). Modernizing Data Warehouse In the world of revolutionary, game-changing big data developments, data warehouse modernization may sound like an evolutionary development. But it is something that can be executed today, with existing data warehouse skills, and represents a simple first step toward gleaning immediate business value and organizational agility from big data technologies. It is observed that the move to agile development methods is one of the strongest trends in data warehousing. Similar trends involve lean, logical, and virtual methods. Modernization also affects users skills, staffing, and team structure. Data Warehouse modernization is truly confronting. The TDWI report s survey says that 76% of Data Warehouses are evolving briskly; 89% of respondents say modernization is an opportunity for innovation. Warehouse with new business goals, increasing Data Warehouses scale for big data, enabling new analytics applications, and embracing new tools or data types and their attendant practices. The chief beneficiaries of modernization include analytics, business management, and real-time operations. The leading barriers involve problems with governance, staffing, funding, designs, and platforms. The rise of the multiplatform DWE is an evolution of the DW system architecture. Hence, changes at the system architecture level are the most common form of DW modernization (53% of users surveyed). At one end, this involves simple upgrades and patches for hardware and software servers or tools. Introduction to MPP DATA WAREHOUSES The data warehouse plays a critical role in storing, managing, and processing information at big data scale. This might sound counterintuitive, especially now that Hadoop, Cassandra, MongoDB, and other NoSQL platforms are marketed as replacements for the data warehouse. True, one or more SQL query engines exist for all of these platforms, but a SQL query engine does not a data warehouse make. One or more SQL query engines exist for all of these platforms, but a SQL query engine does not a data warehouse make. And that s why, If we want to do is take some flat files and execute SQL across them, that doesn t actually require a database. It requires a translator from SQL to execution. In order to design and build a massively parallel processing [MPP] database, some of the most difficult problems to solve are maintaining consistency across a huge database that runs on a distributed cluster, where you have concurrent access to that data, notes Ivan Novick, a product manager with Pivotal Software, Inc., which markets Greenplum, an open source MPP database. MPP (Massively Parallel Processing)-based databases, especially software-only options, provide a cost effective, scale-out data warehouse environment that allows companies to leverage Moore s Law on performance-to-cost ratio improvements in x86 processors. MPP databases provide a non-intrusive analytical platform/data warehouse for data discovery and exploratory work over massive amounts of data. Built on inexpensive commodity clusters, MPP databases can extend, complement, or even replace parts of your existing data warehouse, managing massive volumes of detailed data, while providing agile query, reporting, dashboards, and analytics. According to the survey, the leading drivers behind Data Warehouse modernization include realigning the Data ISSN: Page 121

4 Here s another example: the open source R statistical programming environment. Disciplines don t get much more specialized than statistics and data mining, which were dominated by proprietary vendors such as SAS Institute Inc. and the former SPSS Inc. (now IBM SPSS) for decades. R hasn t just mounted a challenge to the dominance of SAS and SPSS, it s arguably already won. Most graduates of college business, engineering, social scientific, and, of course, statistical programs learned their craft on R, not on proprietary platforms. Figure -3: MPP (Massively Parallel Processing) Architecture The number of ACID-compliant (atomic, consistent, isolated, durable), MPP analytical data warehouses is remarkably small. This doesn t mean MPP is a prohibitively expensive proposition, however. But the commodification of MPP software and server hardware and a little assist from the world of open source software made MPP performance surprisingly affordable and cost-effective, too. In the era of big data, MPP database systems are ideally suited for many if not most analytical workloads. They re able to support high concurrency rates and, in some cases, new kinds of advanced, NoSQL analytics. For example, some MPP platforms can parallelize and run different types of algorithms in the context of the database engine. The MPP data warehouses of today are utilizing a cluster of servers to store and process data. You can run a machinelearning algorithm that leverages the CPUs of all of the servers in that cluster to do the analysis in parallel. Hadoop and other NoSQL platforms have positive, distinctive roles to play in the big data architectures of today and tomorrow. NoSQL platforms are well suited for storing and managing multistructured data (e.g., text files, multimedia content, and binary objects), as well as for storing relational data at truly massive scale. The MPP data warehouse, however, is ideal for dynamic, mixed workloads, as well as for storing, managing, and processing information at big data volumes. Pivotal s Greenplum database is an open source MPP data warehouse. Greenplum itself is based on the open source PostgreSQL database, which has a rich open source pedigree. Open Source MPP Data Warehouse Open source software has lowered the proverbial bar with respect to cost of entry, cost of maintenance, and total cost of ownership (TCO) in many once-specialized markets. Take the open source GNU-Linux operating system, which turns 23 this year. A quarter of a century ago, the dominant UNIX operating systems were proprietary and ran on costly RISC hardware. GNU-Linux isn t technically UNIX, but it s UNIX-like and its market share now trounces that of its proprietary UNIX rivals. An MPP database can scale up (within a single SMP node) across all of the available cores in a server node. However, an MPP database also scales horizontally in the sense that it s distributed across multiple SMP nodes in a cluster. When an MPP database processes a query, each of the nodes in the cluster independently processes a piece of the query so instead of, say, 24 cores, an MPP database can muster 192 cores, 384 cores, 768 cores, and more. Open Source OpenStack: OpenStack is a massively scalable, open cloud computing platform designed for deploying and managing public, private, and hybrid cloud solutions through a single control plane. This open-source project has evolved quickly and many early adopters, including Intel, are using it to orchestrate large pools of compute, storage, and networking resources and to provide IT as a service to end users. The OpenStack community includes more than 18,000 members and 1,300 active contributors. Intel is a platinum member and a top contributor and is committed to making OpenStack easier to deploy and more capable through code contributions, blueprints, reference architectures, and other technical resources. Column-based Open Source Data Warehouse Business managers are increasingly aware that there is an urgent requirement for business intelligence (BI) that will spot trends about customers, products, markets and suppliers to identify new business opportunities, manage risk or formulate strategic initiatives that can lead to increased profitable revenues. Furthermore in the current recession companies should consider upgrading their BI capabilities to enable them to explore the how, why and when and not just the what of the data that is stored within their organisations. What is needed is advanced data analytics discover useful patterns within rapidly changing customer buying preferences to create cross-marketing opportunities and leading to increased customer revenues. At the heart of advanced analytics is the Data Warehouse (DW). However, most data warehouse implementations were not designed for advanced data analytics. Advanced data analytics requires fast access to data so that the data mining and statistical queries can be conducted in minutes rather than hours. ISSN: Page 122

5 Conclusion: There are in the sense that for most organizations, the reporting and analytics that data warehousing supports are mission-critical to the business, so it is important that they select a hardened, battle-tested, and reliable open source data warehouse. In the world of revolutionary, game-changing big data developments, data warehouse modernization may sound like an evolutionary development. But it is something that can be executed today, with existing data warehouse skills, and represents a simple first step toward gleaning immediate business value and organizational agility from big data technologies and open source technology for the continuous development and modifications. References: [1] Jiawei Han, Micheline Kamber, Data Mining: Concepts & Techniques 3rd ed. p. cm.... Chapter 4 Data Warehousing and Online Analytical Processing [2] TDWI Best Practices Report March-2016, Data Warehouse Modernization in the Age of Big Data Analytics [3] By TDWI e-book April-2016, Shaping the Future of Data Warehousing through Open Source Software [4] The_microsoft_modern_data_warehouse_white_paper ISSN: Page 123

Evolving Data Warehouse Architectures

Evolving Data Warehouse Architectures Evolving Data Warehouse Architectures In the Age of Big Data Philip Russom April 15, 2014 TDWI would like to thank the following companies for sponsoring the 2014 TDWI Best Practices research report: Evolving

More information

Big Data and Its Impact on the Data Warehousing Architecture

Big Data and Its Impact on the Data Warehousing Architecture Big Data and Its Impact on the Data Warehousing Architecture Sponsored by SAP Speaker: Wayne Eckerson, Director of Research, TechTarget Wayne Eckerson: Hi my name is Wayne Eckerson, I am Director of Research

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

ENABLING OPERATIONAL BI

ENABLING OPERATIONAL BI ENABLING OPERATIONAL BI WITH SAP DATA Satisfy the need for speed with real-time data replication Author: Eric Kavanagh, The Bloor Group Co-Founder WHITE PAPER Table of Contents The Data Challenge to Make

More information

Microsoft Analytics Platform System. Solution Brief

Microsoft Analytics Platform System. Solution Brief Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

BIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting

BIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting BIG DATA APPLIANCES July 23, TDWI R Sathyanarayana Enterprise Information Management & Analytics Practice EMC Consulting 1 Big data are datasets that grow so large that they become awkward to work with

More information

Data platform evolution

Data platform evolution 2 Data platform evolution Top Reasons Reasons to to upgrade 1) End of extended support 2) Enhanced SQL Server 2014 features and performance 3) Impact on security and compliance 4) Cloud strategy Top Blockers

More information

The big data revolution

The big data revolution The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing

More information

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management Datalogix Using IBM Netezza data warehouse appliances to drive online sales with offline data Overview The need Infrastructure could not support the growing online data volumes and analysis required The

More information

Five Technology Trends for Improved Business Intelligence Performance

Five Technology Trends for Improved Business Intelligence Performance TechTarget Enterprise Applications Media E-Book Five Technology Trends for Improved Business Intelligence Performance The demand for business intelligence data only continues to increase, putting BI vendors

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Data Warehouse Appliances: The Next Wave of IT Delivery. Private Cloud (Revocable Access and Support) Applications Appliance. (License/Maintenance)

Data Warehouse Appliances: The Next Wave of IT Delivery. Private Cloud (Revocable Access and Support) Applications Appliance. (License/Maintenance) Appliances are rapidly becoming a preferred purchase option for large and small businesses seeking to meet expanding workloads and deliver ROI in the face of tightening budgets. TBR is reporting the results

More information

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

Innovate and Grow: SAP and Teradata

Innovate and Grow: SAP and Teradata Partners Innovate and Grow: SAP and Teradata Lily Gulik, Teradata Director, SAP Center of Excellence Wayne Boyle, Chief Technology Officer Strategy, Teradata R&D Table of Contents Introduction: The Integrated

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Netezza and Business Analytics Synergy

Netezza and Business Analytics Synergy Netezza Business Partner Update: November 17, 2011 Netezza and Business Analytics Synergy Shimon Nir, IBM Agenda Business Analytics / Netezza Synergy Overview Netezza overview Enabling the Business with

More information

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers Modern IT Operations Management Why a New Approach is Required, and How Boundary Delivers TABLE OF CONTENTS EXECUTIVE SUMMARY 3 INTRODUCTION: CHANGING NATURE OF IT 3 WHY TRADITIONAL APPROACHES ARE FAILING

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

SQL Server 2012 Parallel Data Warehouse. Solution Brief

SQL Server 2012 Parallel Data Warehouse. Solution Brief SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...

More information

GigaSpaces Real-Time Analytics for Big Data

GigaSpaces Real-Time Analytics for Big Data GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

In-Memory Analytics for Big Data

In-Memory Analytics for Big Data In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...

More information

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

EMC/Greenplum Driving the Future of Data Warehousing and Analytics EMC/Greenplum Driving the Future of Data Warehousing and Analytics EMC 2010 Forum Series 1 Greenplum Becomes the Foundation of EMC s Data Computing Division E M C A CQ U I R E S G R E E N P L U M Greenplum,

More information

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com

More information

Data Center Fabrics and Their Role in Managing the Big Data Trend

Data Center Fabrics and Their Role in Managing the Big Data Trend Data Center Fabrics and Their Role in Managing the Big Data Trend The emergence of Big Data as a critical technology initiative is one of the driving factors forcing IT decision-makers to explore new alternatives

More information

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013 Big Data Use Case How Rackspace is using Private Cloud for Big Data Bryan Thompson May 8th, 2013 Our Big Data Problem Consolidate all monitoring data for reporting and analytical purposes. Every device

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise Introducing Unisys All in One software based weather platform designed to reduce server space, streamline operations, consolidate

More information

Big Data and Your Data Warehouse Philip Russom

Big Data and Your Data Warehouse Philip Russom Big Data and Your Data Warehouse Philip Russom TDWI Research Director for Data Management April 5, 2012 Sponsor Speakers Philip Russom Research Director, Data Management, TDWI Peter Jeffcock Director,

More information

TECHNOLOGY TRANSFER PRESENTS JOHN O BRIEN MODERN DATA PLATFORMS APRIL 14-15 2014 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY)

TECHNOLOGY TRANSFER PRESENTS JOHN O BRIEN MODERN DATA PLATFORMS APRIL 14-15 2014 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY) TECHNOLOGY TRANSFER PRESENTS JOHN O BRIEN MODERN DATA PLATFORMS APRIL 14-15 2014 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY) info@technologytransfer.it www.technologytransfer.it MODERN DATA

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Tap into Big Data at the Speed of Business

Tap into Big Data at the Speed of Business SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics

More information

G-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions

G-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions G-Cloud Big Data Suite Powered by Pivotal December 2014 G-Cloud service definitions TABLE OF CONTENTS Service Overview... 3 Business Need... 6 Our Approach... 7 Service Management... 7 Vendor Accreditations/Awards...

More information

Top 10 Automotive Manufacturer Makes the Business Case for OpenStack

Top 10 Automotive Manufacturer Makes the Business Case for OpenStack Top 10 Automotive Manufacturer Makes the Business Case for OpenStack OPENSTACK WHITE PAPER Contributors: SOLINEA: Francesco Paola, CEO Seth Fox, Vice President Operations Brad Vaughan, Vice President Service

More information

A Study on Big-Data Approach to Data Analytics

A Study on Big-Data Approach to Data Analytics A Study on Big-Data Approach to Data Analytics Ishwinder Kaur Sandhu #1, Richa Chabbra 2 1 M.Tech Student, Department of Computer Science and Technology, NCU University, Gurgaon, Haryana, India 2 Assistant

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances Highlights IBM Netezza and SAS together provide appliances and analytic software solutions that help organizations improve

More information

2015 Ironside Group, Inc. 2

2015 Ironside Group, Inc. 2 2015 Ironside Group, Inc. 2 Introduction to Ironside What is Cloud, Really? Why Cloud for Data Warehousing? Intro to IBM PureData for Analytics (IPDA) IBM PureData for Analytics on Cloud Intro to IBM dashdb

More information

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.

More information

Introducing Oracle Exalytics In-Memory Machine

Introducing Oracle Exalytics In-Memory Machine Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

Traditional BI vs. Business Data Lake A comparison

Traditional BI vs. Business Data Lake A comparison Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses

More information

Why DBMSs Matter More than Ever in the Big Data Era

Why DBMSs Matter More than Ever in the Big Data Era E-PAPER FEBRUARY 2014 Why DBMSs Matter More than Ever in the Big Data Era Having the right database infrastructure can make or break big data analytics projects. TW_1401138 Big data has become big news

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX 1 Successful companies know that analytics are key to winning customer loyalty, optimizing business processes and beating their

More information

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA Applied Business Intelligence Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA Agenda Business Drivers and Perspectives Technology & Analytical Applications Trends Challenges

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

The Internet of Things and Big Data: Intro

The Internet of Things and Big Data: Intro The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific

More information

BIG DATA & DATA SCIENCE

BIG DATA & DATA SCIENCE BIG DATA & DATA SCIENCE ACADEMY PROGRAMS IN-COMPANY TRAINING PORTFOLIO 2 TRAINING PORTFOLIO 2016 Synergic Academy Solutions BIG DATA FOR LEADING BUSINESS Big data promises a significant shift in the way

More information

Cost-Effective Business Intelligence with Red Hat and Open Source

Cost-Effective Business Intelligence with Red Hat and Open Source Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Einsatzfelder von IBM PureData Systems und Ihre Vorteile. Einsatzfelder von IBM PureData Systems und Ihre Vorteile demirkaya@de.ibm.com Agenda Information technology challenges PureSystems and PureData introduction PureData for Transactions PureData for Analytics

More information

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method Improving Data Processing Speed in Big Data Analytics Using HDFS Method M.R.Sundarakumar Assistant Professor, Department Of Computer Science and Engineering, R.V College of Engineering, Bangalore, India

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

An Accenture Point of View. Oracle Exalytics brings speed and unparalleled flexibility to business analytics

An Accenture Point of View. Oracle Exalytics brings speed and unparalleled flexibility to business analytics An Accenture Point of View Oracle Exalytics brings speed and unparalleled flexibility to business analytics Keep your competitive edge with analytics When it comes to working smarter, organizations that

More information

Analytics & Data Warehousing Reader Challenges & Priorities Survey

Analytics & Data Warehousing Reader Challenges & Priorities Survey Analytics & Data Warehousing Reader Challenges & Priorities Survey Business Applications 2013 Respondent company size 52% from companies with over 1,000 employees (similar to past years surveys) 17% 13%

More information

Cray: Enabling Real-Time Discovery in Big Data

Cray: Enabling Real-Time Discovery in Big Data Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence Appliances and DW Architectures John O Brien President and Executive Architect Zukeran Technologies 1 TDWI 1 Agenda What

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data

EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data EMC Greenplum Driving the Future of Data Warehousing and Analytics Tools and Technologies for Big Data Steven Hillion V.P. Analytics EMC Data Computing Division 1 Big Data Size: The Volume Of Data Continues

More information

Ten Cornerstones of a Modern Data Warehouse Environment

Ten Cornerstones of a Modern Data Warehouse Environment Ten Cornerstones of a Modern Data Warehouse Environment May 2015 Mike Lamble, CEO Clarity Solution Group Business Analytics Data Clarity Solution Group Unique Perspective Largest US consultancy focused

More information

Modern Data Warehousing

Modern Data Warehousing Modern Data Warehousing Cem Kubilay Microsoft CEE, Turkey & Israel Time is FY15 Gartner Survey April 2014 Piloting on premise 15% 10% 4% 14% 57% 2014 5% think Hadoop will replace existing DW solution (2013:

More information

Next Generation Data Warehousing Appliances 23.10.2014

Next Generation Data Warehousing Appliances 23.10.2014 Next Generation Data Warehousing Appliances 23.10.2014 Presentert av: Espen Jorde, Executive Advisor Bjørn Runar Nes, CTO/Chief Architect Bjørn Runar Nes Espen Jorde 2 3.12.2014 Agenda Affecto s new Data

More information

Customized Report- Big Data

Customized Report- Big Data GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.

More information

SAP HANA FAQ. A dozen answers to the top questions IT pros typically have about SAP HANA

SAP HANA FAQ. A dozen answers to the top questions IT pros typically have about SAP HANA ? SAP HANA FAQ A dozen answers to the top questions IT pros typically have about SAP HANA??? Overview If there s one thing that CEOs, CFOs, CMOs and CIOs agree on, it s the importance of collecting data.

More information

Big Data and Healthcare Payers WHITE PAPER

Big Data and Healthcare Payers WHITE PAPER Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other

More information

Getting Started & Successful with Big Data

Getting Started & Successful with Big Data Getting Started & Successful with Big Data @Pentaho #BigDataWebSeries 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Your Hosts Today Davy Nys VP EMEA & APAC Pentaho Paul

More information

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard Hadoop and Relational base The Best of Both Worlds for Analytics Greg Battas Hewlett Packard The Evolution of Analytics Mainframe EDW Proprietary MPP Unix SMP MPP Appliance Hadoop? Questions Is Hadoop

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

Please give me your feedback

Please give me your feedback Please give me your feedback Session BB4089 Speaker Claude Lorenson, Ph. D and Wendy Harms Use the mobile app to complete a session survey 1. Access My schedule 2. Click on this session 3. Go to Rate &

More information

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse IBM Analytics Just the facts: Four critical concepts for planning the logical data warehouse 1 2 3 4 5 6 Introduction Complexity Speed is businessfriendly Cost reduction is crucial Analytics: The key to

More information

Infrastructure Matters: POWER8 vs. Xeon x86

Infrastructure Matters: POWER8 vs. Xeon x86 Advisory Infrastructure Matters: POWER8 vs. Xeon x86 Executive Summary This report compares IBM s new POWER8-based scale-out Power System to Intel E5 v2 x86- based scale-out systems. A follow-on report

More information

Welcome. Host: Eric Kavanagh. eric.kavanagh@bloorgroup.com. The Briefing Room. Twitter Tag: #briefr

Welcome. Host: Eric Kavanagh. eric.kavanagh@bloorgroup.com. The Briefing Room. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com Twitter Tag: #briefr The Briefing Room Mission! Reveal the essential characteristics of enterprise software, good and bad! Provide

More information

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...

More information

Practical Approaches to Big Data & Analytics: From Infrastructure to

Practical Approaches to Big Data & Analytics: From Infrastructure to 2014 Cisco and/or its affiliates. All rights reserved. Practical Approaches to Big Data & Analytics: From Infrastructure to Applications Kapil Bakshi Distinguished Architect, Cisco System Digital Government

More information

Lowering the Total Cost of Ownership (TCO) of Data Warehousing

Lowering the Total Cost of Ownership (TCO) of Data Warehousing Ownership (TCO) of Data If Gordon Moore s law of performance improvement and cost reduction applies to processing power, why hasn t it worked for data warehousing? Kognitio provides solutions to business

More information

The Principles of the Business Data Lake

The Principles of the Business Data Lake The Principles of the Business Data Lake The Business Data Lake Culture eats Strategy for Breakfast, so said Peter Drucker, elegantly making the point that the hardest thing to change in any organization

More information

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business

More information