A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

Size: px
Start display at page:

Download "A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel"

Transcription

1 A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel

2 BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated from it Big data involves a set of overlapping data management and analytic technologies Big data is garnering a significant amount of industry attention, but unfortunately much of this attention is focused erroneously on handling large volumes of multistructured data. This leads to a skewed and narrow perspective about the value of big data to organizations. Big data involves not just multi-structured data, but any type of electronic data that exists in, or can be acquired by, an organization. Also, big data is not so much about the volume or type of data, but more about how you extract business value from that data it is about the business analytics that can be generated from big data to help improve business efficiency and competitiveness. Confusion about big data and its business value also exists because big data involves a set of overlapping data management and analytic technologies, including relational DBMSs and non-relational systems such as Hadoop. When combined, these technologies represent a new generation of innovation, and the business value of big data comes from extending the existing analytics ecosystem to incorporate these innovative data management and analytic advances. Where Are We Today? Big data represents a continuum of high-performance technologies that have evolved over the past five decades to support IT business transaction and business analytics workloads that stretch the limits of hardware and software capabilities. The first business transaction processing systems introduced in the early 1960s, for example, were custom built and optimized to handle the workloads involved. Early airline reservation systems are an example here. In turn, the introduction and use of point-ofsale terminals, automated teller machines, mobile phones and the Internet have also led to workloads that often require optimized hardware and software systems to support performance needs. Optimized systems are required to handle complex workloads that analyze big data The picture is similar with business analytics and data warehousing where there have also been dramatic increases in workload requirements. The first multi-terabyte data warehouse, for example, was deployed in the early 1990s, but today this size of warehouse is common, and many companies now have data warehouses that handle multiple petabytes of data. Again, optimized systems are often required to handle the workloads involved. This is especially the case when the processing involves complex analytical models and algorithms. Some organizations have business analytics workloads that do not require optimized systems, and in this situation a single platform may be adequate for supporting most workloads. However, as the need grows to handle new types of data and support more sophisticated analyses, even these organizations are faced with either cutting back on the amount of data that can be managed and the types of analyses that can be run, or extending their business analytics ecosystems to add optimized hardware and software platforms for handling high-performance workloads. Copyright 2012 BI Research, All Rights Reserved. 1

3 WHAT THEN IS BIG DATA? Big data represents workloads that could not previously be supported Big data represents data management and analytic solutions that could not previously be supported because of technology performance limitations, the high costs involved, or limited information. Big data solutions allow organizations to build optimized systems that improve performance, reduce costs, and allow new types of data to be captured for analysis. Big data involves two important new data management technologies: Analytic relational systems that are optimized for supporting complex analytic processing against both structured and multi-structured data. These systems are evolving to support not only relational data, but also other types of data structures. These systems may be offered as software-only solutions or as custom hardware/software appliances. Non-relational systems (such as the Hadoop distributed computing environment) that are particularly well suited to the processing of large amounts of multi-structured data. There are many different types of non-relational systems, including distributed file systems, document management systems, and database and analytic systems for handling complex data such as graph data. These workloads can now be supported through advances in data management and business analytics Big data enables new types of analytic applications and systems When combined, these big data technologies can support the management and analysis of the many types of electronic data that exist in organizations, regardless of volume, variety, or volatility. They are used in conjunction with four important advances in business analytics: New and improved analytic techniques and algorithms that increase the sophistication of existing analytic models and results and allow the creation of new types of analytic applications. Enhanced data visualization techniques that make large volumes of data easier to explore and understand. Analytics-driven business processes that improve the speed of decision making and enable close to real-time business agility. These processes involve new rulesbased applications that use a combination of both transaction and analytic processing. Stream processing systems that filter and analyze data in motion (sensor data, for example) as it flows through IT systems and across IT networks. The four advances in business analytics enable users to make more accurate and faster decisions, and also answer questions that were not previously possible for cost reasons or because of technology limitations. They also enable data scientists to investigate big data to look for new data patterns and identify new business opportunities. This investigation work is usually done using a separate investigative computing platform or sandbox. The results from investigative computing may lead to new and improved analytic models and analyses, or new built-for-purpose analytic applications and systems. Copyright 2012 BI Research, All Rights Reserved. 2

4 Figure 1. Analytics ecosystem for big data We can see then that big data involves a number of different components that work together to provide a richer and more powerful analytics ecosystem. Such an ecosystem is illustrated in Figure 1. BARRIERS TO SUCCESS The business needs to understand the use cases for big data The big data ecosystem will be implemented in a phased manner A critical success factor is highperformance data integration and movement between systems Given the complexity and number of components involved in big data there are several barriers that have to be overcome in order to be successful in deploying and gaining business value from new data management and business analytics technologies. Educating IT and the business about the use cases and business benefits of big data. The use of big data is expanding rapidly and there are an increasing number of use cases that cover a wide range of applications in various industries. At present, most big data solutions are geared to addressing specific line-ofbusiness (LOB) needs, rather than being deployed enterprise wide. Understanding and selecting the components that are required to build and support a big data ecosystem. The objective of this paper is to help here by reviewing this ecosystem, identifying its key components, and explaining how these components are likely to evolve over time. Although an organization should develop an overall big data plan and ecosystem, the components that make up the ecosystem will be implemented in a phased manner to support the various workloads associated with the use cases that apply to that organization. The amount of data integration and level of data movement required in a big data environment. As illustrated in Figure 1, the analytics ecosystem for big data may involve several different software and hardware systems. The availability of open data and metadata interfaces, well-designed data integration tools, and high-speed data connectors between these distributed systems is a critical success factor in implementing big data solutions. Data management components that are currently distributed across these multiple systems are likely to be consolidated over time onto a single platform to reduce data movement. For Copyright 2012 BI Research, All Rights Reserved. 3

5 Flexible data governance is required for big data Most organizations will use a combination of relational and nonrelations systems such as Hadoop Even if multiple systems are used, business users need a single interface to these systems Users need to be trained in new data science skills, but tools should also be made easier to use example, a relational DBMS and Hadoop server could co-exist on the same system, or in the future, their capabilities combined into a single integrated DBMS environment. This consolidation is discussed further in the section An Analytics Ecosystem for the Future below. Another possible future option to reduce data movement is an intelligent workload facility that reroutes certain parts of a workload (a query, for example) to the system where the data resides. Developing data governance and data quality management processes to support big data. Given the growth in the amount of data that organizations will be analyzing it will become impossible to guarantee the quality of every piece of information delivered to business users. Instead, it will become necessary to segment data based on quality and security requirements, and handle governance accordingly. It will also be important that users are made fully aware of the quality level of any given piece of information delivered to them. The immaturity of new non-relational systems and the level of IT development and administration resources and skills required for supporting them. Although technologies such as Hadoop are immature, vendors are nevertheless rapidly enhancing these systems to improve their analytical processing capabilities, add development and administration tools, reduce implementation and administration effort, and improve system reliability, availability and security. It must be noted, however, that existing relational DBMS products have undergone significant development effort to support an enterprise-level analytical processing environment, and new non-relational systems will require similar resources to achieve the same level of maturity. However, not all big data use cases require an enterprise-quality analytical processing environment and this is why most organizations are likely to use multiple products. Providing business users with a single and seamless user interface. Given the distributed nature of the ecosystem illustrated in Figure 1, business users could be faced with having to use a multitude of analytic tools and interfaces to access and analyze the data they need to do their jobs. The ecosystem must therefore evolve into providing an integrated toolset and seamless interface to the big data environment. This is discussed in more detail in the section An Analytics Ecosystem for the Future below. Lack of skills for enabling data science and investigative computing projects. From an analytics perspective, this is likely to be one of the biggest barriers to success. While vendors are making their advanced analytics capabilities more approachable, there is still a limit to extent to which such features can be made easily accessible to less skilled users. Rather than looking for individuals that have a complete set of data science skills, organizations should instead build data science teams where the team as a unit has the required skills. For less experienced users, keeping the level of new skills required to a minimum is also important. Copyright 2012 BI Research, All Rights Reserved. 4

6 AN ANALYTICS ECOSYSTEM FOR THE FUTURE Vendors are working on short- and longterm solutions that remove the barriers to successful big data projects The ecosystem illustrated in Figure 1 shows how the various components of a big data environment coexist in a distributed environment consisting of multiple interconnected systems. As outlined in the section Barriers to Success above, this distributed environment can involve a number of issues that need to be considered. A key issue concerns the level of data and metadata integration and data movement involved in a big data ecosystem. For business analytics, one of the main issues is providing business users with an integrated toolset and seamless interface to the ecosystem. Vendors are working on both short-term and long-term solutions to help solve these issues. High-performance data connectors Metadata interface to identify data in the analytics ecosystem A common set of interfaces to the data in the analytics ecosystem SQL should be supported Short-Term Needs In the short-term, the data portability, connector and interface features supplied by vendors will help address many of these issues. The quality and performance of these features will be key distinguishing factors between products. Three key requirements here are: The ability to ingest and transform any type of data into and between the data management components of the ecosystem. In some cases this may be done using batch processes, but where low-latency data is required, there will be a need to stream data continuously, or in micro-batches, into and between the data management components. Transformation power and performance will distinguish products here. A common metadata interface to identify the data managed by the analytics ecosystem. This may be achieved by using data virtualization techniques to access the metadata where it is physically stored and managed. In the longer term, a single repository that documents the data, its location and physical definition, and its business meaning is required. However, this single repository has always been an elusive IT objective because of the number of data stores and vendor products used by most organizations. A common set of interfaces to the complete set of data managed by the ecosystem no matter where it resides. These interfaces should make the location of the data transparent to user and applications. Current approaches to achieving this include data virtualization and data abstraction layers in existing business analytics tools. However, such capabilities need to be moved into the data management environment to provide a set of optimized services for accessing and processing all of the data in the ecosystem. Given that big data extends existing analytical capabilities, these common interfaces should support current analytic toolsets and data languages. The most common data language used today is SQL, and SQL support should be the starting point for any given interface. Many non-relational products, however, do not support SQL, or support only a limited subset of SQL. To overcome this issue vendors will need to support other languages such as R or Pig, or even programming languages such as Java. In some cases, applications may use these languages directly. In other situations, to reduce the development effort involved, the interfaces will need to translate SQL into one of those languages, or invoke Copyright 2012 BI Research, All Rights Reserved. 5

7 A single hybrid data management and analytic platform may be required to handle complex workloads Three features help extend an RDBMS to support big data pre-built routines and functions written using these languages. Compatibility and performance will distinguish products here. Longer-Term Goals In the longer term, to support more sophisticated and complex workloads, it will be necessary to consolidate key components of the ecosystem onto a single system to eliminate the overheads caused by network interactions. This consolidated system may consist simply of the various components co-located on the same hardware and may also involve integrating the components into a single hybrid data management platform optimized for analytic processing. Providing a single data management and analytic platform will of course require major enhancements to an existing relational DBMS product. The advantage of this approach is that not only will new big data capabilities be leveraged for new types of data, but also existing analytic toolsets and data languages will continue to operate without change. The requirements for a single platform can be best explained by examining the architecture of a relational DBMS and reviewing the enhancements required to extend it to support big data. The Logical View of Data In a relational DBMS, users and applications see data in the form of tables. These tables are defined and accessed using SQL. Three important SQL extensibility features provide the underpinnings for analyzing big data: data-type extensions, analytic-function extensions, and external tables. Data-type extensions Analytic-function extensions SQL access to external data stores Over the years, vendors have steadily increased the range of data types that can be managed by a relational DBMS, including, for example, XML documents, geospatial data, text, and multi-media. Some products also provide a user-defined type capability, which allows both vendors and customers to define their own data types to the system. This is useful for adding support for complex multi-structured data formats. The essential requirement here of course is that these new types of data can be analyzed using SQL. This is where analytic function extensions come into play. Vendors have increased the number of pre-defined analytic functions they include in their products, and in some cases have added support for third-party function libraries. Certain products also include a user-defined function capability, which allows new functions to be developed for manipulating new types of data. Both vendor-supplied and user-defined functions are used to encapsulate complex analytic processing, which makes this processing accessible to a broader set of users. These functions are embedded into the DBMS, which means they can exploit the parallel processing capabilities of the system. Analytic-function extensibility is also sometimes used to implement the Hadoop MapReduce distributed programming model in relational DBMS products. An external table capability allows data that is external to the relational DBMS to be accessed using SQL. This capability is useful for accessing data that is managed by an external file system. It can also be used to modify and create these external data files. In a Hadoop environment, data managed by the Hadoop Distributed File Copyright 2012 BI Research, All Rights Reserved. 6

8 System (HDFS) could therefore be accessed and created using the external table capability of a relational DBMS. The design of physical data management layer affects the performance of the system The Physical Management of Data The key to success in extending a relational DBMS to support big data lies of course in how the underlying physical architecture implements the logical view of data discussed above. Performance is of the utmost importance here. The DBMS query optimizer maps the logical view of data to the underlying physical storage structures used to manage the data. The optimizer s job is to provide physical data independence and to determine the most appropriate way to physically access and process data. The optimizer plays a major role in the performance of the system, and significant amounts of research and development have gone into designing efficient optimizer technology and in integrating this technology at execution time with DBMS workload management. Products vary considerably in optimizer quality, in their ability to extend the optimizer to efficiently manage user-defined types and functions, and in handling complex workloads. Optimizer quality and extensibility and sophisticated workload management are essential if a product is to support big data with good performance. Today s relational DBMSs offer a variety of physical storage options Storage managers could be added to a relational DBMS to handle new types of big data The way the data is physically stored and managed in data blocks and data files also plays a significant role in performance. Initially relational DBMSs stored data in a row-based format in data files on disk. Indexes were then created to provide direct access to the data where required. Over time more advanced mechanisms and options have been introduced. Examples include various column-based data block formats, improved data compression and encryption, enhanced buffer management, new indexing techniques, in-memory tables, hybrid storage where data can be stored in optimized file systems and on different speed devices based on usage, and parallel processing architectures with high-speed data interconnects. All of the above physical storage options apply equally to big data. Certain types of big data, however, may require enhanced or separate physical storage options. This is nothing new. Some relational DBMSs, for example, have separate storage managers and optimized data stores for XML-formatted data and multi-media. Each of these storage managers works in conjunction with the query optimizer (see Figure 2 on the next page). This approach could be used to support multi-structured data such as graph data, document data or data imported from systems such as Hadoop. These new big data storage managers when coupled with external tables give organizations the flexibility to use SQL to transparently access data that has been moved into the relational DBMS and also data that resides in external data systems. We can see from the above discussion that supporting big data involves extending a relational DBMS to not only support new types of data and new analytical techniques, but also to provide a sophisticated physical architecture that can provide the required performance. Copyright 2012 BI Research, All Rights Reserved. 7

9 Figure 2. Next generation analytics system for big data Summary In the short-term, big data can be supported by providing efficient data interchange connectors and common interfaces to the various systems shown in Figure 1. In the longer term, for certain types of applications, it can be seen from the discussion above that there are several advantages to providing a single system for supporting big data. There are two ways of building such a system. The first is by co-locating critical components on the same hardware platform and then using a common set of application interfaces to those components and the data they manage. The second approach is to extend a relational DBMS to support big data using a single data management and analytics platform. This second approach allows the full power of the relational DBMS environment to be applied to big data solutions, while at the same supporting existing applications. Requirements for an analytics ecosystem for big data In summary, key requirements for a next generation analytics ecosystem for big data include: An integrated set of open interfaces to all of the processes, data and metadata in the analytics ecosystem. Support for all mainstream programming languages. SQL access to external data sources and systems (such as Hadoop) via highperformance connectors. Data-type extensions for big data. Analytic-function extensions and development kit for business analytics. Support for third-party and open source analytic function libraries Query optimizer that understands SQL extensibility for big data (data types, analytic functions and external tables) and the underlying physical data management and storage architecture. Workload manager for handling complex, mixed and interactive workloads. Copyright 2012 BI Research, All Rights Reserved. 8

10 Customized storage managers for handling certain types of complex data. Parallel-processing architecture with high-speed interconnect between nodes. None of the short-term or long-term options outlined in this paper are mutually exclusive. Organizations will need to understand the benefits and costs of each of the options and choose the appropriate solutions that meet their business needs while at the same time supporting the workloads involved with high performance. About BI Research BI Research is a research and consulting company whose goal is to help organizations understand and exploit new developments in business intelligence, data integration, and data management. Copyright 2012 BI Research, All Rights Reserved. 9

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

Technology Innovations for Enhanced Database Management and Advanced BI

Technology Innovations for Enhanced Database Management and Advanced BI Have 40 Technology Innovations for Enhanced Database Management and Advanced BI Claudia Imhoff, Intelligent Solutions, Inc. Colin White, BI Research May 2013 Sponsored by IBM Table of Contents Executive

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

Evolving Data Warehouse Architectures

Evolving Data Warehouse Architectures Evolving Data Warehouse Architectures In the Age of Big Data Philip Russom April 15, 2014 TDWI would like to thank the following companies for sponsoring the 2014 TDWI Best Practices research report: Evolving

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

Why DBMSs Matter More than Ever in the Big Data Era

Why DBMSs Matter More than Ever in the Big Data Era E-PAPER FEBRUARY 2014 Why DBMSs Matter More than Ever in the Big Data Era Having the right database infrastructure can make or break big data analytics projects. TW_1401138 Big data has become big news

More information

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges James Campbell Corporate Systems Engineer HP Vertica jcampbell@vertica.com Big

More information

The Importance of a Single Platform for Data Integration and Quality Management

The Importance of a Single Platform for Data Integration and Quality Management helping build the smart and agile business The Importance of a Single Platform for Data Integration and Quality Management Colin White BI Research March 2008 Sponsored by Business Objects TABLE OF CONTENTS

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

CitusDB Architecture for Real-Time Big Data

CitusDB Architecture for Real-Time Big Data CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

Breaking News! Big Data is Solved. What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER

Breaking News! Big Data is Solved. What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER Breaking News! Big Data is Solved. What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER There is a revolution happening in information technology, and it s not just

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Actian Vector in Hadoop

Actian Vector in Hadoop Actian Vector in Hadoop Industrialized, High-Performance SQL in Hadoop A Technical Overview Contents Introduction...3 Actian Vector in Hadoop - Uniquely Fast...5 Exploiting the CPU...5 Exploiting Single

More information

Integrating SAP and non-sap data for comprehensive Business Intelligence

Integrating SAP and non-sap data for comprehensive Business Intelligence WHITE PAPER Integrating SAP and non-sap data for comprehensive Business Intelligence www.barc.de/en Business Application Research Center 2 Integrating SAP and non-sap data Authors Timm Grosser Senior Analyst

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

SQL Server 2012 Performance White Paper

SQL Server 2012 Performance White Paper Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.

More information

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload

More information

Turning Big Data into Big Insights

Turning Big Data into Big Insights mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER

What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER A NEW PARADIGM IN INFORMATION TECHNOLOGY There is a revolution happening in information technology, and it s not

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Microsoft Analytics Platform System. Solution Brief

Microsoft Analytics Platform System. Solution Brief Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

IBM Netezza High Capacity Appliance

IBM Netezza High Capacity Appliance IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data

More information

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.

More information

Traditional BI vs. Business Data Lake A comparison

Traditional BI vs. Business Data Lake A comparison Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

The Future of Business Analytics is Now! 2013 IBM Corporation

The Future of Business Analytics is Now! 2013 IBM Corporation The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics

More information

Navigating the Big Data infrastructure layer Helena Schwenk

Navigating the Big Data infrastructure layer Helena Schwenk mwd a d v i s o r s Navigating the Big Data infrastructure layer Helena Schwenk A special report prepared for Actuate May 2013 This report is the second in a series of four and focuses principally on explaining

More information

SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box)

SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box) SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box) SQL Server White Paper Published: January 2012 Applies to: SQL Server 2012 Summary: This paper explains the different ways in which databases

More information

Architectures for Big Data Analytics A database perspective

Architectures for Big Data Analytics A database perspective Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

White Paper. Unified Data Integration Across Big Data Platforms

White Paper. Unified Data Integration Across Big Data Platforms White Paper Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using

More information

Unified Data Integration Across Big Data Platforms

Unified Data Integration Across Big Data Platforms Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using ELT... 6 Diyotta

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

Harnessing the power of advanced analytics with IBM Netezza

Harnessing the power of advanced analytics with IBM Netezza IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition 12c delivers high-performance data movement and transformation among enterprise platforms with its open and integrated

More information

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business

More information

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved. EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics

More information

IBM System x reference architecture solutions for big data

IBM System x reference architecture solutions for big data IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,

More information

WHITE PAPER. Building Big Data Analytical Applications at Scale Using Existing ETL Skillsets INTELLIGENT BUSINESS STRATEGIES

WHITE PAPER. Building Big Data Analytical Applications at Scale Using Existing ETL Skillsets INTELLIGENT BUSINESS STRATEGIES INTELLIGENT BUSINESS STRATEGIES WHITE PAPER Building Big Data Analytical Applications at Scale Using Existing ETL Skillsets By Mike Ferguson Intelligent Business Strategies June 2015 Prepared for: Table

More information

Big Data Executive Survey

Big Data Executive Survey Big Data Executive Full Questionnaire Big Date Executive Full Questionnaire Appendix B Questionnaire Welcome The survey has been designed to provide a benchmark for enterprises seeking to understand the

More information

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise

An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure

More information

Using Master Data in Business Intelligence

Using Master Data in Business Intelligence helping build the smart business Using Master Data in Business Intelligence Colin White BI Research March 2007 Sponsored by SAP TABLE OF CONTENTS THE IMPORTANCE OF MASTER DATA MANAGEMENT 1 What is Master

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

HP Vertica OnDemand. Vertica OnDemand. Enterprise-class Big Data analytics in the cloud. Enterprise-class Big Data analytics for any size organization

HP Vertica OnDemand. Vertica OnDemand. Enterprise-class Big Data analytics in the cloud. Enterprise-class Big Data analytics for any size organization Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

REAL-TIME OPERATIONAL INTELLIGENCE. Competitive advantage from unstructured, high-velocity log and machine Big Data

REAL-TIME OPERATIONAL INTELLIGENCE. Competitive advantage from unstructured, high-velocity log and machine Big Data REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log

More information

I N T E R S Y S T E M S W H I T E P A P E R INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES. David Kaaret InterSystems Corporation

I N T E R S Y S T E M S W H I T E P A P E R INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES. David Kaaret InterSystems Corporation INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES David Kaaret InterSystems Corporation INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES Introduction To overcome the performance limitations

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise

An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Next Generation Business Performance Management Solution

Next Generation Business Performance Management Solution Next Generation Business Performance Management Solution Why Existing Business Intelligence (BI) Products are Inadequate Changing Business Environment In the face of increased competition, complex customer

More information

W H I T E P A P E R B u s i n e s s I n t e l l i g e n c e S o lutions from the Microsoft and Teradata Partnership

W H I T E P A P E R B u s i n e s s I n t e l l i g e n c e S o lutions from the Microsoft and Teradata Partnership W H I T E P A P E R B u s i n e s s I n t e l l i g e n c e S o lutions from the Microsoft and Teradata Partnership Sponsored by: Microsoft and Teradata Dan Vesset October 2008 Brian McDonough Global Headquarters:

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

9.4 Intelligence. SAS Platform. Overview Second Edition. SAS Documentation

9.4 Intelligence. SAS Platform. Overview Second Edition. SAS Documentation SAS Platform Overview Second Edition 9.4 Intelligence SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. SAS 9.4 Intelligence Platform: Overview,

More information

Parallel Data Warehouse

Parallel Data Warehouse MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability

More information

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard Hadoop and Relational base The Best of Both Worlds for Analytics Greg Battas Hewlett Packard The Evolution of Analytics Mainframe EDW Proprietary MPP Unix SMP MPP Appliance Hadoop? Questions Is Hadoop

More information

Integrating Cloudera and SAP HANA

Integrating Cloudera and SAP HANA Integrating Cloudera and SAP HANA Version: 103 Table of Contents Introduction/Executive Summary 4 Overview of Cloudera Enterprise 4 Data Access 5 Apache Hive 5 Data Processing 5 Data Integration 5 Partner

More information

IBM Big Data Platform

IBM Big Data Platform IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

How Companies are! Using Spark

How Companies are! Using Spark How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence

More information

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches. Detecting Anomalous Behavior with the Business Data Lake Reference Architecture and Enterprise Approaches. 2 Detecting Anomalous Behavior with the Business Data Lake Pivotal the way we see it Reference

More information

Comprehensive Analytics on the Hortonworks Data Platform

Comprehensive Analytics on the Hortonworks Data Platform Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page

More information

www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach

www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach Nic Caine NoSQL Matters, April 2013 Overview The Problem Current Big Data Analytics Relationship Analytics Leveraging

More information

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics

More information

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse

IBM Analytics. Just the facts: Four critical concepts for planning the logical data warehouse IBM Analytics Just the facts: Four critical concepts for planning the logical data warehouse 1 2 3 4 5 6 Introduction Complexity Speed is businessfriendly Cost reduction is crucial Analytics: The key to

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

WHAT S NEW IN SAS 9.4

WHAT S NEW IN SAS 9.4 WHAT S NEW IN SAS 9.4 PLATFORM, HPA & SAS GRID COMPUTING MICHAEL GODDARD CHIEF ARCHITECT SAS INSTITUTE, NEW ZEALAND SAS 9.4 WHAT S NEW IN THE PLATFORM Platform update SAS Grid Computing update Hadoop support

More information

In-Memory Analytics for Big Data

In-Memory Analytics for Big Data In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

Tap into Big Data at the Speed of Business

Tap into Big Data at the Speed of Business SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics

More information