Embracing Next-Generation Big Data & Analytics

Size: px
Start display at page:

Download "Embracing Next-Generation Big Data & Analytics"

Transcription

1 Embracing Next-Generation Big Data & Analytics Guidelines for CIOs & IT Leaders A collaboration by

2 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 2 CONTENTS Executive Summary 3 The New Big Data Reality 4 Why Hadoop Should Be Part Of The Modern Data Architecture 5 The Next-Generation Data Architecture 7 Hurdles To Maximizing The Value Of Big Data In The Enterprise 9 Clearing the Hurdles: Key Aspects of the Next-Gen Big Data Architecture 12 What Happens to Existing Data and Infrastructure Investments 19 How to Move Forward 20 Conclusion 23

3 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 3 EXECUTIVE SUMMARY In this age of Big Data, enterprises are creating and acquiring more data than ever before. To handle the volume, variety, and velocity requirements associated with Big Data, Apache Hadoop and its thriving ecosystem of engines and tools have created a platform for the next generation of data management, operating at a scale that traditional data warehouses cannot match. While the use of Hadoop to store and analyze Big Data generates tremendous business opportunity, it can prove a complex task for CIOs, CTOs, and the IT organizations responsible for Big Data success. To obtain business value from Big Data quickly and consistently, enterprise IT teams need to overcome well-known obstacles in making data in Hadoop available to broad communities of users and in managing a non-traditional, scale-out Big Data system. The next-generation data architecture solves these challenges by providing an interactive business analytics layer for end users on top of an agile, productionready Hadoop foundation. It allows high-speed analysis, access to Big Data via familiar tools and interfaces, and a multi-tenant approach where different groups can operate on separate copies of the data. A fully managed Hadoop service underlies the architecture, offloading Hadoop infrastructure and operations from IT teams and ultimately providing fast time to value. For organizations transitioning to Big Data and the next-generation data architecture, the most practical approach would be to opt for an analytic platform that works with traditional environments as well as emerging Big Data systems. This allows organizations to benefit from their existing investment while minimizing disruption. To maximize the likelihood of success when making Hadoop part of their data strategy, organizations are advised to partner with a provider of reliable Big Data infrastructure and operations to limit risk and costs and to future-proof their investment.

4 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 4 THE NEW BIG DATA REALITY Enterprises across all industries are approaching the Big Data age and asking how they can best position themselves for Big Data success. A decade ago, a terabyte was considered a massive amount of data and might have represented the data store of an entire organization. Today, large, data-rich organizations, like major Internet-based businesses, are grappling with hundreds of petabytes. While these companies may seem like outliers today, we ll soon see that they are simply the leading wave of enterprises having to deal with petabyte-scale data volumes. Big Data is poised to transform industries, including those not often thought of as high tech, such as manufacturing and utilities. For manufacturing companies, Big Data represents the opportunity to optimize the production cycle and streamline the supply chain. For utility and energy companies, there is the chance to reduce energy loss, better promote energy efficiency, and proactively repair or fix system issues. For financial services companies, there is the opportunity to rapidly identify fraud across millions of transactions or better manage risk across thousands of trading positions. For IT organizations, however, the advent of Big Data raises fundamental questions about how they can transition from their current architecture to a new one that embraces the opportunity in Big Data, while also preserving their flexibility for future data developments.

5 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 5 WHY HADOOP SHOULD BE PART OF THE MODERN DATA ARCHITECTURE Enterprises require an adaptable, affordable solution that can manage the demands of Big Data. The dominant Big Data technology in commercial use today is Apache Hadoop, along with other technologies that are part of the greater Hadoop ecosystem, such as the Apache Spark in-memory processing engine, the Apache Hive data warehouse infrastructure, and the Apache HBase NoSQL storage system. Hadoop is a software platform for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation running against direct-attached storage. Since Hadoop is open-source software and leverages commodity hardware, it has significant cost advantages over using proprietary software and specialized supercomputing environments. With Hadoop s low-cost storage, organizations do not need to decide which data to save and which to delete. All the organization s raw, unprocessed data can be stored and made available in Hadoop, so no valuable data signals are lost. The sheer power of Hadoop s processing engine allows organizations to unlock greater insights by using machine learning and highly sophisticated computational algorithms. In many large-volume data situations it is the only way to get jobs to complete successfully. The platform is particularly suited to process large volumes of unstructured data, such as Facebook comments and Twitter tweets, and instant messages, and security and application logs.

6 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 6 Hadoop was incubated and developed into an enterprise-class solution at Yahoo!, where it drives search results and advertising optimization. At Yahoo! it is used by thousands of data scientists and analysts and is critical to the generation of billions of dollars in annual revenue. Today, Hadoop is available to any enterprise and is experiencing rapid adoption by financial services, manufacturing, healthcare, marketing, retail, services, and Internet-based businesses that have come to the realization that existing data warehouse technologies will not scale. Any organization seeking to incorporate new sources of unstructured data, combine heterogeneous data, or simply process large volumes of data should have Hadoop as an integral part of their architecture.

7 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 7 THE NEXT-GENERATION DATA ARCHITECTURE Incorporating big data into enterprise analytic and data architectures Smart organizations are using a set of new standards to include Big Data in their core enterprise data architecture. This next-generation data architecture moves Hadoop from experimental, data science projects to mainstream applications and forms the foundation of an information architecture that delivers business value for the broader organization. The major elements of the architecture are explained at a high level here and with more detailed explanations in the following sections.

8 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 8 Enterprise Data Tier This data tier represents private and public sources of information, spread across the enterprise. While some of this data still exists in data warehouses and operational applications, Hadoop is the major new entrant at this level and the fastest-growing data source, containing both structured and unstructured data. In this tier, power users use sophisticated tools or write custom code to process and perform advanced analytics on the data. User Data Tier The user data tier creates an interactive data layer that sits on top of Hadoop to quickly respond to multiple user queries and concurrent workloads with the rapid response times business users expect. It also provides a semantic layer to represent data in familiar business terms, thereby making data consumable by a broad set of users across the enterprise. Additionally, this is where organizations can create a multi-tenant approach to Big Data, to allow different departments, business groups, and users to operate on separate, yet centrally unified data. User Experience Tier This layer represents the user interfaces to data and a broad range of data exploration, analytics, and visualization tools. These options include business intelligence solutions, data visualizations, data discovery, or even spreadsheets and local data sets.

9 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 9 HURDLES TO MAXIMIZING THE VALUE OF BIG DATA IN THE ENTERPRISE Hadoop is attractive to many organizations because it can store and process large volumes of heterogeneous data in a cost-effective manner, but several hurdles, if not addressed, can limit the data agility and overall business value Hadoop can provide. Challenge: Making Big Data Available to Broader Business Groups While the ability to process raw data is powerful, we do not expect large numbers of business users to query Hadoop directly. This is due to several reasons: Need for a Business User Interface Most business users require data discovery and visualization tools that offer a point-and-click visual interface to data. Some prefer interactive dashboards; others might want scheduled, highly formatted reports. However, Hadoop, like an enterprise data warehouse, does not offer a business user interface to data. To date, most Hadoop users have been data scientists and developers, but business groups also see a lot of value in Big Data and are looking to tap into this valuable information asset. Need for Rapid Response Times and High Concurrency Queries in Hadoop often run as brute-force scans. While this is powerful in many ways, going directly against Hadoop is not always ideal for analytical scenarios that require rapid response times. Additionally, although Hadoop can process massive volumes of data and perform complex computations, it is not designed for high-concurrency workloads. In order to offer agility and speed of analysis for hundreds of users, some adaptations need to be made. Dependence on IT as a Data Intermediary The SQL-on-Hadoop reporting options available today are generally designed for data analysts. As a result, data in Hadoop cannot be incorporated into daily

10 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 10 analytics for broader business groups unless IT acts as an intermediary, providing query results and reporting for business use. This creates a bottleneck and takes significant effort to maintain the data pipeline from Hadoop to data visualization tools. As new requirements arise, business users rely on IT to create new aggregates or change the ones that were created before. These requests go into IT s queue, adding considerable latency to analysis. Over time, the ETL code that transforms data in Hadoop into user-ready aggregates gets more and more complex. As the information needs of end users shift, and data volumes increase, maintaining this code becomes very costly and creates a source of friction for IT. Challenge: Hadoop Infrastructure Considerations Establishing and maintaining the hardware infrastructure for Hadoop requires addressing deployment and scalability issues. Lack of Experience with Hadoop Infrastructure Requirements Although very cost effective, Hadoop s horizontal scalability requires data center space, networking techniques, and power densities, and it has other infrastructure requirements that are unfamiliar to many IT organizations. In addition, Hadoop is typically implemented on servers with large numbers of local disk drives, which are different from the high-density compute blades and shared storage that are standard in many data centers. IT s lack of familiarity with the hardware, networking, and maintenance demands of Hadoop can limit an organization s initial and ongoing success with Hadoop. Overprovisioning to Handle Spikes and Growth Hadoop clusters tend to exhibit strong, long-term growth in demand for storage and compute. They also experience intense spikes in demand at times. Planning for and accommodating both the spikes and the growth is a challenge for onpremises clusters. These scalability and elasticity challenges are often tackled in an expensive, brute-force way. This approach can quickly diminish the cost benefits Hadoop is meant to achieve.

11 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 11 Challenge: Ongoing Hadoop Operations and Tuning Beyond the initial setup, Hadoop can often require extensive ongoing effort to maintain. The operational challenge of ongoing Hadoop management is one of the key reasons a leading analyst firm estimates that 70% of Hadoop implementations will fail to meet their revenue and cost objectives over the next few years. Operational Complexity at Scale The intense operational requirements of Hadoop are driven by a few issues. First is the fact that the volume and complexity of data in Hadoop inevitably increase over time. As the data becomes larger and more complex, and as the organization demands more and more from its data, significant effort is required to rectify resource contention, job failures, and conflicts, and to manage data pipelines. Second, as data gets larger and distributed over an increasing number of machines, Hadoop jobs by nature become fragile and more prone to breakage or failure. Operational teams must stay on top of job signals and intervene quickly or proactively in order to ensure these processes complete. Lack of Hadoop Operational Expertise Most organizations are ill equipped to handle the operational demands of Hadoop. As a result, users, such as data scientists, often spend significant amounts of time managing Hadoop operations rather than pursuing actual analytics. This redirection of time to operational demands, as well as the overall lack of operational expertise, means organizations will struggle to obtain full value from data in Hadoop. Keeping Up with the Rapid Pace of Big Data Evolution The core Hadoop platform, along with its supporting software projects, such as Apache Spark, is rapidly evolving, with large advances in capability occurring in a relatively short amount of time. In addition, the number of software applications that run on Hadoop, both open-source and third-party solutions, is growing

12 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 12 quickly. While this rapid evolution improves the power and breadth of the overall Big Data platform, it places a significant burden on IT organizations to evaluate and integrate a broad set of new capabilities. CLEARING THE HURDLES: KEY ASPECTS OF THE NEXT-GENERATION BIG DATA ARCHITECTURE Many organizations have been experimenting with Hadoop, with the first wave of major Hadoop deployments beginning in the past few years. Many of these experiments are standalone projects, away from the core enterprise data architecture that supports the business. In order for organizations to operationalize the value of Big Data beyond experimental pilot projects, two considerations are critical: 1. The end-user delivery of Big Data 2. The deployment and operations model for Hadoop 1. The End-User Delivery of Big Data As discussed previously, one of the great strengths of Hadoop is its ability to store and process a wide variety of data. However, data is stored in a raw state and is not ready for analysis. To help a broader group of business users gain value from Big Data assets, a next generation of data architecture must be layered on top. This new architecture comprises four main components: 1. An analytics-ready data store that accommodates analysis at high speed 2. A semantic layer that facilitates interacting with data in familiar business terms 3. A multi-tenant data architecture that provides local data and semantics for different business groups 4. A set of user experiences that provide different styles of reporting and analysis

13 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 13 Creating a High-Performance, Analytic-Ready Data Store On Top of Hadoop A best practice for building an analysis-friendly Big Data environment is to create an analytic data store that loads the most commonly used data sets from the Hadoop data lake and structures them into dimensional models. These models are easy for business users to understand, and they facilitate the exploration of how business contexts change over time. This analytic data store is part of the user data tier, introduced in the reference architecture diagram earlier. This data store typically uses a columnar MPP database to implement an interactive data layer that can handle many concurrent queries with speed and agility. This analytic data store must not only support reporting for the known-use cases but also exploratory analysis for unplanned scenarios. In addition, the user data tier should allow data exploration and access directly to Hadoop. The process should be seamless to the user, eliminating the need to know whether to query the analytic data store or Hadoop directly. This is only possible if an abstraction layer is provided that intelligently knows whether a query can be accomplished from the analytic data store or needs to be run in the Hadoop data lake. This abstraction layer is called a semantic layer, discussed in the next section. Providing a Business Interface to Data via a Semantic Layer To hide the complexities in raw data and to expose data to business users in easily understood business terms, a semantic overlay is required. This semantic layer is a logical representation of data, where business rules can be applied. For example, a semantic layer can define high-value customer as those who have been customers for more than three years and are making new or renewal purchases on a regular basis. The data for high-value customer might have been sourced from different tables and gone through different levels of calculation and transformation before arriving at the semantic layer, all invisible to the business user who queries for high-value customer. The alternative is a) to have business users query Hadoop directly, which is

14 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 14 impractical, or b) to have them request this information from IT, which means waiting in a queue of reporting requests. A semantic layer enables business users to analyze and explore data using familiar business terms without the need to wait for IT to prioritize their requests. It also allows for reuse of data, reports, and analysis across different users, maintaining alignment and consistency and saving IT the effort of responding to every individual request on a case-by-case basis. Operationalizing a Multi-Tenant Big Data Environment Some organizations only require a purely centralized approach, in which IT owns, prepares, and makes data available for all teams by creating a common semantic layer and analytic data store for shared analytics, reporting, and visualizations. However, most organizations need to embrace a hybrid centralized and decentralized approach to data, where different teams need to incorporate local data sets and semantic definitions while also accessing the enterprise data resources IT creates. This hybrid approach can be achieved with a multi-tenant data architecture. In this architecture, IT collects and cleanses data into a shared Hadoop data lake and prepares a central semantic layer and analytic data store from that data. IT then creates virtual (i.e., logical, not physical) copies of the centralized data environment for different business groups, such as finance, sales, marketing, and customer support. Since these are virtual copies, access to them does not impact the centralized data store, which retains integrity and governance standards. However, this multi-tenant architecture allows business groups to add their own definitions to the centralized semantics (e.g., their version of high-value customer) and to blend the centralized Hadoop data which IT has prepared and made analytic-ready with their own local data sets. This way, IT keeps the authority in data governance and semantic rules, while business groups and departments can truly see the impact of their daily business activities against historical or corporate data stored in Hadoop. Providing a Set of User Experiences to Data in Hadoop A final consideration for the end-user delivery of Big Data is the form in which data will be represented. These data interfaces should meet the unique and individual needs of all users. This requirement includes providing highly interactive and

15 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 15 responsive dashboards for business users, intuitive visual discovery for analysts, and pixel-perfect, scheduled reports for information consumers. While each style is unique, the best practice is to ensure that each interface is not a separate tool, so that creating, collaborating, and publishing information is done with consistency and accuracy. This is only achievable through a semantic layer that ensures data values remain consistent, while data presentations might differ from one user interface to another. 2. The Deployment and Operations Model for Hadoop Hadoop places unique demands on an organization for infrastructure deployment and ongoing operations. These particular challenges must be taken into account when selecting the best deployment model for incorporating Hadoop and Big Data into the enterprise data environment, and there are three common ways of doing so: a. On-premises, do-it-yourself deployment and operations. This approach requires the procurement and provisioning of a scale-out cluster for Hadoop. It then involves installing and configuring Hadoop and other ecosystem components. IT, and often the data science team, is heavily involved in deployment, upgrades, security implementation, and ongoing operations. b. Infrastructure-as-a-service, with do-it-yourself operations. This includes getting cloud servers from a provider such as Amazon Web Services or Microsoft Azure. IT is responsible for configuring the Hadoop cluster and providing the operational team required to run the solution, as well as providing resources to implement and maintain supporting software. Some infrastructure-as-a-service providers also offer services, such as Amazon EMR and Microsoft HDInsight, that perform the initial Hadoop setup for users. However, ongoing operations remain the responsibility of the IT team again, often with heavy involvement from the user community. c. Fully managed Hadoop-as-a-service. Such a service includes computing infrastructure optimized for Hadoop, a complete Hadoop-based platform, and the operational support required to minimize job failure, scale the solution, ensure the

16 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 16 solution is updated, resolve resource conflicts, and perform ongoing tuning. The vendor also provides security measures. Agility and user readiness have to be mirrored by both the deployment and enduser delivery layers for the next-generation Big Data architecture to be effective. The ideal solution will be capable of the following: 1. Minimizing time to value of the organization s Big Data initiatives 2. Providing optimized performance 3. Scaling elastically based on actual compute and storage demands 4. Reducing the organization s operational burden Speeding Time to Value As Hadoop moves into the mainstream, users focus will be less on technical capabilities and more on the business value generated by Big Data. And the best way to demonstrate this value is to get quick wins, where analytics on Hadoop can be tied to positive results for the business. This is no easy task, given the difficulties of standing up a Hadoop cluster. This process takes months and sometimes longer than half a year. Instead, organizations should take the fastest route from Big Data to analytical output to maximize their chances of Big Data success. With managed Hadoop in the cloud, there is no hardware to buy, software to implement, or team to hire, so solutions can typically be available in a matter of days after initial data transfer. Selecting Optimal Infrastructure for Hadoop For optimal performance, organizations need to select infrastructure specifically geared toward running Hadoop. This includes getting the ratio of CPU cores to memory to disks correct, optimizing the network for inter-server throughput, ensuring the master nodes run on reliable hardware, and more. Since Hadoop-as-a-service vendors are dedicated to Hadoop workload

17 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 17 performance, the hardware and networking have been specially selected, implemented, and tuned for the rapid and successful completion of Big Data processing jobs. As a result, jobs complete more quickly and with lower job failure rates than when using a generalized infrastructure-as-a-service provider or, in many cases, than when running infrastructure on-premises. Providing Compute and Storage Elasticity To process Big Data successfully over time, as workloads and data volumes become more intense, the cluster needs to be able to scale immediately and on demand so that the most demanding jobs, such as a monthly complex billing job for millions of customers or a monthly longitudinal analysis, are able to complete smoothly, successfully, and within the required time window. Organizations should eschew implementations where infrastructure needs to be built out to meet peak requirements while sitting idle the remainder of the time. Hadoop-as-a-service allows customers to reduce costs by renting additional capacity only when needed. Also, with a Hadoop-as-a-service provider, unexpected rapid expansion of the overall data volume can be easily managed simply by requesting more data storage, rather than having to procure and deploy additional hardware on-premises. Minimizing the Operational Complexity of Big Data There are several aspects of operational complexity that must be addressed to be successful with Big Data. Ensuring High Performance: Ongoing Operations and Support Users simply want reliable access to the data and processing power in Hadoop so they can focus on getting their analytical jobs done without having to worry about the underlying operational complexity. With a fully managed Hadoop solution, IT teams and data scientists no longer have to worry about issues like the following: constant cluster tuning to maintain high performance

18 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 18 investigating and resolving job failures managing escalating job contention due to scaling navigating memory usage conflicts between MapReduce, Spark, and other engines managing the scheduling of large jobs and the resulting political conflicts between teams of users due to limited infrastructure Instead, users have access to powerful, well-maintained solutions that get their data jobs done and supporting experts and tools to keep their Hadoop clusters running reliably with none of the administrative overhead. Security Security is sometimes not given serious thought when organizations start experimenting with Hadoop; however, it should be carefully considered, as it is extremely difficult to retrofit security onto an existing cluster. Hadoop can be fortified with Kerberos authentication, for example, but this security measure is non-trivial to establish. Going beyond Hadoop s built-in security capabilities, more comprehensive security network security, data encryption, audit logs, physical security is required for enterprise-grade Hadoop operation. Organizations should ensure their Big Data platform is one that has security fully managed and available in the platform from day one. Keeping Pace with the Latest Technology Developments In a rapidly expanding and evolving ecosystem, the burden of ensuring the production readiness of new capabilities and implementing the latest upgrades can be overwhelming. Many early on-premises Hadoop customers find themselves hampered by the effort required to achieve the latest upgrade of the elements of their platform; they find it easier not to upgrade, continuing to struggle with the limitations of outmoded capabilities.

19 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 19 Alternatively, when using a Hadoop-as-a-service vendor, it is the responsibility of the vendor to keep pace with infrastructure upgrades and ensure the full Hadoop stack is up to date with the latest production-ready features. The enterprise customer simply has access to the latest capabilities as part of the subscription, with no extra effort required. WHAT HAPPENS TO EXISTING DATA AND INFRASTRUCTURE INVESTMENTS? Many organizations are grappling with the difficult challenge of incorporating Big Data into their existing architecture. Organizations have made significant investments in traditional data warehousing and business intelligence systems to manage valuable structured data that has been carefully identified, cleaned, and normalized over the years. These solutions have been in place for years, if not decades, with broad user sets within the organization that are familiar with their use and outcomes. These systems and their analytical results produce significant value and will continue to do so in the future. They are fundamental to the everyday operations of many enterprises, large and small. While maintaining the value of these existing systems, IT organizations must also adjust to the new reality of working with massive volumes of heterogeneous data; that is, working not only with structured data like sales figures, but also with the valuable real-time and unstructured data that come from sensors, social media, web clickstreams, application logs, , video, and other sources. With the new shifts and advancements in information management technologies, IT organizations ask: How can I intelligently add Big Data to my existing architecture? What do I do with the rest of my enterprise data (e.g., my sales records)? Do I place all my data in Hadoop? Or do I leave it where it exists today? While there is recognition that new, distributed storage and processing solutions like Hadoop are likely the future, how should a smart organization incorporate

20 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 20 these solutions into its existing environment and position itself for the fastest and easiest evolution to an architecture that is completely based on distributed computing? HOW TO MOVE FORWARD Given the need to embrace Hadoop while continuing to leverage existing investments in data infrastructure, the following recommendations offer a practical approach to evolving into the next-generation architecture: Select an Analytic Platform that Works with Existing, Yet Evolving Systems Organizations managing complex, multi-layered data environments that involve many data sources and a variety of data storage, warehouses, and processes should look at transitioning to architectures that allow them to incorporate Big Data into their current ecosystems, while still leveraging existing investments in environment, processes, and people. In these situations, selecting an analytic platform that can leverage an organization s data warehouses as well as Big Data sources, such as Hadoop, is critical. If an organization has already built an analytic-ready data structure in the data warehouse, analytics and reporting layers should be able to connect to it and leverage the hierarchies, data dimensions, measures, and calculations already defined. In addition, it should also let the user connect to the data pipelines developed in Hadoop. An analytic platform that provides a semantic layer allows an organization to place an existing data warehouse side by side with Hadoop yet keep this separation seamless to the end users. Users would be querying the data in business terms, and the semantic layer would route the query appropriately between Hadoop or the data warehouse. With this approach, an organization is able to gradually introduce Hadoop into its data architecture without interrupting business processes, data access, user data flows, and other things familiar to end users.

21 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 21 Partner with a Trusted Provider for Hadoop Infrastructure and Operations The intense operational requirements of Hadoop are often not well understood until an organization is deep into an implementation. Only then are the true breadth and depth of operational requirements fully revealed. It is recommended that in order to experience ongoing success with Hadoop, and to limit the capital and operational costs of scaling Hadoop, organizations partner with a Hadoop-as-a-service provider that offers a fully managed service, including operational support, from the start. This means there is never any infrastructure to manage or complicated software to operate, so organizations can focus on their

22 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 22 business. Keep Big Data Options Open IT leaders are enthusiastic about the opportunities Big Data can bring to their business, and they are rightfully concerned about ensuring that the strategic and tactical architectural decisions they make today will position them well for future developments in data management and analytics. While traditional data environments are fairly well established and understood, the emerging ecosystem for Big Data is not mature. IT leaders are challenged to evaluate these rapidly evolving options as well as make the appropriate infrastructure and hiring decisions to support these solutions. Organizations will want to establish that the choices they make today will not cause them to be locked into a proprietary system or to incur significant switching costs should their requirements or the ecosystem evolve. When evaluating Hadoop offerings, IT teams should ascertain how difficult it would be to upgrade software or integrate new tools into the platform. Organizations should also examine how flexible any investment in hardware and support staff would be. The enterprise will need to evaluate the full platform vendors provide, as well as their roadmap for future developments, to ensure the platform meets and will continue to meet their requirements.

23 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 23 CONCLUSION As Big Data becomes increasingly vital to the enterprise, it is transitioning into a fundamental part of the enterprise data architecture. Navigating this transition successfully, in a manner that positions the enterprise well for future data growth and an evolving Big Data ecosystem, requires careful consideration. Maximizing the Impact of Data Smart enterprises are adapting their architectures to ensure that Big Data can be accessed broadly by data scientists and business analysts alike. Achieving this requires adapting the data environment to create a semantic layer, which provides the critical juncture that allows data from all enterprise sources to be easily accessed by the broadest range of solutions. Addressing Big Data Infrastructure Challenges Achieving an infrastructure that can meet current and future needs is exceptionally challenging as well as capital and resource intensive. Organizations that deploy on-premises are burdened with ensuring capacity can meet escalating demands in order to have a robust data platform and data science workbench available to their teams. By finding a Hadoop-as-a-service partner, companies can rapidly achieve a reliable, high-performance Big Data platform without having to overinvest in infrastructure. Addressing Big Data Operational Challenges Organizations are increasingly limited by their capacity to operate Hadoop successfully over time. Since Hadoop operational demands increase with scale, and since these operational resources are scarce, a Hadoop-as-a-service partner that provides operational support as part of its services can remove the operational restraints that may be limiting growth of Big Data analytics in the enterprise.

24 Embracing Next-Generation Big Data & Analytics: Guidelines for CIOs and IT Leaders 24 ABOUT BIRST Birst is the global leader in Cloud BI and Analytics for the Enterprise. Birst s patented 2-tier BI and analytics platform enables enterprises to create trusted data while empowering business users to manipulate the information in a fast and easily accessible manner. Thousands of the most demanding businesses trust Birst to make metricdriven business execution a reality. Every day we help companies make smarter decisions based on data they can trust. Thousands of the most demanding businesses trust Birst to make metricdriven business execution a reality. Learn more at and join the ABOUT ALTISCALE Altiscale provides Big Data as a Service, helping businesses to maximize the value of their data without the challenge and expense of managing complex technologies on their own. Altiscale operations experts provide advisory services and support on top of an integrated big data platform that is comprised of leading technologies such as Hadoop and Spark. The combination of a secure, scalable big data platform, dedicated operational services, and a passion for results means that Altiscale customers experience performance that is up to 10x faster than alternatives. Altiscale customers include leading companies across financial services, media, marketing services, AdTech, and gaming. Learn more at

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

The 2-Tier Business Intelligence Imperative

The 2-Tier Business Intelligence Imperative Business Intelligence Imperative Enterprise-grade analytics that keeps pace with today s business speed Table of Contents 3 4 5 7 9 Overview The Historical Conundrum The Need For A New Class Of Platform

More information

Hadoop in the Hybrid Cloud

Hadoop in the Hybrid Cloud Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

SQL Server 2012 Parallel Data Warehouse. Solution Brief

SQL Server 2012 Parallel Data Warehouse. Solution Brief SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...

More information

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora SAP Brief SAP Technology SAP HANA Vora Objectives Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora Bridge the divide between enterprise data and Big Data Bridge the divide

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Big Data for the Rest of Us Technical White Paper

Big Data for the Rest of Us Technical White Paper Big Data for the Rest of Us Technical White Paper Treasure Data - Big Data for the Rest of Us 1 Introduction The importance of data warehousing and analytics has increased as companies seek to gain competitive

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY Analytics for Enterprise Data Warehouse Management and Optimization Executive Summary Successful enterprise data management is an important initiative for growing

More information

Microsoft Analytics Platform System. Solution Brief

Microsoft Analytics Platform System. Solution Brief Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business

More information

How To Use Hp Vertica Ondemand

How To Use Hp Vertica Ondemand Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Big Data Comes of Age: Shifting to a Real-time Data Platform

Big Data Comes of Age: Shifting to a Real-time Data Platform An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for SAP April 2013 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents Introduction... 1 Drivers of Change...

More information

MicroStrategy Cloud Reduces the Barriers to Enterprise BI...

MicroStrategy Cloud Reduces the Barriers to Enterprise BI... MicroStrategy Cloud Reduces the Barriers to Enterprise BI... MicroStrategy Cloud reduces the traditional barriers that organizations face when implementing enterprise business intelligence solutions. MicroStrategy

More information

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT WHITEPAPER OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT A top-tier global bank s end-of-day risk analysis jobs didn t complete in time for the next start of trading day. To solve

More information

Tap into Big Data at the Speed of Business

Tap into Big Data at the Speed of Business SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics

More information

Next-Generation Cloud Analytics with Amazon Redshift

Next-Generation Cloud Analytics with Amazon Redshift Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional

More information

SQL Server 2012 Performance White Paper

SQL Server 2012 Performance White Paper Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Whitepaper: Solution Overview - Breakthrough Insight. Published: March 7, 2012. Applies to: Microsoft SQL Server 2012. Summary:

Whitepaper: Solution Overview - Breakthrough Insight. Published: March 7, 2012. Applies to: Microsoft SQL Server 2012. Summary: Whitepaper: Solution Overview - Breakthrough Insight Published: March 7, 2012 Applies to: Microsoft SQL Server 2012 Summary: Today s Business Intelligence (BI) platform must adapt to a whole new scope,

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

Big Data Analytics for Retail with Apache Hadoop. A Hortonworks and Microsoft White Paper

Big Data Analytics for Retail with Apache Hadoop. A Hortonworks and Microsoft White Paper Big Data Analytics for Retail with Apache Hadoop A Hortonworks and Microsoft White Paper 2 Contents The Big Data Opportunity for Retail 3 The Data Deluge, and Other Barriers 4 Hadoop in Retail 5 Omni-Channel

More information

Please give me your feedback

Please give me your feedback Please give me your feedback Session BB4089 Speaker Claude Lorenson, Ph. D and Wendy Harms Use the mobile app to complete a session survey 1. Access My schedule 2. Click on this session 3. Go to Rate &

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Deploying an Operational Data Store Designed for Big Data

Deploying an Operational Data Store Designed for Big Data Deploying an Operational Data Store Designed for Big Data A fast, secure, and scalable data staging environment with no data volume or variety constraints Sponsored by: Version: 102 Table of Contents Introduction

More information

WHITEPAPER. Why Dependency Mapping is Critical for the Modern Data Center

WHITEPAPER. Why Dependency Mapping is Critical for the Modern Data Center WHITEPAPER Why Dependency Mapping is Critical for the Modern Data Center OVERVIEW The last decade has seen a profound shift in the way IT is delivered and consumed by organizations, triggered by new technologies

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.

More information

ANALYTICS BUILT FOR INTERNET OF THINGS

ANALYTICS BUILT FOR INTERNET OF THINGS ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Ubuntu and Hadoop: the perfect match

Ubuntu and Hadoop: the perfect match WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

In-Memory Analytics for Big Data

In-Memory Analytics for Big Data In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...

More information

Big Data and Healthcare Payers WHITE PAPER

Big Data and Healthcare Payers WHITE PAPER Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other

More information

BEYOND BI: Big Data Analytic Use Cases

BEYOND BI: Big Data Analytic Use Cases BEYOND BI: Big Data Analytic Use Cases Big Data Analytics Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved. Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat

More information

ENTERPRISE BI AND DATA DISCOVERY, FINALLY

ENTERPRISE BI AND DATA DISCOVERY, FINALLY Enterprise-caliber Cloud BI ENTERPRISE BI AND DATA DISCOVERY, FINALLY Southard Jones, Vice President, Product Strategy 1 AGENDA Market Trends Cloud BI Market Surveys Visualization, Data Discovery, & Self-Service

More information

VIEWPOINT. High Performance Analytics. Industry Context and Trends

VIEWPOINT. High Performance Analytics. Industry Context and Trends VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES [ Consumer goods, Data Services ] TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES QUICK FACTS Objectives Develop a unified data architecture for capturing Sony Computer Entertainment America s (SCEA)

More information

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

can you effectively plan for the migration and management of systems and applications on Vblock Platforms? SOLUTION BRIEF CA Capacity Management and Reporting Suite for Vblock Platforms can you effectively plan for the migration and management of systems and applications on Vblock Platforms? agility made possible

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases DATAMEER WHITE PAPER Beyond BI Big Data Analytic Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence

More information

Delivering Real-World Total Cost of Ownership and Operational Benefits

Delivering Real-World Total Cost of Ownership and Operational Benefits Delivering Real-World Total Cost of Ownership and Operational Benefits Treasure Data - Delivering Real-World Total Cost of Ownership and Operational Benefits 1 Background Big Data is traditionally thought

More information

Informatica and the Vibe Virtual Data Machine

Informatica and the Vibe Virtual Data Machine White Paper Informatica and the Vibe Virtual Data Machine Preparing for the Integrated Information Age This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information

More information

Changing the Equation on Big Data Spending

Changing the Equation on Big Data Spending White Paper Changing the Equation on Big Data Spending Big Data analytics can deliver new customer insights, provide competitive advantage, and drive business innovation. But complexity is holding back

More information

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction

More information

Evolution to Revolution: Big Data 2.0

Evolution to Revolution: Big Data 2.0 Evolution to Revolution: Big Data 2.0 An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for Actian March 2014 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

THE JOURNEY TO A DATA LAKE

THE JOURNEY TO A DATA LAKE THE JOURNEY TO A DATA LAKE 1 THE JOURNEY TO A DATA LAKE 85% OF DATA GROWTH BY 2020 WILL COME FROM NEW TYPES OF DATA ACCORDING TO IDC, AS MUCH AS 85% OF DATA GROWTH BY 2020 WILL COME FROM NEW TYPES OF DATA,

More information

Traditional BI vs. Business Data Lake A comparison

Traditional BI vs. Business Data Lake A comparison Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses

More information

White Paper: Datameer s User-Focused Big Data Solutions

White Paper: Datameer s User-Focused Big Data Solutions CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

Why Cloud BI? The 10 Substantial Benefits of Software-as-a-Service Business Intelligence

Why Cloud BI? The 10 Substantial Benefits of Software-as-a-Service Business Intelligence The 10 Substantial Benefits of Software-as-a-Service Business Intelligence Executive Summary Smart businesses are pursuing every available opportunity to maximize performance and minimize costs. Business

More information

Hadoop for Enterprises:

Hadoop for Enterprises: Hadoop for Enterprises: Overcoming the Major Challenges Introduction to Big Data Big Data are information assets that are high volume, velocity, and variety. Big Data demands cost-effective, innovative

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Necto on Azure The Ultimate Cloud Solution for BI

Necto on Azure The Ultimate Cloud Solution for BI Necto on Azure The Ultimate Cloud Solution for BI TECHNICAL WHITEPAPER Introduction Organizations of all sizes and sectors need Business Intelligence (BI) to scale operations, improve performance and remain

More information

Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service

Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service A Sumo Logic White Paper Introduction Managing and analyzing today s huge volume of machine data has never

More information

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1 Powerful analytics and enterprise security in a single platform microstrategy.com 1 Make faster, better business decisions with easy, powerful, and secure tools to explore data and share insights. Enterprise-grade

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

Microsoft Big Data Solutions. Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com;

Microsoft Big Data Solutions. Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com; Microsoft Big Data Solutions Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com; Why/What is Big Data and Why Microsoft? Options of storage and big data processing in Microsoft Azure. Real Impact of Big

More information

IBM System x reference architecture solutions for big data

IBM System x reference architecture solutions for big data IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,

More information

TBR. IBM Cloud Services Balancing compute options: How IBM Smart Business Cloud can be a catalyst for IT transformation

TBR. IBM Cloud Services Balancing compute options: How IBM Smart Business Cloud can be a catalyst for IT transformation T EC H N O LO G Y B U S I N ES S R ES EAR C H, I N C. IBM Cloud Services Balancing compute options: How IBM Smart Business Cloud can be a catalyst for IT transformation Author: Stuart Williams Director,

More information

The IBM Cognos Platform

The IBM Cognos Platform The IBM Cognos Platform Deliver complete, consistent, timely information to all your users, with cost-effective scale Highlights Reach all your information reliably and quickly Deliver a complete, consistent

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Introducing Oracle Exalytics In-Memory Machine

Introducing Oracle Exalytics In-Memory Machine Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle

More information

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP Pythian White Paper TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP ABSTRACT As companies increasingly rely on big data to steer decisions, they also find themselves looking for ways to simplify

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

Getting Started Practical Input For Your Roadmap

Getting Started Practical Input For Your Roadmap Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson

More information

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches Introduction For companies that want to quickly gain insights into or opportunities from big data - the dramatic volume growth in corporate

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Big Data on the Open Cloud

Big Data on the Open Cloud Big Data on the Open Cloud Rackspace Private Cloud, Powered by OpenStack, Helps Reduce Costs and Improve Operational Efficiency Written by Niki Acosta, Cloud Evangelist, Rackspace Big Data on the Open

More information

Why Cloud BI? of Software-as-a-Service Business Intelligence. Executive Summary. This white paper explores the 10 substantial

Why Cloud BI? of Software-as-a-Service Business Intelligence. Executive Summary. This white paper explores the 10 substantial of Software-as-a-Service Business Intelligence Executive Summary Smart businesses are pursuing every available opportunity to maximize performance and minimize costs. Business Intelligence tools used to

More information

Why DBMSs Matter More than Ever in the Big Data Era

Why DBMSs Matter More than Ever in the Big Data Era E-PAPER FEBRUARY 2014 Why DBMSs Matter More than Ever in the Big Data Era Having the right database infrastructure can make or break big data analytics projects. TW_1401138 Big data has become big news

More information

Unleash your intuition

Unleash your intuition Introducing Qlik Sense Unleash your intuition Qlik Sense is a next-generation self-service data visualization application that empowers everyone to easily create a range of flexible, interactive visualizations

More information

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution WHITEPAPER A Technical Perspective on the Talena Data Availability Management Solution BIG DATA TECHNOLOGY LANDSCAPE Over the past decade, the emergence of social media, mobile, and cloud technologies

More information

See what cloud can do for you.

See what cloud can do for you. See what cloud can do for you. Uncomplicating cloud business Table of contents Introduction 3 Why cloud is relevant for your business? 4 What is changing? 4 Why organizations are moving to cloud 5 What

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era

Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era Enterprise Strategy Group Getting to the bigger truth. White Paper Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era Apeiron introduces NVMe-based storage innovation designed

More information

Oracle Big Data Building A Big Data Management System

Oracle Big Data Building A Big Data Management System Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following

More information

Cloudera Enterprise Data Hub in Telecom:

Cloudera Enterprise Data Hub in Telecom: Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

NetApp Big Content Solutions: Agile Infrastructure for Big Data

NetApp Big Content Solutions: Agile Infrastructure for Big Data White Paper NetApp Big Content Solutions: Agile Infrastructure for Big Data Ingo Fuchs, NetApp April 2012 WP-7161 Executive Summary Enterprises are entering a new era of scale, in which the amount of data

More information