Cloudera Search and the Enterprise Hub

Size: px
Start display at page:

Download "Cloudera Search and the Enterprise Hub"

Transcription

1 Cloudera Search and the Enterprise Hub Version: Q

2 Table of Contents Introduction 3 The Cost of Data Silos 4 The Enterprise Data Hub and Search 4 Multi-Workload Search and the EDH 7 Inside Cloudera Search 7 Examples of Cloudera Search and an EDH 10 Conclusion 12 About Cloudera 13 2

3 Introduction Success in today s competitive market demands that enterprises embrace information at every opportunity and become informationdriven in decisions and thought. The enterprise data hub (EDH) has emerged as the transformative data management architecture to help enterprises discover the full potential of their data. An EDH provides a secure and cost-efficient place to store and analyze unlimited business data, from batch processing and interactive SQL to machine learning and stream processing. This architecture empowers enterprises to derive new insights and correlation while extending the value of existing investments. With an EDH, enterprises can leverage a large and unpredictable volume and variety of data economically through scalable and flexible storage and in place processing of data. An EDH provides visibility and technical acumen for a variety of business audiences, without incurring costly data movement, remodeling, or duplication found in conventional data management approaches. Cloudera Search plays an important part in realizing an EDH architecture and strategy and serves a multitude of use cases across the organization. Its native integration with Apache Hadoop and its ecosystem is key to future business innovation and insight. Search opens doors to powerful, yet approachable, aggregation and correlation of structured and unstructured data. Many large enterprises already employ Cloudera Search for their informationdriven objectives and applications. 3

4 Separate silos of data emerge for various business and operational reasons, from corporate acquisitions to distinct retention policies. These silos suffer from the high cost of data duplication and management and have limited data visibility characteristics common to singlepurpose, standalone systems. The Cost of Data Silos For most enterprises, their data management deployments have over time grown into siloed architectures of separate computing capabilities and storage functions. This data management separation has emerged for various reasons. Different teams with a variety of operational and retention policies manage the data, and often enterprise IT teams need to serve data to different audiences with varying degrees of interpretation, format, and access. For example, corporate acquisitions and restructuring can result in heterogeneous infrastructure and multiple data sharing needs. Siloed environments unfortunately come with the cost of maintaining the many copies and forms of data among the various computing capabilities. These costs include the transport of a growing volume of data between infrastructures to order to serve both the existing and new use cases encountered in today s business. Siloed environments also limit analytic access and visibility to only the subset of data that is within the silo, thus can stymie the opportunity to gain more valuable insight by correlating various disparate data sets. For example, a product management team might not recognize which products or services are generating more happy customers because the team is unable to link the unstructured data of their text-based customer surveys with their structured account data. Standalone search solutions are not the exception and constitute one of those silos, representing yet another piece of an enterprise IT teams data synchronization puzzle. Such solutions also demand duplicate cluster management and place additional operational demands on the IT operators and information security teams. Ultimately, these solutions limit business insight because they represent only part of the data, and users must turn to other computing capabilities for the full view. The Enterprise Data Hub and Search One of the core principles of an enterprise data hub is to dissolve the barriers of data silos and grant visibility and use across an unified data environment - where multiple, disparate business workloads execute directly on all data in one centralized location. As mentioned, an EDH brings scalable batch processing, interactive query and exploration, stream processing, and advanced analytics side-by-side within a single, maintainable infrastructure. Many of these computing capabilities serve critical business needs and user communities within and outside the enterprise. Yet, often these capabilities either require significant expertise, like advanced analytics or structured query language (SQL) reporting, or address important but defined scope functions, like end-of-day transaction aggregation, which limits the audiences served. Search is a critical capability for the enterprise due to its easy and intuitive approach to exploration and correlation of all types of data. Search as part of an enterprise data hub represents an important gateway to all the data, regardless of shape, size, and volume, for a broad set of business communities within an organization and has several common business drivers. 4

5 Broad and Rapid Insight Search lets business users get insight from enterprise data that otherwise might not be possible. An EDH, as a common storage substrate for all enterprise data in its original, full fidelity, can handle all types of data, from structured formats like relational data coming from a transactional system to unstructured forms like information found in research notes, analyst reports, and voice recording translations. Search, as part of the EDH, has access to the full scope of this data, and users can employ familiar tools like faceted navigation and full-text queries in their own language to aggregate and correlate across these data sets. Search also permits business analysts and operators to filter data sets of relevance through rapid, iterative queries and refinements, which allows these users to solve their mission critical tasks at the speed of thought. Streamlined Workflows Enterprise users can run search and other workloads, like SQL reporting and batch processing, on shared data sets within an EDH, which avoids the costly data transfer or duplication to purpose-built, siloed computing systems for these capabilities. Similarly, IT operators can readily index and serve data in the same infrastructure as other computing capabilities, which establishes more advanced data workflows. Using search within an EDH lets administrators tap into distributed, fault-tolerant, and high-throughput data processing frameworks both to simplify the scale-out of manual or bulk data preparation and indexing and to serve the immediate results. Expanded Audiences A well-managed enterprise data hub with integrated search opens data to a wider business audience within the organization. For most enterprises, a limited set of developers and analysts have the skills to write custom computing jobs with frameworks like MapReduce, let alone the understanding of how to use SQL and business intelligence (BI) tools to evaluate and investigate data and construct views and reports. The majority of the business communities within an organization are effectively shut out from the wealth of information represented by big data and an EDH because they do not have the skills or tools to access the information effectively. With search, all audiences gain a familiar and interactive method to exploring data and finding answers everyone knows how to search and thus an enterprise can easily bring big data and its value and insights to everyone within the organization. Distributed Computing Search Indexing Batch Processing Interactive SQL Data Data Data Data Shared Data Sets Figure 1: Streamlined Workflows and Distributed Computing in Apache Hadoop 5

6 COMMON USES OF SEARCH Data discovery and metadata search Enterprise search Web crawling and web search Document search Log and event search BI search ecommerce catalogs Social and graph search New media (video, audio, etc.) search search Threat detection analysis Data accuracy/quality assurance Data Exploration and Discovery The introduction of big data into an enterprise data management environment can quickly overwhelm conventional (i.e. relational) approaches and methods for data exploration; the vast majority of big data is unstructured in form and is a poor fit for the schemas of relational technologies. Search lets business users explore and discover data interactively without knowledge of schemas by relying on the intrinsic relationships and intersections of key elements and data points both within a given data set and in combination with other data sets. Business teams, from analysts and data scientists to line-of-business leaders and missioncritical staff, can use search to quickly and easily investigate and understand new data sets, discover correlations in data, and speed up data modeling and data discovery processes. Enterprise teams can offer this powerful capability with entity extraction that is driven by the language and concepts of the business audiences, which results in domain-specific and intuitive search experiences. Teams can also employ simple, yet powerful fuzzy matching functions to discover more accurate and comprehensive data sets. And these capabilities extend beyond historical and static data to real-time data sets, so business users can employ the same processes and bring the same navigation experience across multiple timescales of data. Archival Access Organizations realize that their historical data sets hold significant value and opportunities for their business in trend analysis or anomaly detection, for example but often are thwarted by the cumbersome and time-consuming processes required to restore these data sets from near-line and off-line archives for that analysis and reporting. Moreover, conventional systems might not have the additional capacity to accommodate this data. With a Hadoop-based EDH, enterprise IT teams can establish a highly cost-efficient system for capturing and storing data that not only meets enterprise archival or retention requirements, but also presents the business with opportunities to find value from this data without incurring the significant costs of retrieval, movement, and separate storage for analysis. Search within the EDH offers enterprise teams methods for on-demand processing over arbitrary data volumes at no incremental cost or complexity to the system, which both opens archival data to ad-hoc processing and accelerates midterm validation of results during development. In this capacity, search is an excellent tool for uncovering the unknown or potential value within archival data sets. 6

7 Multi-Workload Search and the EDH Search as a technical capability has broad application, and thus can be divided into several categories and solutions. Examples of search within an enterprise include log and event search, document management search, and web search. IT operators traditionally have realized many of these solutions as distinct, purpose-built systems. Cloudera Search is this broad computing capability realized in a Hadoop-based EDH, yet Cloudera Search is decidedly not a standalone solution. Cloudera Search is an integrated, flexible, and robust search solution that works directly with the other workloads provided by an EDH, like batch processing, interactive SQL, and advanced analytics. Given the depth and breadth of this integration with these other services, enterprise IT teams can easily employ Cloudera Search in tandem with these other capabilities to provide comprehensive solutions for a much wider range of search applications and use cases, all from within an existing data hub. In particular, administrators and developers find Cloudera Search to be a great match for multiple search applications that have overlapping or shared sets of data, like an engineering document repository and the associated product support call center dashboard. Cloudera Search is also well suited for situations where business requirements demand multiple, simultaneous computing capabilities in addition to search to render value and uncover insight from data, for example fraud investigation, advanced threat detection, and anomaly and data mining efforts. For many organizations where these multiple workload scenarios are commonplace, Cloudera Search is an ideal solution, because Cloudera Search shares not only the same data sets, but also the same physical infrastructure and system, resource, and security management frameworks. These features eliminate the typical data duplication and synchronization issues, while streamlining IT operations and maximizing overall data management investments. Inside Cloudera Search Cloudera Search is not strictly a single product; it is a fully open source search solution, built with the feature-rich and extensible Apache Solr project. Apache Solr, in turn, includes other open source projects like Apache Lucene and Apache Tika. The reasons for basing Cloudera Search on Solr are clear Solr has a mature code base with a very active developer community and is broadly adopted throughout many industries and used in many applications, making it the open standard for big data search. One example that demonstrates the strength and depth of innovation within the Solr project is the recent development of SolrCloud, which enables distributed, scalable indexing. SolrCloud was a key contribution that resulted in Cloudera s decision to include Solr as the search engine for its EDH. Solr also provides developers with open APIs that promote extension and innovation, and its flexible architecture offers many options for seamless integration with a broad ecosystem of existing enterprise tools, systems, and data formats. As mentioned, Cloudera Search is a solution that uses and integrates many services and capabilities of the Hadoop ecosystem to enable an EDH. These integrations provide IT teams with several critical capabilities required in building and operating a comprehensive search engine capable of handling the challenges of big data. 7

8 Real-time and Scalable Indexing Cloudera Search has various options for indexing data and thus has the flexibility and agility to handle a multitude of use cases. Cloudera Search works with Apache Flume to offer developers a straightforward and extensible method for streaming data, like log events and social media content, into search for real-time indexing. This approach simultaneously provides immediate data serving via search while capturing and storing the data within Hadoop for further analysis, processing, and archiving. These real-time capabilities also extend to Hadoop s NoSQL solution, Apache HBase; IT operators can send writes and updates in HBase directly to Cloudera Search for immediate indexing and serving with minimal overhead to the HBase service through the use of HBase replication. Enterprise developers building real-time applications with HBase, from ad serving engines to operational data stores, can now easily incorporate interactive search and facet-driven dashboards to their services with minimal impact to performance. While real-time data indexing greatly expands the general data visibility for many applications, Cloudera Search also provides the enterprise developer with traditional indexing tools built with the ubiquitous Hadoop batch processing framework, MapReduce. IT teams can employ this proven, high-throughput framework to enable linearly scalable indexing of any data within an EDH. This arrangement also lets end users submit on-demand, ad-hoc indexing jobs to explore data sets not currently included in existing search indices. As these options use Hadoop s common processing model, Cloudera Search s indexing activities share the consolidated infrastructure, resource, security, and system management with the other services within an EDH, which streamlines operations for administrators and the execution model for end users. Simplified Data Storage Cloudera Search stores its data and indices in the common data substrate of Hadoop (i.e. Hadoop Distributed File System or HDFS). This critical integration lets Cloudera Search benefit from the same unified management, control, and robustness that comes out of the box with Hadoop s native file storage, such as disk-to-cluster fault tolerance and data block replication as well as perpetual, full fidelity data. This integration also offers significant cost efficiencies for index storage, as indices are just data within HDFS, and thus avoids typical per-document and per-volume pricing models common to standalone search solutions. In addition, Cloudera Search takes advantage of the collocation of its data and indexes with its query and indexing services. Collocation eliminates the error-prone and expensive data duplication and data pipelines required to move data to and from separately managed storage systems for indexing and serving. IT administrators can also optimize and streamline data workflows involving shared services, including search, due to the collocation of data with the multiple computing services; within a Hadoop-based EDH, Cloudera Search is one of many services using the same data and resources within the cluster. Familiar Search Experience Cloudera Hue, an out-of-the-box user experience for Hadoop, includes a Cloudera Search GUI and GUI builder. The GUI is built on the standard Solr APIs and enables users to search data interactively, view result files, and perform faceted exploration. The availability of the GUI means that business users and developers can perform searches alongside all the other workloads and capabilities in Hadoop through the same, centralized application. The GUI builder allows non-it developers to construct meaningful and informative views of searchable data sets, from configuring the individual result fields and facet values to constructing the guided and visual navigation features. 8

9 Figure 2: Hue Search GUI Production Management and Visibility Cloudera Search is fully integrated with Cloudera Manager, so IT operators can simplify the deployment, configuration, and monitoring of all EDH services, including search, from a single entry point for administration. They can gain deep insight into metrics like service utilization, monitor system health, examine performance and system trends, and facilitate search index management and resource control while balancing the needs of different workloads, like batch processing and interactive SQL, on the cluster. Access Control Enterprise IT teams can extend their existing authentication systems and authorization policies to Cloudera Search. The authentication framework of Cloudera Search employs Kerberos for both intra-service and client-to-service communication to ensure IT operators have comprehensive and centrally managed authentication for all channels throughout their entire data management environment. Cloudera Search also integrates with Apache Sentry (incubating) for granular document-level authorization to individual indices. By consolidating Search authorization within Sentry, IT operators have a centralized system for managing authorization throughout the EDH data access workloads, like batch (i.e. Hive) and interactive SQL (i.e. Impala) in addition to search. Multi-Format Support One of the strengths of Apache Solr is its support for a wide and growing range of data formats. Cloudera Search, with its Solr integration, supports many standard file formats for indexing, e.g. XML, JSON, HTML, Microsoft Word, PDF, JPG, , audio, video, and others, by using supporting libraries like Apache Tika, which is part to the Solr project family. These features greatly ease data ingestion and open Cloudera Search to many existing enterprise search workloads. In addition, Cloudera Search has native support for Hadoop-optimized file formats and compression codecs, such as Snappy, Sequence Files, and Avro. By employing these open formats, IT developers can seamlessly use data sets between the various computing capabilities within an EDH, and data stewards can ensure easy reuse across these same capabilities, including search. 9

10 Simplified Data Processing Cloudera Search includes a collection of integrated tools and frameworks, one of which is Cloudera Morphlines, a Kite SDK library that provides common command-based data transformations. With Morphlines, search developers can simplify index configuration and replace custom Java programming with simple command calls to prepare and transform data, for example parsing, configuring, and indexing MBox files for search. These commands are both extensible and inclusive and can be used outside of search with other computing workloads in an EDH, thus promoting shared development and operations. Examples of Cloudera Search and an EDH As illustrated in the previous sections, Cloudera Search is a key capability of an EDH and addresses many business and technical challenges facing organizations across industries. The following examples highlight some of the uses where Cloudera Search accelerates an organization s time-to-action and time-to-discovery with its data and expedites an enterprise data hub strategy. Real-time Event Search and Advanced Pattern Extraction Many organizations recognize the strategic and tactical advantages of combining search and other workloads, like interactive SQL, simultaneously on the same data sets. For instance, IT operations and information security teams typically need quick drill-down and real time correlation capabilities for varied, multi-type log data to support rapid inspection and tactical exploration of current alerts and anomalies. This same log data, however, can also service the advanced analytics efforts of these teams, tackling subjects like capacity planning, long-term reporting, and historical trending. SLA-Bound search Nightly Batch Processing Real-time Data Real-time Data Real-time Data Real-time Data Calculated Baselines Figure 3: Search for Baseline and Real-time Analytics To meet these challenges, IT teams rely on an EDH as the foundation for advanced deployments like automatic anomaly detection. In these cases, developers match incoming data against nightly batch-calculated distributions built from the same data pool. While an enterprise IT team could optimize a standalone search solution to provide real time data for their mission-critical staff, they are faced with significant cost inefficiencies when scaling to larger volumes of data. Moreover, this conventional approach mandates that an IT team move or duplicate data to expose the same information to the other additional processing frameworks. An EDH answers multiple technical needs for the same data cost-efficiently and at scale and Cloudera Search serves as the primary delivery engine for SLA-bound audiences to this complex mix of baseline and real-time analytics. 10

11 360 Degree View Another common challenge for the information-driven organization is how to gain a better view and understanding of a particular subject or topic, like a patient, a client, or an order. While conceptually simple and straightforward, the details are often much less so, as a business typically needs to correlate numerous and diverse types of data that have limited or unclear intersection linking market surveys, support tickets, and s with sales trends, transactions, and usage logs. Yet successful correlation of these data points grants enterprises more insight into what makes a happy customer or an efficient supply chain. For example, customer support teams use the extraction and correlation across data sets within an EDH to discover how a purchased product and its associated resource and support interaction correlate with customer satisfaction and support ticket load. This mix of activity creates more accurate, actionable, and repeatable recipes for customer success. Cloudera Search is the primary application for this effort, both for interactive exploration of the various data sets during the design of the processing pipeline and for the in-process introspection of the intermediate results. This same search service and resources still serve the traditional applications and audiences the marketing, support, and sales staff with the original data sets, yet without needing additional search systems. Data Exploration Many forward-thinking organizations turn to social media for insights and answers to get a head start over the competition. For marketing firms, investment concerns, and retailers, the analysis of Twitter, Facebook, and other external data sources can yield new visibility into market segments. Cloudera Search aids the exploration of these new forms of information by allowing richer and faster modeling of the original, full fidelity data with clustering, faceted search, and fuzzy matching. These approaches allow analysts and end users the freedom to explore the information as needed, but with limited IT intervention or reliance on complex SQL queries. An EDH readily and efficiently stores all this variable, unstructured, yet highly valuable data due to the underlying scale out and schema-on-read characteristics of HDFS. This data is correlated with other data within the EDH; an EDH centralizes and accelerates not only the usual processing, categorization, and reporting, but also other advanced analytics, such as sentiment analysis, in this shared environment and across all of its data, unstructured or otherwise. This approach is most pervasive in trend analysis as well as impact analysis, as business users typically uncover better, more comprehensive results when analyzing multiple sources together and at significant depth and history. More innovative uses of this broad and deep analysis include risk and disaster prevention. Process Validation Organizations across many industries are turning to non-text information, like video, audio, and images, as another source of valuable data. In these cases, business teams need to improve the information received through or related to this new media, and IT teams typically seek ways to streamline both the metadata processing and decorating of these files, as well as to serve the combined data-metadata results. Some examples of the value of this metadata include the geographical or weather conditions at the time of capture of an image or video, the diagnostics reports and logs of the equipment generating the sensor reading, or downstream, post-processing details about the source content, like advanced image or voice processing. 11

12 All these scenarios benefit from an EDH, which combines a highly scalable processing platform well suited to metadata treatments with parallelized advanced processing. Cloudera Search, as one of the shared computing capabilities of an EDH, provides a critical tool for IT developers to explore in-flight and intermediate processing results. With search, developers and users can realize and validate procedure or algorithm results or identify errors and flaws at early stages, thus accelerating overall time-to-value. Cloudera Search also lets IT teams serve the final results of the processing to multiple end audiences with free text and faceted navigation, yet with no incremental cost or additional infrastructure for this function. Similarity Discovery Business analysts face a common problem often, they will have an important data element, yet need more data points to continue analysis, for example additional documents that share topics and themes with the source document or other genomes that are 80% similar to the subject s sequence. Finding these related data points could be complex and time consuming. Business users can accelerate their time-to-discovery by using an EDH as a very large data storage solution and executing fuzzy search or similarity searches across that entire data set with Cloudera Search. This approach significantly reduces the effort spent in manual introspection, particularly across unstructured, yet important, data sets. With the relevant data sets identified, analysts can then immediately process the information further with other workloads, without moving, copying, or regenerating data. The shared data and resources of a Hadoop-based EDH, coupled with the native and rich integration of Cloudera Search, gives these users analytical depth and flexibility with familiar and cost-effective tools. Conclusion Cloudera Search plays an important role in an organization s EDH strategy, serving a multitude of individual and multi-workload use cases. Its rich and unique native integration with the foundations of Hadoop and its ecosystem presents IT administrators and business users alike with opportunities for more streamlined workloads across larger and more diverse data sets. Cloudera Search and an EDH empower enterprise teams to breaking out of separate, disconnected data silos and discover bigger insights across all types of data, structured and unstructured. With Cloudera Search, enterprises enjoy the power, flexibility, and ease of use of a proven and unmatched search solution for their enterprise data hub. 12

13 About Cloudera Cloudera is revolutionizing enterprise data management by offering the first unified Platform for Big Data, an enterprise data hub built on Apache Hadoop. Cloudera offers enterprises one place to store, access, process, secure, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Cloudera s open source Big Data platform is the most widely adopted in the world, and Cloudera is the most prolific contributor to the open source Hadoop ecosystem. As the leading educator of Hadoop professionals, Cloudera has trained over 22,000 individuals worldwide. Over 1,200 partners and a seasoned professional services team help deliver greater time to value. Finally, only Cloudera provides proactive and predictive support to run an enterprise data hub with confidence. Leading organizations in every industry plus top public sector organizations globally run Cloudera in production. cloudera.com or Cloudera, Inc Page Mill Road, Palo Alto, CA 94304, USA 2015 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.

INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES

INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES Data Consolidation and Multi-Tenancy in Financial Services CLOUDERA INDUSTRY BRIEF 2 Table of Contents Introduction 3 Security

More information

Deploying an Operational Data Store Designed for Big Data

Deploying an Operational Data Store Designed for Big Data Deploying an Operational Data Store Designed for Big Data A fast, secure, and scalable data staging environment with no data volume or variety constraints Sponsored by: Version: 102 Table of Contents Introduction

More information

Data Discovery, Analytics, and the Enterprise Data Hub

Data Discovery, Analytics, and the Enterprise Data Hub Data Discovery, Analytics, and the Enterprise Data Hub Version: 101 Table of Contents Summary 3 Used Data and Limitations of Legacy Analytic Architecture 3 The Meaning of Data Discovery & Analytics 4 Machine

More information

Cloudera Enterprise Data Hub in Telecom:

Cloudera Enterprise Data Hub in Telecom: Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

MULTITENANCY AND THE ENTERPRISE DATA HUB:

MULTITENANCY AND THE ENTERPRISE DATA HUB: MULTITENANCY AND THE ENTERPRISE DATA HUB: Version: Q414-105 Table of Content Introduction 3 Business Objectives for Multitenant Environments 3 Standard Isolation Models of an EDH 4 Elements of a Multitenant

More information

Cloudera in the Public Cloud

Cloudera in the Public Cloud Cloudera in the Public Cloud Deployment Options for the Enterprise Data Hub Version: Q414-102 Table of Contents Executive Summary 3 The Case for Public Cloud 5 Public Cloud vs On-Premise 6 Public Cloud

More information

Driving Growth in Insurance With a Big Data Architecture

Driving Growth in Insurance With a Big Data Architecture Driving Growth in Insurance With a Big Data Architecture The SAS and Cloudera Advantage Version: 103 Table of Contents Overview 3 Current Data Challenges for Insurers 3 Unlocking the Power of Big Data

More information

More Data in Less Time

More Data in Less Time More Data in Less Time Leveraging Cloudera CDH as an Operational Data Store Daniel Tydecks, Systems Engineering DACH & CE Goals of an Operational Data Store Load Data Sources Traditional Architecture Operational

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Hadoop in the Hybrid Cloud

Hadoop in the Hybrid Cloud Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

locuz.com Big Data Services

locuz.com Big Data Services locuz.com Big Data Services Big Data At Locuz, we help the enterprise move from being a data-limited to a data-driven one, thereby enabling smarter, faster decisions that result in better business outcome.

More information

The Enterprise Data Hub and The Modern Information Architecture

The Enterprise Data Hub and The Modern Information Architecture The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Twitter: @awadallah 1 2013 Cloudera, Inc. All rights reserved. Cloudera Overview The Leader

More information

CA Service Desk Manager

CA Service Desk Manager PRODUCT BRIEF: CA SERVICE DESK MANAGER CA Service Desk Manager CA SERVICE DESK MANAGER IS A VERSATILE, COMPREHENSIVE IT SUPPORT SOLUTION THAT HELPS YOU BUILD SUPERIOR INCIDENT AND PROBLEM MANAGEMENT PROCESSES

More information

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

CDH AND BUSINESS CONTINUITY:

CDH AND BUSINESS CONTINUITY: WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

WHITE PAPER LOWER COSTS, INCREASE PRODUCTIVITY, AND ACCELERATE VALUE, WITH ENTERPRISE- READY HADOOP

WHITE PAPER LOWER COSTS, INCREASE PRODUCTIVITY, AND ACCELERATE VALUE, WITH ENTERPRISE- READY HADOOP WHITE PAPER LOWER COSTS, INCREASE PRODUCTIVITY, AND ACCELERATE VALUE, WITH ENTERPRISE- READY HADOOP CLOUDERA WHITE PAPER 2 Table of Contents Introduction 3 Hadoop's Role in the Big Data Challenge 3 Cloudera:

More information

Accelerate your Big Data Strategy. Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator

Accelerate your Big Data Strategy. Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator Accelerate your Big Data Strategy Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator Enterprise Data Hub Accelerator enables you to get started rapidly and cost-effectively with

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

An Enterprise Data Hub, the Next Gen Operational Data Store

An Enterprise Data Hub, the Next Gen Operational Data Store An Enterprise Data Hub, the Next Gen Operational Data Store Version: 101 Table of Contents Summary 3 The ODS in Practice 4 Drawbacks of the ODS Today 5 The Case for ODS on an EDH 5 Conclusion 6 About the

More information

Are You Big Data Ready?

Are You Big Data Ready? ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain

More information

Your Data, Any Place, Any Time.

Your Data, Any Place, Any Time. Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to: Run your most demanding mission-critical applications. Reduce

More information

Executive Summary WHO SHOULD READ THIS PAPER?

Executive Summary WHO SHOULD READ THIS PAPER? The Business Value of Business Intelligence in SharePoint 2010 Executive Summary SharePoint 2010 is The Business Collaboration Platform for the Enterprise & the Web that enables you to connect & empower

More information

Integrating a Big Data Platform into Government:

Integrating a Big Data Platform into Government: Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful

More information

Finding the Needle in a Big Data Haystack. Wolfgang Hoschek (@whoschek) JAX 2014

Finding the Needle in a Big Data Haystack. Wolfgang Hoschek (@whoschek) JAX 2014 Finding the Needle in a Big Data Haystack Wolfgang Hoschek (@whoschek) JAX 2014 1 About Wolfgang Software Engineer @ Cloudera Search Platform Team Previously CERN, Lawrence Berkeley National Laboratory,

More information

Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to:

Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to: Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to: Run your most demanding mission-critical applications. Reduce

More information

Empowering the Masses with Analytics

Empowering the Masses with Analytics Empowering the Masses with Analytics THE GAP FOR BUSINESS USERS For a discussion of bridging the gap from the perspective of a business user, read Three Ways to Use Data Science. Ask the average business

More information

White Paper: Enhancing Functionality and Security of Enterprise Data Holdings

White Paper: Enhancing Functionality and Security of Enterprise Data Holdings White Paper: Enhancing Functionality and Security of Enterprise Data Holdings Examining New Mission- Enabling Design Patterns Made Possible by the Cloudera- Intel Partnership Inside: Improving Return on

More information

The Business Analyst s Guide to Hadoop

The Business Analyst s Guide to Hadoop White Paper The Business Analyst s Guide to Hadoop Get Ready, Get Set, and Go: A Three-Step Guide to Implementing Hadoop-based Analytics By Alteryx and Hortonworks (T)here is considerable evidence that

More information

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache

More information

www.ducenit.com Analance Data Integration Technical Whitepaper

www.ducenit.com Analance Data Integration Technical Whitepaper Analance Data Integration Technical Whitepaper Executive Summary Business Intelligence is a thriving discipline in the marvelous era of computing in which we live. It s the process of analyzing and exploring

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches. Detecting Anomalous Behavior with the Business Data Lake Reference Architecture and Enterprise Approaches. 2 Detecting Anomalous Behavior with the Business Data Lake Pivotal the way we see it Reference

More information

Data virtualization: Delivering on-demand access to information throughout the enterprise

Data virtualization: Delivering on-demand access to information throughout the enterprise IBM Software Thought Leadership White Paper April 2013 Data virtualization: Delivering on-demand access to information throughout the enterprise 2 Data virtualization: Delivering on-demand access to information

More information

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases DATAMEER WHITE PAPER Beyond BI Big Data Analytic Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence

More information

Three Open Blueprints For Big Data Success

Three Open Blueprints For Big Data Success White Paper: Three Open Blueprints For Big Data Success Featuring Pentaho s Open Data Integration Platform Inside: Leverage open framework and open source Kickstart your efforts with repeatable blueprints

More information

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service Cloudera Enterprise Data Hub GCloud Service Definition Lot 3: Software as a Service December 2014 1 SERVICE OVERVIEW & SOLUTION... 4 1.1 Service Overview... 4 1.2 Introduction to Cloudera... 5 1.3 Cloudera

More information

BANKING ON CUSTOMER BEHAVIOR

BANKING ON CUSTOMER BEHAVIOR BANKING ON CUSTOMER BEHAVIOR How customer data analytics are helping banks grow revenue, improve products, and reduce risk In the face of changing economies and regulatory pressures, retail banks are looking

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

WHITE PAPER. Hadoop and HDFS: Storage for Next Generation Data Management. Version: Q414-102

WHITE PAPER. Hadoop and HDFS: Storage for Next Generation Data Management. Version: Q414-102 Storage for Next Generation Data Management Version: Q414-102 Table of Content Storage for the Modern Enterprise 3 The Challenges of Big Data 5 Data at the Center of the Enterprise 6 The Internals of HDFS

More information

AtScale Intelligence Platform

AtScale Intelligence Platform AtScale Intelligence Platform PUT THE POWER OF HADOOP IN THE HANDS OF BUSINESS USERS. Connect your BI tools directly to Hadoop without compromising scale, performance, or control. TURN HADOOP INTO A HIGH-PERFORMANCE

More information

How To Make Data Streaming A Real Time Intelligence

How To Make Data Streaming A Real Time Intelligence REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log

More information

www.sryas.com Analance Data Integration Technical Whitepaper

www.sryas.com Analance Data Integration Technical Whitepaper Analance Data Integration Technical Whitepaper Executive Summary Business Intelligence is a thriving discipline in the marvelous era of computing in which we live. It s the process of analyzing and exploring

More information

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities

More information

Making Sense of Big Data in Insurance

Making Sense of Big Data in Insurance Making Sense of Big Data in Insurance Amir Halfon, CTO, Financial Services, MarkLogic Corporation BIG DATA?.. SLIDE: 2 The Evolution of Data Management For your application data! Application- and hardware-specific

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

The big data revolution

The big data revolution The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing

More information

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data Fundamentals Ed 1 NEW Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big

More information

IBM Cognos Enterprise: Powerful and scalable business intelligence and performance management

IBM Cognos Enterprise: Powerful and scalable business intelligence and performance management : Powerful and scalable business intelligence and performance management Highlights Arm every user with the analytics they need to act Support the way that users want to work with their analytics Meet

More information

The Future of Business Analytics is Now! 2013 IBM Corporation

The Future of Business Analytics is Now! 2013 IBM Corporation The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Oracle Big Data Building A Big Data Management System

Oracle Big Data Building A Big Data Management System Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following

More information

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must

More information

Operational Analytics

Operational Analytics Operational Analytics Version: 101 Table of Contents Operational Analytics 3 From the Enterprise Data Hub to the Enterprise Application Hub 3 Operational Intelligence in Action: Some Examples 4 Requirements

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Dell In-Memory Appliance for Cloudera Enterprise

Dell In-Memory Appliance for Cloudera Enterprise Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/

More information

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment

More information

Streamlining the Process of Business Intelligence with JReport

Streamlining the Process of Business Intelligence with JReport Streamlining the Process of Business Intelligence with JReport An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) Product Summary from 2014 EMA Radar for Business Intelligence Platforms for Mid-Sized Organizations

More information

IBM InfoSphere BigInsights Enterprise Edition

IBM InfoSphere BigInsights Enterprise Edition IBM InfoSphere BigInsights Enterprise Edition Efficiently manage and mine big data for valuable insights Highlights Advanced analytics for structured, semi-structured and unstructured data Professional-grade

More information

WHITE PAPER. Five Steps to Better Application Monitoring and Troubleshooting

WHITE PAPER. Five Steps to Better Application Monitoring and Troubleshooting WHITE PAPER Five Steps to Better Application Monitoring and Troubleshooting There is no doubt that application monitoring and troubleshooting will evolve with the shift to modern applications. The only

More information

Search and Real-Time Analytics on Big Data

Search and Real-Time Analytics on Big Data Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its

More information

The IBM Cognos Platform

The IBM Cognos Platform The IBM Cognos Platform Deliver complete, consistent, timely information to all your users, with cost-effective scale Highlights Reach all your information reliably and quickly Deliver a complete, consistent

More information

Big Data Comes of Age: Shifting to a Real-time Data Platform

Big Data Comes of Age: Shifting to a Real-time Data Platform An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for SAP April 2013 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents Introduction... 1 Drivers of Change...

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

White Paper. Unified Data Integration Across Big Data Platforms

White Paper. Unified Data Integration Across Big Data Platforms White Paper Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using

More information

Unified Data Integration Across Big Data Platforms

Unified Data Integration Across Big Data Platforms Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using ELT... 6 Diyotta

More information

Optimizing the Data Center for Today s State & Local Government

Optimizing the Data Center for Today s State & Local Government WHITE PAPER: OPTIMIZING THE DATA CENTER FOR TODAY S STATE...... &.. LOCAL...... GOVERNMENT.......................... Optimizing the Data Center for Today s State & Local Government Who should read this

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Multichannel Customer Listening and Social Media Analytics

Multichannel Customer Listening and Social Media Analytics ( Multichannel Customer Listening and Social Media Analytics KANA Experience Analytics Lite is a multichannel customer listening and social media analytics solution that delivers sentiment, meaning and

More information

OpenText Output Transformation Server

OpenText Output Transformation Server OpenText Output Transformation Server Seamlessly manage and process content flow across the organization OpenText Output Transformation Server processes, extracts, transforms, repurposes, personalizes,

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

Databricks. A Primer

Databricks. A Primer Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Master big data to optimize the oil and gas lifecycle

Master big data to optimize the oil and gas lifecycle Viewpoint paper Master big data to optimize the oil and gas lifecycle Information management and analytics (IM&A) helps move decisions from reactive to predictive Table of contents 4 Getting a handle on

More information

WHITE PAPER SPLUNK SOFTWARE AS A SIEM

WHITE PAPER SPLUNK SOFTWARE AS A SIEM SPLUNK SOFTWARE AS A SIEM Improve your security posture by using Splunk as your SIEM HIGHLIGHTS Splunk software can be used to operate security operations centers (SOC) of any size (large, med, small)

More information

BEYOND BI: Big Data Analytic Use Cases

BEYOND BI: Big Data Analytic Use Cases BEYOND BI: Big Data Analytic Use Cases Big Data Analytics Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence

More information

The Clear Path to Business Intelligence

The Clear Path to Business Intelligence SAP Solution in Detail SAP Solutions for Small Businesses and Midsize Companies SAP Crystal Solutions The Clear Path to Business Intelligence Table of Contents 3 Quick Facts 4 Optimize Decisions with SAP

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

Big Data for Investment Research Management

Big Data for Investment Research Management IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable

More information

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK 5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK CUSTOMER JOURNEY Technology is radically transforming the customer journey. Today s customers are more empowered and connected

More information

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Data Governance in the Hadoop Data Lake. Michael Lang May 2015 Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales

More information