Building Governance into Big Data
|
|
- Prosper McDonald
- 8 years ago
- Views:
Transcription
1 Building Governance into Big Data A metadata-based approach for ensuring visibility and control for your Hadoop data architecture A Hortonworks White Paper SEPTEMBER 2015 Building Governance into Big Data 2015 Hortonworks
2 2 Contents Overview 3 Why data governance matters 4 Four essential elements of Hadoop data governance 4 Why metadata and taxonomy hold the key to comprehensive data governance 5 The Data Governance Initiative (DGI) building cross-industry metadata services in Hadoop 6 Addressing cross-industry use cases 7 DGI becomes Apache Atlas 8 Supporting data governance across industries through a flexible-type system 10 Key characteristics and capabilities of Atlas 11 Competitive analysis 13 Summary 15
3 3 Overview As organizations pursue Hadoop initiatives in order to capture new opportunities for data-driven insight, data governance requirements can pose a key challenge. The management of information to identify its value and enable effective control, security and compliance for customer and enterprise data is a core requirement for both traditional and Modern Data Architectures. However, it hasn t yet been clear how to easily address these requirements using Hadoop. Traditional data governance tools either treat Hadoop as a black box with no visibility or access into internal data manipulation (aka ETL, etc.), or impose significant restrictions in order to meet these requirements, such as requiring every job run to be authored within a single tool, undermining the value of the breadth of tooling across the Hadoop Modern Data Architecture. While Hadoop produces a large amount of operational and application related data that can be used for auditing purposes, the attempt to discern meaning from this information using a forensic or rear view mirror kind of approach can result in inconsistency and inaccurate results. As a result of these challenges, a Data Lake can easily become a data swamp as users lose track of what data it contains, where it came from and the processes used to shape it. Hortonworks, committed to innovation at the core, has been a leader in industry efforts to weave data governance into the fabric of the Modern Data Architecture. Recognizing that Hadoop isn t an island of data, our approach has been to ensure that everything we build is open and can integrate within the context of the Modern Data Architecture. This approach provides our customers with a comprehensive view of data as it moves between systems and is transformed and accessed. The realization of this approach has revolved around a common set of metadata services, helpful information that describes and provides information about other data. Through an open, collaborative initiative with a small number of industry thought leaders, Hortonworks has helped develop capabilities and frameworks that can be applied across industries to ensure effective management and governance of Big Data environments. Working with other industry thought leaders, Hortonworks launched Apache Atlas to apply consistent metadata and taxonomy across the data ecosystem. Hortonworks empowers data managers to ensure the transparency, reproducibility, auditability and consistency of the Data Lake and the assets it contains. Hadoopcentric information can be leveraged in this broader context using third-party products to form a comprehensive view. In this way, Apache Atlas sits at the core of data governance for Hadoop and makes it possible for enterprises to capitalize on the power of Big Data to drive growth, differentiation and competitive advantage while maintaining full control and oversight.
4 4 Why data governance matters Data governance is a matter of critical importance for every organization that relies on data to drive business value in other words, virtually every organization today. Businesses in highly regulated industries such as finance and healthcare must maintain effective control and visibility over data to ensure auditability and compliance. For other companies, data governance is crucial for securing sensitive information and protecting customer privacy while helping employees leverage the full value of information to drive growth and differentiation. But every company, as it grows and expand its data lake beyond the first few use cases and applications, needs an easy way to explore the data sets that exist within the lake. At the same time, data governance needs to be built-in and automated as much as possible. The approach should support the process of bringing data into Hadoop and applied consistently across every subsequent access point to the data itself. What enterprises need is an approach to data governance for Big Data that creates value by: Enabling rapid discovery of datasets already contained within Hadoop, eliminating requests for duplicate data to be curated or ingested Addressing compliance reporting requirements for Hadoop related to data access and lineage, to reduce both cost and regulatory risk Supporting comprehensive data governance initiatives that span Hadoop and traditional data systems As Hadoop enables enterprises to grow the volume, velocity and variety of data that can be leveraged for insight, the importance of governance grows in tandem with the scale of the Data Lake. By building effective data governance into the architecture that powers Big Data, businesses can realize the full value of their information assets while ensuring effective risk management. Four essential elements of Hadoop data governance Critics of the Data Lake approach have characterized it as throw all the data into cluster now, and worry about cleansing, reconciliation and enrichment later. Hadoop s schema-on-read functionality allows users to forgo the definition and organization of data as it enters the system, while its distributed architecture facilitates the persistence of data. As a result, organizations have unchecked permission to store virtually any type of data while delegating data management and governance to application layers operating at the top of the platform. This approach is all too likely to transform an organization s Data Lake into a data swamp while fostering additional governance risks. To realize the full value of Hadoop, enterprises must reconcile data management realities when they bring existing and new data from disparate sources into the Hadoop platform. Metadata and its use in the context of data governance are vital parts of any enterprise-ready Data Lake, and must be built into the ecosystem from the outset to prevent increasingly complex data management challenges moving forward.
5 5 The Hortonworks philosophy for data governance in the enterprise revolves around four tenets: Auditability: All relevant events and assets must be traceable with appropriate lineage Transparency: Governance standards and protocols must be clearly defined, consistently applied and available to all Reproducibility: Relevant data landscapes should be reproducible at any given point in time Consistency: Compliance programs must be policy-driven As Hadoop enables enterprises to grow the volume, velocity and variety of data that can be leveraged for insight, the importance of governance grows in tandem with the scale of the Data Lake. By building effective data governance into the architecture that powers Big Data, businesses can realize the full value of their information assets while ensuring effective risk management. Why metadata and taxonomy hold the key to comprehensive data governance The success of data governance fundamentally revolves around capturing metadata and defining meaningful taxonomies for data. A definition of these concepts can provide a useful context for understanding their value within Hadoop and the broader data ecosystem. Metadata is information that describes and provides information about other data. This may include data models, schemas and administrative information in addition to attributes such as title, author, subject, tags, date created and description. Once defined and documented, these attributes can be used to search, link, aggregate and grant access to the associated dataset. TECHNICAL METADATA BUSINESS METADATA OPERATIONAL METADATA Database name Table name Column name Data type Business names Business definition Business classification Sensitivity tags Who? (security access) What? (job information) When? (logs/audit trails) Where? (location) Taxonomy refers to any structure that is used to organize and classify information. Taxonomies are used as part of metadata fields to support consistent and accurate indexing of data structures, and to define the relationships among them. Taxonomy may include a standardized list of terms (vocabulary) that can be used to consistently order data classification structures and/ or hierarchies into parent-child relationships. One can think of metadata as a framework or filing cabinet for data, and taxonomy as a mechanism for organizing it into folders. This approach makes it possible to organize even vast amounts of information consistently, just as a similar hierarchical approach is used to categorize the millions of different life forms on earth into a rational and manageable structure of families, genuses and species. This can be contrasted with the simple name-value pairs used elsewhere, which are really free-form labels with no hierarchy structure and a vulnerability to error and duplication. It should also be noted that taxonomies can and do change with time. So accounting for changes that can occur within taxonomies over time is critical to the success of any system which leverages them.
6 6 Combining technical and business taxonomical metadata is the key for consistent data governance within Hadoop and the broader data ecosystem. A common metadata and classification framework ensures that all applications operating on top of Hadoop infrastructure will relate to and treat data in the same way. DATA + METADATA + BUSINESS TAXONOMY = AUDIT & GOVERNANCE HDFS files HCatalog definitions Falcon pipelines Ranger set of users Title Description Author Subject Date created Date modified Data sensitivity Organizational hierarchy Customer/ industry vocabulary Industry compliance standards Who did what, where and when and how The Data Governance Initiative (DGI) building cross-industry metadata services in Hadoop The application of data governance best practices for Hadoop is complicated by its current lack of a comprehensive approach to deliver visibility and control into workflows that require audit, lineage and security. While a number of available vendor solutions seek to fill this gap, their solutions are not integrated into the broader Hadoop ecosystem and require a siloed, monolithic workflow. Governance vendors support for multi-tenancy and concurrency are less than ideal as current offerings do not have visibility into activity outside their own narrow focus. Hortonworks has been a leader in industry efforts to address these challenges for Open Enterprise Hadoop. As part of our promise to drive enterprise readiness for Hadoop, Hortonworks established the Data Governance Initiative (DGI) in collaboration with Aetna, Merck, Target, and SAS. The charter of this initiative was to introduce a common, metadatapowered approach to data governance into the open source community, and to establish a framework with the flexibility to be applied across industries. Since its inception, this co-development effort has grown to include Schlumberger and a global financial institution. DGI members set forth two guiding principles: The Hadoop data governance framework must integrate seamlessly with existing frameworks and exchange metadata with them The framework must also address governance across all the components or data engines that operate on top of the Hadoop platform
7 7 Figure 1: Data Governance Initiative (DGI) laid the foundation for a common, metadata-powered approach to data governance. DGI members worked on this shared framework to determine how users access data within Hadoop while interoperating with and extending its capability to existing third-party data governance and management tools. Addressing cross-industry use cases By bringing together leading companies with deep expertise across a range of industries, DGI made it possible to develop a truly cross-industry, extensible framework. DGI members actively worked to materialize real industry data governance solutions through the open source community at an unprecedented rate. The expertise brought to the DGI by the members manifested itself in addressing the following industry use cases across financial services, healthcare, pharmaceutical and telecommunications. Chain of custody (compliance): The financial services sector operates under strict regulations that require detailed audit tracking of every event s origin, access and transformation in order to comply with customer and governmental inquiries. This involves tracking every copy, backup and derivation of each dataset, in addition to actions with regards to data access or denial. Financial services companies must be able to recreate the narrative for every dataset from its creation through its disposition at any given time. Healthcare ad hoc reporting (30-day measures): Reimbursements by the Centers for Medicare & Medicaid Services (CMS) represent a significant portion of healthcare provider revenues. A healthcare institution s bottom line can be adversely affected if they are penalized by the CMS for poor patient recidivism rates, making it essential to be able to assess and track patient outcomes over their entire history. This involves analyzing a wide set of sensitive patient data from disparate data sources on an ad-hoc basis for timely remediation. The work that was done as part of the DGI can be used to discover, catalog and score patient data rapidly and accurately and present it in the relevant context.
8 8 Licensing of research data (data masking): To optimize return on investment for product development cycles that can stretch years, pharmaceutical companies often license research data to other companies or partners. Each licensing agreement has specific requirements, often requiring data to be shared in its entirety with licensing customers or partners. To complicate matters, this data may contain sensitive personally identifiable information (PII), protected health information (PHI) data or both. To prevent regulatory violations, the licensing company must mask this sensitive information, while still making the entire dataset available to users based on their roles or data attributes. All these factors must be managed and coordinated in an efficient way. Energy companies often rely on similar licensing deals to monetize their own research data; while the regulatory environment differs, some of the same challenges come into play. Log analysis (customer experience): Data from telephony, networked devices, set-top boxes and websites hold vast quantities of information about the experience of individual telecommunications customers. This information is highly valuable to telecom companies, as inconsistent customer service can easily increase customer attrition and lower service margins. However, current data technologies make it extremely difficult to correlate customer events spread across a number of years and petabytes of data, making insights more difficult to expose. Opt-in customer data is specific to device, subscribed product, time and geography, and the lineage of all these attributes must be tracked to enable effective analysis. To mitigate subpar customer experience, providers must perform both real-time and predictive analysis of live streaming data, correlated by deep historical analysis. This analysis must be performed based on compliant methods that are grounded in established data governance practices. DGI laid the foundation to provide true visibility in Hadoop business processes such as these and other key use cases across industries. DGI becomes Apache Atlas Building on the success of DGI, Hortonworks, Aetna, Merck, SAS, Schlumberger, Target and others leveraged their groundbreaking co-development efforts into a new Apache project. In April 2015, they submitted a proposal for a new incubator project called Apache Atlas to the Apache Software Foundation. The founding members of the project include all the members of the DGI and others from the Hadoop community. Apache Atlas was proposed to provide governance capabilities in Hadoop. At its core, Atlas is designed to exchange metadata both within and outside of the Hadoop stack. By reconciling both logical data models and forensic events, enriched by business taxonomy metadata, Atlas enables a scalable set of core governance services. These services enable enterprises to effectively and efficiently address their compliance requirements by providing: Search and lineage for datasets Metadata-driven data access control Indexed and searchable centralized audit for operational events Comprehensive data lifecycle management from ingestion to disposition Metadata interchange with other metadata tools
9 9 In this way, Atlas allows for organizations to establish reliable and safe information products and better utilize information assets to generate revenue. By helping to eliminate duplicate data along with their associated cost, Atlas makes it easier for IT to support data exploration and compliance. As Hadoop enables enterprises to grow the volume, velocity and variety of data an enterprise can leverage for insight, the importance of governance grows with the Data Lake. A common metadata store provides the foundation for addressing these requirements and delivering a broad range of data governance capabilities for Hadoop. It also provides a focal point for interoperability to any metadata consumer within the ecosystem and within the Modern Data Architecture, rather than requiring each project or component within the Hadoop stack to provide its own unique interface. This further reduces cost and complexity for IT while enabling a holistic approach to data governance across the Data Lake. Rather than requiring each third-party product (ETL tools, broader data governance tools, etc.) to understand which projects and components are within the Hadoop ecosystem, Atlas provides a focal point for interoperability and information exchange. Of course, this isn t delivered in a big bang approach, but rather as a sustained open source effort. The community has decided to take a gradual approach to delivering comprehensive interoperability capability and has come together to define and build the core of Apache Atlas. The community has also outlined a clear roadmap to integrate a number of Hadoop ecosystem components with the common metadata store. Hive was chosen as the starting point due to its maturity, existing footprint among current Hadoop users and the fact that it is similar in concept to existing enterprise data warehouse technologies that are subject to these same data governance challenges. Figure 2: Atlas delivers out-of the-box integration with Apache Hive as its starting point with plans to expand from there.
10 10 Supporting data governance across industries through a flexible-type system The Apache community built Atlas with the realization that when it comes to data governance, one size doesn t fit all. It would be impractical for the community to attempt to create a super data model that would satisfy the unique requirements of all the diverse industries and business processes. This approach would also result in duplicate data models, given that enterprises across industries have already invested significant resources in building and refining the data models that reflect the unique ways in which they do business. A much more effective and efficient approach is to provide enterprises with the ability to import and export metadata, as it currently exists in non-hadoop systems such as ETL tools, ERP systems or data warehouses. The Atlas adaptive model streamlines compliance efforts by allowing companies to import existing metadata structures via REST-based APIs from other sources to leverage legacy investments, or to pre-load a taxonomy-rule combination for a specific industry or line of business. This approach is especially relevant for companies in the payment card industry (PCI), where a consistent metadata vocabulary ensures that downstream audit and compliance processes will match perfectly with metadata tags and access rules. With Atlas, data stewards also have the ability to define, annotate and automate the capture of relationships between data sets and underlying elements including source, target and derivation processes. Atlas ensures downstream metadata consistency across the ecosystem by enabling enterprises to easily export it to third-party systems. The advantages of the flexible-type system can be seen in its day-to-day use. Atlas empowers IT to model business organizations as well as technical metadata about enterprise data. Administrators can create ad-hoc or bulk structures that allow users assign a business tag (taxonomy) to physical data structures including database, tables or columns. For example, a data steward can assign a PII (personally identifiable information) tag to a column in a Hive table that contains employee's social security numbers. Whenever that column is used as part of a business workflow or is queried against for analysis purposes, it carries the PII tag with it and the user is notified of its appropriate use. Since Atlas is aware of how and when a tagged data structure was accessed, copied or modified, it can construct its lineage at any given time based on actual data events. This approach provides the enterprises with the confidence that its data governance processes are comprehensive enough to pass independent audit. This approach is also applicable to logical data structures (business taxonomy) such as hierarchies of departments or products. A data administrator can tag a data structure once at the parent level and all the associated child elements automatically inherit that tag. For example, a human resources data asset group can be tagged sensitive or PII and all child groups inside that parent group such as Drivers or Timesheets would inherit this attribute. Figure 3: Apache Atlas enables business tags applied to the parent entity to be automatically inherited by child entities.
11 11 Key characteristics and capabilities of Atlas As a result of the collaborative approach to its development, Atlas provides a robust and comprehensive framework for addressing governance for Big Data. The following attributes contribute to its unique effectiveness in this regard. Prescriptive lineage Lineage typically refers to the steps a dataset took to arrive to its current state, as well as any copies that may have been created. However, simply looking at audit or log correlations alone to determine if the lineage is flawed is not enough. As it is not possible to determine with certainty whether the route a data workflow took was correct or in compliance. Data governance approaches based on time-based algorithms are especially problematic as this inaccurate process can lead to misplaced confidence in a method, which would never pass serious compliance scrutiny. Without a more comprehensive understanding, it is impossible to take any action that might be warranted. The correct approach is to combine logical models of workflow with log events for validation and completeness, an approach called prescriptive lineage. This is the path that Atlas takes. Dynamic, metadata-based access policies for real-time policy enforcement Governance control cannot be passive or simply forensic; reports on who did what, when, are not enough. Apache Ranger is an open source project that provides authorization and authentication to the Hadoop ecosystem. By integrating with Ranger, Atlas empowers enterprises to rationalize compliance policy at runtime based on Atlas s data classification schemes by leveraging Ranger to enforce a flexible attribute-based policy that prevents violations from occurring. Ranger s centralized platform empowers data administrators to define security policy once based on Atlas metadata tags or attributes defined by a data steward or administrators, and apply this policy in real time to the entire hierarchy of assets. Data stewards can focus on discovery and tagging while another group can manage compliance policy. This decoupling of explicit policy offers two important benefits: Dynamic policy enforcement: data analysis-driven tags can be enforced immediately Reusability: One policy can be applied to many assets, simplifying management Apache Ranger enforces both role-based (RBAC) and attribute-based (ABAC) access control to create a flexible security profile that meets the needs of data-driven enterprises. The initial set of policies being constructed within the community are defined as: 1. Attribute-based access controls: For example, a column in a particular Hive table is marked with the metadata tag PII. This tag is then used to assign multiple entitles to a group. This is an evolution from role-based entitlements, which require discrete and static one-to-one mappings. 2. Prohibition against dataset combinations: It s possible for two data sets for example, one consisting of account numbers and the other of customer names to be in compliance individually, but pose a violation if combined. Administrators can apply a metadata tag to both sets to prevent them from being combined, helping avoid such a violation. 3. Time-based access policies: Administrators can use metadata to define access according to time windows in order to enforce compliance with regulations such as SOX 90-day reporting rules. 4. Location-specific access policies: Similar to time-based access policies, administrators can define entitlements differently by geography. For example, a U.S.-based user might be granted access to data while still in a domestic office, and then travel to Switzerland. Although the same user may be trying to access the same data, the different geographical context would apply, triggering a different set of privacy rules to be evaluated.
12 12 These policies can be used in combination to create a very sophisticated security access policy for each user at that point in time and location. Of course, the reach that Apache Ranger provides in terms of authorization for an ever growing number of Hadoop ecosystem components (currently eight at the time of this writing) allows organizations to consistently define and apply data access policies based on metadata regardless of the route by which the user or application attempts to the data itself. Audit and reporting Atlas leverages a common metadata store and policy rules, and the community plans to leverage this further with centralized log data for advanced reporting and analysis. Customers can recreate the data landscape at any given time by capturing security access information for every application, process and interaction with data, thereby providing insight into operational information for completed tasks as well as intermediate steps and activities. In the future, combining the capabilities of the HDP log search with a cross-component globally unique identifier (GUID), Atlas will strive to provide greater visibility to the entire HDP stack. RESTful APIs Atlas facilitates exploration of audit information by providing pre-defined navigation paths to data classification and audit information. Text-based search features in Atlas locate relevant data and audit event across the Data Lake quickly and accurately. Data stewards have the power to visualize a data set s lineage and then drill down into operational, security and provenance-related details. Native connector for Hive integration HDP 2.3 saw the initial release of Atlas. Included is a native connector to automatically capture all SQL activity on HiveServer2. All activity through HiveServer2 is tracked, providing lineage of both the data and the schema. This is then combined with business taxonomy to provide an enriched search and discovery capability. Governance-ready certification Atlas strives to foster a vibrant ecosystem to address Hadoop application integration requirements based on a centralized metadata store. A certification program aims to create a curated group of partners that contribute a rich set of data management features encompassing data preparation, integration, cleansing, tagging, ETL visualization and collaboration. Certified partners will define a set of metadata standards to exchange data and contribute conforming data integration features to the metadata store. Customers can then subscribe to features that they want to deploy with low switching costs and faster ramp-up times. Smaller firms can differentiate themselves by contributing innovative features to the program and benefit from other features to devise end-to-end workflow processing.
13 13 Competitive analysis As a result of the collaborative development of Atlas following the principles of Open Enterprise Hadoop, HDP offers key advantages over solutions developed through a proprietary approach to Hadoop. METADATA SERVICES HORTONWORKS DATA PLATFORM Metadata built around a core flexible type system that can model any organizational and data structure. Support for hierarchies and inheritance of attributes (parent-to-child elements). PROPRIETARY HADOOP Flat modeling using name-value pairs. Coarse and inelegant data modeling. No hierarchy or inheritance support. Open, platform-wide metadata integration to provide cross component lineage and dependencies. Lineage support for Hcatalog, Hive and HDFS. No support for Kafka or Storm. Open metadata services coordinate and support the entire platform, including complete SQL lineage, tag based real-time policy protection and common taxonomy for data pipelines. Custom connection supported through rich REST API set. Limited proprietary point integrations for certain components only Hcatalog, Hive and HDFS. PRESCRIPTIVE LINEAGE HORTONWORKS DATA PLATFORM Business and Operational: Combine logical models of workflow and log events for validation and completeness. PROPRIETARY HADOOP Operational event data lineage assembled by algorithm. Backward-looking only, no validation for missing elements. Taxonomy: Lineage searchable by both hierarchical business taxonomy (classification) and tags (traits) such as PII, as well as by data type (Hive table, column, etc.) Search only on operational data and flat labels; no validation against taxonomy for duplications or typos. Advanced search: Domain-specific language (DSL-SQL like search) that supports keyword and full text search Full text search only.
14 14 DATA LIFE CYCLE HORTONWORKS DATA PLATFORM Reusable: Logical model to create reusable and repeatable workflows. PROPRIETARY HADOOP Manually create each job and schedule. Built-in data management policies: Late data handling, replication (both HDFS and Hive) and eviction (disposition). Manually create each job and schedule. THIRD-PARTY SUPPORT HORTONWORKS DATA PLATFORM Gov.-ready certification: Certification that partners are being good citizens. Common metadata store, no proprietary formats, must use open APIs and SLA for lineage commits. PROPRIETARY HADOOP Not available. Low cost and no vendor lock-in: Common metadata stores to allow HDP users to change vendor with minimum switching cost. Customer metadata control and ownership. Not available. Vendor lock-in with typical cycle of configuration and migration. Agility and rapid customization: Common metadata to allow rapid deployment of new vendor or features with minimal downtime and risk. Data management tools available a la carte instead of only in rigid suites. Vendor-specific proprietary point solutions. No shared metadata. Not open.
15 15 Summary The transformative value of Big Data has driven the rapid adoption of Hadoop across businesses and industries of all kinds, but to be a truly enterprise-ready technology, its implications for data governance must be recognized and addressed. To manage risk, organizations need a comprehensive and effective way to ensure full visibility, control and compliance for the corporate and customer information in the Data Lake. Recognizing data governance as an essential element of Open Enterprise Hadoop, Hortonworks has collaborated with industry partners to create a flexible, open framework based on metadata and taxonomy to ensure the auditability, transparency, reproducibility and consistency of the Data Lake and the information it contains. This metadata-based approach is embodied in Apache Atlas, a project developed collaboratively by Hortonworks and a diverse group of large enterprises. Atlas will allow a single gateway to interface with all the diverse components in the HDP stack and harmonizes hem with the rest of the enterprise data ecosystem. A core flexible type system allows modeling of any organizational or data structures with built-in support for hierarchies and inheritance of attributes or tags (parent-to-child elements). Administrators also benefit from rich capabilities to define and enforce policies flexibly to support a wide range of industry use cases, and to take action quickly when data governance policies are violated. The power and versatility of Hadoop is the direct result of its open and collaborative development. By continuing this approach to address key enterprise requirements for data governance, Hortonworks helps companies leverage the strength of the open source community to manage risk without compromising productivity or data accessibility. In this way, customers can be confident that their Big Data strategy is built on a foundation of visibility, control and compliance. About Hortonworks Hortonworks develops, distributes and supports the only 100% open source Apache Hadoop data platform. Our team comprises the largest contingent of builders and architects within the Hadoop ecosystem who represent and lead the broader enterprise requirements within these communities. Hortonworks Data Platform deeply integrates with existing IT investments upon which enterprises can build and deploy Hadoop-based applications. Hortonworks has deep relationships with the key strategic data center partners that enable our customers to unlock the broadest opportunities from Hadoop. For more information, visit
Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015
Ensure PCI DSS compliance for your Hadoop environment A Hortonworks White Paper October 2015 2 Contents Overview Why PCI matters to your business Building support for PCI compliance into your Hadoop environment
More informationData Governance in the Hadoop Data Lake. Michael Lang May 2015
Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationHadoop in the Hybrid Cloud
Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big
More informationData Management Roadmap
Data Management Roadmap A progressive approach towards building an Information Architecture strategy 1 Business and IT Drivers q Support for business agility and innovation q Faster time to market Improve
More informationData Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015
Data Governance in the Hadoop Data Lake Kiran Kamreddy May 2015 One Data Lake: Many Definitions A centralized repository of raw data into which many data-producing streams flow and from which downstream
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationHP SOA Systinet software
HP SOA Systinet software Govern the Lifecycle of SOA-based Applications Complete Lifecycle Governance: Accelerate application modernization and gain IT agility through more rapid and consistent SOA adoption
More informationORACLE HYPERION DATA RELATIONSHIP MANAGEMENT
Oracle Fusion editions of Oracle's Hyperion performance management products are currently available only on Microsoft Windows server platforms. The following is intended to outline our general product
More informationHow to avoid building a data swamp
How to avoid building a data swamp Case studies in Hadoop data management and governance Mark Donsky, Product Management, Cloudera Naren Korenu, Engineering, Cloudera 1 Abstract DELETE How can you make
More informationwww.sryas.com Analance Data Integration Technical Whitepaper
Analance Data Integration Technical Whitepaper Executive Summary Business Intelligence is a thriving discipline in the marvelous era of computing in which we live. It s the process of analyzing and exploring
More informationOracle Role Manager. An Oracle White Paper Updated June 2009
Oracle Role Manager An Oracle White Paper Updated June 2009 Oracle Role Manager Introduction... 3 Key Benefits... 3 Features... 5 Enterprise Role Lifecycle Management... 5 Organization and Relationship
More informationWhy enterprise data archiving is critical in a changing landscape
Why enterprise data archiving is critical in a changing landscape Ovum white paper for Informatica SUMMARY Catalyst Ovum view The most successful enterprises manage data as strategic asset. They have complete
More informationKnowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success
Developing an MDM Strategy Key Components for Success WHITE PAPER Table of Contents Introduction... 2 Process Considerations... 3 Architecture Considerations... 5 Conclusion... 9 About Knowledgent... 10
More informationCross-Domain Service Management vs. Traditional IT Service Management for Service Providers
Position Paper Cross-Domain vs. Traditional IT for Providers Joseph Bondi Copyright-2013 All rights reserved. Ni², Ni² logo, other vendors or their logos are trademarks of Network Infrastructure Inventory
More informationBuilding a Data Quality Scorecard for Operational Data Governance
Building a Data Quality Scorecard for Operational Data Governance A White Paper by David Loshin WHITE PAPER Table of Contents Introduction.... 1 Establishing Business Objectives.... 1 Business Drivers...
More informationCORPORATE OVERVIEW. Big Data. Shared. Simply. Securely.
CORPORATE OVERVIEW Big Data. Shared. Simply. Securely. INTRODUCING PHEMI SYSTEMS PHEMI unlocks the power of your data with out-of-the-box privacy, sharing, and governance PHEMI Systems brings advanced
More informationData Masking: A baseline data security measure
Imperva Camouflage Data Masking Reduce the risk of non-compliance and sensitive data theft Sensitive data is embedded deep within many business processes; it is the foundational element in Human Relations,
More informationwww.ducenit.com Analance Data Integration Technical Whitepaper
Analance Data Integration Technical Whitepaper Executive Summary Business Intelligence is a thriving discipline in the marvelous era of computing in which we live. It s the process of analyzing and exploring
More informationCisco Data Preparation
Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and
More informationMOVING TO THE NEXT-GENERATION MEDICAL INFORMATION CALL CENTER
MOVING TO THE NEXT-GENERATION MEDICAL INFORMATION CALL CENTER Pharma companies are improving personalized relationships across more channels while cutting cost, complexity, and risk Increased competition
More informationUpcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
More informationApache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
More informationWHITEPAPER. Why Dependency Mapping is Critical for the Modern Data Center
WHITEPAPER Why Dependency Mapping is Critical for the Modern Data Center OVERVIEW The last decade has seen a profound shift in the way IT is delivered and consumed by organizations, triggered by new technologies
More informationCorralling Data for Business Insights. The difference data relationship management can make. Part of the Rolta Managed Services Series
Corralling Data for Business Insights The difference data relationship management can make Part of the Rolta Managed Services Series Data Relationship Management Data inconsistencies plague many organizations.
More informationIncrease Agility and Reduce Costs with a Logical Data Warehouse. February 2014
Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4
More informationCloudera Enterprise Data Hub in Telecom:
Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationOPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT
WHITEPAPER OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT A top-tier global bank s end-of-day risk analysis jobs didn t complete in time for the next start of trading day. To solve
More informationA Comprehensive Solution for API Management
An Oracle White Paper March 2015 A Comprehensive Solution for API Management Executive Summary... 3 What is API Management?... 4 Defining an API Management Strategy... 5 API Management Solutions from Oracle...
More informationMaster Data Management (MDM) in the Sales & Marketing Office
Master Data Management (MDM) in the Sales & Marketing Office As companies compete to acquire and retain customers in today s challenging economy, a single point of management for all Sales & Marketing
More informationA WHITE PAPER By Silwood Technology Limited
A WHITE PAPER By Silwood Technology Limited Using Safyr to facilitate metadata transparency and communication in major Enterprise Applications Executive Summary Enterprise systems packages such as SAP,
More informationHDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
More informationDATA QUALITY MATURITY
3 DATA QUALITY MATURITY CHAPTER OUTLINE 3.1 The Data Quality Strategy 35 3.2 A Data Quality Framework 38 3.3 A Data Quality Capability/Maturity Model 42 3.4 Mapping Framework Components to the Maturity
More informationGanzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
More informationNCOE whitepaper Master Data Deployment and Management in a Global ERP Implementation
NCOE whitepaper Master Data Deployment and Management in a Global ERP Implementation Market Offering: Package(s): Oracle Authors: Rick Olson, Luke Tay Date: January 13, 2012 Contents Executive summary
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks was founded by the team behind Apache Spark, the most active open source project in the big data ecosystem today. Our mission at Databricks is to dramatically
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
More informationBig Data must become a first class citizen in the enterprise
Big Data must become a first class citizen in the enterprise An Ovum white paper for Cloudera Publication Date: 14 January 2014 Author: Tony Baer SUMMARY Catalyst Ovum view Big Data analytics have caught
More informationSOA REFERENCE ARCHITECTURE: SERVICE TIER
SOA REFERENCE ARCHITECTURE: SERVICE TIER SOA Blueprint A structured blog by Yogish Pai Service Tier The service tier is the primary enabler of the SOA and includes the components described in this section.
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,
More informationTest Data Management Concepts
Test Data Management Concepts BIZDATAX IS AN EKOBIT BRAND Executive Summary Test Data Management (TDM), as a part of the quality assurance (QA) process is more than ever in the focus among IT organizations
More informationIBM Data Security Services for endpoint data protection endpoint data loss prevention solution
Automating policy enforcement to prevent endpoint data loss IBM Data Security Services for endpoint data protection endpoint data loss prevention solution Highlights Facilitate policy-based expertise and
More informationWhitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff
Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff The Challenge IT Executives are challenged with issues around data, compliancy, regulation and making confident decisions on their business
More informationSimplified Management With Hitachi Command Suite. By Hitachi Data Systems
Simplified Management With Hitachi Command Suite By Hitachi Data Systems April 2015 Contents Executive Summary... 2 Introduction... 3 Hitachi Command Suite v8: Key Highlights... 4 Global Storage Virtualization
More informationAccelerate your Big Data Strategy. Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator
Accelerate your Big Data Strategy Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator Enterprise Data Hub Accelerator enables you to get started rapidly and cost-effectively with
More informationUTILIZING COMPOUND TERM PROCESSING TO ADDRESS RECORDS MANAGEMENT CHALLENGES
UTILIZING COMPOUND TERM PROCESSING TO ADDRESS RECORDS MANAGEMENT CHALLENGES CONCEPT SEARCHING This document discusses some of the inherent challenges in implementing and maintaining a sound records management
More informationInside the Digital Commerce Engine. The architecture and deployment of the Elastic Path Digital Commerce Engine
Inside the Digital Commerce Engine The architecture and deployment of the Elastic Path Digital Commerce Engine Contents Executive Summary... 3 Introduction... 4 What is the Digital Commerce Engine?...
More informationEmbarcadero DataU Conference. Data Governance. Francis McWilliams. Solutions Architect. Master Your Data
Data Governance Francis McWilliams Solutions Architect Master Your Data A Level Set Data Governance Some definitions... Business and IT leaders making strategic decisions regarding an enterprise s data
More informationAchieving Regulatory Compliance through Security Information Management
www.netforensics.com NETFORENSICS WHITE PAPER Achieving Regulatory Compliance through Security Information Management Contents Executive Summary The Compliance Challenge Common Requirements of Regulations
More informationBest Practices in Contract Migration
ebook Best Practices in Contract Migration Why You Should & How to Do It Introducing Contract Migration Organizations have as many as 10,000-200,000 contracts, perhaps more, yet very few organizations
More informationDatameer Big Data Governance
TECHNICAL BRIEF Datameer Big Data Governance Bringing open-architected and forward-compatible governance controls to Hadoop analytics As big data moves toward greater mainstream adoption, its compliance
More informationThe Liaison ALLOY Platform
PRODUCT OVERVIEW The Liaison ALLOY Platform WELCOME TO YOUR DATA-INSPIRED FUTURE Data is a core enterprise asset. Extracting insights from data is a fundamental business need. As the volume, velocity,
More informationIBM Software A Journey to Adaptive MDM
IBM Software A Journey to Adaptive MDM What is Master Data? Why is it Important? A Journey to Adaptive MDM Contents 2 MDM Business Drivers and Business Value 4 MDM is a Journey 7 IBM MDM Portfolio An Adaptive
More informationEnhance visibility into and control over software projects IBM Rational change and release management software
Enhance visibility into and control over software projects IBM Rational change and release management software Accelerating the software delivery lifecycle Faster delivery of high-quality software Software
More informationGenerally Accepted Recordkeeping Principles
Generally Accepted Recordkeeping Principles Information Governance Maturity Model Information is one of the most vital strategic assets any organization possesses. Organizations depend on information to
More informationIBM Data Security Services for endpoint data protection endpoint data loss prevention solution
Automating policy enforcement to prevent endpoint data loss IBM Data Security Services for endpoint data protection endpoint data loss prevention solution Highlights Protecting your business value from
More informationIBM Software IBM Business Process Management Suite. Increase business agility with the IBM Business Process Management Suite
IBM Software IBM Business Process Management Suite Increase business agility with the IBM Business Process Management Suite 2 Increase business agility with the IBM Business Process Management Suite We
More informationAn Application-Centric Infrastructure Will Enable Business Agility
An Application-Centric Infrastructure Will Enable Business Agility March 2014 Prepared by: Zeus Kerravala An Application-Centric Infrastructure Will Enable Business Agility by Zeus Kerravala March 2014
More informationHealthcare, transportation,
Smart IT Argus456 Dreamstime.com From Data to Decisions: A Value Chain for Big Data H. Gilbert Miller and Peter Mork, Noblis Healthcare, transportation, finance, energy and resource conservation, environmental
More informationWrangling Actionable Insights from Organizational Data
Wrangling Actionable Insights from Organizational Data Koverse Eases Big Data Analytics for Those with Strong Security Requirements The amount of data created and stored by organizations around the world
More informationData Security in Hadoop
Data Security in Hadoop Eric Mizell Director, Solution Engineering Page 1 What is Data Security? Data Security for Hadoop allows you to administer a singular policy for authentication of users, authorize
More informationScalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data
Transforming Data into Intelligence Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data Big Data Data Warehousing Data Governance and Quality
More informationTransforming Information Silos into Shareable Assets through Automated Content Conversion
Transforming Information Silos into Shareable Assets through Automated Content Conversion AUTOMATED DOCUMENT CONVERSION FOR ECM SYSTEMS WHITE PAPER Executive Summary Massive volumes of business data much
More informationAccelerate BI Initiatives With Self-Service Data Discovery And Integration
A Custom Technology Adoption Profile Commissioned By Attivio June 2015 Accelerate BI Initiatives With Self-Service Data Discovery And Integration Introduction The rapid advancement of technology has ushered
More informationHortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015
Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 We Do Hadoop Fall 2014 Page 1 HDP delivers a comprehensive data management platform GOVERNANCE Hortonworks Data Platform
More informationAre You Big Data Ready?
ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain
More informationWHITE PAPER. Creating your Intranet Checklist
WHITE PAPER Creating your Intranet Checklist About this guide It can be overwhelming to run and manage an Intranet project. As a provider of Intranet software and services to small, medium and large organizations,
More informationIntegrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
More informationA Vision for Operational Analytics as the Enabler for Business Focused Hybrid Cloud Operations
A Vision for Operational Analytics as the Enabler for Focused Hybrid Cloud Operations As infrastructure and applications have evolved from legacy to modern technologies with the evolution of Hybrid Cloud
More informationRealizing business flexibility through integrated SOA policy management.
SOA policy management White paper April 2009 Realizing business flexibility through integrated How integrated management supports business flexibility, consistency and accountability John Falkl, distinguished
More informationBringing Strategy to Life Using an Intelligent Data Platform to Become Data Ready. Informatica Government Summit April 23, 2015
Bringing Strategy to Life Using an Intelligent Platform to Become Ready Informatica Government Summit April 23, 2015 Informatica Solutions Overview Power the -Ready Enterprise Government Imperatives Improve
More informationIBM Unstructured Data Identification and Management
IBM Unstructured Data Identification and Management Discover, recognize, and act on unstructured data in-place Highlights Identify data in place that is relevant for legal collections or regulatory retention.
More informationIBM Master Data Management and data governance November 2007. IBM Master Data Management: Effective data governance
November 2007 IBM Master Data Management: Effective data governance Page 2 Introduction Gone are the days when doing business meant doing so only within the borders of the organization. What used to be
More informationKlarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance
Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice
More informationThe Case for Business Process Management
The Case for Business Process Management Executive Summary Each company s unique way of doing business is captured in its business processes. For this reason, business processes are rapidly becoming the
More informationHow To Improve Your Business
IT Risk Management Life Cycle and enabling it with GRC Technology 21 March 2013 Overview IT Risk management lifecycle What does technology enablement mean? Industry perspective Business drivers Trends
More informationGain control over all enterprise content
Brochure Gain control over all enterprise content HP Autonomy ControlPoint Turning Big Data into little data Most organizations today store data in a number of business systems and information repositories.
More informationMasterminding Data Governance
Why Data Governance Matters The Five Critical Steps for Data Governance Data Governance and BackOffice Associates Masterminding Data Governance 1 of 11 A 5-step strategic roadmap to sustainable data quality
More informationCIC Audit Review: Experian Data Quality Enterprise Integrations. Guidance for maximising your investment in enterprise applications
CIC Audit Review: Experian Data Quality Enterprise Integrations Guidance for maximising your investment in enterprise applications February 2014 Table of contents 1. Challenge Overview 03 1.1 Experian
More informationSuccessful Outsourcing of Data Warehouse Support
Experience the commitment viewpoint Successful Outsourcing of Data Warehouse Support Focus IT management on the big picture, improve business value and reduce the cost of data Data warehouses can help
More informationInformation Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
More informationOperational Excellence for Data Quality
Operational Excellence for Data Quality Building a platform for operational excellence to support data quality. 1 Background & Premise The concept for an operational platform to ensure Data Quality is
More informationBeyond the Data Lake
WHITE PAPER Beyond the Data Lake Managing Big Data for Value Creation In this white paper 1 The Data Lake Fallacy 2 Moving Beyond Data Lakes 3 A Big Data Warehouse Supports Strategy, Value Creation Beyond
More informationBig Data Management and Security
Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value
More informationThe Customizable Cloud. How the Cloud Provides the More Flexible Alternative to Legacy ERP Platforms
How the Cloud Provides the More Flexible Alternative to Legacy ERP Platforms Executive Summary For years, Enterprise Resource Planning (ERP) applications have been instrumental in integrating business
More informationImproving Service Asset and Configuration Management with CA Process Maps
TECHNOLOGY BRIEF: SERVICE ASSET AND CONFIGURATION MANAGEMENT MAPS Improving Service Asset and Configuration with CA Process Maps Peter Doherty CA TECHNICAL SALES Table of Contents Executive Summary SECTION
More informationOffload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper
Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)
More informationData Integration Checklist
The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media
More informationHadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop
Hadoop Data Hubs and BI Supporting the migration from siloed reporting and BI to centralized services with Hadoop John Allen October 2014 Introduction John Allen; computer scientist Background in data
More informationOptimized for the Industrial Internet: GE s Industrial Data Lake Platform
Optimized for the Industrial Internet: GE s Industrial Lake Platform Agenda The Opportunity The Solution The Challenges The Results Solutions for Industrial Internet, deep domain expertise 2 GESoftware.com
More informationAPI Management: Powered by SOA Software Dedicated Cloud
Software Dedicated Cloud The Challenge Smartphones, mobility and the IoT are changing the way users consume digital information. They re changing the expectations and experience of customers interacting
More informationBeyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations
Beyond Lambda - how to get from logical to physical Artur Borycki, Director International Technology & Innovations Simplification & Efficiency Teradata believe in the principles of self-service, automation
More informationA Modern Data Architecture with Apache Hadoop
Modern Data Architecture with Apache Hadoop Talend Big Data Presented by Hortonworks and Talend Executive Summary Apache Hadoop didn t disrupt the datacenter, the data did. Shortly after Corporate IT functions
More informationApache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com
Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
More informationMicrosoft Big Data. Solution Brief
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
More informationService Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group. Tuesday June 12 1:00-2:15
Service Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group Tuesday June 12 1:00-2:15 Service Oriented Architecture and the DBA What is Service Oriented Architecture (SOA)
More informationHow To Make Data Streaming A Real Time Intelligence
REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log
More informationHow To Turn Big Data Into An Insight
mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed
More informationINDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES
INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES Data Consolidation and Multi-Tenancy in Financial Services CLOUDERA INDUSTRY BRIEF 2 Table of Contents Introduction 3 Security
More information