A White PAPer by Grant ingersoll, Chief Scientist for LucidWorks and ted Dunning, Chief Application Architect for Mapr

Size: px
Start display at page:

Download "A White PAPer by Grant ingersoll, Chief Scientist for LucidWorks and ted Dunning, Chief Application Architect for Mapr"

Transcription

1 White Paper Crowd Sourcing Reflected Intelligence Using Search and Big Data How LucidWorks and MapR can reflect crowd-source intelligence by leveraging Lucene/Solr A White PAPer by Grant ingersoll, Chief Scientist for LucidWorks and ted Dunning, Chief Application Architect for Mapr

2 Page 2 LucidWorks & MapR: Crowd-Sourced Intelligence Abstract This white paper explores how search has evolved in recent years beyond keyword search into a more broadly applicable information discovery tool by using principles of reflected intelligence. The paper will then demonstrate how several organizations combine big data, search and reflected intelligence to improve search results and decision-making. The paper concludes with a discussion of how LucidWorks and MapR work together to make this possible and how organizations can get started using reflected intelligence in their search applications. The Evolution of Search Search has become a mainstream and integral part of our daily lives it is helpful to remember, however, that it wasn t always this way. In the early days of the Internet, tools like Archie, Veronica, and Jughead emerged to search for particular file names stored on FTP servers and Gopher listings. Once the World Wide Web was established with the release of the first browser and server code from CERN in 1992, search engines like WebCrawler, Lycos, Magellan, Excite, Infoseek, Inktomi, Northern Light, and AltaVista emerged to help unlock the information stored across, what was at the time, thousands of Web servers with perhaps hundreds of thousands of pages of information. As the Internet continued to grow exponentially, Google s innovative PageRank algorithm allowed the company to break away from the pack and eventually became synonymous with search on the Web. While this was all developing on the public Web, companies began to realize that they also have vast stores of information, both structured (enterprise applications, databases and spreadsheets) and semi-structured ( s, documents, presentations, multimedia, etc.) and that search technology can provide an effective means to uncover insights and correlations about things like customers, products, and markets. Enter Lucene/Solr and LucidWorks Lucene was accepted into the Apache Software Foundation in 2001 and became its own top-level project in By 2007, companies such as Netflix, eharmony, and Cisco, among others began adopting Apache Lucene as an open source alternative to proprietary search engines for both internal and customer-facing applications. Apache Solr began as in internal project at CNET to provide a better search experience for their web visitors. Leveraging Lucene search algorithms as the internals, Solr added a search server and significant additional capabilities. Solr was donated to the Apache foundation in 2006 and over the last several years has become tightly coupled with Lucene. Lucene/Solr has become the de facto standard in open source search in use at tens of thousands of companies and employs a thriving developer community that numbers in the thousands. Along Comes Hadoop and MapR In 2006, the Lucene project spawned a sub-project by the name of Hadoop. By early 2008, it had become a top-level Apache project and is now a de facto standard for big-data analysis. What began as a way to make the Nutch web crawler scale to handle larger and larger crawling jobs, it has since morphed into a general purpose, distributed file system and computation framework used in a wide variety of large scale applications such as log processing, data warehousing and much, much more. Improving the Search Experience the Next Frontier Products such as Apache Lucene/Solr and LucidWorks Search have commoditized enterprise search and have made it extraordinarily easy to deploy as well as easy to obtain highly relevant search results. As a result of LucidWorks substantial investment, it no longer requires a search engine expert or developer to stand up an enterprise search server and tune it for optimal performance. Search professionals no longer need to focus on the basics of search like how to index and search content. Instead, implementers can focus on the real challenge of improving the user s application experience by exploiting the intersection of content, content relationships, user interactions, and access. In most cases, in order to exploit this information, one needs a big data solution due to the large amounts of data and user interactions often seen in many applications. Even if there isn t a large amount of content, distributed computation is often useful for speeding up computationally intensive tasks like natural language processing.

3 Page 3 LucidWorks & MapR: Crowd-Sourced Intelligence Search Abuse An Evolutionary Step Forward As search has evolved beyond simple text retrieval, it has emerged as a building block for addressing tougher challenges like fuzziness, relevance ranking, and probabilities all across data stores that include structured and semi-structured information. Search abuse is the notion that software that was intended for one thing (text retrieval) is re-purposed as a building block for another thing (non-textual analytics). Specifically, all kinds of data, including structured data or records of user behavior, can be analyzed by a search engine as a component of a larger system. These kinds of solutions are possible because the underlying algorithms and data structures used to power a search engine can be effectively seen as a sparse matrix multiplication, and it just so happens that sparse matrix multiplication is often what is needed to power many of these next generation, data driven applications. It turns out that there are many places to use non-textual information in search applications: to perform NoSQL types of data retrieval, or to do large-scale machine learning, recommendations and much, much more. The Intersection of Search and Big Data Things begin to get really interesting when a search engine is used to analyze the behavior of users who are themselves using a search engine. Search experts compare keyword search to dehydrated food the basic nutrition is there, but it is not easily accessible until water is added. In this case, the water is behavioral information. Armed with other data sources, such as clicks, mouse tracks, ratings, and reviews, keyword search can be augmented and can lead to information discovery and ultimately to better decision-making. This is where tools like MapR s distribution for Hadoop enter into the search equation. By including all of this meta information about user behavior, search engines can find interesting patterns and correlations that feed directly back into search results. The end result is that the search system can reflect the behavior of subject matter experts back at other users who lack some of that training or experience. The system appears to users to act intelligently because it is reflecting intelligent action back at users. Big data plays a role in large-scale analysis, as well, by producing clusters, identifying trends and topics, finding statistically interesting phrases, similar documents and many more things that require an aggregate view of the data. These large-scale discovery components can encourage system users to experiment with the data and can lead to a virtuous cycle more people do search and discovery and their behavior contributes to improving search results and insights that can be derived. Similar to the Agile method of software engineering, where organizations are always in development cycles, search can always be refining results, based on user behavior. And, with potentially millions of users hitting a system, some subset will behave in clever ways that can be reflected back to general users and make them more productive. This results in the system appearing to be intelligent, although it is simply reflecting back the intelligence of others. Customer Examples and Use Cases Reflected intelligence has utility in a wide variety of situations as reflected in some examples in industries such as telecom, advertising, banking and insurance, education, government, and entertainment. Most of these use cases are not typical search applications there are no users entering search terms into text boxes. Instead, these cases largely use search components as pivotal elements to adding value to content that already exists. Social Media in Telecom Social media has evolved to become a key component of marketing for many types of companies and organizations. The first use of search as it relates to social media has been to find mentions of the company across social media sources such as Twitter or Instagram. The true power of search is revealed in cases where a company can make operational decisions based on insights derived from search. An example is a major telecom provider who mines social media and correlates it with cell tower data to predict additional capacity demands for sporting events, music festivals, emergencies, etc.

4 Page 4 LucidWorks & MapR: Crowd-Sourced Intelligence Social Media Analysis for Advertising Typical television advertising is done using a scattershot approach where targeting is based on demographic data that paints a broad picture of an audience, e.g., affluent women between years old. As a result, ad placement pricing is based on reaching a portion of this very broad segment. It is estimated that up to a 5x multiple could be derived if the ads were backed up by good analysis of who was actually watching, when they were watching, and what they thought about the advertising and brand. By combining insights from social media, advertisers can get as much as 80% of the total value of the ad from this analysis, as compared to the ad itself. Insurance Claims Processing and Analysis Insurance companies always want to have a better understanding of the claims they are processing, whether it is to detect fraud or to determine new trends or patterns that emerge from the pool of claims they see. Typical auto insurance claims include both tabular, attributed data, such as make, model, year, price, etc., and semi-structured data such as police reports, eyewitness reports, victim reports, etc. In the traditional data warehouse approach, analysts could ask question about the attributed data, but had no means to combine, rank or facet on the complete picture. In this example, a large insurance company took both the structured and semi-structured data into their search application and then enriched it with behavioral data. Specifically, they looked at what the analysts were working on and performed text analysis at a low level to identify trends and patterns. It turns out that they could identify trends such as seeing that in a particular make/model vehicle, just before a crash, people reported that their brakes failed. This data could be fed back to the NTSB and to manufacturers, as well as their own claims adjusters. Virginia Tech - Help the World in Crisis Virginia Tech s Crisis Tragedy Recovery Network serves as a resource to victims and their relatives as well as first responders and policy makers. Anytime there is a large national or global crisis natural or man-made the CTRN harvests content from the web, social media, news outlets, etc., and makes it immediately searchable as well as archived for future access. Over time, they employ large-scale natural language processing to identify trends, topics, themes, and relationships both inside an event and across multiple events to help policy makers and first responders develop systems and processes to improve response. Bright Planet Catch the Bad Guys Bright Planet is in the business of harvesting intelligence from the web beyond the reach of traditional search engines for use by governments, businesses, and organizations. Bright Planet s client in this case is a large pharmaceutical manufacturer who was looking for evidence of the sale of counterfeit drugs. While search can provide some answers, more analysis is needed since counterfeiters often carefully disguise their wares. Bright Planet looks for certain types of language and other indicators that they feed into their search algorithms along with enrichment data from how analysts are performing their analysis and what questions they are asking of the data. This results in new patterns that are detected and continuously refines and improves their analysis. Veoh Cross Recommendations Veoh is a video content network that allows subscribers to watch, follow, share, and comment on aggregated video content from around the web. Their innovative recommendation engine leverages user behavior (videos searched, watched, recommended, items clicked, words typed in, mouse tracks etc.) to influence recommendations and search results. They use behavior across the entire subscriber population to influence an individual s search results and coalesce all of these various signals into a single query system with what appeared to the user as magical results.

5 Page 5 LucidWorks & MapR: Crowd-Sourced Intelligence Getting Started with Reflected Intelligence There are several critical components needed to get starting building applications that leverage reflected intelligence. Fast, efficient, scalable search Lucene/Solr powers some of the world s largest websites and search applications with sub-second response against billions of records, so it makes a good choice for this fundamental component. Bulk and near-real-time indexing Distributed computing platform for performance and scale Storage capacity to store and work with raw data to transform it to address the kinds of questions that will be asked NLP and Machine Learning Tools to address semi- and unstructured data that will scale The natural language processing and machine learning tools are what will power the discovery and analysis. They provide the ability to crunch through all of the feedback and user behavior data to understand what people are clicking. To make this work at scale, the feedback must work seamlessly inside of the system with the appropriate workflows in place to eliminate the need for administrators to chase down log files from disparate systems. Reference Architecture for Reflected Intelligence This reference architecture handles a wide variety of data types both textual and behavioral. It also can handle an array of enrichment systems to elaborate and annotate documents for useful actions across a broad spectrum of business purposes. The enrichment systems can be batch oriented or large-scale offline, or near-real-time. Discovery and enrichment can be done as a rough cut at the time of content acquisition and can be re-clustered at a later date when more is known. The heart of this architecture is the document store represented by the grey cylinder in the middle of the diagram. Inside of this store are multiple shards that make up the document store and retrieval index. It contains text and semi-structured information, as well as structured information processed by ETL systems.

6 Page 6 LucidWorks & MapR: Crowd-Sourced Intelligence Discovery and enrichment processes run against recently added documents and look for patterns and enrichment opportunities that can improve search results. Enrichment can include classifiers and recommenders that can create special tags and indicators on documents to improve correlations. Analytic services are accessed via the general APIs that can query the system and may be explicit or implicit where they are derived from behavior or formed from other data sources. Query processes don t necessarily have to give results. Instead, they may be used to structure a website or notify an analyst when particular conditions are met. MapR Extends Hadoop for Reflected Intelligence MapR provides a technology-leading, complete distribution for Hadoop with enhancements that make Hadoop easy, dependable and fast. MapR distribution includes the different Apache projects from the Hadoop ecosystem such as Hive, Pig, HBase and Mahout over a platform that provides enterprise grade features such as direct access NFS, snapshots, mirroring and instant node recovery. easy MapR innovation allows users to access the Hadoop cluster through industry standard APIs. Some of the standards that are built-in and supported over MapR include full POSIX compliance, Network File Service (NFS), ODBC, Linux PAM and REST. Beyond the standards, MapR also provides multi-tenancy, data placement control and hardware level monitoring of the cluster. Dependable MapR provides some of the best features for running mission critical applications. Features include self-healing of critical services that maintain the nodes and the jobs, snapshots that allows for point in time recovery of data, mirroring that allows for inter-cluster replication over WAN and rolling upgrades that prevent service disruptions. Fast MapR is twice as fast as any other distribution. It leverages optimized shuffle algorithm, direct access to disk, built in compression and code written in advanced C++ to provide superior and unprecedented performance over Hadoop. MapR is particularly well suited for reflected intelligence applications. It provides an integrated data platform that can store file-like objects accessible through HDFS or NFS and table objects that exhibit the Hbase API. MapR supports real-time ingestion and processing for objects that store user behavior which are changing in real time. MapR s snapshot and mirroring capabilities are critical for reflected intelligence applications, as they support the evolution of large data objects over time. With these tools, new data can be layered on old data in what-if scenarios to assess the impact to an application. As search experts will attest, tuning a result set in one area can have unanticipated consequences in other areas, and this sort of impact analysis is crucial to good search hygiene. These snapshots support the always testing model of enrichment, where the search application continues to improve simply through the act of more people using the application over time. In addition, snapshots allow search professionals to play back what might have happened over a particular period of time and recreate situations for further troubleshooting. These capabilities go beyond ordinary Hadoop and make reflected intelligence applications possible. LucidWorks Extends Lucene/Solr for Reflected Intelligence LucidWorks is the leading provider of packaging, support, training, and knowledge about Apache Lucene/Solr. LucidWorks employs about a third of the committers to the open source project and was founded by a group of the committers to promote the adoption of Lucene/Solr. The company continues to contribute a considerable body of work back to the open source project each year. In the past year, the LucidWorks team worked to ensure Lucene/Solr can scale to handle Hadoop workloads.

7 Page 7 LucidWorks & MapR: Crowd-Sourced Intelligence LucidWorks offers LucidWorks Search, which adds a user interface for management and operations to Lucene/Solr, along with a connector framework for integrating to tools like MapR and common enterprise repositories such as SharePoint, file systems, etc., and it adds integration to organizations security access control lists. LucidWorks Big Data offers big data as a service. It is constructed very similarly to the reference architecture referred to earlier in this document. It incorporates LucidWorks Search, adds Hadoop and machine learning, along with pre-built workflows that eliminate the pain of moving the data around to be processed. The LucidWorks Big Data Marketecture The Big Data Operating Systems at the heart of this diagram is the reference architecture discussed earlier where LucidWorks Search is combined with Hadoop, Hbase, etc., and determines that the data is in the right place at the right time. On top of this substrate, Search, Discovery, and Analytics applications are built that leverage machine learning tools, natural language processing, and the tools needed to scale with pre-defined workflows. This is all accessible through a set of REST APIs, so a non-expert can interact with the services with common web services like REST and JSON. The right side of the diagram is the system management layer with the glue, like Zookeeper, and provisioning tools. To get content into the system, LucidWorks provides a variety of connector to a range of enterprise data sources, databases, S3 buckets, plus the system supports push data.

8 Page 8 LucidWorks & MapR: Crowd-Sourced Intelligence The LucidWorks/MapR Advantage The goal of the partnership between LucidWorks and MapR is to enable a rapid path to the next generation of search, by using reflected intelligence, along with other methods, to unlock correlations and insights from large data sets and ultimately drive better decisions for individuals and organizations. By using LucidWorks and MapR, organizations can quickly build reflected intelligence search applications where: Data can be ingested into MapR by a variety of methods, through Hadoop ecosystem components, or by storing data directly and transparently via NFS (for legacy components) Search indices can be stored in MapR and fed into a MapReduce setting into tools like Pig and Mahout or can be deployed using mirrors or NFS MapR snapshots make backups very simple Snapshots also allow scenarios to be replayed and to do experiment management correlate scoring factors, config files, log analysis etc., to see what users saw at the time LucidWorks connects transparently with MapR No unnatural acts are required logs are in NFS or file systems that MapR presents and can run MapReduce jobs over them without concern for where they reside LEARN MORE AND GET STARTED TODAy To learn more about using crowd sourcing reflected intelligence for search and big data please visit and A webinar with Grant Ingersoll, Chief Scientist for LucidWorks and Ted Dunning, Chief Application Architect for MapR can be found on either site. For a direct response, please or For more information, please visit MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses. MapR brings unprecedented dependability, ease-of-use and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified Big Data platform. MapR is used across financial services, retail, media, healthcare, manufacturing, telecommunications and government organizations as well as by leading Fortune 100 and Web 2.0 companies. Amazon, Cisco, EMC and Google are part of MapR s broad partner ecosystem. Investors include Lightspeed Venture Partners, Mayfield Fund, NEA, and Redpoint Ventures MapR Technologies. All rights reserved. Apache Hadoop and Hadoop are trademarks of the Apache Software Foundation and not affiliated with MapR Technologies.

Search and Real-Time Analytics on Big Data

Search and Real-Time Analytics on Big Data Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its

More information

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches Introduction For companies that want to quickly gain insights into or opportunities from big data - the dramatic volume growth in corporate

More information

Create and Drive Big Data Success Don t Get Left Behind

Create and Drive Big Data Success Don t Get Left Behind Create and Drive Big Data Success Don t Get Left Behind The performance boost from MapR not only means we have lower hardware requirements, but also enables us to deliver faster analytics for our users.

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

RPO represents the data differential between the source cluster and the replicas.

RPO represents the data differential between the source cluster and the replicas. Technical brief Introduction Disaster recovery (DR) is the science of returning a system to operating status after a site-wide disaster. DR enables business continuity for significant data center failures

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Tomer Shiran VP Product Management MapR Technologies. November 12, 2013

Tomer Shiran VP Product Management MapR Technologies. November 12, 2013 Predictive Analytics with Hadoop Tomer Shiran VP Product Management MapR Technologies November 12, 2013 1 Me, Us Tomer Shiran VP Product Management, MapR Technologies tshiran@maprtech.com MapR Enterprise-grade

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

White Paper. Managing MapR Clusters on Google Compute Engine

White Paper. Managing MapR Clusters on Google Compute Engine White Paper Managing MapR Clusters on Google Compute Engine MapR Technologies, Inc. www.mapr.com Introduction Google Compute Engine is a proven platform for running MapR. Consistent, high performance virtual

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads

Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads Solution Overview Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads What You Will Learn MapR Hadoop clusters on Cisco Unified Computing System (Cisco UCS

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Transactions & Interactions

Transactions & Interactions Transactions & Interactions The Correlation of Structured and Unstructured Data Shaun Connolly, Hortonworks December 15, 2011 Big Data Has Reached Every Market Digital data is personal, everywhere, increasingly

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

Integrating a Big Data Platform into Government:

Integrating a Big Data Platform into Government: Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved. Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

Big Data? Definition # 1: Big Data Definition Forrester Research

Big Data? Definition # 1: Big Data Definition Forrester Research Big Data Big Data? Definition # 1: Big Data Definition Forrester Research Big Data? Definition # 2: Quote of Tim O Reilly brings it all home: Companies that have massive amounts of data without massive

More information

White Paper: Datameer s User-Focused Big Data Solutions

White Paper: Datameer s User-Focused Big Data Solutions CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration

More information

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Cloud Integration and the Big Data Journey - Common Use-Case Patterns Cloud Integration and the Big Data Journey - Common Use-Case Patterns A White Paper August, 2014 Corporate Technologies Business Intelligence Group OVERVIEW The advent of cloud and hybrid architectures

More information

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

Deploying Big Data with MapR and StackIQ

Deploying Big Data with MapR and StackIQ white paper Deploying Big Data with MapR and StackIQ A Simplified, Automated Solution for Enterprise Hadoop from StackIQ and MapR. Abstract Contents Meeting the Need for Enterprise- Grade Hadoop Deployments

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12 Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment

More information

Cloudera Enterprise Data Hub in Telecom:

Cloudera Enterprise Data Hub in Telecom: Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Apache Hadoop: Past, Present, and Future

Apache Hadoop: Past, Present, and Future The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past

More information

Mastering Big Data. Steve Hoskin, VP and Chief Architect INFORMATICA MDM. October 2015

Mastering Big Data. Steve Hoskin, VP and Chief Architect INFORMATICA MDM. October 2015 Mastering Big Data Steve Hoskin, VP and Chief Architect INFORMATICA MDM October 2015 Agenda About Big Data MDM and Big Data The Importance of Relationships Big Data Use Cases About Big Data Big Data is

More information

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction

More information

Saving Millions through Data Warehouse Offloading to Hadoop. Jack Norris, CMO MapR Technologies. MapR Technologies. All rights reserved.

Saving Millions through Data Warehouse Offloading to Hadoop. Jack Norris, CMO MapR Technologies. MapR Technologies. All rights reserved. Saving Millions through Data Warehouse Offloading to Hadoop Jack Norris, CMO MapR Technologies MapR Technologies. All rights reserved. MapR Technologies Overview Open, enterprise-grade distribution for

More information

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015

Leveraging the Power of SOLR with SPARK. Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015 Leveraging the Power of SOLR with SPARK Johannes Weigend QAware GmbH Germany pache Big Data Europe September 2015 Welcome Johannes Weigend - CTO QAware GmbH - Software architect / developer - 25 years

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

The Enterprise Data Hub and The Modern Information Architecture

The Enterprise Data Hub and The Modern Information Architecture The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Twitter: @awadallah 1 2013 Cloudera, Inc. All rights reserved. Cloudera Overview The Leader

More information

Big Data: Beyond the Hype

Big Data: Beyond the Hype Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER Big Data: Beyond the Hype Why Big Data Matters to You By DataStax Corporation October 2011 Table of Contents Introduction...4 Big Data

More information

Big Data Zurich, November 23. September 2011

Big Data Zurich, November 23. September 2011 Institute of Technology Management Big Data Projektskizze «Competence Center Automotive Intelligence» Zurich, November 11th 23. September 2011 Felix Wortmann Assistant Professor Technology Management,

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

CSE-E5430 Scalable Cloud Computing Lecture 2

CSE-E5430 Scalable Cloud Computing Lecture 2 CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing

More information

Self-service BI for big data applications using Apache Drill

Self-service BI for big data applications using Apache Drill Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Management - MCS MapR Data Platform for Hadoop and NoSQL APACHE HADOOP AND OSS ECOSYSTEM Batch

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

The Big Data Paradigm Shift. Insight Through Automation

The Big Data Paradigm Shift. Insight Through Automation The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.

More information

HadoopTM Analytics DDN

HadoopTM Analytics DDN DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate

More information

FINANCIAL SERVICES: FRAUD MANAGEMENT A solution showcase

FINANCIAL SERVICES: FRAUD MANAGEMENT A solution showcase FINANCIAL SERVICES: FRAUD MANAGEMENT A solution showcase TECHNOLOGY OVERVIEW FRAUD MANAGE- MENT REFERENCE ARCHITECTURE This technology overview describes a complete infrastructure and application re-architecture

More information

Time-Series Databases and Machine Learning

Time-Series Databases and Machine Learning Time-Series Databases and Machine Learning Jimmy Bates November 2017 1 Top-Ranked Hadoop 1 3 5 7 Read Write File System World Record Performance High Availability Enterprise-grade Security Distribution

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

Choosing a Provider from the Hadoop Ecosystem

Choosing a Provider from the Hadoop Ecosystem CITO Research Advancing the craft of technology leadership Choosing a Provider from the Hadoop Ecosystem Sponsored by MapR Technologies Contents Introduction: The Hadoop Opportunity 1 What Is Hadoop? 2

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Hadoop Big Data for Processing Data and Performing Workload

Hadoop Big Data for Processing Data and Performing Workload Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer

More information

Self-service BI for big data applications using Apache Drill

Self-service BI for big data applications using Apache Drill Self-service BI for big data applications using Apache Drill 2015 MapR Technologies 2015 MapR Technologies 1 Data Is Doubling Every Two Years Unstructured data will account for more than 80% of the data

More information

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012 Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team rlancaster@orbitz.com @rob1lancaster Organizer of Chicago

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS)

Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS) Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS) White Paper BY DATASTAX CORPORATION August 2013 1 Table of Contents Abstract 3 Introduction 3 Overview of HDFS 4

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

Information Builders Mission & Value Proposition

Information Builders Mission & Value Proposition Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns

More information

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.

More information

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763 International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

CDH AND BUSINESS CONTINUITY:

CDH AND BUSINESS CONTINUITY: WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

This Symposium brought to you by www.ttcus.com

This Symposium brought to you by www.ttcus.com This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload

More information

Traditional BI vs. Business Data Lake A comparison

Traditional BI vs. Business Data Lake A comparison Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Big Data: Beyond the Hype

Big Data: Beyond the Hype Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER By DataStax Corporation March 2012 Contents Introduction... 3 Big Data and You... 5 Big Data Is More Prevalent Than You Think... 5 Big

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

Making Sense of Big Data in Insurance

Making Sense of Big Data in Insurance Making Sense of Big Data in Insurance Amir Halfon, CTO, Financial Services, MarkLogic Corporation BIG DATA?.. SLIDE: 2 The Evolution of Data Management For your application data! Application- and hardware-specific

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR 1 Agenda Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback 2 A World of Connected Devices Need a new data management architecture for Internet of Things 21% the % of

More information

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK 5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK CUSTOMER JOURNEY Technology is radically transforming the customer journey. Today s customers are more empowered and connected

More information

VIEWPOINT. High Performance Analytics. Industry Context and Trends

VIEWPOINT. High Performance Analytics. Industry Context and Trends VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations

More information

Big Data and Hadoop for the Executive A Reference Guide

Big Data and Hadoop for the Executive A Reference Guide Big Data and Hadoop for the Executive A Reference Guide Overview The amount of information being collected by companies today is incredible. Wal- Mart has 460 terabytes of data, which, according to the

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information