GLOBAL FINANCIAL SERVICES COMPLIANCE & RISK MANAGEMENT with

Size: px
Start display at page:

Download "GLOBAL FINANCIAL SERVICES COMPLIANCE & RISK MANAGEMENT with"

Transcription

1 GLOBAL FINANCIAL SERVICES COMPLIANCE & RISK MANAGEMENT with Bombuz A Big Data & Semantic Web Solution Insigma Hengtian Software Ltd. Bayshore Management Consultants, LLC

2 Table of Contents OVERVIEW... 1 KEY CHALLENGES IN THE RISK AND COMPLIANCE DOMAIN.. 2 The Regulatory Environment Today s Approach to Solving the Challenge The Big Data Solution RISK & COMPLIANCE BUSINESS CASE 7 Exposure Risk Assessment....6 Client Organizational Description..6 BOMBUZ A UNIQUE BIG DATA SOLUTION... 7 Data Synchronization....7 Semantic Mapping.. 8 Performance Testing...10 Dashboard.11 CONCLUSION AND FUTURE WORK.12

3 OVERVIEW The globalization of the financial industry has resulted in firms that operate 24/7 in multi-country, multi-currency and multicultural environments. Complex organizational models comprised of centralized, regional and local operations must somehow function seamlessly day in and day out. Overlapping responsibilities, matrix reporting, independent outposts and outsourced operations must be monitored not only for business performance but to ensure compliance with a vast system of rules and regulations. And in the aftermath of the financial crisis, risk management and compliance responsibilities have exploded. There are literally hundreds of new rules and regulations from Basel III and Solvency II, to individual country rules, and to those coming from the alphabet soup of U.S. regulatory agencies. In fact, the House Financial Services Committee estimates that it will take private industry 24 million man-hours annually to comply with the first 185 new rules emanating from Dodd-Frank. At this rate, it will cost the industry somewhere around 50 million man-hours to comply with all 400 proposed rules. This translates to approximately 25,000 additional personnel and $2.5B in annual expenses. This is coming at a time when firms in our industry are still engaged in cost-cutting and productivity-enhancing initiatives. With IT budgets slashed, there are more projects competing for fewer resources. And with most of the data needed to monitor risk and demonstrate compliance with these new rules resident in silos whether that be in functional business units or in specific applications it still takes a significant amount of human intervention to answer a simple request. To manage risk and demonstrate compliance in the future, new combinations of systems, technologies and delivery mechanisms will be needed from mobile and web-based applications to legacy systems, spreadsheets and stand-alone applications. Furthermore, they will need immediate and targeted access to vast amounts of data in a variety of forms, from hard-copy documents to digitally-stored images, to calculated values, to tweets and blogs, to audio and video clips, and to any number of other data feeds to satisfy the everexpanding measures of risk. This paper discusses the business challenges of finding a cost-effective way to rapidly develop and deploy management tools that can adapt to the changes in the regulatory environment over the next several years, demonstrate compliance, and mitigate risks. We will look at the compliance function across the supply chain from the front, to the middle, and to the back-office operations. We will identify the challenges of enhancing current processes at each step in the process, from changes required to enhance the quality of the data, to the analytics needed to test the rule, and to the delivery of the final results. Although this is a complex domain, we feel that there are enough similarities surrounding the challenges to core processes that can benefit from a new approach and be applied across the supply chain to demonstrate compliance. 1

4 KEY CHALLENGES IN THE RISK AND COMPLIANCE DOMAIN The Regulatory Environment There are a number of challenges in developing new tools to address the changing requirements in the areas of risk and compliance the first being regulatory clarity. This is an issue no matter what regulatory jurisdiction you re working in. Rules are proposed, then they are issued in draft, and then you have the comment period. This entire process can eat up a lot of time. And businesses just can t sit around and wait. They need to plan and budget, they need to prioritize initiatives, and they need to know where and how to deploy their resources. Let s take as an example the recently implemented IRS Regulation Cost Basis Reporting. It took several years for the IRS to issue final regulations in fact, it wasn t until November of 2011 that they were finalized just one month prior to implementation for equity securities in January of During this time, industry user groups were formed to sort through the draft regulations, comment on certain provisions, then wait for responses from the IRS and attempt to develop best practices so that firms could begin developing the new functionality required to address the mandate. It s clear that large portions of IT budgets over the next few years will be devoted to meeting new risk, regulatory and compliance requirements. IDC Financial Insights estimates that growth in IT spending on risk management will top 15% of the total IT spending in financial services in And these estimates are only for the known rules. That is why we believe a new approach is needed one that allows for a quick and efficient response to change, and doesn t burn through the IT budget for the year. The Systems, Data and Delivery 2

5 In today s world, managing and monitoring risk requires a complex systems and data environment that may or may not include Enterprise Risk Management (ERM) applications, Compliance Systems, Security Masters, Data Warehouses and functional applications for Managing Orders, Trade Routing and Execution, Accounting, Reporting, etc. Most risk measures are driven off compliance rules that are either integrated into a specific application or are maintained in a compliance rule engine. But regardless of what tools are used, they all require data in order to function. Much of the data that is needed for testing rules or running a risk monitor is spread across multiple sources from the actual application, to the Security Master File, to massive data warehouses. Security information can be sourced from vendors, offering documents (in the case of certain asset types), direct market feeds, etc. And depending on whether the system is in-house or at a vendor, the same data may come with different identifiers and in different formats. With over 400 rules emanating from Dodd-Frank alone, just think of the additional data requirements that will be needed. The Gartner Group has estimated that data will grow 800% over the next five years and 80% of that data will be unstructured. In addition, they also estimate that 85% of currently deployed data warehouses will, in some respect, fail to address new issues around extreme data management by How will firms who are squeezed for resources be able to keep pace with these new demands? Today s Approach to Solving the Challenge Any solution trying to address today s risk and compliance challenges has to deal with these issues: Distributed Data Unstructured Data Large Data Volume Less Operating Expenses Forever Changing Rules Typical solutions include a central data mart or warehouse model. Data are stored with a pre-defined schema in a structural format. With the help of ETL tools, data are extracted, transformed and loaded into the data repository to be queried by some type of business intelligence software (BI). In most scenarios, this central repository model is a vertical scaling platform. This means more powerful machines will have to be used as the volume of data grows. This, in turn, requires upgrades to the hardware and software in the data center limiting flexibility and increasing costs in answering demands. Another major challenge is the ability to mine unstructured data in a distributed environment, while at the same time trying to keep pace with all the new rule requirements. And whenever a quick solution is demanded for a complex problem, it usually comes with a high cost attached. 3

6 The Big Data Solution Horizontal scaling solutions with commodity hardware running distributed and parallel processes have become very popular today. BigData, which refers to the technologies that handle large volumes of data, is more of a general term, covering open source tools like Hadoop, MapReduce and NoSQL databases. These evolving technologies provide a relatively lower cost and a more flexible alternative to a central data repository model. The solution described here is built on the generally-accepted distributed computing framework, Hadoop, which leverages the MapReduce programming model to achieve a highly scalable, distributed processing capacity. Instead of collecting specific data into a central data warehouse through an ETL process, BigData systems extract the raw data from operational systems into a NoSQL database, HBase, avoiding repetitive ETL process when regulatory rules change. HBase is based on HDFS (Hadoop Distributed File System). It is able to store very large files with streaming data access and runs on clusters of commodity hardware. Furthermore, there is a semantic data mapping layer presented in a BigData system, building a virtual RDF graph that links the data stored in HBase. This RDF graph helps find the corresponding records in HBase in terms of the query subject, and feeds them as the input to the MapReduce task. In this case, changes in data semantics can be automatically reflected in a BigData system by updating the mapping files that define how the virtual RDF graph is generated. A Big Data system solves many of these challenges and is also an economical solution for big data analysis. For one thing, a Big Data system can either run on clusters of commodity machines or on an elastic cloud like Amazon EC2. Another benefit of a Big Data system is that the technologies are built on open source tools/applications, providing the opportunity for customization with relatively low development/maintenance overhead. 4

7 RISK & COMPLIANCE BUSINESS CASE To illustrate the challenges that an investment manager faces in responding to a relatively straightforward compliance and/or risk management query, we will use a case study. We developed the business case based on real-life experience within a complex diversified manager responding to an event. It is illustrative of the many pain points encountered in responding to a typical request. Exposure Risk Assessment Client Organizational Description: American Alliance (AA) is a large diversified global financial service company providing both asset management and insurance products to individuals and institutions. AA s product line-up includes mutual funds, separate accounts and collective trusts, along with annuity products, life insurance and alternative investments. AA has grown significantly over the past ten years, mostly through acquisitions. They have done significant work to integrate their operations, but still maintain several separate legal entities. Up until the recent financial crisis, AA had been working on a number of data warehouse and consolidated reporting initiatives to provide management with better dashboard tools to monitor and measure key indicators of risk and compliance across their businesses. On the investment side, AA has four Investment Advisors, over 30 Custodians and three Trading Desks. On the operations side, AA has a hybrid operating model: Mutual Fund Accounting, Custody and Transfer Agent operations are outsourced, while Institutional Portfolios (SMAs), CTFs, Commingled Pools, Fixed Annuities and the Insurance General Fund are accounted for in-house, as is Annuity Administration. Hedge Funds are supported by a Prime Broker. On the technology side, AA uses both internally developed and vendor-supplied applications. They have different trading platforms for Equity and Fixed Income, as well as different applications for Pre- and Post-Trade Compliance. They have a home -grown portfolio accounting system and a vendor-supplied annuity administration system. AA also has a Data Warehouse that takes nightly feeds from the internal portfolio accounting system and the Mutual Fund accounting system of holdings at the individual security level. At the current time, holdings data for the General Fund, Fixed Annuities, Sub-Advised Portfolios, and Alternative Products are not available in the warehouse. 5

8 Challenges Because AA has a hybrid operating model with some functions performed in-house and some outsourced there isn t a straightforward solution to this relatively simple request to run a report to see what the exposure to XYZ Corporation is across the complex. Multiple accounting platforms, back-end compliance applications, data downloads and data warehouses must be queried in order to retrieve the data required to satisfy the report. Each application stores data in different formats and some level of translation is required in order to be able to satisfy the requirements of the report request. The primary challenge of an enterprise solution to this business scenario includes the capability to handle large volume of historical data as well as distributed operating data stores. A common practice is to build a centralized data warehouse with legacy system codes to handle ETL (extract, transform, load) jobs at the back office. Compliance reports will then be generated at the end of day, week or month, after all the data have been transformed and stored in the data warehouse. It is not difficult to understand that similar solutions would require a huge commitment in the infrastructure and system development expenditure. More important is the lengthy project duration in planning and deployment of the data warehouse, which can easily become obsolete prior to production, simply because of a regulatory requirement change. The Business Problem: There has been a significant event related to XYZ Corporation, a leading supplier of power generation equipment. Coming on the heels of poorer than expected earnings, the wire services are reporting that the US Military has nixed their plans to use XYZ as their primary supplier of generators. A bankruptcy filing could be coming. The stock price is declining rapidly as traders rush to dump shares The AA investment committee has called a meeting to determine the overall exposure to XYZ in order to develop an action plan one that includes how to provide updates and answer inquiries from their institutional clients. The key questions that need to be answered are: How much do we own? Where are the holdings what Funds, SMA, Trust/Commingled Pools, etc.? How many clients are impacted? Can we sell out of position, if needed do we have any issues? (Settlements, shares out on loan, etc.) What Needs to Be Done? AA needs to determine their overall exposure to XYZ Corporation across the entire complex. Because of the complex operating and technical environment at AA, requests for information must go out to multiple constituents simultaneously. The information will have to be pieced together from source systems, data warehouses, s, spreadsheets and manual reports. This takes time and time is of the essence. A secondary requirement will be to determine the holdings for XYZ at the individual client level. The SMAs, CTFs and Commingled Pools are all accounted for on the Portfolio Accounting system which has aggregate reporting functionality. In addition to the holdings data, we will need information on pending trades, settlement issues and/or shares on loan for XYZ Corporation. This information is helpful in determining a trading strategy that will minimize any potential issues should they decide to authorize selling out of all positions in XYZ. 6

9 BOMBUZ A UNIQUE BIG DATA SOLUTION To solve the business scenario quoted in this paper, a proof of concept utilizing the Bombuz framework is being developed. The sample financial company AA acts as a custodian for vendors that provides various financial products, such as mutual funds, SMA, CTF, etc., to individuals as well as institutions. The transaction data is distributed on different systems, such as data warehouse, portfolio account system or , depending on the asset class of the transaction. To explore the exposure across all the product lines, all the transaction data will be extracted first into HBase in RAW format, upon which queries can be made at the issuer level, and also be further drilled down to the customer level. In addition, the semantic connection among customers can be customized in the user interface, and the resulting dataset presented in dashboard will change accordingly. Data Synchronization The Thrift service solution is designed to synchronize data from ODS to HBase. The reason we don t use Hadoop Sqoop, an open source tool for bulk loading, is that this solution requires the HBase server to pull data from source systems, thus imposing a great burden on the server to manage and coordinate with various clients. The Thrift service solution, instead, lets the clients push data to the server whenever they are available. The data can be stored in various ways as structured data from MySQL and Oracle, semi-structured data as found in spreadsheet tables, or unstructured data as found in and can be synchronized into HBase easily. 7

10 Semantic Mapping As mentioned previously, Bombuz synchronizes raw data from source operating data stores to HBase without an ETL process, as in traditional data warehousing. Thus, the same data may present in different identifiers and different formats. Semantic web technology is leveraged to establish the logic link amongst related data. Generally, there are two ways of synchronizing that can serve the purpose. The first method is to map all the data in HBase, together with the domain ontology that maintains the relationships among them, to a consolidated virtual RDF (Resource Description Framework) graph, as shown in the following chart. This unified data view can be queried through standard RDF query language, such as SPARQL. Currently, there is no out-of-box tool geared towards mapping a non-relational database to RDF. However, there are tools mapping RDB (Relational Database) to RDF, such as D2RQ, which is an open source tool capable of mapping SQL-92 compatible database to RDF. And combined with Hive, a SQL-like query engine, D2RQ can successfully execute most of the RDF queries with the data in HBase. The second method is to merely maintain the data relationships as domain ontology, and leave the raw data in HBase intact. As illustrated in the chart below, a client query should be broken down into a SPARQL query and a MapReduce program, interacting with RDF graph and HBase tables respectively. 8

11 Both methods are implemented and then executed on a small Hadoop cluster, just to demonstrate the performance comparison between them. The results chart below shows that it takes much more time for the unified RDF method (the first method) to process the query than the second method. This is because the first method attempts to translate SPARQL to SQL with D2RQ, and then translate SQL to MapReduce with Hive. Clearly the process suffers from these two transitions. Another disadvantage of the first method is that Hive does not support all the features of SQL-92. Also, Hive requires major upgrades that will allow for all the features of SPARQL to be supported through D2RQ. Therefore, the second method is adopted as more suitable for Bombuz. The architecture diagram below shows how the system is implemented under this philosophy. The query service always retrieves data result sets that are pre-processed and stored in Hadoop. The semantic logic is maintained in a triple store through the rule service, which also triggers a re-calculation of certain queries as long as some data link changes occur. 9

12 Performance Testing In order to measure the potential performance of the solution, especially how many nodes are needed for the cluster to achieve acceptable response latency, we conducted a series of in-house tests upon several clusters with different numbers of slave nodes. All the cluster nodes were commodity machines and each was equipped with one Intel G GHZ 2- core processor, one 2 GB RAM, 1T SATA hard disk. The size of effective data to be processed in the test case is 1 TB (the physical data stored is 3 TB with a default setup of three replications in HBase). The chart below shows the average processing time required to compute the total amount of holdings and pending transactions at the vendor level, upon two clusters with different numbers of nodes. 10

13 According to the results above, the predictive response time resembles the curve below (assuming the processing time will drop linearly when the number of nodes expands). This gives us a ballpark estimation of a 10-minute response time, if the number of nodes reaches 500. Dashboard Following are sample dashboards of Pending Transactions and Holding Reports running in ipad as HTML5: 11

14 CONCLUSION AND FUTURE WORK This POC proves that Bombuz, as a Big Data and Semantic Web framework, is a feasible solution for processing a huge volume of data scattered within various data stores. Built upon Hadoop / MapReduce and Semantic Web, this system can scale without physical limits. And compared to the potential downtime triggered by semantic rule changes (e.g. several days) for repetitive ETL by a traditional data warehouse solution, our solution can achieve a response time of several hours or even several minutes as long as enough commodity machines are configured to the cluster. Still, there is more work to be done. Firstly, the performance testing for data extraction / synchronization from ODS to HBase will be conducted and analyzed to determine a practical mechanism with a high payload. Secondly, a scalable solution for large triple stores may need to be explored, since the size of RDF triples could be too large to manage in current triple stores, such as Jena TDB and SDB. A potential solution is to use HBase to store a large volume of triples, then develop methods to execute triple queries in parallel. 12

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

GigaSpaces Real-Time Analytics for Big Data

GigaSpaces Real-Time Analytics for Big Data GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload

More information

Newsletter. Hengtian FOREWORD. Volume 6: Data Analytics September 2014

Newsletter. Hengtian FOREWORD. Volume 6: Data Analytics September 2014 Hengtian Volume 6: Data Analytics September 2014 Newsletter FOREWORD Artificial intelligence, machine learning, and natural language processing have moved from experimental concepts to business disruptors,

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 Ralph Kimball Associates 2014 The Data Warehouse Mission Identify all possible enterprise data assets Select those assets

More information

Big Data for Investment Research Management

Big Data for Investment Research Management IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable

More information

TopBraid Insight for Life Sciences

TopBraid Insight for Life Sciences TopBraid Insight for Life Sciences In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

Cost-Effective Business Intelligence with Red Hat and Open Source

Cost-Effective Business Intelligence with Red Hat and Open Source Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

White Paper: Evaluating Big Data Analytical Capabilities For Government Use

White Paper: Evaluating Big Data Analytical Capabilities For Government Use CTOlabs.com White Paper: Evaluating Big Data Analytical Capabilities For Government Use March 2012 A White Paper providing context and guidance you can use Inside: The Big Data Tool Landscape Big Data

More information

BIG DATA IS MESSY PARTNER WITH SCALABLE

BIG DATA IS MESSY PARTNER WITH SCALABLE BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

Big Systems, Big Data

Big Systems, Big Data Big Systems, Big Data When considering Big Distributed Systems, it can be noted that a major concern is dealing with data, and in particular, Big Data Have general data issues (such as latency, availability,

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

Data Modeling for Big Data

Data Modeling for Big Data Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

Turning Big Data into Big Insights

Turning Big Data into Big Insights mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

Traditional BI vs. Business Data Lake A comparison

Traditional BI vs. Business Data Lake A comparison Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses

More information

OnX Big Data Reference Architecture

OnX Big Data Reference Architecture OnX Big Data Reference Architecture Knowledge is Power when it comes to Business Strategy The business landscape of decision-making is converging during a period in which: > Data is considered by most

More information

The Principles of the Business Data Lake

The Principles of the Business Data Lake The Principles of the Business Data Lake The Business Data Lake Culture eats Strategy for Breakfast, so said Peter Drucker, elegantly making the point that the hardest thing to change in any organization

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Ubuntu and Hadoop: the perfect match

Ubuntu and Hadoop: the perfect match WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

Next-Generation Cloud Analytics with Amazon Redshift

Next-Generation Cloud Analytics with Amazon Redshift Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP Pythian White Paper TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP ABSTRACT As companies increasingly rely on big data to steer decisions, they also find themselves looking for ways to simplify

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Agile Business Intelligence Data Lake Architecture

Agile Business Intelligence Data Lake Architecture Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Executive Summary The explosion of internet data, driven in large part by the growth of more and more powerful mobile devices, has created

More information

bigdata Managing Scale in Ontological Systems

bigdata Managing Scale in Ontological Systems Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural

More information

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP

BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP BIG DATA AND THE ENTERPRISE DATA WAREHOUSE WORKSHOP Business Analytics for All Amsterdam - 2015 Value of Big Data is Being Recognized Executives beginning to see the path from data insights to revenue

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

Buyer s Guide to Big Data Integration

Buyer s Guide to Big Data Integration SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Big Data & the Cloud: The Sum Is Greater Than the Parts

Big Data & the Cloud: The Sum Is Greater Than the Parts E-PAPER March 2014 Big Data & the Cloud: The Sum Is Greater Than the Parts Learn how to accelerate your move to the cloud and use big data to discover new hidden value for your business and your users.

More information

Oracle s Big Data solutions. Roger Wullschleger.

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Wrangling Actionable Insights from Organizational Data

Wrangling Actionable Insights from Organizational Data Wrangling Actionable Insights from Organizational Data Koverse Eases Big Data Analytics for Those with Strong Security Requirements The amount of data created and stored by organizations around the world

More information

CIO Guide How to Use Hadoop with Your SAP Software Landscape

CIO Guide How to Use Hadoop with Your SAP Software Landscape SAP Solutions CIO Guide How to Use with Your SAP Software Landscape February 2013 Table of Contents 3 Executive Summary 4 Introduction and Scope 6 Big Data: A Definition A Conventional Disk-Based RDBMs

More information

Data Mining in the Swamp

Data Mining in the Swamp WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Testing 3Vs (Volume, Variety and Velocity) of Big Data Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used

More information

Planning the Installation and Installing SQL Server

Planning the Installation and Installing SQL Server Chapter 2 Planning the Installation and Installing SQL Server In This Chapter c SQL Server Editions c Planning Phase c Installing SQL Server 22 Microsoft SQL Server 2012: A Beginner s Guide This chapter

More information

Big Data and Natural Language: Extracting Insight From Text

Big Data and Natural Language: Extracting Insight From Text An Oracle White Paper October 2012 Big Data and Natural Language: Extracting Insight From Text Table of Contents Executive Overview... 3 Introduction... 3 Oracle Big Data Appliance... 4 Synthesys... 5

More information

Big Data for Investment Research Management

Big Data for Investment Research Management IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment firms turn big data into actionable research

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.

More information

Firebird meets NoSQL (Apache HBase) Case Study

Firebird meets NoSQL (Apache HBase) Case Study Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

HadoopRDF : A Scalable RDF Data Analysis System

HadoopRDF : A Scalable RDF Data Analysis System HadoopRDF : A Scalable RDF Data Analysis System Yuan Tian 1, Jinhang DU 1, Haofen Wang 1, Yuan Ni 2, and Yong Yu 1 1 Shanghai Jiao Tong University, Shanghai, China {tian,dujh,whfcarter}@apex.sjtu.edu.cn

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Coho Data s DataStream Clustered NAS System

Coho Data s DataStream Clustered NAS System Technology Insight Paper Coho Data s DataStream Clustered NAS System Bridging a Gap Between Webscale and Enterprise IT Storage By John Webster November, 2014 Enabling you to make the best technology decisions

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Delivering Real-World Total Cost of Ownership and Operational Benefits

Delivering Real-World Total Cost of Ownership and Operational Benefits Delivering Real-World Total Cost of Ownership and Operational Benefits Treasure Data - Delivering Real-World Total Cost of Ownership and Operational Benefits 1 Background Big Data is traditionally thought

More information

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:

More information

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP

PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP Your business is swimming in data, and your business analysts want to use it to answer the questions of today and tomorrow. YOU LOOK TO

More information

A B S T R A C T. Index Terms: Hadoop, Clickstream, I. INTRODUCTION

A B S T R A C T. Index Terms: Hadoop, Clickstream, I. INTRODUCTION Big Data Analytics with Hadoop on Cloud for Masses Rupali Sathe,Srijita Bhattacharjee Department of Computer Engineering Pillai HOC College of Engineering and Technology, Rasayani A B S T R A C T Businesses

More information

How Cisco IT Built Big Data Platform to Transform Data Management

How Cisco IT Built Big Data Platform to Transform Data Management Cisco IT Case Study August 2013 Big Data Analytics How Cisco IT Built Big Data Platform to Transform Data Management EXECUTIVE SUMMARY CHALLENGE Unlock the business value of large data sets, including

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

Case Study. ElegantJ BI Business Intelligence. ElegantJ BI Business Intelligence Implementation for a Financial Services Group in India

Case Study. ElegantJ BI Business Intelligence. ElegantJ BI Business Intelligence Implementation for a Financial Services Group in India ISO 9001:2008 www.elegantjbi.com Get competitive with ElegantJ BI,today.. To learn more about leveraging ElegantJ BI Solutions for your business, please visit our website. Client The client is one of the

More information

Big Data for the Rest of Us Technical White Paper

Big Data for the Rest of Us Technical White Paper Big Data for the Rest of Us Technical White Paper Treasure Data - Big Data for the Rest of Us 1 Introduction The importance of data warehousing and analytics has increased as companies seek to gain competitive

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information