White Paper: Evaluating Big Data Analytical Capabilities For Government Use
|
|
- Arnold Webb
- 8 years ago
- Views:
Transcription
1 CTOlabs.com White Paper: Evaluating Big Data Analytical Capabilities For Government Use March 2012 A White Paper providing context and guidance you can use Inside: The Big Data Tool Landscape Big Data Tool Evaluation Criteria More Resources
2 CTOlabs.com Evaluating Big Data Analytical Capabilities for Government This paper, produced by the analysts and researchers of CTOlabs.com, proposes ten criteria for evaluating analytical tools, focused on capabilities in the emerging Big Data space. The methods and models here can help you select the best capability for your mission needs. Executive Summary The need for sensemaking across large and growing data stores has given rise to new approaches to data infrastructure, including the use of capabilities like Apache Hadoop. Hadoop overcomes traditional limitations of storage and compute by delivering capabilities that run on commodity hardware and can leverage any data type. Hadoop enables scalability to the largest of data sets in a very cost effective way, making it the infrastructure of choice for organizations seeking to make sense of their growing data stores. Its ability to store data without a data model means information can be leveraged without first knowledge of what questions will be asked of the data, making this a system with far more agility than legacy data based. The core capability of Hadoop has now grown to include a full framework of tools that include a data warehouse infrastructure (Hive), parallel computation capabilities (Pig), scalable distributed databases able to store large tables (HBase), scalable means of distributing data (HDFS) and tools for rapidly importing and managing data and coordinating the infrastructure (like Sqoop, Flume, Oozie and Zookeeper). The use of this framework of Hadoop tools has given rise to a new series of innovation in sensemaking over large quantifies of data and has laid the foundation for a dramatic growth of new analytical tools which can operate over these Big Data infrastructures. Over the last several years organizations that wanted to leverage this Hadoop framework wrote their own analytical capabilities to ride on top of the infrastructure. Now a new trend has emerged. Organizations can turn to commercial vendors who offer analytical packages that ride on top of the Hadoop framework. This positive trend makes it easier to deliver advanced big data solutions to end users. The right tool can enable more agile use of your organization s data stores and can do so quickly. The right tool can also make Big Data analytics so easy that end users can form their own queries and generate their own responses. This new development is particularly exciting to knowledge-based government organizations seeking to empower their workforce with up to date insights. 1
3 A White Paper for the Government IT Community This paper provides a framework meant to help in your evaluation of Big Data analytical tools. We review ten factors we believe should be paramount in your evaluations of Big Data analytical software packages. We present these factors in a way you should find easily tailorable to your organizational needs. Ten Evaluation Factors The ten factors we believe should be at the forefront of your decision are: Mission Functionality/Capability Ease of Use/Interface Architecture Approach Data Architecture Models Licensing Security and Enterprise Governance Partner Ecosystem Deployment Models Health of the Firm We expand on these factors below. Mission Functionality/Capability: This may be the most important factor in deciding which Big Data analytical tools you decide to leverage in your infrastructure. If the Big Data analytical package you are selecting does not have the capability you expect and need then no other factor matters. The importance of this factor in evaluating solutions means you should have a well thought out vision you can articulate for your desired capability. For example, do you need a system that can analyze all types of unstructured and structured data? Do you need a solution that enables collaboration between analysts? Or one that has a focus on extracting knowledge from existing data stores? Do you want a system that just works in the back office of an IT shop or one that supports missions through empowering end-users? Ease of Use/Interface: One of the first questions you should ask when choosing Big Data analytical tools is who you intend to use them. Do you want to increase the capabilities of your data scientists to dig deeper into new questions? Do you want to increase the power of your analysts? Or are you hoping to push analytical capabilities out to your entire workforce? Giving more of your enterprise access to Big Data solutions leads to a more informed and agile workforce and reduces the IT 2
4 CTOlabs.com bottleneck. The same tools that help intelligence analysts map networks can help web developers evaluate guest activity on a website, can help the citizen-facing parts of your organization understand citizen requirements/trends, and help HR keep track of work flows and loads. But these capabilities only help if your workforce is willing and able to use them. A powerful tool that takes several specialized degrees or requires specific expertise such as SQL to utilize will obviously have both limited impact and limited usage, but more broadly, many non-it professionals demand walk-up usability from their information management software. Often entire departments only use a small fraction of the capabilities that powerful analytics provide because they are intimidating or hard to access. Interface matters as much for specialists as it does for your less tech-savvy employees. Your analytics should be able to pose and answer questions across all data quickly and organically so that they become an extension of the analyst s thought process. A natural interface can be more important than any individual functionality. With smooth and efficient interactions between tools and users, analysts and decisionmakers make more and better decisions faster, which is the ultimate goal of analytics. Architecture Approach: Some solutions require you to establish entire architectures just to support them. This is not a good approach. Other solutions are their own stand-alone islands and expect you to get all data into their closed system for them to do analysis. This might be ok for some missions, but in most cases you will want systems that work with your existing enterprise architecture and are able to securely move data in and out of the analytical tool. Your architecture should also help drive the interface into the capability. In most cases, every user in your organization will have a browser on their device already. Shouldn t that be the interface into all your new analytical capabilities as well? Bottom line here: The solution you choose should work with your architecture and should not force you to re-engineer. Expect the new solution to integrate well with what you already have. Data Architecture: Common standards for data are already key foundational components of most organization IT strategies. But integration of new tools can be complicated, requiring extensive set-up and configuration to extract, transform and load data from multiple sources. Tools that require large teams of programmers to build ETL accesses into existing data stores are not going to have the agility required to take advantage of new data sources or to accommodate shifting mission needs or new 3
5 A White Paper For The Federal IT Community business plans. Look for Big Data analytical tools that do not require complex data mappings and schema development that are time consuming and lock your architecture into a fixed way of work. Look for tools that are designed to work with any type of data (they should be data source agnostic). Systems that force data to be collected again and imported into their local store in set formats and indices designed only for that system s use are sub-optimal and will limit your ability to perform your mission with the flexibility you want. Seek a capability that has designed in an ability to add new data fast, without a need for engineers to design and activate the new data feed. Demand integration without limits. Analytical Models: Analytical systems designed to help with complex issues use ontologies. These are ways of reflecting associations and meanings. Ontologies are sometimes called world views of an organization, since they reflect concepts in the environment that the group is dealing with. Simple, basic systems can be found that use a single ontology system. These are ok as long as the problem you analyze will never change. Multi-ontology systems enable you to see different perspectives and manage policy by namespace. Multi-ontology systems also better enable discovery of new conclusions. The ability to have multiple models allows multiple issues to be worked, and multiple organizations can make use of the same tool. This lowers overall cost and speeds return on investment. Bottom line here: Do not select a tool that forces you to lock in on a particular analytical model. Licensing: User organizations should, to the greatest extent possible, push for licensing that is as economical, flexible and predictable. For many analytical tools a license based on number of users is a common approach. Some tools license based on the number of processors or servers or cores so you can be stuck with a high cost even if you have no one using the tool. You want systems from companies that are motivated to serve users, so licenses that reflect actual analytics used, regardless of processors or users, are the most flexible and are generally the best for this type of tool. For example, if the mission team needs to be drastically expanded in a short period of time, it may slow the project down while more licenses are acquired. Also user licenses are, in most cases, acquired for longer periods of time than mission list, so when you compare options, this sort of choice can be significantly lower cost to start and to maintain. You should also be careful about other licenses that are hidden when you buy a Big Data tool. For example, are you also required to buy an Oracle or Sybase license? Security and Governance: Enterprises require authentication, authorization, auditing and other governance of tools for effective oversight of mission support and for ensured reliability. Expect the capability you support to have options for LDAP/Active Directory integration, role-based access with delegation, integrated encryption methods and strong audit capabilities. Tools working with Hadoop 4
6 CTOlabs.com clusters should have an ability to run in the secure areas of your network that hold the Hadoop master and slave nodes. Partner and Legacy Ecosystem: Your legacy IT infrastructure comes from a wide range of firms. Any organization of size will have software that operates over datastores from companies like Oracle, Microsoft, Sybase, MySQL, IBM, Cloudera and countless others. And analytical tools from a wide range of vendors are also in your ecosystem. This means any Big Data capability you pick should have great flexibility in working with others in the ecosystem. Your Big Data solution must be able to work with anyone. So the Big Data capabilities you pick should be designed to enable customization and extension. This includes an ability to change ontologies, change interfaces, change data sources and change the other tools that it interfaces to. Deployment Models: The capabilities you acquire should be able to run without a large contractor staff. Specialists are frequently required to install a capability and some level of services and support to your team can be expected, but if you must buy a large number of engineers to keep the Big Data tools running then you really have not bought a solution. You have bought the capability plus engineers, and the cost of that will eat you alive. If you are told that engineers are required it should send up other alarms. Will there always have to be a wizard behind the curtain? Health of the Firm: Who are you buying your capability from? Are they a user-focused organization that cares and will be with you long term? This can be hard to evaluate but it is worth some homework. What if the firm you are dealing with has the great reputation of an Enron pre-crash? How would you know as a potential user if the firm has the ethics and abilities you require? Is this firm having trouble staying afloat? If you are relying on the company for support, you may lose your investment if it closes its doors. This is why the government mandates market research to be done in Federal Acquisition Regulations. Never skip that step! Research the capability itself and the firm you are doing business with. Concluding Thoughts There are many other criteria you may want to consider for evaluating Big Data analytical tools, but the ten above are key for ensuring long term mission success. We also believe it is important to speak with others who have used the tools you are evaluating to get the benefit of the lessons learned of others. This is especially important in the current budget environment. 5
7 A White Paper For The Federal IT Community More Reading For more federal IaaS technology and policy issues visit: CTOvision.com- A blog for enterprise technologists with a special focus on Big Data. CTOlabs.com - A reference for research and reporting on all IT issues. Carahsoft.com - Offering Big Data solutions for Government. About the Authors Ryan Kamauff is the lead technology research analyst at Crucial Point LLC, focusing in disruptive technologies of interest to enterprise technologists. He is also a writer at CTOvision.com Contact Ryan at Ryan@crucialpointllc.com Bob Gourley is CTO and founder of Crucial Point LLC and editor and chief of CTOvision.com He is a former federal CTO. Contact Bob at bob@crucialpointllc.com 6
8 For More Information If you have questions or would like to discuss this report, please contact me. As an advocate for better IT in government, I am committed to keeping the dialogue open on technologies, processes and best practices that will keep us moving forward. Contact: Bob Gourley bob@crucialpointllc.com All information/data 2011 CTOLabs.com. CTOlabs.com
White Paper: What You Need To Know About Hadoop
CTOlabs.com White Paper: What You Need To Know About Hadoop June 2011 A White Paper providing succinct information for the enterprise technologist. Inside: What is Hadoop, really? Issues the Hadoop stack
More informationWhite Paper: Hadoop for Intelligence Analysis
CTOlabs.com White Paper: Hadoop for Intelligence Analysis July 2011 A White Paper providing context, tips and use cases on the topic of analysis over large quantities of data. Inside: Apache Hadoop and
More informationWhite Paper: Datameer s User-Focused Big Data Solutions
CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration
More informationThree Open Blueprints For Big Data Success
White Paper: Three Open Blueprints For Big Data Success Featuring Pentaho s Open Data Integration Platform Inside: Leverage open framework and open source Kickstart your efforts with repeatable blueprints
More informationWhite Paper: SAS and Apache Hadoop For Government. Inside: Unlocking Higher Value From Business Analytics to Further the Mission
White Paper: SAS and Apache Hadoop For Government Unlocking Higher Value From Business Analytics to Further the Mission Inside: Using SAS and Hadoop Together Design Considerations for Your SAS and Hadoop
More informationBringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
More informationInfomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationWhite Paper: Enhancing Functionality and Security of Enterprise Data Holdings
White Paper: Enhancing Functionality and Security of Enterprise Data Holdings Examining New Mission- Enabling Design Patterns Made Possible by the Cloudera- Intel Partnership Inside: Improving Return on
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationPlease give me your feedback
Please give me your feedback Session BB4089 Speaker Claude Lorenson, Ph. D and Wendy Harms Use the mobile app to complete a session survey 1. Access My schedule 2. Click on this session 3. Go to Rate &
More informationSOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
More informationIntroduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
More informationForecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
More informationBuilding Your Big Data Team
Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.
More informationGetting Started with Hadoop. Raanan Dagan Paul Tibaldi
Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop
More informationCollaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
More informationQUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES
[ Consumer goods, Data Services ] TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES QUICK FACTS Objectives Develop a unified data architecture for capturing Sony Computer Entertainment America s (SCEA)
More informationEmpowering Analysts With Big Data
White Paper: Empowering Analysts With Big Data Inside: Balancing your approach to Big Data Criteria for evaluating your enterprise approach Tips for getting started 1 Four Years of Research Into Big Data
More informationExtending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team rlancaster@orbitz.com @rob1lancaster Organizer of Chicago
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationAre You Big Data Ready?
ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain
More informationMore Data in Less Time
More Data in Less Time Leveraging Cloudera CDH as an Operational Data Store Daniel Tydecks, Systems Engineering DACH & CE Goals of an Operational Data Store Load Data Sources Traditional Architecture Operational
More informationBig Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management
Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must
More informationModernizing Your Data Warehouse for Hadoop
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
More informationIn-Database Analytics
Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing
More informationBIG DATA IS MESSY PARTNER WITH SCALABLE
BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on
More informationApache Hadoop: Past, Present, and Future
The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past
More informationBig Data Management and Security
Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value
More informationTAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP
Pythian White Paper TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP ABSTRACT As companies increasingly rely on big data to steer decisions, they also find themselves looking for ways to simplify
More informationMicrosoft Analytics Platform System. Solution Brief
Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal
More informationAgile Business Intelligence Data Lake Architecture
Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step
More informationBig Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
More informationBig Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com
More informationHadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
More informationBig Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.
More informationApache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com
Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache
More informationQsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationA technical paper for Microsoft Dynamics AX users
s c i t y l a n a g n i Implement. d e d e e N is h c a o r Why a New app A technical paper for Microsoft Dynamics AX users ABOUT THIS WHITEPAPER 03 06 A TRADITIONAL APPROACH TO BI A NEW APPROACH This
More informationThe Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
More informationVirtualizing Apache Hadoop. June, 2012
June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationDell Cloudera Syncsort Data Warehouse Optimization ETL Offload
Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload
More informationRole of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
More informationBig data: Unlocking strategic dimensions
Big data: Unlocking strategic dimensions By Teresa de Onis and Lisa Waddell Dell Inc. New technologies help decision makers gain insights from all types of data from traditional databases to high-visibility
More informationLecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
More informationINDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES
INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES Data Consolidation and Multi-Tenancy in Financial Services CLOUDERA INDUSTRY BRIEF 2 Table of Contents Introduction 3 Security
More informationWhite Paper: Leveraging Web Intelligence to Enhance Cyber Security
White Paper: Leveraging Web Intelligence to Enhance Cyber Security October 2013 Inside: New context on Web Intelligence The need for external data in enterprise context Making better use of web intelligence
More informationSecuring NoSQL Clusters
Presents Securing NoSQL Clusters Adrian Lane, CTO alane@securosis.com Twitter: @AdrianLane David Mortman dmortman@securosis.com Twitter: @ Independent analysts with backgrounds on both the user and vendor
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationBeyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.
Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has
More informationCA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data
Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationDatabricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
More informationInformation Builders Mission & Value Proposition
Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns
More informationIntel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache
More informationDeploying an Operational Data Store Designed for Big Data
Deploying an Operational Data Store Designed for Big Data A fast, secure, and scalable data staging environment with no data volume or variety constraints Sponsored by: Version: 102 Table of Contents Introduction
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More informationCost-Effective Business Intelligence with Red Hat and Open Source
Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,
More informationMapR: Best Solution for Customer Success
2015 MapR Technologies 2015 MapR Technologies 1 MapR: Best Solution for Customer Success Best Product High Growth 700+ Customers Premier Investors Apache Open Source 2X 2X Growth In Direct Customers Growth
More informationBig Data and Data Science. The globally recognised training program
Big Data and Data Science The globally recognised training program Certificate in Big Data Analytics Duration 5 days Big Data and Data Science enables value creation from data, through the use of calculative
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationSecuring Hadoop. Sudheesh Narayanan. Chapter No.1 "Hadoop Security Overview"
Securing Hadoop Sudheesh Narayanan Chapter No.1 "Hadoop Security Overview" In this package, you will find: A Biography of the author of the book A preview chapter from the book, Chapter NO.1 "Hadoop Security
More informationOffload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper
Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)
More informationAdobe s Story of Integrating Hadoop and SAP HANA with SAP Data Services
Orange County Convention Center Orlando, Florida June 3-5, 2014 Adobe s Story of Integrating Hadoop and SAP HANA with SAP Data Services Kevin Davis, Senior Data Warehouse Engineer, Adobe Hemant Puranik,
More informationTransforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
More informationApache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
More informationConverged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
More informationJune 2011. Production Hadoop systems in the enterprise
June 2011 Production Hadoop systems in the enterprise 1 What Hadoop changes about data 2 The system past and present 3 Living with it your present and future 4 Q&A 2 2011 Cloudera, Inc. All Rights Reserved.
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationInternational Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763
International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing
More informationSQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse
SQL Server 2012 PDW Ryan Simpson Technical Solution Professional PDW Microsoft Microsoft SQL Server 2012 Parallel Data Warehouse Massively Parallel Processing Platform Delivers Big Data HDFS Delivers Scale
More informationHadoop Introduction. Olivier Renault Solution Engineer - Hortonworks
Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013
More informationBIG DATA SOLUTION DATA SHEET
BIG DATA SOLUTION DATA SHEET Highlight. DATA SHEET HGrid247 BIG DATA SOLUTION Exploring your BIG DATA, get some deeper insight. It is possible! Another approach to access your BIG DATA with the latest
More informationCertified Big Data and Apache Hadoop Developer VS-1221
Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationApache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com
Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
More informationTHE JOURNEY TO A DATA LAKE
THE JOURNEY TO A DATA LAKE 1 THE JOURNEY TO A DATA LAKE 85% OF DATA GROWTH BY 2020 WILL COME FROM NEW TYPES OF DATA ACCORDING TO IDC, AS MUCH AS 85% OF DATA GROWTH BY 2020 WILL COME FROM NEW TYPES OF DATA,
More informationDell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert Armando_Acosta@Dell.com/
More informationPeers Techno log ies Pv t. L td. HADOOP
Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and
More informationMike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.
Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,
More informationBig Data Too Big To Ignore
Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction
More informationAtScale Intelligence Platform
AtScale Intelligence Platform PUT THE POWER OF HADOOP IN THE HANDS OF BUSINESS USERS. Connect your BI tools directly to Hadoop without compromising scale, performance, or control. TURN HADOOP INTO A HIGH-PERFORMANCE
More informationData Governance in the Hadoop Data Lake. Michael Lang May 2015
Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales
More informationIntroduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!
More informationInteractive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
More informationSolving the Big Data Intention-Deployment Gap
Whitepaper Solving the Big Data Intention-Deployment Gap Big Data is on virtually every enterprise s to-do list these days. Recognizing both its potential and competitive advantage, companies are aligning
More informationHDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
More informationA Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
More informationBig Data Processing: Past, Present and Future
Big Data Processing: Past, Present and Future Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM
More informationBig Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
More informationMySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
More informationAdvanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
More informationConstructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
More informationThe Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM The Inside Scoop
More informationThere s no way around it: learning about Big Data means
In This Chapter Chapter 1 Introducing Big Data Beginning with Big Data Meeting MapReduce Saying hello to Hadoop Making connections between Big Data, MapReduce, and Hadoop There s no way around it: learning
More information