THE PROCESS OF BIG DATA SOLUTION ADOPTION

Size: px
Start display at page:

Download "THE PROCESS OF BIG DATA SOLUTION ADOPTION"

Transcription

1 THE PROCESS OF BIG DATA SOLUTION ADOPTION An exploratory study within the Dutch telecom and energy utility sector Master Management of Technology Faculty Technology, Policy and Management Delft University of Technology August 2013 Name: Bas Verheij Student number: st supervisor: Dr. Laurens Rook 2 nd supervisor: Prof.dr.ir. Jan van den Berg Chair: Prof.dr. Cees van Beers External company: External supervisor: Accenture Drs. Paul van der Linden Msc 1

2 2

3 ACKNOWLEDGEMENT This thesis covers the development of a conceptual big data solution adoption process model, which has been designed using empirical findings from interviews with firms adopting big data solutions within the Dutch telecommunications and energy utilities sector. This document presents the results of my research graduation research to finalize the master program Management Of Technology (MOT) at Delft University of Technology. Firstly I would like to thank Paul van der Linden from Accenture for giving me the opportunity to conduct my research at Accenture and for helping me shape the research through the many meetings we had. Accenture provided a very fruitful environment for developing knowledge on the topic of big data, and I am thankful to Paul van der Linden and the DD&A department for giving me the opportunity to conduct my research among large and interesting companies within the Netherlands. Also I would like to thank my first supervisor Laurens Rook for supervising my thesis, and mainly for the support and discussions during the process of my research. Also I would like to thank Jan van den Berg for his critical and accurate feedback, this has greatly helped my learning experience on performing scientific research. Thirdly I would to thank my friends who helped me during my time in Delft and my various study activities. Especially I would like to thank the friends that helped me finish my Bachelor in Mechanical Engineering, which ultimately made this graduation possible. Finally I am very grateful to my family for supporting me through my life and my study period, and especially for supporting me to persevere in my studies. 3

4 LIST OF ABBREVIATIONS Abbreviation ACID AWS EC2 AWS EMR API BASE BI BI&A CAP theorem DM DW ETL HDFS MVCC OLAP RDBMS DSS DDD IS Term Atomicity, Consistency, Isolation, Durability Amazon Web Services Amazon Web Services EC2 Amazon Web Services Elastic MapReduce Application Programming Interface Basically Available, Soft state, Eventual consistency Business Intelligence Business Intelligence & Analytics Consistency, Availability, Partition tolerance Data mining Data Warehouse Extraction-Transformation-Loading Hadoop Distributed File System Multi-version Concurrency Control Online Analytical Processing Relational Database Management System Decision Support System Data-driven Decisionmaking Information System 4

5 5

6 TABLE OF CONTENTS 1 INTRODUCTION RESEARCH OBJECTIVE RESEARCH QUESTIONS RESEARCH FRAMEWORK STRUCTURE OF THIS REPORT BACKGROUND INTRODUCTION BUSINESS INTELLIGENCE & ANALYTICS AND BIG DATA DATABASE TECHNOLOGY BIG DATA SOLUTIONS BIG DATA AND ORGANIZATIONS THEORY: IT INNOVATION ADOPTION CONCLUSION: BI GENERATIONS AND ADOPTION METHODOLOGY CASE-STUDY RESEARCH APPROACH JUSTIFICATION CASE-STUDY RESEARCH STRATEGY PHASE OF PREPARATION PHASE OF DATA COLLECTION PHASE OF DATA ANALYSIS DESIGN OF CONCEPTUAL MODEL CASE-STUDY VALIDITY CONCERNS BIG PRACTICES IN TWO INDUSTRIES ENERGY SECTOR OVERVIEW TELECOM SECTOR OVERVIEW OTHER SECTORS OVERVIEW RESULTS OF THEMATIC ANALYSIS CASE STUDY RESULTS BUSINESS CASE DEVELOPMENT SOLUTION CHOICE ORGANIZATIONAL CHANGE INFORMATION PRIVACY IMPLEMENTATION AND FINE-TUNING PROBLEMS IN BIG DATA TECHNOLOGY CONCEPTUAL MODEL CONSIDERATIONS: REQUIREMENTS FOR BIG DATA ADOPTION BIG DATA ADOPTION PROCESS PHASES ISSUES PERCEIVED IN THE ADOPTION PROCESS ADOPTION PROCESS MODEL ANALYSIS IT INNOVATION ADOPTION (BUSINESS CASE DEVELOPMENT) RADICAL INNOVATION (ORGANIZATIONAL CHANGE) CHANGES IN IS ARCHITECTURES (SOLUTION CHOICE)

7 7.4 INFORMATION PRIVACY AS A MAIN DRIVER FOR IS CHANGE (PRIVACY) REFLECTION REFLECTION ON RESEARCH RESULTS QUALITY OF THE RESULTS REFLECTION ON RESEARCH PROCESS CONCLUSION MAIN FINDINGS RESEARCH IMPLICATIONS MANAGERIAL IMPLICATIONS RESEARCH LIMITATIONS FUTURE RESEARCH CLOSURE REFERENCES APPENDIX A: PLANNING APPENDIX B.1: INTERVIEW PROTOCOL APPENDIX B.2: INTERVIEW CASE STUDY DESCRIPTION QUESTIONS APPENDIX B.3: INTERVIEW PERCEPTIONS QUESTIONS APPENDIX D: SAMPLE DETAILS D.1 GENERAL FIRM CHARACTERISTICS D.2 CHARACTERISTICS FIRM CASES D.3 GROUPING OF ADOPTION PROCESSES APPENDIX F: MOST FREQUENTLY USED PREDICTORS OF ORG. IT ADOPTION (JEYARAJ ET AL., 2006) APPENDIX G: LITERATURE RESEARCH APPENDIX H: DATABASE LANDSCAPE (AS OF JUNE 2013) APPENDIX I: NOSQL FUNDAMENTAL TECHNICAL CONCEPTS I.1 FUNDAMENTAL CONCEPTS I.2 TYPES OF DATA MODELS I.3 PERFORMANCE AND ELASTICITY

8 EXECUTIVE SUMMARY Research Objective Starting with the research objective: the main purpose of this research is to design a conceptual big data solution adoption model by exploring the process of big data solution adoption within organizations. This multiple-case study gives a thorough description of the adoption process of big data solutions and the main issues organizations experience within this process. Phenomenon The term big data is primarily seen as an umbrella term used within the industry, clear definitions in scientific literature have not been found at the moment of research. Gartner was the first firm within the industry to name the big data phenomenon defining it as challenges and opportunities in data growth in Gartner defined these challenges and opportunities in data growth as having three facets: increasing volume, velocity and variety ("The three Vs"; Pettey & Goasduff, 2011). The McKinsey Global Institute published an industry report in 2011, which uses an intentionally subjective definition to capture the essence of the difference between data and big data. This definition will be used as a principle in this paper (Manyika, Chui, Brown, & Bughin, 2011; p1): Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. The big data phenomenon is accompanied by a new generation of databases so called NoSQL databases which have the capability of distributed analyzing large datasets of unstructured data. These databases are expected to be adopted within firms as the need of analysis of an increasing volume, velocity and variety of data demands for these new database technologies. Although big data is widely and wildly cited in industry papers, analyses of actual activities within firms is scarce. This is strange, concerning the large potential impact big data has on society. This is also reflected in the coming MIS Quarterly Executive Dec Special edition, which titles How to succeed in a world of big data 1 and calls specifically for papers on innovative business uses of data, and organizational approaches to developing and sustaining big data skills and capabilities; topics that are as well described in within this research. Methodology This research was designed as a multiple-case study research (Yin, 2009) in which people involved in big data practices in Dutch telecommunications and energy utility firms were interviewed. The case studies were summarized and through a process of open-coding interviews (Strauss & Corbin, 1998) main concepts were extracted from the codes using Eisenhardts (1989) process of inducting theory using case studies. Findings Using eight cases within two Dutch sectors, five phases within the big data solution adoption process were found: a strategy development phase, a knowledge development phase, a pilot/test-case phase, a platform implementation phase and a fine-tuning phase. Issues described within the process of big data solution adoption were categorized in four main concepts: business case development, technical, organizational, and information privacy related. These findings are depicted in the conceptual model below. 1 See Retrieved 4/7/

9 Information privacy Organizational issues Technical issues Business case development Phase Big data adoption process issues Strategy Knowledge development Pilot/test-cases Platform implementation Fine-tuning - Business case barriers (5) - Business case drivers (4) - Business case co-development - Pilot isolated from normal business processes - Hard to get data from other party - IT management support - Internal stakeholder support - No standards in big data technology yet - Top management support - Instability & bugs big data systems - Define scale of implementation - Solution choice related (combined in pilot/test-cases) - Choices on data granularity - Solution choice related (15) - Unstable platforms - Ecosystem complexity - Data availability - Data access (silo s) - Integrate platform in traditional processes - Ecosystem management - Platform rigidity (ETL schema s) - Data sourcing problems - Algorithm large scale dataset problems - Data streaming problems - When to develop analytical skills - When to start saving data - Team setup & structure - Align internal stakeholders - Role needed to translate analytics to business processes - Unknown consumer attitudes on privacy - Develop analytical capabilities - External communication - Get support from organization - Align external parties - Infrastructure capabilities needed - Specific case capabilities needed (e.g. reliability engineers) - Unknown consumer attitudes on privacy - Restraints internal law department - Costs of integration privacy tooling (existing systems) - Changing laws & regulations Conceptual model of issues in the big data solution adoption process. These concepts after analysis led to twelve main issues in the process of big data solution adoption: Business case development (strategy, knowledge development, and pilot/test-case phases) 1. A paradigm change is needed in order for a firm to recognize the value of big data business cases. 2. Business case development is could be driven by both IT management support or top management support (in case of new business development), it is hampered due to three main reasons a. The business case value being unclear and business case return being unclear b. High financial and time investment before value of business case becomes clear c. High baseline pressure (especially in small firms) 3. Information privacy concerns significantly impact business case development as will be discussed in (6). Solution choice (pilot/test-case phase) 4. Choice for big data solutions (NoSQL databases) are mainly driven by need for scalability, this is also the main limiting factor on currently used IS. Unstructured data was not reported as a main determinant for big data solution choice. 5. Parallel databases are currently seen as a viable alternative for big data (NoSQL) solutions by firms which had already adopted parallel databases for other firm activities. 6. Ecosystem complexity and management problems are reported in both experimentation and implementation phases, these lead to significant delays in the solution adoption process in implementation and fine-tuning phase. 9

10 Organizational change (knowledge development, pilot/test-case, and implementation phases) 7. An analytics department was perceived needed during the process of big data solution adoption and in all cases were such an department was not yet present an Analytics department was setup during the process, which caused significant delays in the process. 8. Business process experts (or business consultants) that can translate business case needs in information systems requirements taking into account legal and technical constraints are perceived a crucial and needed role in the process of big data solution adoption. 9. Specific big data solution infrastructure setup, management and maintenance capabilities were perceived needed in the process of big data solution adoption. Information privacy (knowledge development, pilot/test-case, and implementation phases) 10. Information privacy significantly influences the big data solution adoption process in three main ways: a. Uncertainty regarding consumer attitudes in combination with fear of negative brand image concerning new big data related products and services hampers business case development and experimentation, and is seen as a main barrier to new business case development in a part of cases where personal data is used. b. Risk of changes in laws and regulations hampering business case development. c. Business case being unviable due to high costs of privacy tooling (opt-in/out APIs) in the implementation phase (to connect opt-in/out APIs to current IS) hampers both implementation and new business case development. Implementation and fine-tuning of big data technology (implementation and fine-tuning phases) 11. Issues concerning data quality and accuracy problems were related to two main problems: a. Added complexity layers in big data platforms prevents traditional drill-down in case of data problems. b. Scaling effects due to large datasets lead to algorithms having to be retested and recalibrated. 12. Performance (fine-)tuning of big data solutions is perceived complicated due to high complexity of big data ecosystems and lack of documentation. 10

11 1 Introduction The Google search engine is well known and used all over the world, but a less known fact is about the use of Google of its search queries for other purposes than to optimize the search engine itself. In 2008 Google launched the project Google Flue Trends, a web service to predict flu activity 2. With Flu Trends Google could accurately estimate the current level of weekly influenza activity in each region of the United States (Ginsberg et al., 2008; p1012). How did Google accomplish this? This analysis was only based on correlations of specific search queries with the spreading of influenza related to time and location, no predefined causal links were investigated. Due to scaling effects of the large dataset and tremendous analysis capacity an accurate model could be developed. In this case use of big data led to interesting and usable new insights. Two main developments led to the emergence of the big data concept. Firstly the amount of data generated has increased tremendously in the past few years, internet traffic is calculated to reach one exabyte per day in The first companies facing this explosion were Google and Facebook, these were also a primary force in the development of new technology to deal with extremely large datasets. A second development is the increasing capacity (Moore's law; Schaller, 1997) and decreasing price of hardware following from the commoditization of IT. New open source platforms to analyze large datasets and affordable hardware led to the possibility of larger-scale adoption of so-called big data solutions. After Google s (Dean & Ghemawat, 2008) and Facebook s (Lakshman & Malik, 2010) publication of new paradigms to process and analyze data the (among others) the open source framework Hadoop emerged. A second generation of companies started to adopt these new technologies including Ebay, Amazon and Walmart. Gradually more firms became interested and started to test use cases. Also professional services firms were aware of the new possibilities unlocked by big data technology, after a widely cited report by McKinsey was published (Manyika et al., 2011) a hype started around the phenomenon of big data. Thousands of articles on the value and need to adopt big data were published in the past two years. Also within the Netherlands big data solutions are gradually adopted, the use of big data is mentioned in e.g. Bol.com 4 and the Rabobank 5. Then the question arises: are use cases seen within other companies within the Netherlands, and do all companies in a sector perceive an urgence to invest in big data solutions? And what problems do firms experience when actually adopting big data technologies? This research answers these questions by investigating adoption of big data solutions in the Dutch telecommunications and energy utility sectors. 2 Google Flue Trends, 3 Cisco: The Zettabyte Era - Trends and Analysis. WP.html. Retrieved 5/8/ Vijf vragen aan Arjen de Ruiter, Bol.com. Retrieved 5/8/ Presentation Big Data bij de Rabobank ; NGI, 17 Januari https://www.ngi.nl/regios/utrecht/verslagen/big-data/bijlagen/ rabobank-voor-ngi.pdf. Retrieved 4/7/

12 This research produces a conceptual big data adoption process model which can be used by companies to identify key issues in the process of big data adoption. Furthermore four categories of issues are described in detail and theoretical implications and research directions for these concepts are identified. 12

13 1.1 RESEARCH OBJECTIVE The purpose of this research is to generate knowledge on the process of big data solution adoption within companies. To achieve this purpose, an investigation is needed on how companies perceive big data, and what activities they employ to integrate big data solutions within their company. Therefor the following research objective is formulated: Research objective: the main purpose of this research is to design a conceptual big data solution adoption process model. This research objective is reached through four main steps: A Examine the adoption of various big data solutions by firms within two sectors in the Netherlands. B Investigate key problems, choices, drivers and barriers in the adoption process of big data solutions. C Design a conceptual model which describes the process of big data solution adoption. D Match key categories of issues from the conceptual model with literature to find similarities, differences and knowledge gaps. A number of research questions guide the research process, these will be elaborated on in the next section. 1.2 RESEARCH QUESTIONS Following the main research objective a set six sub-questions is proposed to further guide the research. These sub-questions break down the process into a set of inquiries, and will together answer the main research question. Firstly, the theoretical body of knowledge concerning the term big data and accompanying technologies provides scope and relevance for this research. The term big data is widely and wildly cited but a clear definition is lacking; therefore, the first sub-question is stated as: RQs 1. Which facets of big data have been described within literature and industry papers? This question will be answered by literature research on the big data phenomenon within the scientific domain and within the industry. In addition, a thorough understanding of the actual experience within companies is needed. A body of knowledge is needed on firms current activities, as big data practices have not been often described in literature yet. Therefore, the second sub-question relates to current rate of adoption of big data solutions within Dutch firms. This question involves the gathering of empirical data, which is used to map and describe the process of big data solution adoption. RQs 2. How can the current activities employed within information-intensive firms (in the Netherlands) be described with regard to big data solution adoption? 13

14 To describe the main themes related to the adoption of big data solutions, the perceptions of actors involved in the process will be investigated. Perceptions of actors involved in the process can be seen as a pointer to the concepts involved in this process. This is guided by the following research sub-question: RQs 3. How do actors within information-intensive firms (in the Netherlands) perceive choices, problems and changes related to the process of adopting big data solutions within their company? These empirical findings are then mapped on phases of the adoption process to provide a conceptual model of the big data solution adoption process. This is guided by the following research sub-question: RQs 4. How do main categories of perceived choices, problems and changes concerning the process of adopting big data solutions within a company relate to phases in the adoption process? The process model and the categories of issues are then discussed in relation to literature to provide scientific grounding for the phenomena described. Because the big data phenomenon is hot and happening, it is well possible that scientific models lag behind the big data solutions that are tested out in practice. Therefore, the next sub-question guides the process of identifying similarities and differences within scientific models with the phenomena described, and ultimately should identify knowledge gaps within existing literature. RQs 5. How does the conceptual model of the big data solution process and described categories of issues relate to current scientific literature? Finally, the practical relevance of this research will be explored through the last sub-question. More precisely, a set of managerial recommendations will be given using the body of knowledge developed in this research. RQs 6. How can companies within information-intensive firms (in the Netherlands) best apply the body of knowledge created by this empirical research? These research sub-questions will be answered through a multiple-case study, the details and design of this multiple-case study will be described within the methodology chapter of this research. 14

15 1.3 RESEARCH FRAMEWORK The research framework depicted below describes the process of generating knowledge in this thesis. The thesis is centered the description of the process of big data adoption in case studies. The relation between research questions, research activities employed and deliverables is visualized below. Figure 1.1. Research framework. 1.4 STRUCTURE OF THIS REPORT This report is arranged in accordance with the general structure of a scientific article. Each chapter described a part of the research process and answers a research sub-question as specified in chapter two. Chapter 2 provides a theoretical background for this research and thereby answers the first research question. Chapter 3 outlines the methodology chosen to fulfill the research object and questions. Chapter four contains case descriptions of all case studies involving the adoption of new technology. Chapter five contains the results of the case studies and chapter 6 contains the conceptual model, and Chapter 7 contains an analysis of the main concepts. Chapter 8 contains a reflection on the research results and process. Finally in chapter 9 contains a discussion on the findings, relevance in both scientific and managerial terms and research limitations. This chapter also contains an answer to the main research question as specified above. This is depicted in figure 1.2 below. 15

16 Figure 1.2. Structure of this research report. 16

17 17

18 2 Background In this chapter an overview of the emergence of the term big data and a review of literature currently available will be given. A number of facets of big data will be discussed in subsequent sections. An overview of the main concept of big data will be given in section 2.1 Practices in the business intelligence domain will be discussed in section 2.2. New technology related to big data and big data solution will be discussed in section 2.3 and 2.4 respectively. Big data and organizational adoption will be discussed in section 2.5. The topic of innovation adoption will be discussed in section 2.6. Finally, different business intelligence generations and adoption will be described in section 2.7. This chapter will provide an answer to the research sub-question: RQs 1. Which facets of big data have been described within literature and industry papers? 18

19 2.1 INTRODUCTION The first firm within the industry to name the big data phenomenon was Gartner, defining it as challenges and opportunities in data growth in Gartner defined these challenges and opportunities in data growth as having three facets: increasing volume, velocity and variety (Pettey & Goasduff, 2011). In 2012 Gartner updated their definition of big data stating that "Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization" (Beyer & Laney, 2012). Although this definition is still widely used within the industry, it had no clear boundaries and only indicated challenges in data growth which needed to be solved. In 2011 the McKinsey Global Institute (MGI) published an influential industry report which gave an in-depth analysis of the transformative potential of big data for firms. This report used an intentionally subjective definition to capture the essence of the difference between data and big data (Manyika, Chui, Brown, & Bughin, 2011; p1): Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. Furthermore the report consisted of seven main statements which will be listed below: 1. Data have swept into every industry and business function and are now an important factor of production 2. Big data creates value in several ways: by creating transparency; enabling experimentation to discover needs, expose variability, and improve performance; segmenting populations to customize actions; replacing/supporting human decision making with automated algorithms; and innovating new business models, products, and services. 3. Use of big data will become a key basis of competition and growth for individual firms through data driven decision making (Brynjolfsson, Hitt, & Kim, 2011). 4. The use of big data will underpin new waves of productivity growth and consumer surplus 5. While the use of big data will matter across sectors, some sectors are poised for greater gains 6. There will be a shortage of talent necessary for organizations to take advantage of big data 7. Several issues will have to be addressed to capture the full potential of big data: data policies; technology and techniques; organizational change and talent; access to data; and industry structure. These statement will be used as guidance for themes and issues related to big data solution adoption in this research. Within scientific literature, the term big data is used scarcely and is often presented with definitions derived from Gartner and the McKinsey Global Institute. E.g. Snijders et al. (2012; p. 1) use a definition of big data as a loosely defined term used to describe data sets so large and complex that they become awkward to work with using standard statistical software ; and Chen, Chiang, & Storey (2012; p. 1166) use a definition of big data and big data analytics as to describe the data sets and analytical techniques in applications that are so large (from terabytes to exabytes) and complex (from sensor to social media data) that they require advanced and unique data storage, management, analysis, and visualization technologies. Most technical developments which from a business perspective can be put under the term big data (e.g. the development of specific new databases) are discussed without any reference to the word big data. Some authors do discuss the term briefly, e.g. Cuzzocrea, Song, & Davis (2011) see big data as unstructured data having four characteristics in common: 1. The data being large-scale and often distributed 2. Involving scalability issues 3. Supporting advanced Extraction-Transformation-Loading (ETL) processes to structure information 4. Designing and developing easy and interpretable analytics in order to extract meaningful knowledge. 19

20 For a more technical approach to big data in order to define key differences of big data solutions opposed to classical solutions, knowledge of the Business Intelligence (BI) domain is indispensable. This will be discussed within the next section. 2.2 BUSINESS INTELLIGENCE & ANALYTICS AND BIG DATA As data stored within firms is increasing in volume, velocity and variety, new technical solutions and analytical tools are introduced within the BI&A domain. In this section the evolution of the business intelligence domain will be described (section 2.2.1), the general process within BI systems (section 2.2.2), and the subdomain of data analytics will be described (section 2.2.3) EVOLUTION OF CLASSICAL BUSINESS INTELLIGENCE The term Business Intelligence (BI) was first introduced in the IBM Journal in 1958, stating a BI system to have the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal (Luhn, 1958; p314). Luhn described a BI system as: (i) an automatic system being developed to disseminate information to sections of an organization, (ii) using data-processing to create profiles for each of the action points of an organization, and (iii) using incoming and externally generated data (Luhn, 1958). With the development of Decision Support Systems (DSS) starting in the 1960s and the development of computers in the 1980s, Business Intelligence changed to an umbrella term to describe systems for support and improvement of business decision making (Power, 2007; Khan & Quadri, 2012). The development of the BI domain can be seen from a perspective involving generations. The evolution of BI started with the above mentioned description by Luhn (1958). As computers where first produced in the 1960s, it then became possible to build Management Information Systems (MIS), providing managers with structured periodic reports. These MIS where a precursor of Decision Support Systems (DSS) (Power, 2007). Research mainly at the Carnegie Institute of Technology and at Massachusetts Institute of Technology in the 1960s and 1970s led to a broadening of the purposes DSS where used for, and to the development of executive information systems (EIS) or executive support systems (ESS). Starting in the 1990s, Inmon (1992) and Kimball (1996) where highly influential in incorporating relational database technologies into DSS, focusing on datadriven DSS. This lead to the term Data Warehouse (DW or DWH) or Enterprise Data warehouse (EDW) to become a central term in BI. The data warehouse (DW) is a central database used for BI purposes, for reporting and data analysis (Inmon, 1992). More specifically, a DW is a copy of analysis and is informational, analysis and decision support oriented, not operational or transaction processing oriented (Kimball, 1996; as cited from Khan & Quadri, 2012) BUSINESS INTELLIGENCE SYSTEMS 20

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Big Data Zurich, November 23. September 2011

Big Data Zurich, November 23. September 2011 Institute of Technology Management Big Data Projektskizze «Competence Center Automotive Intelligence» Zurich, November 11th 23. September 2011 Felix Wortmann Assistant Professor Technology Management,

More information

Firebird meets NoSQL (Apache HBase) Case Study

Firebird meets NoSQL (Apache HBase) Case Study Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D. Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Big Data With Hadoop

Big Data With Hadoop With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

Navigating the Big Data infrastructure layer Helena Schwenk

Navigating the Big Data infrastructure layer Helena Schwenk mwd a d v i s o r s Navigating the Big Data infrastructure layer Helena Schwenk A special report prepared for Actuate May 2013 This report is the second in a series of four and focuses principally on explaining

More information

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Introduction to Big Data! with Apache Spark UC#BERKELEY# Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics

More information

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems

More information

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

This Symposium brought to you by www.ttcus.com

This Symposium brought to you by www.ttcus.com This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010 System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2

More information

Lecture Data Warehouse Systems

Lecture Data Warehouse Systems Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores

More information

Data Warehouse design

Data Warehouse design Data Warehouse design Design of Enterprise Systems University of Pavia 10/12/2013 2h for the first; 2h for hadoop - 1- Table of Contents Big Data Overview Big Data DW & BI Big Data Market Hadoop & Mahout

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem: Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

Chapter 7. Using Hadoop Cluster and MapReduce

Chapter 7. Using Hadoop Cluster and MapReduce Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in

More information

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go

More information

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12

Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12 Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

CIO Guide How to Use Hadoop with Your SAP Software Landscape

CIO Guide How to Use Hadoop with Your SAP Software Landscape SAP Solutions CIO Guide How to Use with Your SAP Software Landscape February 2013 Table of Contents 3 Executive Summary 4 Introduction and Scope 6 Big Data: A Definition A Conventional Disk-Based RDBMs

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Big Data and the Cloud Trends, Applications, and Training

Big Data and the Cloud Trends, Applications, and Training Big Data and the Cloud Trends, Applications, and Training Stavros Christodoulakis MUSIC/TUC Lab School of Electronic and Computer Engineering Technical University of Crete stavros@ced.tuc.gr Data Explosion

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS 9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Can the Elephants Handle the NoSQL Onslaught?

Can the Elephants Handle the NoSQL Onslaught? Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented

More information

The emergence of big data technology and analytics

The emergence of big data technology and analytics ABSTRACT The emergence of big data technology and analytics Bernice Purcell Holy Family University The Internet has made new sources of vast amount of data available to business executives. Big data is

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?

More information

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Software October 2010 TABLE OF CONTENTS INTRODUCTION... 3 BUSINESS AND IT DRIVERS... 4 NOSQL DATA STORES LANDSCAPE...

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com

Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Wienand Omta Fabiano Dalpiaz 1 drs. ing. Wienand Omta Learning Objectives Describe how the problems of managing data resources

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

Cloud Scale Distributed Data Storage. Jürmo Mehine

Cloud Scale Distributed Data Storage. Jürmo Mehine Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented

More information

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are

More information

NoSQL Data Base Basics

NoSQL Data Base Basics NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA

ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call

More information

Data Services Advisory

Data Services Advisory Data Services Advisory Modern Datastores An Introduction Created by: Strategy and Transformation Services Modified Date: 8/27/2014 Classification: DRAFT SAFE HARBOR STATEMENT This presentation contains

More information

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

Big Data Management and Security

Big Data Management and Security Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value

More information

An Approach to Implement Map Reduce with NoSQL Databases

An Approach to Implement Map Reduce with NoSQL Databases www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh

More information

Big Data and Apache Hadoop Adoption:

Big Data and Apache Hadoop Adoption: Expert Reference Series of White Papers Big Data and Apache Hadoop Adoption: Key Challenges and Rewards 1-800-COURSES www.globalknowledge.com Big Data and Apache Hadoop Adoption: Key Challenges and Rewards

More information

Oracle s Big Data solutions. Roger Wullschleger.

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Finding your Big Data Way

Finding your Big Data Way Finding your Big Data Way Finding your Big Data Way A multiple case study on the implementation of Big Data Date: July 2015 Author: Amani Michael Introduction New Information Technologies and new possibilities

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

A survey of big data architectures for handling massive data

A survey of big data architectures for handling massive data CSIT 6910 Independent Project A survey of big data architectures for handling massive data Jordy Domingos - jordydomingos@gmail.com Supervisor : Dr David Rossiter Content Table 1 - Introduction a - Context

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

WHITE PAPER. Four Key Pillars To A Big Data Management Solution

WHITE PAPER. Four Key Pillars To A Big Data Management Solution WHITE PAPER Four Key Pillars To A Big Data Management Solution EXECUTIVE SUMMARY... 4 1. Big Data: a Big Term... 4 EVOLVING BIG DATA USE CASES... 7 Recommendation Engines... 7 Marketing Campaign Analysis...

More information

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,

More information

Proact whitepaper on Big Data

Proact whitepaper on Big Data Proact whitepaper on Big Data Summary Big Data is not a definite term. Even if it sounds like just another buzz word, it manifests some interesting opportunities for organisations with the skill, resources

More information

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP Pythian White Paper TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP ABSTRACT As companies increasingly rely on big data to steer decisions, they also find themselves looking for ways to simplify

More information

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置

More information

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

WA2192 Introduction to Big Data and NoSQL EVALUATION ONLY

WA2192 Introduction to Big Data and NoSQL EVALUATION ONLY WA2192 Introduction to Big Data and NoSQL Web Age Solutions Inc. USA: 1-877-517-6540 Canada: 1-866-206-4644 Web: http://www.webagesolutions.com The following terms are trademarks of other companies: Java

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

DATA MINING AND WAREHOUSING CONCEPTS

DATA MINING AND WAREHOUSING CONCEPTS CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation

More information

ROME, 17-10-2013 BIG DATA ANALYTICS

ROME, 17-10-2013 BIG DATA ANALYTICS ROME, 17-10-2013 BIG DATA ANALYTICS BIG DATA FOUNDATIONS Big Data is #1 on the 2012 and the 2013 list of most ambiguous terms - Global language monitor 2 BIG DATA FOUNDATIONS Big Data refers to data sets

More information