BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS
|
|
- Charlotte Henry
- 8 years ago
- Views:
Transcription
1 BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS Megha Joshi Assistant Professor, ASM s Institute of Computer Studies, Pune, India Abstract: Industry is struggling to handle voluminous, complex, unstructured and vast data sets. Testing big data is one of the major challenges which test engineers are not long way from encountering. Much data today is in unstructured format; for example, tweets and blogs are weakly structured pieces of text, while images and video are structured for storage and display, but not for semantic content and search: transforming such content into a structured format for later analysis is a major challenge. This paper details the possible challenges the software testing aspect has to deal with Big Test Data in a bigger way. Focus will be on two specialized testing techniques in explaining the concepts and the trends. Keywords: Big data, big test data, big test management, data warehouse test, testing, voluminous. INTRODUCTION Big data has lot of buzz in recent time. Enormous, voluminous, vast, complex, heterogeneous are some of the common terms that are perceived when Big Data is thought of. Big Data is the continuous explosion of large volume of data that are generated, processed, stored and accessed by applications that handle several concurrent transactions of data, instantaneously. A transition from structured relational data to voluminous unstructured, non-semantic, but essential, highly complex data remains a great challenge for data managers, data workers, data analyzers to hold and organize such Big Data. Social Networking sites, patenting websites, Geographical and Spatial data processing applications, remote sensing and meteorological systems have gone forward to collect data in fractions of a second and all of them are considered veracity data. Though system architects and designers are researching better ways to master Big Data, Test Architects and Test Engineers are also not long way from facing Big Test Data. Whether static or dynamic, Big Data has four dimensions - volume, velocity, variety, and veracity of data processing. 1
2 Fig.1. Dimensions of Big Data Volume is the tremendous amount of data. Enterprises are flooded with ever-growing data of all types, easily accumulate terabytes even petabytes of information. Variety is the heterogeneity of data. Big data is any type of data structured, semi structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more. New insights are found when analyzing these data types together. Velocity is the rate of transfer of data that comes in, flows within and goes out. Time is important factor to be considered here. Veracity is the correctness or quality of the data or information. Establishing trust in big data presents a huge challenge as the variety and number of sources grows. 1. OVERVIEW OF BIG DATA ARCHITECTURE As always been, Big is complex. No generalized architecture or framework could be designed for Big Data, as the forth lying challenges are even BIG. Tweets and blogs are weakly structured pieces of text, while images and video are structured for storage and display, but not semantic content and search: transforming such content into a structured format for later analysis is a major challenge. In addition to these, smileys, icons, long urls, junks of historic data are totally complex to process in one framework. Irrespective of the nature of data, the underlying storage structure is usually preferred to be a file storage system, over which Hadoop s distributions, mappings and map reduce configurations are programmed to access the BIG DATA. Data implementations such as the ones based on Apache Hadoop have no such limitations as they are capable of storing the data in multiple clusters. The storage is provided by HDFS (Hadoop Distributed File System), a reliable shared storage system which can be analyzed using MapReduce technology. Programming languages play an important role in the extraction and cleansing of the acquired data, and make them representable. NoSQL queries are customized to the type of data required to be analyzed and fetched. A big picture of how BIG DATA is functional is given below: 2
3 Fig. 2. Functional Model of Big Data Looking at this Big picture of BIG DATA, the following aspects in testing should be given due consideration: Gathering test requirements Collecting big test data Availability of test data for the environment Veracity of the patterns of usage in case of usability testing Security of these data Stress caused on the system due to the workload and volume of data Rate of scaling up of the data storage media Performance issues when such unanticipated volume of data of an application from a variety of sources The following sections throws light some these challenges. 2. SPECIALIZED WITH BIG TEST DATA Test architects and Test Engineers are not far from handling Big Test Data (BTD). Though we are at present handling clean, structured, frozen design and code and successfully complete our test cycles, it is not going to be the same in the near future. Here are two specialized testing where BTD will be a real challenge for us too. 2.1 Big Test in Data Warehouse Test Data warehouse by itself is a heterogeneous collection of relational tables, and is considered complex in most cases. Data warehouse with 27-billion rows is really big and is tested for its completeness, quality, scalability, integration, acceptance and etc. [2]. All these traits are tested in controlled environment in order to ensure that data-mining and analysis of data in 3
4 DW happens properly. But, how come this is possible in an uncontrollable environment and data. Here are a few differences. Fig.3. Comparison of Data Warehouse Test and Big Data Test Data Warehouse Test Clean Data Simplified, Structured, Semantic Data Structured Database schema Data from Relational Database, SQL queried data Specific business rules, transformation rules and design rules are applied Change in code and data is known and defined Big Data test Unclean Data Complex, Unstructured, non-semantic Data Customized instant schema generated Data from non-relational flat file storage, different storage formats, NoSQL queried data No specific business rules are applied Changes are unanticipated and occur with high velocity Data warehouse analytics and testing are BI processes which will take higher level of testing strategies, processes and tools when Big Data comes into picture. How big the data is, is a comparative perception. Big data has always been there since the late 1980s and every time it got bigger with the explosion of technological advancements. So the perception of how the test strategies are viewed and processes are redefined. There are three things to be considered: a) Make Big things Simple: Instead of talking and discussing big solutions for big test data, it is better to organize the big warehouse into simpler units that are easily testable. The tests for completeness and quality begin here. Once the Big data warehouse is logically or conceptually compartmentalized, then the testing power increases [3]. As BIG DATA normally prefers distributed and parallel computing, this strategy of divide and test will improve our testing processes. b) Normalize Design & Tests: Though NoSQL operates well with non-relational schemas, normalization of the customized schema structures is mandatory for successful DWT testing with BTD. This begins with the Hadoop s programming platforms in which these BIG DATA are tailored to the business functionality and requirements. When the dynamic schemas are normalized at design level, it paves a better way for generating normalized Big Test Data. c) Measure the 4-Vs: The last most important aspect for BIG DATAW Testing is measure and monitor the 4-Vs. Veracity is ensured by the cleaning and normalization of the data, Variety and Volume of the BIG DATA is tested for scalability testing, and the velocity is the measure of the rate of change in the BIG DATAW. The data warehouse test environments are designed to handle these four-vs with utmost priority. 2.2 Big Test in Performance Testing Performance Testing has been an essential and integral part of system testing which deals with volumes, workload (Users and transactions), real-time scenarios and navigation/behavioural patterns of end-users. While performance of a system depends on various factors and figures, like web server, database servers, hosting servers, network, 4
5 hardwares and number of peak loads, prolonged workloads etc. How these can be addressed in Big Test Data from the perception of a system s performance? Fig. 4. Big Test Data for Performance Testing Virtual Users in the Cloud Parallel Big Data Execution Group2 Group4 Group6 Group1 Group3 Group5 Distributed Testers Machine- with Controllers & Vugen Web/Application Servers Big Data Storage Layer Three things need consideration: 1. Distributed and Parallel Workload distribution testing should be conducted in parallel in a distributed environment. This being the basic concept of BIG DATA storage and computation, testing them in BIG terms is supposed to follow the same strategy. Since users transaction pattern sets are also large and voluminous, and proportional to the velocity of data flow-in, the recorded scripts are also distributed among the controllers to simulate real-time environment. 2. The performance test strategies for Load test, stress test and endurance test depend highly on the scenario set for the controller. The spreadsheets and the backend databases that hold our test data are comparatively less competent to hold unstructured BIG DATA. To execute these scenarios the controller should have an interface to integrate with the already distributed BTD. Hence the challenge. 5
6 3. The distributed Vugen, Controllers and monitors are executing the test scenarios interacting with the Big Test Data in the storage layer and the s distributed in the cloud. The test executions by the s are in parallel which pose a better way to handle the Big test execution. When the execution of these Big scenarios with Big Test Data are successful, the test results in terms of reports, charts and graphs are again too Big. The real challenge lies in the interpretation of the results, identifying the bottlenecks and areas of required performance tuning. The performance testing tools that we have now with us help meet the challenges of Big Test Data for Performance Testing is quite uncertain. As the tool vendors and the Big Data nonfunctional test teams are in the infant stages, lot more R&D need to be done in these areas. However, the practical, rollout model for Big Test environment for non-functional testing is not too far. 3. BIG TEST DATA MANAGEMENT With these two types of tests, DWT and PT, for Big Data, one could understand how complicated it is to manage the Big Test Data (BTD) in a dynamic, data-flooded environment. At this infant stage, having real-time BTD is at hand for DWT or PT is impossible due to the significant sensitivity of BIG DATA. But how to have BTD and manage it during the automated testing processes? How BTD acquiring, managing can be foreseen at the initial phases and during the execution phases of the Test life cycle? A few thoughts on these: a) Planning & Design: When dealing with BTD, planning & designing a test environment and strategy has to be prioritized. Automated tests conventionally involve recording and playback. However, refining and customizing the recorded scripts requires technical expertise, and the biggest bottleneck using scripts is that it cannot be scaled up to test big data [4]. Scaling up Big Test Data sets, without proper planning and design, will lead to delayed response time, which might result in timed-out test execution. In order to resolving the scaling up issue with BTD, action based testing (ABT) is proposed [4]. In ABT, tests are treated as actions in a test module. The actions are pointed towards keyword along with the parameters required for executing the tests. Ensure that the test modules are unambiguous, and unique, so that the actions are well-managed and non-redundant. This is in its infant level, and needs POCs to be done on BTD environment. b) Infrastructure Setup: This is unique to projects and companies. However, a generalized, tailor-made infrastructure framework is what is needed. Since test automations consume large resources in generating workloads, dedicated servers and machines are allotted for individual test cycles. Virtual Parallelism supports parallel execution of test scenarios for each test cycle at different virtual machines [4]. In this way, generation of higher workload in case of performance testing for BIG DATA can be handled effectively. However, investments on such servers are quite costly for not-so-big-companies that deal with big-data. One universal solution to rent infrastructure is 'IaaS offered through Cloud. Request for an allocation of big test data and execution of large test 6
7 executions in Cloud is an optimum way of making BIG DATAT effective and efficient. c) Manage: Effective test automation and efficient test execution are two equally significant facets of Big Test Data Management. Due to the extreme dynamism and heterogeneity of Big data, BTD setup is a challenging task in the aspects of test coverage, accuracy and the types of big test data. The roles of Test architects and leads will be crucial in setting up an environment for effective test automation. Tool selection, monitors installation, metrics collection and report generation are factors that commend test execution. With the infrastructure setup discussed above, carrying out the text execution with an appropriate tool has to be the focus of Big Data Test Strategy. Metrics collected during execution are then reported in the forms of charts and voluminous test results. This is again going to be a challenge for the testing team to interpret these results. Managing the entire Big Data Test life cycle is more challenging and involves more research into still unexplored areas. 4. NOT A DISTANT CHIMERA - CONCLUSION Big Data has the potential to revolutionize not just research, but also implementation practice and learning. Testing teams are in no way excluded from handling big data. Though there are frameworks like Hadoop, NoSQL and new programming platforms to handle Big data in development, testers are having a big time in finding optimized solutions, tools and frameworks to test the BIG DATA. Testing processes, customized test frameworks followed and testing tools used in various specialized testing will need a major revision while dealing with BIG DATA, and that will not be a distant hallucination. The challenges and discussions presented in this paper are only limited to current literature studies. Even wider potential risks and challenges will pose as we start working with the Big Test Data. What was once called as Garbage Data is today termed as Big Data. Nothing is wasted, nothing is deleted or removed. Everything is important for the business, for decision making and for the future of the organization. The future is not far, it is tomorrow. References 1. Anne & Peter Et. Al., Understanding System and Architecture of Big Data, IBM Research Technical Report, April SDSS-III: Massive Spectroscopic Surveys of the Distant Universe, the Milky Way Galaxy, and Extra-Solar Planetary Systems. Jan Available at 3. Ravishankar Krishnan, QA Strategy for Large Data Warehouse, Report, Intellisys Technology
Testing Big data is one of the biggest
Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationDATAOPT SOLUTIONS. What Is Big Data?
DATAOPT SOLUTIONS What Is Big Data? WHAT IS BIG DATA? It s more than just large amounts of data, though that s definitely one component. The more interesting dimension is about the types of data. So Big
More informationTrustworthiness of Big Data
Trustworthiness of Big Data International Journal of Computer Applications (0975 8887) Akhil Mittal Technical Test Lead Infosys Limited ABSTRACT Big data refers to large datasets that are challenging to
More informationKeywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
More informationBig Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
More informationBIG DATA-AS-A-SERVICE
White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers
More informationData Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
More informationAlexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
More informationManaging Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and
More informationSpatio-Temporal Networks:
Spatio-Temporal Networks: Analyzing Change Across Time and Place WHITE PAPER By: Jeremy Peters, Principal Consultant, Digital Commerce Professional Services, Pitney Bowes ABSTRACT ORGANIZATIONS ARE GENERATING
More informationThe 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
More informationInternational Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com
More informationRecommendations for Performance Benchmarking
Recommendations for Performance Benchmarking Shikhar Puri Abstract Performance benchmarking of applications is increasingly becoming essential before deployment. This paper covers recommendations and best
More informationUSING BIG DATA FOR INTELLIGENT BUSINESSES
HENRI COANDA AIR FORCE ACADEMY ROMANIA INTERNATIONAL CONFERENCE of SCIENTIFIC PAPER AFASES 2015 Brasov, 28-30 May 2015 GENERAL M.R. STEFANIK ARMED FORCES ACADEMY SLOVAK REPUBLIC USING BIG DATA FOR INTELLIGENT
More informationChapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:
Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationInternational Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology
International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 BIG DATA: A New Technology Farah DeebaHasan Student, M.Tech.(IT) Anshul Kumar Sharma Student, M.Tech.(IT)
More informationIndian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved
Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop
More informationTesting 3Vs (Volume, Variety and Velocity) of Big Data
Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used
More informationImproving Data Processing Speed in Big Data Analytics Using. HDFS Method
Improving Data Processing Speed in Big Data Analytics Using HDFS Method M.R.Sundarakumar Assistant Professor, Department Of Computer Science and Engineering, R.V College of Engineering, Bangalore, India
More informationBeyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.
Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has
More informationAssociate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
More informationQLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
More informationBig Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
More informationThe Next Wave of Data Management. Is Big Data The New Normal?
The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationOnX Big Data Reference Architecture
OnX Big Data Reference Architecture Knowledge is Power when it comes to Business Strategy The business landscape of decision-making is converging during a period in which: > Data is considered by most
More informationFoundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Wienand Omta Fabiano Dalpiaz 1 drs. ing. Wienand Omta Learning Objectives Describe how the problems of managing data resources
More informationPOLAR IT SERVICES. Business Intelligence Project Methodology
POLAR IT SERVICES Business Intelligence Project Methodology Table of Contents 1. Overview... 2 2. Visualize... 3 3. Planning and Architecture... 4 3.1 Define Requirements... 4 3.1.1 Define Attributes...
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationPerformance Testing of Big Data Applications
Paper submitted for STC 2013 Performance Testing of Big Data Applications Author: Mustafa Batterywala: Performance Architect Impetus Technologies mbatterywala@impetus.co.in Shirish Bhale: Director of Engineering
More informationBig Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
More informationBig Data - Infrastructure Considerations
April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationA Study on Big-Data Approach to Data Analytics
A Study on Big-Data Approach to Data Analytics Ishwinder Kaur Sandhu #1, Richa Chabbra 2 1 M.Tech Student, Department of Computer Science and Technology, NCU University, Gurgaon, Haryana, India 2 Assistant
More informationBIG DATA CHALLENGES AND PERSPECTIVES
BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,
More informationBig Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases
Big Data and Transactional Databases Exploding Data Volume is Creating New Stresses on Traditional Transactional Databases Introduction The world is awash in data and turning that data into actionable
More informationThere s no way around it: learning about Big Data means
In This Chapter Chapter 1 Introducing Big Data Beginning with Big Data Meeting MapReduce Saying hello to Hadoop Making connections between Big Data, MapReduce, and Hadoop There s no way around it: learning
More informationSoftware as a Service (SaaS) Testing Challenges- An Indepth
www.ijcsi.org 506 Software as a Service (SaaS) Testing Challenges- An Indepth Analysis Prakash.V Ravikumar Ramadoss Gopalakrishnan.S Assistant Professor Department of Computer Applications, SASTRA University,
More informationAnalytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
More informationTAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP
Pythian White Paper TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP ABSTRACT As companies increasingly rely on big data to steer decisions, they also find themselves looking for ways to simplify
More informationArchitecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
More informationNoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationManifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationData Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution
Warehouse and Business Intelligence : Challenges, Best Practices & the Solution Prepared by datagaps http://www.datagaps.com http://www.youtube.com/datagaps http://www.twitter.com/datagaps Contact contact@datagaps.com
More informationAgile Business Intelligence Data Lake Architecture
Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step
More informationwww.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
More informationHow to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
More informationInternational Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
More informationMike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.
Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,
More informationModernizing Your Data Warehouse for Hadoop
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationGain insight, agility and advantage by analyzing change across time and space.
White paper Location Intelligence Gain insight, agility and advantage by analyzing change across time and space. Spatio-temporal information analysis is a Big Data challenge. The visualization and decision
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationKeywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop
Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning
More informationBIG DATA IS MESSY PARTNER WITH SCALABLE
BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on
More informationTransforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
More informationA Comprehensive Approach to Master Data Management Testing
A Comprehensive Approach to Master Data Management Testing Abstract Testing plays an important role in the SDLC of any Software Product. Testing is vital in Data Warehousing Projects because of the criticality
More informationOffload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper
Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)
More information5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
More informationThe 3 questions to ask yourself about BIG DATA
The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.
More informationBlazent IT Data Intelligence Technology:
Blazent IT Data Intelligence Technology: From Disparate Data Sources to Tangible Business Value White Paper The phrase garbage in, garbage out (GIGO) has been used by computer scientists since the earliest
More informationTrends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
More informationApplication Performance Testing Basics
Application Performance Testing Basics ABSTRACT Todays the web is playing a critical role in all the business domains such as entertainment, finance, healthcare etc. It is much important to ensure hassle-free
More informationANALYTICS BUILT FOR INTERNET OF THINGS
ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that
More informationAffordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale
WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept
More informationIBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!
The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader
More informationUsing Tableau Software with Hortonworks Data Platform
Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data
More informationTap into Big Data at the Speed of Business
SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics
More informationIoT and Big Data- The Current and Future Technologies: A Review
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 5, Issue. 1, January 2016,
More informationData Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
More informationBig Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
More informationCisco Data Preparation
Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and
More informationLoad Testing Strategy Review When Transitioning to Cloud
International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-3, Issue-9, February 2014 Load Testing Strategy Review When Transitioning to Cloud Tanvi Dharmarha,
More informationA REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information
More informationThe big data revolution
The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing
More informationBig Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
More informationConvergence of Big Data and Cloud
American Journal of Engineering Research (AJER) e-issn : 2320-0847 p-issn : 2320-0936 Volume-03, Issue-05, pp-266-270 www.ajer.org Research Paper Open Access Convergence of Big Data and Cloud Sreevani.Y.V.
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationNew Design Principles for Effective Knowledge Discovery from Big Data
New Design Principles for Effective Knowledge Discovery from Big Data Anjana Gosain USICT Guru Gobind Singh Indraprastha University Delhi, India Nikita Chugh USICT Guru Gobind Singh Indraprastha University
More informationMicrosoft Analytics Platform System. Solution Brief
Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal
More informationDelivering Real-World Total Cost of Ownership and Operational Benefits
Delivering Real-World Total Cost of Ownership and Operational Benefits Treasure Data - Delivering Real-World Total Cost of Ownership and Operational Benefits 1 Background Big Data is traditionally thought
More informationSELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM
David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business
More informationIBM System x reference architecture solutions for big data
IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,
More informationBuilding Your Big Data Team
Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.
More informationHow to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW
How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW Roger Breu PDW Solution Specialist Microsoft Western Europe Marcus Gullberg PDW Partner Account Manager Microsoft Sweden
More informationBig Data Defined Introducing DataStack 3.0
Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...
More informationDetecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.
Detecting Anomalous Behavior with the Business Data Lake Reference Architecture and Enterprise Approaches. 2 Detecting Anomalous Behavior with the Business Data Lake Pivotal the way we see it Reference
More informationImpact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.
Impact of Big Data in Oil & Gas Industry Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. New Age Information 2.92 billions Internet Users in 2014 Twitter processes 7 terabytes
More informationMicrosoft Big Data. Solution Brief
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
More informationSQL Server 2012 Parallel Data Warehouse. Solution Brief
SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationINTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
More informationInformation Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
More informationBIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014
BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 Ralph Kimball Associates 2014 The Data Warehouse Mission Identify all possible enterprise data assets Select those assets
More informationMicrosoft Big Data Solutions. Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com;
Microsoft Big Data Solutions Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com; Why/What is Big Data and Why Microsoft? Options of storage and big data processing in Microsoft Azure. Real Impact of Big
More information