An Idea of testing BIG... Challenges and approaches for testing Big Data
|
|
- Cameron Baldwin
- 8 years ago
- Views:
Transcription
1 testing Big Data Capgemini India Private Limited Sep 2013 Prepared by: Renuka Kale
2 1 Table of Contents 1. Abstract What is Big Data From Big Data to Big Testing Vs of Big Data Big testing approaches: Big Data Landscape: Big Hypothesis Big Data limitations: Conclusion References Author s Biography... 16
3 2 1. Abstract Nowadays social media sites are creating a great buzz. Handling such a great amount of social media exchange requires equivalent strong technologies. Big Data thus comes into picture. Big Data is the talk of the town these days, because of its varied uses ranging from social media sites, to large banking firms, to telecom domains, healthcare domain and so on. Also earlier computer systems were large sized, but these days due to mobile revolutions, mobile apps facilitates e-interaction very easily and an enormous amount of data gets pumped in. As per the report, as a part of this digital world, we generate more than 200 Exabyte of information every year. According to Intel, each internet minute sees 100,000 tweets, 277,000 Facebook logins, 204 million mail exchanges, and 2 million search queries fired. Also website visits, touch points, ad impressions, video views, online community discussions etc also create enormous amount of data. This data tell us of the information of customer behaviours, intents, and preferences. We can combine that raw digital data with data from other sources such as call centre logs, transaction histories, and in-person interactions. This data can be mixed with publicly available data on demographics, weather forecasts and the economy. Big data a huge amount of information gathered from non traditional sources like blogs, social media, s, video footages, photos, etc ; which is typically scattered, unstructured and voluminous, can be greatly useful in business intelligence, in analyzing ongoing business trends, shopping trends, sentiment analysis, peoples liking and disliking. Thus with the help of the results drawn out of this big data, companies get immensely benefitted in terms of formulating their upcoming plans, devising their strategies, changing their approaches, and capturing the promising business areas. Companies can better understand their customers and thus can tailor their offers as per the customer needs. So when we talk about Big data, it is imperative to talk about its testing aspect and the vital phases involved in testing Big data. Would it suffice to follow conventional testing measures while testing Big data, what challenges does it involve, what would be the best approaches, what are the limitations and opportunities in testing Big data. These questions clamour as we start thinking about testing big data. This paper tries to elaborate and surface these aspects which call for further discussions. Read on.
4 3 2. What is Big Data Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, storage, search, sharing, transfer, analysis, and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to "spot business trends, determine quality of research, prevent diseases, link legal citations combat crime, and determine real-time roadway traffic conditions. Big data is difficult to work with using most relational database management systems and desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers". What is considered "big data" varies depending on the capabilities of the organization managing the set, and on the capabilities of the applications that are traditionally used to process and analyze the data set in its domain. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration In a digital world, we can easily try many cost effective versions of ads, landing pages, web applications, messages, and other digital touch points running A/B tests between our original control versions and new challenger versions that try out hypotheses. In the marketing strategies, small steps can be taken to experiment on a few customers on a pilot basis to validate the hypothesis. If it fails, there is only little risk involved, and if it succeeds, it becomes new mantra. There is always a win-win situation. Because it s so easy to run tests like this in the digital medium, companies such as Google and Amazon businesses that grew up natively in the digital world have deeply embraced experimentation as part of their culture. It s that relatively low success rate that may have held back many companies from harnessing the power of marketing experimentation. In pre-digital days, such experimentation was much more costly and big bets were either spectacular successes or disastrous failures. But nowadays such experiments have become a trend and innovative strategies are seen to be devised to derive important information from users.
5 4 3. From Big Data to Big Testing As Greg Linden, who led a set of experiments at Amazon, stated in an article on big data in The Atlantic, To find high impact experiments, you need to try a lot of things. Genius is born from a thousand failures. In each failed test, you learn something that helps you find something that will work. Constant, continuous, ubiquitous experimentation is the most important thing. Such ubiquitous experimentation in search of big wins can be labeled big testing, the natural complement to big data. Following things make Big testing a Big deal : 1. Experimentation with new, innovative ideas to try something that may fail. Risk is mitigated by testing such ideas on a small scale. Small to moderate risk taking is necessary shunning all doubts. 2. Experimentations must have a vision and should be properly managed Experimentations are bound to fail, but they are not identifies as failures, rather recognized for their continued and aggressive efforts. While experimenting, a proper goal must be set and its progress properly monitored. 3. More and more people are involved for the experimentation If the experiments are allowed to be run first for small group of people and then slowly scaled up and more and more people are engaged, then the results might be extraordinary. 4. Inferences drawn out of experimentations Experimentation should be called as complete only when proper conclusions are drawn out of it. If it is failed, what are the lessons learnt, what changes are required to be made in the next experimentation and other related factors can be decided. 5. Creating a base data for experiments done by other bodies Experimentations done by other organizations must also be studied and referred to get the baseline which is readily available. Big data is amazing source of new hypotheses for marketing. However, it will have its value in true sense if it is properly tested. Like big data, big testing is a native approach to marketing in a digital world. Big testing can surely bring in more productive results and help draw decisive findings for marketing world.
6 5 4. 4Vs of Big Data 4 important factors involved in Big data testing: Volume: Big data involves large data to be analyzed, coming from various sources. Data volume is day by day increasing, ranging from a few dozen terabytes to many petabytes of data in a single data set. One of the fundamental defining characteristics of Big Data environments is that they involve extremely large data volumes. Big Data environments based on technologies such as the Hadoop Distributed File System (HDFS) sometimes scale out to petabytes of data running across thousands of distributed processors. Internet companies such as Google, Yahoo and Facebook have been pioneers in the use of Big Data technologies and routinely store hundreds of terabytes and even petabytes of data on their systems. Facebook's Hadoop Big Data cluster for instance, scales out to a staggering 30 petabytes of data, making it one of the biggest Big Data implementations on the planet. Pharmaceutical companies and financial services companies also routinely collect, process, and analyze terabytes of data in their Big Data environments. Challenges in Volume testing: - Big data involves huge amount of data to deal with. Thus, 100% coverage is not possible. In this case we need to implement very good data sampling techniques. - Data files are stored in various locations, so consolidation of data is a challenge.
7 6 - Performance of the data processing is an important factor. When we query against such a huge data, its performance is bound to degrade. Time required for the data processing, server response time is a vital factor. - Data files are stored on HDFS. Approach for Volume testing: - Requirement analysis: Understanding the business requirement and accordingly identifying the areas where data sampling is required. - Use of data sampling techniques, data extraction tools. Categorization of data to be tested. - Use of traditional techniques such as Boundary Value Analysis and Equivalent Partitioning. - Converting raw data into useful test data to compare with the actual data. Velocity: Traditionally data updates used to happen weekly, bi-weekly, monthly, quarterly and like, based on the user requirements. But, the periodicity was fixed and consistent. However, in Big data, data is continuously updating, almost at the speed of real time. Thus, gathering and storing this data is also important thing. One of the key elements of Big Data is data velocity, or the speed at which new data is processed and analyzed by an organization. The Internet, e-commerce, mobile devices and social media technologies are allowing organizations to collect more realtime information on customers and transactions than ever before. Online retailers and financial services companies for instance, have the ability these days to compile extremely detailed customer profiles and behavioural patterns by tracking and monitoring the online transactions and other interactions of their customers. In order to derive benefit from such data, businesses are increasingly looking for technologies that allow them to tap and analyze fast-moving data streams in as near real-time as possible. This kind of complex event processing is a crucial component of Big Data environments. Generally, the greater the velocity with which data can be analyzed, the bigger the near-term benefit for the company.
8 7 Challenges in Velocity testing: - Complex scale up strategy required - Coping up with rapidly incrementing data needs to be handled by keeping some benchmarks. - Simulating production like environment Approach for Velocity testing: - Incremental performance testing - Bench mark testing Variety: Data coming from various sources may have different - different forms, structures, and conventions. Also a same thing might be represented in different way. Data may vary from tables, structured data up to free text (tweets). Formatting such data and gelling up in a streamlined coherent form is a vital aspect. Unlike RDBM systems, Big Data environments tend to involve a lot of data collected from a myriad of sources, often in raw form. It's not unusual for Big Data environments to contain data that is repetitive, incomplete, unverifiable or just not useful for any purpose. Data collected from Twitter Feeds or Facebook posts for instance, may offer clues about customer sentiment but the reliability of such data is often very suspect. The sheer variety and the velocity at which data is collected also poses a major challenge to data veracity in a Big Data environment. In order for the data to be really useful it has to be clean and reliable. Organizations can sometimes spend well more than half their time and effort on simply cleaning up Big Data and staging it for future use. Challenges in Variety testing: - Scrutinizing unstructured data is the most challenging thing. Since the data available is in varied forms. It s very difficult to format, categorize and tag it. - Sampling the data out of voluminous data, based on requirements is challenging. - It requires lot of manual work to draw meaningful data out of heap of data Approach for Variety testing: - Identifying source of data and devising strategy based on it. - Bringing / converting data into a structured form and running scripts for sample data comparisons - Localization and Globalization testing.
9 8 Value: Churning out large amount of data has to get converted into a form of useful information. Thus transforming raw data into useable information which can be used internally or for business purpose is also a challenge. Challenges in Value testing: - Data which is unauthorized, unreliable may still prove decisive - Data in incomplete form - Data which is of no use for the current business requirement needs to be filtered out. Doing so from unformatted data is difficult. Approach for Value testing: - Filtering out required data using data extraction tools - Targeting the business requirements and focusing on sampling out data based on the same - Verification and validation testing
10 9 5. Big testing approaches: Cloud resourcing: To handle big data test, it is impossible to handle alone. Availing cloud resources for this task would be helpful. Predictive models based testing: Predictive modeling is the process by which a model is created or chosen to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data, for example given an determining how likely that it is spam. Incremental testing approach: The incremental build model is a method of software development where the model is designed, implemented and tested incrementally (a little more is added each time) until the product is finished. It involves both development and maintenance. The product is defined as finished when it satisfies all of its requirements. This model combines the elements of the waterfall model with the iterative philosophy of prototyping. Traditional testing types such as DWH testing, Performance testing, Security testing etc can be tweaked so accustom with big data. A/B Testing: A/B testing is a methodology in advertising of using randomized experiments with two variants, A and B, which are the control and treatment in the controlled experiment. A/B testing allows you to generate your own data. That is, organizations can be proactive with regard to data management. This is in stark contrast to the practices of far too many companies that rely almost extensively on much more reactive data management. A/B testing is hardly a panacea. site that gets 50,000 unique hits per day can reasonably chop its audience into two and, in the end, feel confident that any results are genuine. Consider the famous quote by Steve Jobs: It s really hard to design products by focus groups. A lot of times, people don t know what they want until you show it to them. Business Week, May Failover Testing: Failover testing is an important area in Big data implementation with the objective of validating the recovery process and to ensure the data processing happens seamlessly when switched to other data nodes. Machine Learning: Machine learning, in short, refers to computers learning to predict from data. Machine Learning has empowered many smart applications. For example, Apple s Siri learns from data to predict the meanings of human voice and the desired answers or actions to be performed. Facebook s photo album learns from data to predict (or recognize) faces to be tagged in photos. LinkedIn learns from data to predict who you want to connect with. Google s driverless car learns from data to predict the appropriate driving actions.
11 10 Artificial Intelligence and Machine Learning: In a move that signals a significant step towards automation in the IT services outsourcing business, Infosys has struck a partnership with IPsoft, the New York-based company founded by Indian American Chetan Dube that provides tools that free engineers from mundane, repetitive tasks. The most fascinating and influential aspect of IPsoft's technology is that it includes the element of machine learning -- or artificial intelligence as some call it -- so that companies don't have to employ an army of people to write the complicated scripts that traditional automation tools require. The system learns from doing, thus making the process of automation itself automated.
12 11 6. Big Data Landscape: Hadoop has the capability to process extremely large volumes of data, much faster and at a fraction of the cost of traditional data systems. Hadoop is an Open Source data management with Scale-out storage and distributed processing. Storage- HDFS: - Distributed across nodes - Natively redundant - Name node tracks locations Processing-Map Reduce: - Split a task across processors near the data and assembles results. - Self healing, High Bandwidth Clustered storage.
13 12 7. Big Hypothesis Big data can definitely give lots of input for sentiment analysis, trend patterns, behavioral patterns, future prospects, customers inclinations etc. In other words, most of these insights are the seeds of hypotheses. But there is no guarantee that the correlations discovered in big data can directly influence customer behavior. There are lots of factors involved in it. For one, big data naturally indicates large data, still it s unending and so in a way incomplete. Also as it comes to capturing unformatted data, we cannot apply a specific rule over the kind of data which is getting exchanged amongst the people. Thus, putting this raw data into a formatted one definitely has limitations. This is the same data which holds potential to turn all the equations upside down and start following an absolutely different trend. But we surely need to find out the best possible ways for the betterment of streamlining the monstrous data into various readable, manageable, and properly catalogued and testable formats. Big data is a powerhouse which is generating both good data and bad data. Need of the time is to massage the data, clean it and use it.
14 13 8. Big Data limitations: % coverage not possible, best fit plan has to be chosen depending upon the total time, resources, and tools available for testing. - Big data testing may invite unidentified areas, or in other words, unstructured data may get misinterpreted in the walk of testing, which may leave testing incomplete / uncovered for that specific area. - Giraffe Effect: Giraffes are a portions of data which dominate the rest of the data and hide important insights. Sometimes they even lead to wrong conclusions. This is a very simple example of the giraffe effect. When people look at a set of data which includes some very large, dominant members, important differences among the other data in the set often disappear from view.
15 14 9. Conclusion Big testing, if properly harnessed and experimented with innovating ideas can lead to big changes in the marketing trends. Also, it can help betterment of the business giving it new shape and zeal. Big data is of the people, created by the people, and is useful for the people. Nowadays we get some features customized as per our preferences, we get choices as per our selection, and some websites conduct surveys and based on that we get some interesting deals in our mailbox. Big testing can certainly highlight some uncovered areas where business can target and provide more useable services to customers. Since people are the creator and contributor of this big data, they can bring in more ideas in testing big data. As there is no limit to data, there is no limit to creativity as well. People well equipped proper tools and technologies can play a vital role in big data testing. Hal Varian, the chief economist at Google, has said that Google runs about 10,000 experiments each year. A large number of different people throughout the company are engaged in all kinds of different tests in parallel. This culture is now being followed by many companies. There is really a need to start experimenting, put forth our hypotheses and agree / disagree to them, passing them to next level to formulate a test structure, integrate all modules into one to give a desirable result to big testing. As far as infrastructures are concerned, cloud resourcing is the best possible option, wherein different tests can be simulated. Questions like who owns the data, what are the challenges involved in big data, is this data useful for us list is endless. And answer is Big Testing!!!
16 References devcentral_f5_com/weblogs/macvittie/windows-live-writer/the-four-vs-of-big- Data_4DB7/big%2520data%2520four%2520vs_2.png&imgrefurl= f5.com/blogs/us/the-four-v-rsquos-of-bigdata&usg= QnGAwzJ91QX5lYCfubWQpb1HFnk=&h=767&w=1024&sz=601&hl=en& start=1&zoom=1&tbnid=q03bq3qp4cx_jm:&tbnh=112&tbnw=150&ei=knpnuaeuc8 morqf4sic4cg&prev=/images%3fq%3d4%2bv%2527s%2bof%2bbig%2bdata%26s a%3dx%26hl%3den- IN%26gbv%3D2%26tbm%3Disch&itbs=1&sa=X&ved=0CCsQrQMwAA
17 Author s Biography Renuka Kale is working as a consultant for RTQA, Morgan Stanley account in Capgemini India Ltd since 4 th Aug Renuka has around 10 years experience which includes Govt sector and IT, out of which around 6 years counts in software testing. While working with Govt of Maharashtra, she has presented a paper women s empowerment in water sector at an international water conference Water Asia.
Big Data / FDAAWARE. Rafi Maslaton President, cresults the maker of Smart-QC/QA/QD & FDAAWARE 30-SEP-2015
Big Data / FDAAWARE Rafi Maslaton President, cresults the maker of Smart-QC/QA/QD & FDAAWARE 30-SEP-2015 1 Agenda BIG DATA What is Big Data? Characteristics of Big Data Where it is being used? FDAAWARE
More informationBusiness Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal
Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal Information has gone from scarce to super-abundant. That brings huge new benefits. The Economist
More informationDanny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
More informationCIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.
CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Data Science Overview Why, What, How, Who Outline Why Data Science?
More informationBig Data: What defines it and why you may have a problem leveraging it DISCUSSION PAPER
DISCUSSION PAPER 1. Enterprise data revolution One of the key trends in the enterprise technology world at the moment - and one that has been steadily growing in influence and importance in the past few
More informationEXECUTIVE REPORT. Big Data and the 3 V s: Volume, Variety and Velocity
EXECUTIVE REPORT Big Data and the 3 V s: Volume, Variety and Velocity The three V s are the defining properties of big data. It is critical to understand what these elements mean. The main point of the
More informationCAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science
CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science Dr. Daisy Zhe Wang CISE Department University of Florida August 25th 2014 20 Review Overview of Data Science Why Data
More informationThe Big Data Paradigm Shift. Insight Through Automation
The Big Data Paradigm Shift Insight Through Automation Agenda The Problem Emcien s Solution: Algorithms solve data related business problems How Does the Technology Work? Case Studies 2013 Emcien, Inc.
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationStatistical Challenges with Big Data in Management Science
Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision
More informationBig Data. Donald Kossmann & Nesime Tatbul Systems Group ETH Zurich
Big Data Donald Kossmann & Nesime Tatbul Systems Group ETH Zurich Goal of Today What is Big Data? introduce all major buzz words What is not Big Data? get a feeling for opportunities & limitations Answering
More informationNavigating Big Data business analytics
mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what
More informationIndustry Impact of Big Data in the Cloud: An IBM Perspective
Industry Impact of Big Data in the Cloud: An IBM Perspective Inhi Cho Suh IBM Software Group, Information Management Vice President, Product Management and Strategy email: inhicho@us.ibm.com twitter: @inhicho
More informationWhite Paper. Intelligence Driven. Security Monitoring. v.2.1.1. nexusguard.com
White Paper 1 Intelligence Driven Security Monitoring v.2.1.1 Overview In today s hypercompetitive business environment, companies have to make swift and decisive decisions. Making the right judgment call
More informationWhile a number of technologies fall under the Big Data label, Hadoop is the Big Data mascot.
While a number of technologies fall under the Big Data label, Hadoop is the Big Data mascot. Remember it stands front and center in the discussion of how to implement a big data strategy. Early adopters
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationBig Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
More informationInternational Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com
More informationData Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
More informationA U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics
contents A U T H O R S : G a n e s h S r i n i v a s a n a n d S a n d e e p W a g h Social Media Analytics Abstract... 2 Need of Social Content Analytics... 3 Social Media Content Analytics... 4 Inferences
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationAnalyzing Big Data: The Path to Competitive Advantage
White Paper Analyzing Big Data: The Path to Competitive Advantage by Marcia Kaplan Contents Introduction....2 How Big is Big Data?................................................................................
More informationwww.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
More informationHow To Listen To Social Media
WHITE PAPER Turning Insight Into Action The Journey to Social Media Intelligence Turning Insight Into Action The Journey to Social Media Intelligence From Data to Decisions Social media generates an enormous
More informationData Mining in the Swamp
WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all
More informationSources: Summary Data is exploding in volume, variety and velocity timely
1 Sources: The Guardian, May 2010 IDC Digital Universe, 2010 IBM Institute for Business Value, 2009 IBM CIO Study 2010 TDWI: Next Generation Data Warehouse Platforms Q4 2009 Summary Data is exploding
More informationIntroduction to Big Data the four V's
Chapter 1: Introduction to Big Data the four V's This chapter is mainly based on the Big Data script by Donald Kossmann and Nesime Tatbul (ETH Zürich) Big Data Management and Analytics 15 Goal of Today
More informationExploiting Data at Rest and Data in Motion with a Big Data Platform
Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, sarah_brader@uk.ibm.com What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags
More informationWHITE PAPER. Social media analytics in the insurance industry
WHITE PAPER Social media analytics in the insurance industry Introduction Insurance is a high involvement product, as it is an expense. Consumers obtain information about insurance from advertisements,
More informationCustomer Experience Management
Customer Experience Management Best Practices for Voice of the Customer (VoC) Programmes Jörg Höhner Senior Vice President Global Head of Automotive SPA Future Thinking The Evolution of Customer Satisfaction
More informationBeyond Watson: The Business Implications of Big Data
Beyond Watson: The Business Implications of Big Data Shankar Venkataraman IBM Program Director, STSM, Big Data August 10, 2011 The World is Changing and Becoming More INSTRUMENTED INTERCONNECTED INTELLIGENT
More informationREVIEW PAPER ON BIG DATA USING HADOOP
International Journal of Computer Engineering & Technology (IJCET) Volume 6, Issue 12, Dec 2015, pp. 65-71, Article ID: IJCET_06_12_008 Available online at http://www.iaeme.com/ijcet/issues.asp?jtype=ijcet&vtype=6&itype=12
More informationSentiment Analysis on Big Data
SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social
More informationAn Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
More informationGetting Started Practical Input For Your Roadmap
Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson
More informationTesting 3Vs (Volume, Variety and Velocity) of Big Data
Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used
More information5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
More informationTesting Big data is one of the biggest
Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing
More informationKeynote: Big Data, Big Deal
Keynote: Big Data, Big Deal Piyush Malik Global Business Services, IBM Silicon Valley San Diego October 6 th, 2015 Outline 1 Why Big Data matters 2 Real World Applications 3 Future in a Data-Driven world
More informationThe 3 questions to ask yourself about BIG DATA
The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.
More informationDATAOPT SOLUTIONS. What Is Big Data?
DATAOPT SOLUTIONS What Is Big Data? WHAT IS BIG DATA? It s more than just large amounts of data, though that s definitely one component. The more interesting dimension is about the types of data. So Big
More informationTutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
More informationANALYTICS BUILT FOR INTERNET OF THINGS
ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that
More informationBig Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
More informationUnlocking The Value of the Deep Web. Harvesting Big Data that Google Doesn t Reach
Unlocking The Value of the Deep Web Harvesting Big Data that Google Doesn t Reach Introduction Every day, untold millions search the web with Google, Bing and other search engines. The volumes truly are
More informationGrabbing Value from Big Data: The New Game Changer for Financial Services
Financial Services Grabbing Value from Big Data: The New Game Changer for Financial Services How financial services companies can harness the innovative power of big data 2 Grabbing Value from Big Data:
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationTransforming Big Data Into Smart Advertising Insights. Lessons Learned from Performance Marketing about Tracking Digital Spend
Transforming Big Data Into Smart Advertising Insights Lessons Learned from Performance Marketing about Tracking Digital Spend Transforming Big Data Into Smart Advertising Insights Lessons Learned from
More informationKeywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop
Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning
More informationOPTIMIZING PERFORMANCE IN AMAZON EC2 INTRODUCTION: LEVERAGING THE PUBLIC CLOUD OPPORTUNITY WITH AMAZON EC2. www.boundary.com
OPTIMIZING PERFORMANCE IN AMAZON EC2 While the business decision to migrate to Amazon public cloud services can be an easy one, tracking and managing performance in these environments isn t so clear cut.
More informationCSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait
CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems
More informationAt a recent industry conference, global
Harnessing Big Data to Improve Customer Service By Marty Tibbitts The goal is to apply analytics methods that move beyond customer satisfaction to nurturing customer loyalty by more deeply understanding
More informationDelivering new insights and value to consumer products companies through big data
IBM Software White Paper Consumer Products Delivering new insights and value to consumer products companies through big data 2 Delivering new insights and value to consumer products companies through big
More informationReaping the Rewards of Big Data
Reaping the Rewards of Big Data TABLE OF CONTENTS INTRODUCTION: 2 TABLE OF CONTENTS FINDING #1: BIG DATA PLATFORMS ARE ESSENTIAL FOR A MAJORITY OF ORGANIZATIONS TO MANAGE FUTURE BIG DATA CHALLENGES. 4
More informationProfitable vs. Profit-Draining Local Business Websites
By: Peter Slegg (01206) 433886 07919 921263 www.besmartmedia.com peter@besmartmedia.com Phone: 01206 433886 www.besmartmedia.com Page 1 What is the Difference Between a Profitable and a Profit-Draining
More informationWe are Big Data A Sonian Whitepaper
EXECUTIVE SUMMARY Big Data is not an uncommon term in the technology industry anymore. It s of big interest to many leading IT providers and archiving companies. But what is Big Data? While many have formed
More informationSocial Media and the Data Management Platform. Understanding Data-Driven Social Media Marketing
Social Media and the Data Management Platform Understanding Data-Driven Social Media Marketing 1 Discover the Benefits of Powering Your Social Media Marketing Efforts with Data In 2013 it became clear
More informationBIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS
BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS Megha Joshi Assistant Professor, ASM s Institute of Computer Studies, Pune, India Abstract: Industry is struggling to handle voluminous, complex, unstructured
More informationDigital Marketing Capabilities
Digital Marketing Capabilities Version : 1.0 Date : 17-Apr-2015 Company Framework Focus on ROI 2 Introduction SPACECOS is a leading IT services and marketing solutions provider. We provide the winning
More informationHOW THE DATA LAKE WORKS
HOW THE DATA LAKE WORKS by Mark Jacobsohn Senior Vice President Booz Allen Hamilton Michael Delurey, EngD Principal Booz Allen Hamilton As organizations rush to take advantage of large and diverse data
More informationBig Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014
White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page
More informationBIG DATA: IT MAY BE BIG BUT IS IT SMART?
BIG DATA: IT MAY BE BIG BUT IS IT SMART? Turning Big Data into winning strategies A GfK Point-of-view 1 Big Data is complex Typical Big Data characteristics?#! %& Variety (data in many forms) Data in different
More informationBIG DATA: BIG BOOST TO BIG TECH
BIG DATA: BIG BOOST TO BIG TECH Ms. Tosha Joshi Department of Computer Applications, Christ College, Rajkot, Gujarat (India) ABSTRACT Data formation is occurring at a record rate. A staggering 2.9 billion
More informationBanking On A Customer-Centric Approach To Data
Banking On A Customer-Centric Approach To Data Putting Content into Context to Enhance Customer Lifetime Value No matter which company they interact with, consumers today have far greater expectations
More informationBIG DATA CHALLENGES AND PERSPECTIVES
BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,
More informationHOW TO ACCURATELY TRACK YOUR SOCIAL MEDIA BUZZ
TIP SHEET HOW TO ACCURATELY TRACK YOUR SOCIAL MEDIA BUZZ Ten years ago, marketers had to rely primarily on customer surveys and mainstream media coverage to track the buzz created by a new product launch
More informationTable of Contents. Copyright 2011 Synchronous Technologies Inc / GreenRope, All Rights Reserved
Table of Contents Introduction: Gathering Website Intelligence 1 Customize Your System for Your Organization s Needs 2 CRM, Website Analytics and Email Integration 3 Action Checklist: Increase the Effectiveness
More informationBig Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014
Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions
More informationData Aggregation and Cloud Computing
Data Intensive Scalable Computing Harnessing the Power of Cloud Computing Randal E. Bryant February, 2009 Our world is awash in data. Millions of devices generate digital data, an estimated one zettabyte
More informationDigital Analytics Checkup:
Digital Analytics Checkup: How to evaluate the impact of your web analytics data A Digital Marketing Depot White Paper Executive Summary Marketing organizations are being inundated with a greater volume,
More informationUbuntu: helping drive business insight from Big Data
WHITE PAPER Ubuntu: helping drive business insight from Big Data February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction For years, web giants such as Facebook, Google and ebay
More informationA financial software company
A financial software company Projecting USD10 million revenue lift with the IBM Netezza data warehouse appliance Overview The need A financial software company sought to analyze customer engagements to
More informationUbuntu and Hadoop: the perfect match
WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely
More informationBig Data & Tourism. Rajendra Akerkar
Big Data & Tourism Rajendra Akerkar Technomathematics Research Foundation TMRF Report 11 2012 Big Data & Tourism To promote innovation and increase efficiency in the Tourism sector TMRF-report-11-2012
More information!!!!! BIG DATA IN A DAY!
BIG DATA IN A DAY December 2, 2013 Underwritten by Copyright 2013 The Big Data Group, LLC. All Rights Reserved. All trademarks and registered trademarks are the property of their respective holders. EXECUTIVE
More informationDriving growth through transformation driven by data Role of IT driven analytics in enterprises
Driving growth through transformation driven by data Role of IT driven analytics in enterprises In the highly competitive global economy, consistent and continuous value creation and value realization
More informationBig Data Are You Ready? Thomas Kyte http://asktom.oracle.com
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationMeasure Social Media like a Pro: Social Media Analytics Uncovered SOCIAL MEDIA LIKE SHARE. Powered by
1 Measure Social Media like a Pro: Social Media Analytics Uncovered # SOCIAL MEDIA LIKE # SHARE Powered by 2 Social media analytics were a big deal in 2013, but this year they are set to be even more crucial.
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationBig Data Introduction, Importance and Current Perspective of Challenges
International Journal of Advances in Engineering Science and Technology 221 Available online at www.ijaestonline.com ISSN: 2319-1120 Big Data Introduction, Importance and Current Perspective of Challenges
More informationHow To Use Big Data Effectively
Why is BIG Data Important? March 2012 1 Why is BIG Data Important? A Navint Partners White Paper May 2012 Why is BIG Data Important? March 2012 2 What is Big Data? Big data is a term that refers to data
More informationMachine Data Analytics with Sumo Logic
Machine Data Analytics with Sumo Logic A Sumo Logic White Paper Introduction Today, organizations generate more data in ten minutes than they did during the entire year in 2003. This exponential growth
More informationIndian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved
Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop
More informationDoing Multidisciplinary Research in Data Science
Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov 16 May
More informationHow To Understand The Benefits Of Big Data
Findings from the research collaboration of IBM Institute for Business Value and Saïd Business School, University of Oxford Analytics: The real-world use of big data How innovative enterprises extract
More informationIt is clear the postal mail is still very relevant in today's marketing environment.
Email and Mobile Digital channels have many strengths, but they also have weaknesses. For example, many companies routinely send out emails as a part of their marketing campaigns. But people receive hundreds
More informationPredicting & Preventing Banking Customer Churn by Unlocking Big Data
Predicting & Preventing Banking Customer Churn by Unlocking Big Data Making Sense of Big Data http://www.ngdata.com Predicting & Preventing Banking Customer Churn by Unlocking Big Data 1 Predicting & Preventing
More informationKeywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
More informationIJRCS - International Journal of Research in Computer Science ISSN: 2349-3828
ISSN: 2349-3828 Implementing Big Data for Intelligent Business Decisions Dr. V. B. Aggarwal Deepshikha Aggarwal 1(Jagan Institute of Management Studies, Delhi, India, vbaggarwal@jimsindia.org) 2(Jagan
More informationBIG Data Analytics Move to Competitive Advantage
BIG Data Analytics Move to Competitive Advantage where is technology heading today Standardization Open Source Automation Scalability Cloud Computing Mobility Smartphones/ tablets Internet of Things Wireless
More informationMoreketing. With great ease you can end up wasting a lot of time and money with online marketing. Causing
! Moreketing Automated Cloud Marketing Service With great ease you can end up wasting a lot of time and money with online marketing. Causing frustrating delay and avoidable expense right at the moment
More informationTransforming Data into Intelligence UNDERSTANDING CUSTOMERS AND REDUCING CHURN IN TELECOM S BIG DATA ERA A SCALABLE SYSTEMS WHITEPAPER ON TELECOM
Transforming Data into Intelligence UNDERSTANDING CUSTOMERS AND REDUCING CHURN IN TELECOM S BIG DATA ERA A SCALABLE SYSTEMS WHITEPAPER ON TELECOM EXECUTIVE SUMMARY The rapid expansion of device, application
More informationA Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi
Judith Hurwitz President and CEO Sponsored by Hitachi Introduction Only a few years ago, the greatest concern for businesses was being able to link traditional IT with the requirements of business units.
More informationSAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise
Frequently Asked Questions SAP HANA Vora SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise SAP HANA Vora software enables digital businesses to innovate and compete through in-the-moment
More informationThe Cloud for Insights
The Cloud for Insights A Guide for Small and Medium Business As the volume of data grows, businesses are using the power of the cloud to gather, analyze, and visualize data from internal and external sources
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More information