Big Data Introduction, Importance and Current Perspective of Challenges



Similar documents
The Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

BIG DATA FUNDAMENTALS

CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Data Refinery with Big Data Aspects

How To Use Big Data Effectively

Now, Next and the Future: IT, Big Data and other Implications for RIM. Presented by Michael S. Smith /

Big Data a threat or a chance?

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

WELCOME TO THE WORLD OF BIG DATA. NEW WORLD PROBLEMS, NEW WORLD SOLUTIONS

Are You Ready for Big Data?

Deploying Big Data to the Cloud: Roadmap for Success

What happens when Big Data and Master Data come together?

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

BIG DATA CHALLENGES AND PERSPECTIVES

Statistical Challenges with Big Data in Management Science

Are You Ready for Big Data?

Data Centric Computing Revisited

Of all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rijmenam, Think Bigger, 2014

COMP9321 Web Application Engineering

Industry Impact of Big Data in the Cloud: An IBM Perspective

Analyzing Big Data: The Path to Competitive Advantage

IoT and Big Data- The Current and Future Technologies: A Review

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

BIG DATA: BIG BOOST TO BIG TECH

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Smarter Analytics. Barbara Cain. Driving Value from Big Data

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

CONNECTING DATA WITH BUSINESS

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science

BIG DATA I N B A N K I N G

BIG Data. An Introductory Overview. IT & Business Management Solutions

BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013

IJRCS - International Journal of Research in Computer Science ISSN:

Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.

BIG DATA IN SUPPLY CHAIN MANAGEMENT: AN EXPLORATORY STUDY

The emergence of big data technology and analytics

Generating the Business Value of Big Data:

We are Big Data A Sonian Whitepaper

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data

Big Data and Analytics: Challenges and Opportunities

Big Data Analytics: 14 November 2013

Transforming the Telecoms Business using Big Data and Analytics

Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 12

A New Era Of Analytic

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Big Data-Challenges and Opportunities

Big Data: Tools and Technologies in Big Data

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS

Healthcare Measurement Analysis Using Data mining Techniques

Hadoop for Enterprises:

Big Data: Study in Structured and Unstructured Data

Big Data. White Paper. Big Data Executive Overview WP-BD Jafar Shunnar & Dan Raver. Page 1 Last Updated

Research Note What is Big Data?

The Next Wave of Data Management. Is Big Data The New Normal?

BIG DATA TRENDS AND TECHNOLOGIES

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May ISSN BIG DATA: A New Technology

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

BUY BIG DATA IN RETAIL

Investigative Research on Big Data: An Analysis

How Big Data is Different

ISSN: International Journal of Innovative Research in Technology & Science(IJIRTS)

Improving Data Processing Speed in Big Data Analytics Using. HDFS Method

Systems of Discovery The Perfect Storm of Big Data, Cloud and Internet-of-Things

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop

Big Data. Fast Forward. Putting data to productive use

SECURITY MEETS BIG DATA. Achieve Effectiveness And Efficiency. Copyright 2012 EMC Corporation. All rights reserved.

Anuradha Bhatia, Faculty, Computer Technology Department, Mumbai, India

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Secure Data Transmission Solutions for the Management and Control of Big Data

Problems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM 1

BEYOND BI: Big Data Analytic Use Cases

Volume 3, Issue 8, August 2015 International Journal of Advance Research in Computer Science and Management Studies

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Doing Multidisciplinary Research in Data Science

Delivering new insights and value to consumer products companies through big data

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

The 3 questions to ask yourself about BIG DATA

How To Handle Big Data With A Data Scientist

Transcription:

International Journal of Advances in Engineering Science and Technology 221 Available online at www.ijaestonline.com ISSN: 2319-1120 Big Data Introduction, Importance and Current Perspective of Challenges Vinod Sharma 1, Prof. N. K. Joshi 2, Manisha 3 Department of Computer Science 1 2 3 1 Research Scholar, Career Point University Kota, Rajathan, India 1 2 Director, Modi Institute of Management & Technology Dadabari, Kota, Rajathan, India 2 Asst. Professor, Arya Girls College, Ambala Cantt, Haryana, India 3 vndsrswt@gmail.com 1, drnkjoshi@gmail.com 2, manisharawat615@gmail.com 3 Abstract- Man and Machine are rapidly generating data. While writing an email, calling, texting, tweeting, streaming audio and video, playing game, online shopping, reservation, user activity generates data about their needs and preferences, as well as the quality of their experience with millions of machines and process every moment of every fraction of a time. Even when we are not using our devices, the network is generating time, location, offline records, pending messages and other data that keeps services running and ready for next use. Organization are starting to appreciate the importance of using more data in order to support decisions and arrangements for their strategies. There considered and proved through the study and practical implementation cases that More Data usually beats better structure and management. A massive sets of data gives a better output and wide range of correlations but also working with it can become a challenge due to limitations of processing. So Organizations initiated to evaluate that they can chose to invest more in processing massive sets of data rather than investing in costly algorithms and techniques. Vision of big data is that organizations will be able to reap and harness every byte of relevant and interrelated data and use it to make the best decisions. Big data technologies not only consider solution for massive sets of data, but more importantly, the ability to understand and take advantage of its full value. There is a lots of new challenges to handle big data, real-time communication, and decisions create lots of other challenges and issues regarding analyzing, management and processing. This article intends to define the concept of Big Data and stress the importance of Big Data. This paper also throws some light on other challenges and issues. Keywords Massive Data Sets, Big Data, Analytics, Correlations, Scalability. I. INTRODUCTION big data is an all broad term for any collection of data sets so massive and complex and that increase complexity to process them using classical data processing applications. The processing applications cover analysis, capture, curation, search, sharing, storing, transfer, visualization, privacy violation, and secure infrastructure that is too divers, fast changing or massive for conventional technologies, skill and infrastructure to address efficiently. Every second, minute, and an hour, we create millions bytes of data and that s too much from last decades. This data comes from everywhere: social media sites and response, digital records like picture, movie, fingerprint data, climate information from sensors, GPS and GIS data, Financial data, Transportations, Health and Hospitalities to name a few and this is a big data. Nowadays big data is a growing torrent some of examples are 1639 millions of mobile users worldwide in 2014[2] and they are producing data, 600TB data per day processed by Facebook[3], 10000 PB data stored in NSA data center[4], 235 terabytes data collected by the US library of congress by April 2011, 10 gigabytes of data may be transferred from its servers every second in CERN LHC Project, Amazon generates over $80,000 in online sales, 2.7 ZB of data exist in the digital universe today and they are few. With expanding of data, big data is capturing its value.other way we are living in a voluminous informational environment and we are moving a knowledge based environment. But voluminous informational environment provide better decision after analysis it. How much data generate by company, government and any entity that belongs to universe represents a big space where great amount of information added every time. Google, Facebook, YouTube, CERNS, Twitter, Apple,

IJAEST, Volume 4, Number 3 Vinod Sharma et al. LinkedIn, Netflix, Expedia, National and Local political campaigns. They know that that the age of Big Data is here and its here to stay. The swelling ranks of organizations that increasingly depend on big-data technologies include dozens of familiar names and a growing number you ve never heard of. II. DEFINITION OF BIG DATA There are some of definitions of big data In IBM s view Big Data has four aspects: volume, velocity, variety and veracity. Volume: refers to the quantity of data gathered by a company. This data must be used further to obtain important knowledge; Velocity: refers to the time in which Big Data can be processed. Some activities are very important and need immediate responses, that is why fast processing maximizes efficiency; Variety: Refers to the type of data that Big Data can comprise. This data can be structured as well as unstructured; Veracity: refers to the degree in which a leader trusts the used information in order to take decision. So getting the right correlations in Big Data is very important for the business future. [4] VOLUME Data at Rest, Ofload or cold data in TB, PB Redundancy elimination Transactions, Retention Records,Tables, Files VELOCITY Data in Moion, In Stream, In Batch Analysis to enable decisions within fractions of a seconds VARIETY Data in many forms Structured Unstructured Semistructured Data access midleware and ETLM VERACITY Data in doubts Data uncertainity Managing the reliablity and predictability of inherently imprecise data types Big data is the term increasingly used to describe the process of applying serious computing power the latest in machine learning and artificial intelligence to seriously massive and often highly complex sets of information."[5] Big Data: when the size and performance requirements for data management become significant design and decision factors for implementing a data management and analysis system. For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration. [6] "Big data (also spelled Big Data) is a general term used to describe the voluminous amount of unstructured and semistructured data a company creates -- data that would take too much time and cost too much money to load into a relational database for analysis. Although big data doesn't refer to any specific quantity, the term is often used when speaking about petabytes and Exabyte s of data."[7] The definition of Big Data is very fluid, as it is a moving target what can be easily manipulated with common tools and specific to the organization: what can be managed and stewarded by any one institution in its infrastructure. One researcher or organization s concept of a large data set is small to another. [8] We define Big Data as a cultural, technological, and scholarly phenomenon that rests on the interplay of: (1) Technology: maximizing computation power and algorithmic accuracy to gather, analyze, link, and compare large data sets. (2) Analysis: drawing on large data sets to identify patterns in order to make economic, social, technical, and legal claims. (3) Mythology: the widespread belief that large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy.[9]

Big Data Introduction, Importance and Current Perspective of Challenges 223 By Roger Magoulas from O Reilly media first introduced to the computing world Big Data term in 2005 in order to define massive amount of data that traditional data management techniques cannot manage and process due to the complexity and size of this data A study on the Evolution of Big Data as a Research and Scientific Topic shows that the term Big Data was present in research starting with 1970s but has been comprised in publications in 2008. Nowadays the Big Data concept is treated from different points of view covering its implications in many fields. According to MiKE 2.0, the open source standard for Information Management, Big Data is defined by its size, comprising a large, complex and independent collection of small data sets, each with the potential to interact. In addition, an important aspect of Big Data is the fact that it cannot be handled with standard data management techniques due to the inconsistency and unpredictability of the possible combinations. The Oxford English Dictionary (OED) defines big data as an extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions: much IT investment is going towards managing and maintaining big data In 2011 big data study by McKinsey highlighted that definitional challenge. Defining big data as datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze, the McKinsey researchers acknowledged that this definition is intentionally subjective and incorporates a moving definition of how big a dataset needs to be in order to be considered big data. As a result, all the quantitative insights of the study, including the updating of the UC Berkeley numbers by estimating how much new data is stored by enterprises and consumers annually, relate to digital data, rather than just big data, e.g., no attempt was made to estimate how much of the data (or datasets ) enterprises store is big data. In addition, in Gartner s IT Glossary Big Data is defined as high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. In a simpler definition we consider Big Data to be an expression that comprises different data sets of very large, highly complex, unstructured, organized, stored and processed using specific methods and techniques used for business processes. There are a lot of definitions on Big Data circulating around the world, but we consider that the most important one is the one that each leader gives to its one company s data. The way that Big Data is defined has implication in the strategy of a business. Each leader has to define the concept in order to bring competitive advantage for the company. III. IMPORTANCE OF BIG DATA The main importance of big data exist because more data more accurate results. Big data consist in the potential to improve efficiency in the context of use a large volume of data, of different type and various kind of dimensions. If big data is defined properly and used efficiently all over, organization can get a better view on their business therefore leading to efficiency in different areas like sales, improving the manufactured product, health report, data selection, statistical sample and so forth. Big data can be used effectively in the context of- 1. Big Data can provide transparent way of significant value by making and processing information. There is still a massive information not in digital form, e.g. traditional and classical records, sensors based data, or not textual based data not easily accessible and searchable through networks due to source of efficiency. 2. As organization create, store and process more transactional data in digital form, they can process more accurately and detailed efficiency and performance on everything from records and expose flightiness and boost functionality. In fact organization can store, analyse, process, and control big data using controlled experiments to make accurate decision. 3. Big Data allows segmentation of records and therefore much more precisely made-to-measure products or services.

IJAEST, Volume 4, Number 3 Vinod Sharma et al. 4. More data more analysis more accurate results that improve decision-making, minimize risk, and expose valuable fact from insights that would otherwise remain hidden. 5. Big Data can be used to develop the next generation of products and services. 6. Big Data act as a bridge the Gap between IT and Business 7. Big data can create new business model based on legacy and classical data 8. Getting the Right Information for Your Business 9. In information technology in order to improve security and troubleshooting by analyzing the patterns in the existing logs and events; 10. In customer service by using information from call centers in order to get the customer pattern and thus enhance customer satisfaction by customizing services; 11. In improving services and products through the use of social media content. By knowing the potential customers preferences the company can modify its product in order to address a larger area of people; 12. In the detection of fraud in the online transactions for any industry; 13. In risk assessment by analyzing information from the transactions on the financial market. 14. Big data can provide better results and transformative potential in Health care, Public Sector Administration, Retails, Manufacturing, and Personal Location, Proving the Value in IT to Business. Big data can do much more that are listed from above but that follow some inconsistencies and issue due its wide features like Volume, Velocity, Variety, Veracity, Value and many others. IV. BIG DATA CHALLENGES AND DILEMMAS Big Data is rapidly growing day by day due to massive data sets that comes from various resources but also reflects the trend in field of science and technology, this trend opens the door to understanding the data world and its hidden part for making decisions. Volume is not entirely the issue of big data but processing, analysis and controlling of traditional, structured, and unstructured data. Structured data depend on process and type s storage but need to differentiate in between data-driven-process and process-driven-data. Nowadays big data handling requirements are completely different from traditional data platform. Some of challenges are facing nowadays 1. The understanding of big data structure is major concerns before storing, processing, analyse it. According to IBM Volume, Velocity, variety, and Veracity is important factor but handling big data require more (e.g. values). There the time is important factor in transactional analysis because some of them need to be require highly frequent in order to accessing fast any change in the environment. 2. Considering the concept that Big Data is new in Organization, it is necessary for these organization to learn how to use new technology after invention of it in competitive age. 3. Prepare IT Specialist for handling big data. Only right talent can handle new technology and interpret the data for meaningful information for business. According to McKinsey s study on Big Data called Big Data: The next frontier for innovation, there is a need for up to 190,000 more workers with analytical expertise and 1.5 million more data- literate managers only in the United States. This statistics are a proof that in order for a company to take the Big Data initiative has to either hire experts or train existing employees on the new field. [10] 4. Privacy and Security is major challenges for big data. Because Big Data consists a massive amount of complex structure data, and it is very tedious for a company for considering security, privacy and authentication levels according to users and process. If we developing big data platform for handling global business need then consider capability of security in various range.

Big Data Introduction, Importance and Current Perspective of Challenges 225 5. There is difficult to identify the right data for right process or user and determining best utilization of this and that very different from the traditional platform. 6. Accessibility and connectivity of big data from third party processing data points is in initial phase of development. A majority of data points are not yet connected today, and organization often do not providing right platforms for access and processing the data across the enterprise. Organization offering some of utilities to integrate operational technologies, such a real-time grid management, cloud, with information technologies like smart metering. 7. Big Data rapidly change day by day and data world is evolving fast. IT architecture require a strong innovative technological partner that can help for adapt rapidly changes in big data. 8. Major organization need continuous availability and high scalability of Big Data Platform. Because your data never go down due to high scalability from different data centers. 9. Flexible design for multiformity of workload. Because dimension of big data in various range. 10. Cost of platform doesn t have to break the bank. IV.CONCLUSION Nowadays big data entering in our life in the context of better processing, and results. This technology emerging various kind of technology and provide values from discovery of knowledge. In this paper we discuss about basically three issues Definition, Importance and Challenges of big data. We discuss which definition is suitable for current scenario of big data then answer of why we using big data after that challenges of acquire big data. Today lot of platforms and challenging situations are there ahead of this world to make new technologies for both processing and storage. Extensions of traditional database technologies are also discussed to deal with big data. So more technologies can be made to deal with big data processing and storage to make it more concise and meaningful. REFERENCE [1] IBM, IBM BIG DATA. [Online]. Available: http://www-01.ibm.com/software/in/data/bigdata/. [2] EMarketers, 2 Billion Consumer WorldWide to Get Smart(phones), emarketer,2014.[online].available http://www.emarketer.com/article/2-billion-consumers-worldwide-smartphones-by-2016/1011694. [3] P. Vagata and K. Wilfong, Scaling the Facebook data warehouse to 300 PB Engineering Blog Facebook Code Facebook, 2014. [Online]. Available: https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/. [4] R. Jennings, NetAppVoice NSA s Huge Utah Datacenter How Much Of Your Data Will It Store Experts Disagree, Forbes, 2013. [Online]. Available: http://www.forbes.com/sites/netapp/2013/07/26/nsa-utah-datacenter/. [5] Microsoft Research, The Big Bang: How the Big Data Explosion Is Changing the World, 2013. [Online]. Available: http://www.microsoft.com/en-us/news/features/2013/feb13/02-11bigdata.aspx. [6] J. Guterman, Release 2.0: Issue 11 Big Data, 2009. [Online]. Available: http://www.oreilly.com/data/free/release-2-issue-11.csp. [7] R. More, Big Data Definition ->, 2014. [Online]. Available: http://searchcloudcomputing.techtarget.com/definition/big-data-big-data. [8] L. Johnston, Data is the New Black, 2011. [Online]. Available: http://blogs.loc.gov/digitalpreservation/2011/10/data-is-the-new-black/. [9] D. Boyd and K. Crawford, Critical Questions for Big Data, Information, Commun. Soc., vol. 15, no. 5, pp. 662 679, 2012. [10] A. H. B. James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, Big data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute, 2011. [Online]. Available: http://scholar.google.com/scholar.bib?q=info:kkctazs1q6wj:scholar.google.com/&output=citation&hl=en&as_sdt=0,47&ct=citation&cd =0