"BIG DATA A PROLIFIC USE OF INFORMATION"



Similar documents
Are You Ready for Big Data?

BIG DATA TRENDS AND TECHNOLOGIES

Are You Ready for Big Data?

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

Data Refinery with Big Data Aspects

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How To Learn To Use Big Data

ANALYTICS BUILT FOR INTERNET OF THINGS

locuz.com Big Data Services

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

Transforming the Telecoms Business using Big Data and Analytics

Integrating a Big Data Platform into Government:

How the oil and gas industry can gain value from Big Data?

A New Era Of Analytic

Information Management course

A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS

Big Data on Microsoft Platform

The Business Analyst s Guide to Hadoop

Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Big Data. White Paper. Big Data Executive Overview WP-BD Jafar Shunnar & Dan Raver. Page 1 Last Updated

Create and Drive Big Data Success Don t Get Left Behind

COMP9321 Web Application Engineering

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Manifest for Big Data Pig, Hive & Jaql

The Future of Business Analytics is Now! 2013 IBM Corporation

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Navigating Big Data business analytics

How To Handle Big Data With A Data Scientist

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

OnX Big Data Reference Architecture

TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP

Big Data and Analytics: Challenges and Opportunities

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Capture intelligence that matters

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

How To Make Data Streaming A Real Time Intelligence

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

In-Database Analytics

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data

Introduction to Data Mining

Microsoft Big Data. Solution Brief

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases

Using Tableau Software with Hortonworks Data Platform

The 4 Pillars of Technosoft s Big Data Practice

Big Data and Data Science: Behind the Buzz Words

Hadoop Technology for Flow Analysis of the Internet Traffic

Big Data Mining: Challenges and Opportunities to Forecast Future Scenario

White Paper: What You Need To Know About Hadoop

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

BIG DATA SOLUTION DATA SHEET

BIG DATA What it is and how to use?

How To Learn Data Analytics

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data

Information-Driven Transformation in Retail with the Enterprise Data Hub Accelerator

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Banking On A Customer-Centric Approach To Data

Indian Journal of Science The International Journal for Science ISSN EISSN Discovery Publication. All Rights Reserved

Bringing the Power of SAS to Hadoop. White Paper

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Suresh Lakavath csir urdip Pune, India

Are You Big Data Ready?

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management

Extending security intelligence with big data solutions

IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst

Interactive data analytics drive insights

Data Mining Solutions for the Business Environment

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Agile Business Intelligence Data Lake Architecture

Exploiting Data at Rest and Data in Motion with a Big Data Platform

PARC and SAP Co-innovation: High-performance Graph Analytics for Big Data Powered by SAP HANA

Big Data in the Nordics 2012

Bruhati Technologies. About us. ISO 9001:2008 certified. Technology fit for Business

Keyword: YARN, HDFS, RAM

Big Data Explained. An introduction to Big Data Science.

How To Turn Big Data Into An Insight

The Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics

Extend your analytic capabilities with SAP Predictive Analysis

E-Guide THE CHALLENGES BEHIND DATA INTEGRATION IN A BIG DATA WORLD

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

BIG DATA FUNDAMENTALS

Ten Mistakes to Avoid

The New Normal: Get Ready for the Era of Extreme Information Management. John Mancini President, DigitalLandfill.

How To Use Big Data Effectively

Harnessing the power of advanced analytics with IBM Netezza

Packet Flow Analysis and Congestion Control of Big Data by Hadoop

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

NoSQL for SQL Professionals William McKnight

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

International Journal of Innovative Research in Computer and Communication Engineering

Big Data Integration: A Buyer's Guide

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

Transcription:

Ojulari Moshood Cameron University - IT4444 Capstone 2013 "BIG DATA A PROLIFIC USE OF INFORMATION" Abstract: The idea of big data is to better use the information generated by individual to remake and improve our businesses, Security, health care, and economy. While some people consider big data as a meme and a marketing term that opens door to new approach to understanding the world and decision making [7]. Big data present vastly new opportunities to us. Giving individuals the right to their data enables data to become an asset that people own which in turn gives them ability to trade for service or whatsoever [3]. This creates a new environment of people who treat their data like how they treat their money. It also enables the next generation of interactive data analysis with real time answers [6]. The goal of this paper is to enlighten its readers about Big Data and its benefits. Keywords: Big Data, Privacy, Security, Big Data Analytic, Data Mining, Data Warehousing, Hadoop. Introduction: Big data is data too big, too fast, or too complex for existing tools to capture, manage, keep, and process. People often refer big data as data from social media or search engines but that is not big data, the real big data are data like credit cards, photograph, mobile phones, GPS logs, web-browsing trails, network data, sensor, email, and so on [1]. These are things that we tend to neglect that show trends in people's behavior, which is one of the major key in big data. Big data promises is to help better engineer the systems we have now in our society to work more efficiently with the use information obtained from data analysis, Which helps brings nonnegotiable facts into mix, enabling managers to base vital decisions on solid information accumulated from a rich variety of sources and delivered in real time [3]. Big data is faced with challenges due to the current technology not be able to handle the velocity, volume, and variety of data and the algorithm for analyzing such massive amount of data and also there is a lot of privacy concern as to who, whom would be using the information. Section 2 discuss big data analytics and warehousing and gives examples of data analytics can help a business maximize profit, In section 3, discuss big data mining and its importance, section 4 discuss big data security, section 5 discuss big data & privacy issues that rises do to individual information been shared by multiple sources around the world. Section 6 gives an insight to the future of big data and how it will help better engineer a more informative driven society. Big Data Analytics & Warehousing: Big data analytic is a fast growing and influential practice [5]. It analyzes various data types i.e. (video, gps tracking, web-browser

trails, sensors, email, social media) with the use of advance analytic techniques to process, clean, and transform structured unstructured acquired data to unearth patterns, hidden correlations and other fruitful information. Such information can be use in decision-making, improve business performance, security, and so on. Big data analytics can be done using apache Hadoop, an open source software framework that enables the distributed processing of large data sets across clusters of commodity servers [9]. Instead of a single server to thousands of machines with a very high degree of fault tolerance that does not relying on high-end hardware, rather the resiliency of the clusters comes from the software s ability to handle and detect failures at the application layer [8]. Hadoop provides two basic services namely, MapReduce and Hadoop Distributed File System (HDFS). MapReduce is a programming model use to simplify data processing across large datasets, While HDFS is a distributed file system with high fault tolerance and is designed to be deployed on low cost hardware, and it also provides highthroughput access to application for large data set [9]. Retail Company like Wal-Mart and kohl's uses data analytic in sales, pricing, demo graph and weather data to tailor product selections at particular stores to determine the timing of price markdowns [7]. Data warehousing is a database use for storing, reporting, and data analysis [11]. It stores current as well as historical data, which are used for creating trending reports used in decision-making, or for future prediction. It is usually a central repository of data, crafted by combining data form one or more sources organized to facilitate management decision making. Data warehouses a constructed through data cleaning, data integration, data transformation, data loading, and periodic data refreshing [10].

Image source:http://whatsthebigdata.com/2012/12/06/the-future-of-big-data-infographic/ Big Data Mining: Is a sophisticated technique that analyzes large variety, and volumes of data, for determining patterns and relations, using advanced statistical analysis, and modeling techniques. The main objective is to find relation in patterns that can be leverage for improving the business [11]. It helps unveil puzzling but useful associations and to better understand known association [12]. For example, a retailer discovered that almost half of the customers who bought cigarette on Friday also bought cologne an association that led the retailer to display the twoproduct side-by-side or when it rains people tend to sign in to various social networks. This helps retailers determine the most effective sales floor layout [11]. Big Data Security: Big data security analytics is simply a collection of security data sets so large and complex that it becomes difficult (or impossible) to process using on-hand database management tools or traditional security data processing applications [13]. It can help lower cyber security risks through analyzing a large amount of behavioral data to distinguish between legitimate or malicious activities through in-depth analysis of forensics and network traffic, fraud detection report or logs, and customer data, to identify the known and unknowns to create or better enhance our current security systems. Big Data analytic can also help detect DDoS attacks by creating a MapReduce based detection algorithm in Hadoop, to simple count the total volume of the number of web page requests from a client or by calculating the spending time and the

bytes count for each request of the URL and comparing the access sequence and spending time among other clients trying to access the same server to detect if a clients is infected [14]. For example, James used 6 IP addresses and five user IDs and 14 different accounts. With big data security analytics techniques security experts will be able to make the most accurate security decisions from the information extracted from real-time analysis of various data set or to create an entire new set of security capabilities. Big Data & Privacy: Protecting data privacy becomes harder as information is been shared widely among different parties around the world, and protecting data stored in distributed systems or data been shared is very important because there can be serious consequences if such data is released without the data owners knowledge [16]. As more information regarding individuals health, financials, location, and online activity spreads, concerns arise about profiling, tracking, discrimination, exclusion, government surveillance and loss of control of such information. Big data challenges some of the most fundamental concepts of privacy law, including the definition of personally identifiable information, the role of individual control, and the principles of data minimization and purpose limitation [15]. Future of Big Data: The future of big data is to potential re-engineer the marketplace, security, schools, businesses, public health, and the society at large with its power of prediction, which helps reveal obscured trend in structured and unstructured data set in real time which helps business correlate with customer activity online with right promotions and marketing campaigns, tracking sales transactions and also to help prediction system failure beforehand or to help the government in areas like security, economic advancement, public school, census, and much more to foster a better and more informative driving society. For more information, check "http://datadrivendetroit.org/projects/". Challenges in big data will help foster or create new career fields and technology such as data scientist, data analyst, new analytic platforms, products, systems and many more. Putting Big Data to work, to drive innovation or to reform current innovation process will not be trouble-free but if appropriate investment is made towards it, I believe that Big Data will usher a new surge of technological advancement that will help change the world. Conclusion: As the world is becoming more information driven, Big Data will continue to grow at a fast pace and will help engineer a new universe driven by data. Big Data holds potential for making an enormous advancement in many scientific disciplines and expanding the profitability and success of many small and big enterprises [6]. However, some companies are already using some of the power of Big Data analytics in crucial decision-making to gain competitive advantage over other competitors. The challenges of Big Data is not just the data scale, but it also a diverse range of others, such as lack of structure, algorithm, privacy, provenance, timeliness, and visualization. Big Data will help us uncover knowledge that no one has discovered before. I encourage everyone to participate in this great journey to a world where data means vivacity. References

1. Liyakasa, Kelly. "Big Data Analytics Can Help Improve Information Security." CRM Magazine 16.11 (2012): 11. Academic Search Premier. Web. 1 Feb. 2013. 2. Johnson, Jeanne E. "Big Data + Big Analytics = Big Opportunity." Financial Executive 28.6 (2012): 50-53. Business Source Premier. Web. 1 Feb. 2013. 3. Huwe, Terence K. "Big Data, Big Future." Computers in Libraries 32.5 (2012): 20-22. Academic Search Premier. Web. 1 Feb. 2013. 4. Ritchey, Diane. "Big Data, Big Security." Security: Solutions for Enterprise Security Leaders 49.7 (2012): 28-30. Business Source Premier. Web. 1 Feb. 2013. 5. Russom, Philip. "Big Data Analytics." Tdwi.org. Tdwi Research, 2011. Web. 15 Feb. 2013. 6. "Challenges and Opportunities with Big Data." Http://www.cra.org. N.p., n.d. Web. 09 Feb. 2013. <http://www.cra.org/ccc/docs/init/bigdatawhitepaper.pdf>. 7. The Age of Big Data. Steve Lohr. New York Times, Feb 11, 2012. http://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html 8. "What Is Hadoop?" IBM. N.p., n.d. Web. 25 Mar. 2013. <http://www- 01.ibm.com/software/data/infosphere/Hadoop/>. 9. Apache Hadoop. http://hadoop.apache.org. 10. Han, Jiawei, Kamber, Micheline. Data Mining: Concepts and Techniques. Boston, Mass: Elsevier, 2006. Web. 1 Apr. 2013. 11. Khan, Arshad. Data Warehousing 101: Concepts and Implementation. San Jose, Calif: Khan Consulting and Publishing, 2003. Web. 1 Apr. 2013. 12. Kamath, Chandrika. "Large Scale Data Mining and Pattern Recognition: Overview." Large Scale Data Mining and Pattern Recognition: Overview. Lawrence Livermore National Laboratory, 30 Aug. 2000. Web. 01 Apr. 2013. <https://computation.llnl.gov/casc/sapphire/overview/overview.html>. 13. Jon, Oltsik. "Defining Big Data Security Analytics." Network World. Network World, Inc., 01 Apr. 2013. Web. 02 Apr. 2013. <http://www.networkworld.com/community/node/82758>. 14. Y. Lee and Y. Lee, Detecting DDoS Attacks with Hadoop, ACM CoNEXT Student Workshop, Dec.2011. 15. Tene, Omer. "Big Data for All: Privacy and User Control in the Age of Analytics." Center for Internet and Society. N.p., 20 Sept. 2012. Web. 16 Mar. 2013. <http://cyberlaw.stanford.edu/blog/2012/09/big-data-all-privacy-and-user-control-ageanalytics>. 16. Wong, R. C. "Big Data Privacy." J Inform Tech Softw Eng 2 (2012): e114.