A SURVEY ON BIG DATA AND ITS RESEARCH CHALLENGES
|
|
|
- Samson Bridges
- 9 years ago
- Views:
Transcription
1 A SURVEY ON BIG DATA AND ITS RESEARCH CHALLENGES S. Justin Samuel 1, Koundinya RVP 2, Kotha Sashidhar 3 and C.R. Bharathi 4 1 Department of IT, Faculty of Computing, Sathyabama University, Chennai, India 2,3 B.Tech, Faculty of Computing, Sathyabama University, Chennai, India 4 Department of ECE, Vel Tech University, Chennai, India [email protected] ABSTRACT There has been an ever-increasing interest in big data due to its rapid growth and since it covers diverse areas of applications. Hence, there seems to be a need for an analytical review of recent developments in the big data technology. This paper aims to provide a comprehensive review of the big data state of the art, conceptual explorations, major benefits, and research challenging aspects. In addition to that, several future directions for big data research are highlighted. Keywords: big data, research challenges, big data architecture, big data open problems. 1. INTRODUCTION Big data is a term encompassing different types of complicated and large datasets that is hard to process with the conventional data processing systems. Numerous challenges are in place with big data like storage, transition, visualization, searching, analysis, security and privacy violations and sharing. The exponential growth of data in all fields demands the revolutionary measures required for managing and accessing such data. In [1], the authors have highlighted the need for the research in big data, in order to manage the online bio-logical data avenue. They have foreseen the importance of big data in the biological and biomedical research. It has exploded in such a way that it has marginalized a regulatory schema for personally identifiable information [2]. This is possible by analyzing the meta data and by using the predictive, aggregated findings thereby combining the previous discrete data sets. The significance of big data analytics comes when enterprises choose a technical stack, which dictates the type of data to store and to process. Relational Data Base management Systems are doing fine with structured data and continue to be the choice for many requirements. But for the exponential growth of unstructured data in terabytes or even peta bytes, derived from social networks, sensor networks and other federated data with replications, big data is the answer for handling such data. In general, cloud based big data seems cost effective, speedy to build and scalable. The adverse effect for the administrators lies in riding the data. If a system administrator has a private cloud, the unstructured data look like conventional structured data stack where IaaS (Infrastructure as a Service) is in the bottom, middle layer is the database and applications on the top. But in public cloud services, the concern about data security and managing privacy are on the rise. Storing NoSQL data in systems like MangoDB, Amazon SimpleDb, Windows Azure Tables, started the era of Big data science. This paper ensembles a detailed study on big data technology with characteristics, behavior and classified categories. The main focus is on throwing light into the research issues, opportunities and open problems. 2. FIVE V S OF BIG DATA There are many properties associated with big data. The prominent aspects are Volume, Variety, Velocity, Variability and Value. Volume Value Variety BIG DATA Variability Figure-1. Five V s of Big Data. Velocity There are many properties associated with big data. The prominent aspects are Volume, Variety, Velocity, Variability and Value. Volume: The volume of big data is exploding exponentially day to day. The data accumulated through social websites and sensor networks going to cross from petabytes to Zetabytes. Variety: Data produced are from different categories, consists of unstructured, standard, semistructured and raw data which are very difficult to be handled by traditional systems. Velocity: This is a concept which indicates the speed at which the data generated and become historical. Big data is able to handle the incoming and outgoing data rapidly. Variability: It describes the amount of variance used in summaries kept within the data bank and refers how they are spread out or closely clustered within the data set. Value: All enterprises and e-commerce systems are keen in improving the customer relationship by 3343
2 providing value added services. For that, study on customer attitudes and trends in the market are to be analyzed. Moreover, users can also query the data store to find business trends and accordingly they can change their strategies. By making big data open to all, it creates transparency on functional analysis. Supporting real time decisions and experimental analysis in different locations datasets can do wonderful things for enterprises. 3. BIG DATA CLASSES Categorization of big data falls with major aspects, since this technology involves with multiple diversified fields and ir-related types of information handling. Some of the classes can be framed like storing, sourcing, formatting, staging and processing [11]. Each class instantiates many entities through which the actions carried out. This is depicted as a technical view in Figure- 2. B I G D A T A Sourcing Storing Formatting Organizing Processing Querying Internet Communications Sensor Networks GIS Transactions Images Graphs Documents Key Values Store Unstructured Partially-Structured Structured Extract Clean Normalize Transform Load Predictive Analytics Data Mining Text Analytics On Demand Ad Hoc Figure-2. Technical view of Big Data. Sourcing: Data sources identified are Internet Web Pages, Discussion Forums, Chats and message shared in and among social networks, Remote Sensing Networks, All kinds of day to day transactions done through internet based applications. Formats: Unstructured, partially structured, and structured. Storing: Image based, Graph based, documents, Key Value Stores (Key Values Store is a way of storing application's data with null schema. It doesn't require a static data model. Unique keys are used to represent values stored in it.) Organize: Extract, Clean, normalize, transform, Load Process: Online, Offline Query: On demand, ad-hoc 4. RESEARCH AREAS AND THE CHALLENGES There are six major research areas identified as shown in Figure-3. They are, a) Applied ontology b) Security c) Storage and Transport d) Accessibility e) Inconsistencies f) Mobility Accessibility Inconsistencies BIG DATA CHALENGES Applied Ontology Security Mobility Storage and Transport Figure-3. Major research areas in Big Data. 4.1 Applied Ontology Applied Ontology is the way of applying the ontological resources to the domains like Geography as well as Bio-medicine. These works can be done within the semantic web framework. Applied ontology involves in looking the relationship between a person s world and his actions. The 9 th Ontology Summit held on Jan under the theme Big Data and Semantic Web Meet Applied Ontology [4] has concluded the following; To build bridges between Semantic web, big Data, Linked Data and Applied Ontology. Performance issues, Challenges in scalability while combining big data and Semantic web. Technically, automated reasoning tools to be developed to make use of ontologies. Large common reusable ontologies and ontological analytical techniques to overcome the engineering bottlenecks are the need of the hour. 4.2 Security Big data involved with many use cases like Staging, Pre-processing, Processing, Meta data storage and to store short term as well as long term fact data. For serving each use case multi-facets of infrastructure required. Safe and private transactions are the two major concerns of IT. But the safety and privacy becomes a question mark as the data volume of big data fast grows. When we consider the safety aspect, the existing cryptography standards can not meet the demands of big 3344
3 data [5]. Hence; effective mechanisms to handle structured, semi-structured, unstructured data have to be investigated and to be developed. Data acquisitions of users like habits, personal interests and the like through websites may happen with the permissions of the users or maybe when the users are not aware of. But the same may be leaked while storing, transmitting or handling. According to report [6], researcher Ron acquired 2.8 GB of Facebook user s data and made it available to download on the internet. Hence, Privacy protection is another challenging problem in big data. 4.3 Storage and transport Big data stores and handles data in different way from traditional data warehouses. Big data comprises massive sensor data, raw and semi-structured log data of IT industries and the exploded quantity of data from social media. As per the examples given in [7], current disk technologies are limited to store 4 TB per disk. For storing 1 exa byte, it requires 25, 000 disks which will overwhelm the existing communication technologies. That means such phenomenon demands for a revolution in storage and communication technologies. 4.4 Accessibility The rapid growth of data on the internet, challenges the research community to move towards innovating efficient algorithms and processing technologies. The access technique for big data has two manifestations. First, process the data in the source side and transmit results. That is, scripting technologies on the browser side should be improved to bring the necessary code from the server. Second, transmit only the critical data after performing through valid filters. For enabling this scenario, the following recommendations are made. a) High performance convexed clustering algorithms to be developed to improve the performance in the distributed and parallel architectures. b) Efficient data analytic techniques required to handle large amount of data and to filter the data by applying proper constraints. c) The Natural Language Processing techniques should become applicable over all kinds of enterprise data, politics, bio-medics and all other domains. d) Conventional tools to handle Big data are outdated and become inefficient. So a lot of research required in developing tools for Big Data and platforms for deployment. e) Improved algorithm for crawling data from multiple platforms is needed. Also, powerful algorithms to visualize random data from multi-disciplinary segments are to be generated to get accurate results as input for crucial applications. 4.5 Inconsistencies The Big data research area is embedded with multi-dimensional technical and scientific spaces. The objectives of big data analytics differ with the stakeholders. As this big data analysis is the next frontier for innovation and advancement of technology, one should not underestimate the impact of it on the society. Soon the big data cloud is going to cover all domains and sectors like manufacturing, spacial data science, life and physical sciences, communication, finance and banking etc., As big data comprises all domains, definitely inconsistencies arise. Inconsistencies in data level, information level and knowledge level have to be addressed. There are four types of inconsistencies like temporal, textual, spatial and functional dependency. These inconsistencies are well addressed by Zhang in [8]. 4.6 Mobility Enterprises are poised to extend more investment in applications which support mobile devices. Very great potential to increase productivity is on the way for businesses when they combine enterprise process automation and mobile computing technologies. Increasing location based datasets, influx of data from mobile applications, their size and variety exceeds the capacity of spatial and mobile computing technologies. Mobile users contribute a lot to big data analytics through their online activities. The convergence of traditional routing services (including GPS and spatial data) into the big data paradigm has to face major challenges. First, it increases the computational cost because it magnifies the impact of routing queries to mobile devices. Second, it uses geographical reasoning in remote sensing and inference over time and space. The built-in motion detectors in mobile phones derive a huge amount of data from every user s life. How efficiently utilize these data and how to carry and share through limited bandwidth mobile stations, are the other challenges. 5. TECHNICAL CHALLENGES IN BIG DATA Whenever new technologies evolve, they meet with new challenges in all the aspects. Once the functional challenges are in place, the next kin is the technical challenges. Big data faces many technical challenges which are on the roadway of the research. a) Failure handling Devising 100% reliable systems on the go is not an easy task. Systems can be devised in such a way that the probability of failure must fall within the permitted threshold. Fault tolerance is a technical challenge in big data. When a process started it may involve with numerous network nodes and the whole computation process becomes cumbersome. Retaining check points and fixing the threshold level for process restart in case of failure, are greater concerns. b) Data heterogeneity Big data deals with unstructured, semi-structured and structured data. Linking unstructured data with structured data, converting data from one form into another required form needs a lot of research. 3345
4 c) Data quality Huge amount of data pertaining to a problem is undoubtedly a big asset for both Business as well as IT leaders [9]. For predictive analysis or for better decision making amount of relevant data helps a lot. But the quality of such data is based on the source through which they are derived. Though big data stores large relevant data, the accuracy of data is completely dependent on the source domains. Hence, there is a question of how far the data can be trusted and it definitely requires appropriate trust agent filters. 6. OPEN PROBLEMS a) Data management in a single perfect manner is remaining an open problem for both cloud and big data. b) Big data handles unstructured data also. Hence, the data structures differ from conventional SQL databases. The data structures for NoSQl databases are Graph, Documents and Key value stores [10]. Though there are technologies to manage graphics and documents, Key value store needs a lot of research. The functionalities of handling key value stores to be improved to support on demand queries and also extended key value stores are required to manage diverse data oriented rich internet applications. c) Elasticity for effective usage of unstructured as well as structured resources with consistent semantics, operating with those resources at a minimum cost and enabling autonomy in multi-tenant data systems are some of the open problems. 7. PROGRESS IN BIG DATA ANALYTICS An insight about the ongoing research in Big data analytics has been presented in this section. Table-1. Recent evolvements in Big Data research. Ref. Used technology Findings [12] Hadoop and map reduce Solution for Optimizing Job Scheduling, Organization of indexes and Layouts rendered. [13] Hive, Hadoop and Have built a Random Forest based Decision Tree Mahout model to detect botnet in peer to peer network. [14] Rapidminder and Proposed an architecture called Radoop to scale the Hadoop data and network size. [15] Hadoop and IBM smart Introduced new architecture to support the analytic system analytical system [16] Hadoop Proposed a self-tuning system called Starfish. 8. CONCLUSIONS The data size in all areas is exploding day to day. The velocity and variety of data growth is increasing due to the proliferation of sensor and mobile devices with internet connection. Data generated by this way, is the greatest asset for enterprises in developing and defining business strategies. Cloud services were used to process and analyze huge amount of data and it has turned into the new Big Data model to meet the on-demand services. In this paper, we have done an elaborated study on Big Data and its research challenges. We have highlighted the existing problems and have presented the research opportunities. We propose the technical view of big data comprising the various classes. Also, the key research areas have been identified and the insights into those areas are discussed. Furthermore, key issues to be addressed by both industry and academia, are brought into light. Even though a conclusion may review the main results or the contributions of the paper, do not duplicate the abstract or the introduction. For a conclusion, you might elaborate on the importance of the work or suggest the potential applications and extensions. REFERENCES [1] Howe AD, Costanzo M, Fey P, et al Big data: The future of biocuration, Nature. 455(7209): Doi: /455047a. [2] Crawford Kate and Jason Schultz Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms, Boston College Law Review. 55(93): [3] [4] ontology-summit-big-data-and-semantic-web-meetapplied-ontology/. [5] Min Chen, Shiwen Mao, Yunhao Liu Big Data: A Survey, Mobile Networks and Applications. 19(2): [6] Tasevski P Password attacks and generation strategies, Tartu University: Faculty of Mathematics and Computer Sciences. [7] Stephen Kaisler, Frank Armour, J. Alberto Espinosa, William Money Big Data: Issues and 3346
5 Challenges Moving Forward, Proceedings of 46th Hawaii International Conference on System Sciences, IEEE, (pp Year of Publication: ISBN: ). [8] Du Zhang, Inconsistencies in big data, Proceedings of the 12th IEEE International Conference on Cognitive Informatics, New York City, NY (Page: Year of Publication: 2013 ISBN: ). [9] A. Katal, M. Wazid, and R. H. Goudar, Big data: Issues, challenges, tools and Good practices, Proceedings of Sixth International Conference on Contemporary Computing (IC3), (Page: Year of Publication: 2013 ISBN: ). [10] [11] Ibrahim Abaker Targio Hashema, Ibrar Yaqooba, Nor Badrul Anuara, Salimah Mokhtara, Abdullah Gania, Samee Ullah Khanb The rise of big data on cloud computing: Review and open research issues, Information Systems, Elsevier. 47: [12] Dittrich Jens and Jorge-Arnulfo Quiané-Ruiz Efficient big data processing in Hadoop MapReduce, Proceedings of the VLDB Endowment. 5(12): pdf. [13] Singh Kamaldeep, et al Big Data Analytics framework for Peer-to-Peer Botnet detection using Random Forests. Information Sciences, Elsevier. 278: [14] Prekopcsák Zoltán, et al Radoop: Analyzing big data with rapidminer and hadoop. Proceedings of the 2 nd RapidMiner Community Meeting and Conference. [15] Ferguson Mike Architecting a Big Data Platform for Analytics, a Whitepaper Prepared for IBM. 33usen/IML14333USEN.PDF. [16] Herodotou Herodotos, et al Starfish: A Selftuning System for Big Data Analytics, CIDR. aper36.pdf. 3347
Big Data: Tools and Technologies in Big Data
Big Data: Tools and Technologies in Big Data Jaskaran Singh Student Lovely Professional University, Punjab Varun Singla Assistant Professor Lovely Professional University, Punjab ABSTRACT Big data can
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
Big Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
Big Data: Study in Structured and Unstructured Data
Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 [email protected], [email protected] Abstract With the overlay of digital world, Information is available
Data Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
Big Data on Cloud Computing- Security Issues
Big Data on Cloud Computing- Security Issues K Subashini, K Srivaishnavi UG Student, Department of CSE, University College of Engineering, Kanchipuram, Tamilnadu, India ABSTRACT: Cloud computing is now
BIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved
Indian Journal of Science The International Journal for Science ISSN 2319 7730 EISSN 2319 7749 2016 Discovery Publication. All Rights Reserved Perspective Big Data Framework for Healthcare using Hadoop
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
International Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
Manifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS
BIG DATA: BIG CHALLENGE FOR SOFTWARE TESTERS Megha Joshi Assistant Professor, ASM s Institute of Computer Studies, Pune, India Abstract: Industry is struggling to handle voluminous, complex, unstructured
International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology
International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 BIG DATA: A New Technology Farah DeebaHasan Student, M.Tech.(IT) Anshul Kumar Sharma Student, M.Tech.(IT)
Keywords: Big Data, Hadoop, cluster, heterogeneous, HDFS, MapReduce
Volume 5, Issue 9, September 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Study of
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
Chapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.
Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has
Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
The 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS Dr. Ananthi Sheshasayee 1, J V N Lakshmi 2 1 Head Department of Computer Science & Research, Quaid-E-Millath Govt College for Women, Chennai, (India)
BIG DATA CHALLENGES AND PERSPECTIVES
BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,
A Cloud-based System Framework for Storage and Analysis on Big Data of Massive BIMs
A Cloud-based System Framework for Storage and Analysis on Big Data of Massive BIMs Hung-Ming Chen a, and Kai-Chuan Chang a a Department of Civil and Construction Engineering, National Taiwan University
UNDERSTANDING THE BIG DATA PROBLEMS AND THEIR SOLUTIONS USING HADOOP AND MAP-REDUCE
UNDERSTANDING THE BIG DATA PROBLEMS AND THEIR SOLUTIONS USING HADOOP AND MAP-REDUCE Mr. Swapnil A. Kale 1, Prof. Sangram S.Dandge 2 1 ME (CSE), First Year, Department of CSE, Prof. Ram Meghe Institute
A Novel Cloud Based Elastic Framework for Big Data Preprocessing
School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
ISSN: 2321-7782 (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
Testing Big data is one of the biggest
Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]
BIG DATA IN SUPPLY CHAIN MANAGEMENT: AN EXPLORATORY STUDY
Gheorghe MILITARU Politehnica University of Bucharest, Romania Massimo POLLIFRONI University of Turin, Italy Alexandra IOANID Politehnica University of Bucharest, Romania BIG DATA IN SUPPLY CHAIN MANAGEMENT:
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN
Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK REVIEW ON BIG DATA SECURITY IN CLOUD COMPUTING MISS. ANKITA S. AMBADKAR 1, PROF.
The Next Wave of Data Management. Is Big Data The New Normal?
The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management
So What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
Big Data: Issues, Challenges, Tools and Good Practices
Big Data: Issues, Challenges, Tools and Good Practices Avita Katal Mohammad Wazid R H Goudar Department of CSE Department of CSE Department of CSE Graphic Era University Graphic Era University Graphic
Trustworthiness of Big Data
Trustworthiness of Big Data International Journal of Computer Applications (0975 8887) Akhil Mittal Technical Test Lead Infosys Limited ABSTRACT Big data refers to large datasets that are challenging to
Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
Big Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
Customized Report- Big Data
GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.
BIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
Tracking System for GPS Devices and Mining of Spatial Data
Tracking System for GPS Devices and Mining of Spatial Data AIDA ALISPAHIC, DZENANA DONKO Department for Computer Science and Informatics Faculty of Electrical Engineering, University of Sarajevo Zmaja
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
A Brief Outline on Bigdata Hadoop
A Brief Outline on Bigdata Hadoop Twinkle Gupta 1, Shruti Dixit 2 RGPV, Department of Computer Science and Engineering, Acropolis Institute of Technology and Research, Indore, India Abstract- Bigdata is
Big Data Storage Architecture Design in Cloud Computing
Big Data Storage Architecture Design in Cloud Computing Xuebin Chen 1, Shi Wang 1( ), Yanyan Dong 1, and Xu Wang 2 1 College of Science, North China University of Science and Technology, Tangshan, Hebei,
Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop
Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning
Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
Big Data - Infrastructure Considerations
April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright
Big Data: An Introduction, Challenges & Analysis using Splunk
pp. 464-468 Krishi Sanskriti Publications http://www.krishisanskriti.org/acsit.html Big : An Introduction, Challenges & Analysis using Splunk Satyam Gupta 1 and Rinkle Rani 2 1,2 Department of Computer
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image
AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
Big Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
Journal of Chemical and Pharmaceutical Research, 2015, 7(3):1388-1392. Research Article. E-commerce recommendation system on cloud computing
Available online www.jocpr.com Journal of Chemical and Pharmaceutical Research, 2015, 7(3):1388-1392 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 E-commerce recommendation system on cloud computing
BIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE Anjali P P 1 and Binu A 2 1 Department of Information Technology, Rajagiri School of Engineering and Technology, Kochi. M G University, Kerala
Problems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - [email protected] 1
Problems to store, transfer and process the Big Data COURSE: COMPUTING CLUSTERS, GRIDS, AND CLOUDS LECTURER: ANDREY SHEVEL ITMO UNIVERSITY SAINT PETERSBURG 6/2/2016 GIANG TRAN - [email protected]
BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
W H I T E P A P E R. Building your Big Data analytics strategy: Block-by-Block! Abstract
W H I T E P A P E R Building your Big Data analytics strategy: Block-by-Block! Abstract In this white paper, Impetus discusses how you can handle Big Data problems. It talks about how analytics on Big
Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data
Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems
Outline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
How To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Transforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn
The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress
COMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
NextGen Infrastructure for Big DATA Analytics.
NextGen Infrastructure for Big DATA Analytics. So What is Big Data? Data that exceeds the processing capacity of conven4onal database systems. The data is too big, moves too fast, or doesn t fit the structures
Big Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
OnX Big Data Reference Architecture
OnX Big Data Reference Architecture Knowledge is Power when it comes to Business Strategy The business landscape of decision-making is converging during a period in which: > Data is considered by most
Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics
Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics Dharmendra Agawane 1, Rohit Pawar 2, Pavankumar Purohit 3, Gangadhar Agre 4 Guide: Prof. P B Jawade 2
Sunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON HIGH PERFORMANCE DATA STORAGE ARCHITECTURE OF BIGDATA USING HDFS MS.
Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
How to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel
Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined
1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India
1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India Call for Papers Colossal Data Analysis and Networking has emerged as a de facto
Real Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
Why Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP
AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP Asst.Prof Mr. M.I Peter Shiyam,M.E * Department of Computer Science and Engineering, DMI Engineering college, Aralvaimozhi.
Cloud Computing Services and its Application
Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 4, Number 1 (2014), pp. 107-112 Research India Publications http://www.ripublication.com/aeee.htm Cloud Computing Services and its
Axibase Time Series Database
Axibase Time Series Database Axibase Time Series Database Axibase Time-Series Database (ATSD) is a clustered non-relational database for the storage of various information coming out of the IT infrastructure.
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:
Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
How To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 [email protected] www.scch.at Michael Zwick DI
A Study on Security and Privacy in Big Data Processing
A Study on Security and Privacy in Big Data Processing C.Yosepu P Srinivasulu Bathala Subbarayudu Assistant Professor, Dept of CSE, St.Martin's Engineering College, Hyderabad, India Assistant Professor,
Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.
Impact of Big Data in Oil & Gas Industry Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. New Age Information 2.92 billions Internet Users in 2014 Twitter processes 7 terabytes
Big Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
