ANALYSIS OF SMART METER DATA USING HADOOP
|
|
|
- Darren Small
- 10 years ago
- Views:
Transcription
1 ANALYSIS OF SMART METER DATA USING HADOOP 1 Balaji K. Bodkhe, 2 Dr. Sanjay P. Sood MESCOE Pune, CDAC Mohali 1 [email protected], 2 [email protected] Abstract The government agencies and the large multinational companies across the world focuses on energy conservation and efficient usage of energy. The need of using energy in a efficient way is the need of developing countries like India and China.The emergence of smart grid meters gave us access to huge amount of energy consumption data. This data provided by smart meters can be used efficiently to provide insights into energy conservation measures and initiatives. Various energy distribution companies harness this data and get unpredictable results about customer s usage pattern; they then after performing analysis predict the demand and consumption of users. This analysis helps them to decide the tariff at different point of time. The companies are trying to overcome the bottleneck in capital investment cost of data.further, processing Big Data for chart generation and analytics is a slow process and is not fast enough to support real-time decision making. Our paper showcases a Business Intelligence tool which uses Apache Hadoop to efficiently handle the existing problems. Taking the advantage of this tool, energy distribution companies can reduce the investment by using community hardware that runs Hadoop. The usage of distributed computing tools also reduces the processing time significantly to enable real-time monitoring and decision making.this tool will also reduce carbon footprint and other related problems in energy distribution including loses and theft.in future this same analysis can be done on other utility resources such as gas and water. Index Terms About four key words or phrases in alphabetical order, separated by commas. I. INTRODUCTION Analytics of energy consumption data to gain insights into customer usage patterns is what energy distributors are trying to achieve for several target applications such as time-of use tariff, demand response management and billing accuracy. This smart meter collects data every minute which results in generating large amount of data while the old mechanical meter collects data by hourly or monthly. This huge data storage capabilities and the complexity of data processing intelligence varies significantly with different applications. Traditional RDBMS of utility companies is a bottleneck in executing this approach. As a result, for the industry to truly benefit from the smart grid investment, it is critical that the massive amount of data made available by smart meters be handled efficiently in an organized manner that helps grid operators make timely decisions to operate grid safely, economically and reliably. Apache Hadoop is the solution available to tackle above problems which runs on commodity machines only. It is distributed computing tool which have large storage as well as processing capability. We are using Apache Hadoop framework that allows for the distributed processing of large data sets across clusters of computers. Hadoop MapReduce is a system for parallel processing of large data sets. Traditional RDBMS or other applications are much slower and inefficient in handling big data generated by smart meters as compared to Hadoop framework. So for the 104
2 industry and customers to gain benefits like billing accuracy, energy theft detection, analyzing customer usage patterns and demand response management etc. It is always profitable to use Hadoop which runs on cheap commodity hardware. With the evolution of smart meters for smart distribution and efficient use of energy, electricity, the generated power should be utilized properly with fair economy gains to distributors and the consumers. Thus with this focus of energy distribution in the domain of energy consumption, which will result in reduction of carbon prints, the analytics for the data received from the smart meters should be done. This massive size of analytics will need large computation which can be done with the help of distributed processing framework, Hadoop. The framework s use will provide multipurpose beneficial outputs which include: billing accuracy, time-of-use tariff plans etc. Thus this concept, smart meter data analytics, is implemented with a view of future use. II. HADOOP Smart meters send energy consumption data to the server at regular intervals of time which results in generating big amount of data and existing tools were not capable of handling such large amounts of data. Apache Hadoop is an open source framework for developing distributed applications that can process very large amounts of data. It is a platform that provides both distributed storage and computational capabilities. Hadoop has two main layers: 1. Computation layer: The computation tier uses a framework called MapReduce. 2. Distributed storage layer: A distributed file system called HDFS provides storage. Hadoop Advantages: Hadoop is an open source, versatile tool that provides the power of distributed computing. By using distributed storage & transferring code instead of data, Hadoop reduces the costly transmission step when working with large data sets to a great extent. Redundancy, hadoop can recover from a situation when a single node fails. Ease to create programs with Hadoop As it uses the MapReduce framework. You did not have to do worry about partitioning the data, determining which nodes will perform which tasks, or handling communication between nodes as it is all done by Hadoop for you. Hadoop leaving you free to focus on what is most important to you and your data and what you want to do with it. Hadoop Key Features: Distributed computing is the very vast field but following key features has made Hadoop very distinctive and attractive. A. Accessible: Hadoop runs on large clusters of commodity machines or on cloud computing services such as Amazon's Elastic Compute Cloud (EC2). B. Robust: As Hadoop is intended to run on commodity hardware, it is architected with the assumption of frequent hardware malfunctions. It can gracefully handle most such failures. C. Scalable: Hadoop scales linearly to handle larger data by adding more nodes to the cluster. III. SYSTEM ARCHITECHTURE IV. SMART METER DATA ANALYSIS Electricity generation in India majorly depends upon the non-renewable sources [1]. Though India is rich in these resources but these resources are depleting at an alarming rate such that they will be exhausted very soon whereas the renewable resources are not utilized to their capacity. Thus India needs to concentrate more on the renewable sources of energy such as wind 105
3 energy harvesting in southern and western India where wind velocity is high. Poor metering, power theft, lack of proper planning, overload on the resources are the few reasons to the present poor grid conditions in India. The transmission and distribution costs are so high that the government loses a lot of money on every unit of electricity sold. Keeping all the above factors in mind the government of India has taken many steps towards the betterment and improvement of the electricity grid. Smart metering, Variable tariffs, the Electricity act 2003, etc. are some of the initiatives taken. The Smart grid concept presented in this paper is a step by step process specially tailored to Indian conditions which when followed will lead to a very effective and well managed smart electricity grid by This will include well established production and transmission devices, smart metering, transparency in the working of the management, visual analytics for both the provider and the consumer. Various hurdles will have to be jumped over to reach this stage the biggest one will be that of corruption, once that is overcome then rest of the problems can be collectively solved by heuristic methods. Secondly government should try and involve the IT companies into this so that experts from these companies can make necessary amendments. The top ranked colleges can be involved in this process where the students might provide some valuable insights. To save energy, reduce cost, and increase reliability, billions of dollars are being invested by the U.S. government and private industries to build the smart grid infrastructure [2]. Higher resolution measurements are made available to more equipment at wider areas by the wide deployment of modern information technology into power grid control and communication networks. Consider Fig.2 for example. According to this figure, data is collected by a smart meter by the minute while the data is collected hourly or monthly by an old mechanical meter; data points are collected by a phasor measurement unit (PMU) per second which is much faster than the sampling rate of the traditional mechanical system named supervisory control and data acquisition (SCADA) system, which is 1 data point per 1-2 second. V. IMPLEMENTATION AND RESULTS The system that we implemented is useful for both electricity providers and customers in terms of cost efficiency and analysis. Smart meters collect and send electricity consumption data at particular intervals which results in generating large amount of data. The best software that we can use for handling such big data is Hadoop. Hadoop is infrastructure developed by Apache and leading distributed computing platform. Hadoop was derived from Google s MapReduce and Google File System (GFS). Hadoop is open-source platform and works on master-slave configuration. Hadoop runs on commodity hardware. In our implementation HDFS (Hadoop Distributed File System) acts as our database where we are going to store all the data which includes electricity consumption data, user account information. GUI (Graphical User Interface): Our main module is the GUI which tells entirely about the project that we have implemented. It describes different functionalities provided to user of the system. Here two types of user can gain access to the system, one is the analyst and another is consumer. Only authenticated users can be allowed to use the system. Both users can view electricity consumption data of their based on different given month etc.) available in the system. Then depending upon their selected, graph is generated so that they can analyze the consumption of electricity and further achieve various objectives mentioned. Main aim of GUI 106
4 is to provide flexible and efficient system to users. ANALYSIS MODEL 1: Module Energy Consumption Functionality General Workflow Dependency Prediction Following functionality is Following steps are included in general workflow of module 1. Login 2. Select Parameters 3. Select Graph Following module on which Table I Analysis Model 1 ANALYSIS MODEL 2: Module Drill Down Functionality Following functionality is General Following steps are included in Workflow general workflow of module 1. Login 2. Select Parameters 3. Select Graph Dependency Following module on which Table II Analysis Model 2 CONSUMER MODEL: Module Customer Functionality Following functionality is General Workflow Dependency Following steps are included in general workflow of module 3. Login 4. Select Parameters 5. Select Graph Following module on which Table III Customer Model The fig 4 shows the comparison of energy consumption in the month of February and fig 5 shows the running hadoop job on slave 3 VI. CONCLUSION To reach the 2020 energy efficiency as well as renewable energy targets and also for the future smart grids, effective use of smart metering technology is crucial. Rational energy use is a must for a larger group of companies, municipalities and public organizations because of the gain in importance of the energy costs and environmental issues. Hence proper information about their consumption is needed by them along 107
5 with and its distribution between different activities. A total picture of their energy use, potential for savings, along with costs can be given to them by smart meter data analytics, enabling effective energy management. Smart meter sends energy consumption data at small intervals resulting in generating big data. Time and storage are two important factors that affect a lot on building any application. The solution for handling such big data is Hadoop. VII. FUTURE SCOPE For a good understanding of how customer, environmental and structural features affect the usage, the techniques for the analysis of energy consumption data by using smart meters will be helpful. Analysis of customer behavior can be performed easily by analyzing this data which will be helpful to reduce energy consumption. A utility for better management of power outages and restoration events as well as reduction in outage duration and costs is allowed by Outage Management System (OMS). Reduced CO2 emission is resulted by demand response because of avoided use of polluting power plant. Also, as a result, reduced peak prices are found because of avoided use of expensive peak load production. Because of customers awareness, reduced consumption is found as a result of feedback as well as load management regarding energy consumption. Analysis of smart meter data in accordance with weather can also be performed. In India, main challenge is to install the smart meter all over the country as internet facility is still not present in most of the regions. REFERENCES [1] Amir-Hamed Mohsenian-Rad, Member, IEEE, and Alberto Leon-Garcia, Fellow, IEEE, Optimal Residential Load Control With Price Prediction in Real-Time Electricity Pricing Environments, IEEE TRANSACTIONS ON SMART GRID, VOL. 1, NO. 2, SEPTEMBER 2010 [2] Ning Lu, Pengwei Du, Xinxin Guo, and Frank L. Greitzer, Smart Meter Data Analysis. [3] Dr. Sanjay Gupta, Ashish Mohan Tiwari, Cognizant, How to Generate Greater Value from Smart Meter Data. [4] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google, The Google File System. [5] Amir-Hamed Mohsenian-Rad, Member, IEEE, and Alberto Leon-Garcia, Fellow, IEEE, Optimal Residential Load Control With Price Prediction in Real-Time Electricity Pricing Environments, IEEE TRANSACTIONS ON SMART GRID, VOL. 1, NO. 2, SEPTEMBER 201 [6] Li Ping Qian, Member, IEEE, Ying Jun (Angela) Zhang, Senior Member, IEEE, Jianwei Huang, Senior Member, IEEE, and Yuan Wu, Member, IEEE, Demand Response Management via Real-Time Electricity Price Control in Smart Grids, IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 31, NO. 7, JULY 2013 SMART METER DATA ANALYTICS USING HADOOP (SMDA) MES COLLEGE OF ENGINEERING, PUNE 87 [7] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels Amazon.com, Dynamo: Amazon s Highly Available Key-value Store [8] Fahad Javed, Naveed Arshad, Fredrik Wallin, Iana Vassileva, Erik Dahlquist, Forecasting for demand response in smart grids: An analysis on use of anthropologic and structural data and short term multiple loads forecasting [9] Shannon Warner, Samrat Sen, Cognizant, How Predictive Analytics Elevate Airlines Customer Centricity and Competitive Advantage 2, Aug. 1987, pp [Dig. 9 th Annu. Conf. Magnetics Japan, 1982, p. 301]. 108
Scalable Multiple NameNodes Hadoop Cloud Storage System
Vol.8, No.1 (2015), pp.105-110 http://dx.doi.org/10.14257/ijdta.2015.8.1.12 Scalable Multiple NameNodes Hadoop Cloud Storage System Kun Bi 1 and Dezhi Han 1,2 1 College of Information Engineering, Shanghai
A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA. Technology, Coimbatore. Engineering and Technology, Coimbatore.
A REVIEW ON EFFICIENT DATA ANALYSIS FRAMEWORK FOR INCREASING THROUGHPUT IN BIG DATA 1 V.N.Anushya and 2 Dr.G.Ravi Kumar 1 Pg scholar, Department of Computer Science and Engineering, Coimbatore Institute
Dynamo: Amazon s Highly Available Key-value Store
Dynamo: Amazon s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and
Seminar Presentation for ECE 658 Instructed by: Prof.Anura Jayasumana Distributed File Systems
Seminar Presentation for ECE 658 Instructed by: Prof.Anura Jayasumana Distributed File Systems Prabhakaran Murugesan Outline File Transfer Protocol (FTP) Network File System (NFS) Andrew File System (AFS)
SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE SYSTEM IN CLOUD
International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol-1, Iss.-3, JUNE 2014, 54-58 IIST SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE
Big Data Storage Architecture Design in Cloud Computing
Big Data Storage Architecture Design in Cloud Computing Xuebin Chen 1, Shi Wang 1( ), Yanyan Dong 1, and Xu Wang 2 1 College of Science, North China University of Science and Technology, Tangshan, Hebei,
Cloud Computing in Distributed System
M.H.Nerkar & Sonali Vijay Shinkar GCOE, Jalgaon Abstract - Cloud Computing as an Internet-based computing; where resources, software and information are provided to computers on-demand, like a public utility;
A Study on Data Analysis Process Management System in MapReduce using BPM
A Study on Data Analysis Process Management System in MapReduce using BPM Yoon-Sik Yoo 1, Jaehak Yu 1, Hyo-Chan Bang 1, Cheong Hee Park 1 Electronics and Telecommunications Research Institute, 138 Gajeongno,
Big Data and Hadoop with components like Flume, Pig, Hive and Jaql
Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.
Big Data: An Introduction, Challenges & Analysis using Splunk
pp. 464-468 Krishi Sanskriti Publications http://www.krishisanskriti.org/acsit.html Big : An Introduction, Challenges & Analysis using Splunk Satyam Gupta 1 and Rinkle Rani 2 1,2 Department of Computer
International Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 8, August 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, [email protected] Assistant Professor, Information
Chapter 2 Related Technologies
Chapter 2 Related Technologies Abstract In order to gain a deep understanding of big data, this chapter will introduce several fundamental technologies that are closely related to big data, including cloud
Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014
White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page
Introduction to Hadoop
Introduction to Hadoop 1 What is Hadoop? the big data revolution extracting value from data cloud computing 2 Understanding MapReduce the word count problem more examples MCS 572 Lecture 24 Introduction
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
The IBM Solution Architecture for Energy and Utilities Framework
IBM Solution Architecture for Energy and Utilities Framework Accelerating Solutions for Smarter Utilities The IBM Solution Architecture for Energy and Utilities Framework Providing a foundation for solutions
Log Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, [email protected] Amruta Deshpande Department of Computer Science, [email protected]
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and
Testing Big data is one of the biggest
Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS Dr. Ananthi Sheshasayee 1, J V N Lakshmi 2 1 Head Department of Computer Science & Research, Quaid-E-Millath Govt College for Women, Chennai, (India)
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
Reduction of Data at Namenode in HDFS using harballing Technique
Reduction of Data at Namenode in HDFS using harballing Technique Vaibhav Gopal Korat, Kumar Swamy Pamu [email protected] [email protected] Abstract HDFS stands for the Hadoop Distributed File System.
Big Application Execution on Cloud using Hadoop Distributed File System
Big Application Execution on Cloud using Hadoop Distributed File System Ashkan Vates*, Upendra, Muwafaq Rahi Ali RPIIT Campus, Bastara Karnal, Haryana, India ---------------------------------------------------------------------***---------------------------------------------------------------------
Big Data - Infrastructure Considerations
April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright
marlabs driving digital agility WHITEPAPER Big Data and Hadoop
marlabs driving digital agility WHITEPAPER Big Data and Hadoop Abstract This paper explains the significance of Hadoop, an emerging yet rapidly growing technology. The prime goal of this paper is to unveil
Chapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
Hadoop IST 734 SS CHUNG
Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to
A Brief Outline on Bigdata Hadoop
A Brief Outline on Bigdata Hadoop Twinkle Gupta 1, Shruti Dixit 2 RGPV, Department of Computer Science and Engineering, Acropolis Institute of Technology and Research, Indore, India Abstract- Bigdata is
How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
NoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
Introduction to Hadoop
1 What is Hadoop? Introduction to Hadoop We are living in an era where large volumes of data are available and the problem is to extract meaning from the data avalanche. The goal of the software tools
MANAGEMENT OF DATA REPLICATION FOR PC CLUSTER BASED CLOUD STORAGE SYSTEM
MANAGEMENT OF DATA REPLICATION FOR PC CLUSTER BASED CLOUD STORAGE SYSTEM Julia Myint 1 and Thinn Thu Naing 2 1 University of Computer Studies, Yangon, Myanmar [email protected] 2 University of Computer
http://www.paper.edu.cn
5 10 15 20 25 30 35 A platform for massive railway information data storage # SHAN Xu 1, WANG Genying 1, LIU Lin 2** (1. Key Laboratory of Communication and Information Systems, Beijing Municipal Commission
Improving Data Processing Speed in Big Data Analytics Using. HDFS Method
Improving Data Processing Speed in Big Data Analytics Using HDFS Method M.R.Sundarakumar Assistant Professor, Department Of Computer Science and Engineering, R.V College of Engineering, Bangalore, India
Cloud Computing based on the Hadoop Platform
Cloud Computing based on the Hadoop Platform Harshita Pandey 1 UG, Department of Information Technology RKGITW, Ghaziabad ABSTRACT In the recent years,cloud computing has come forth as the new IT paradigm.
W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image
Data Management Course Syllabus
Data Management Course Syllabus Data Management: This course is designed to give students a broad understanding of modern storage systems, data management techniques, and how these systems are used to
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model
Detection of Distributed Denial of Service Attack with Hadoop on Live Network
Detection of Distributed Denial of Service Attack with Hadoop on Live Network Suchita Korad 1, Shubhada Kadam 2, Prajakta Deore 3, Madhuri Jadhav 4, Prof.Rahul Patil 5 Students, Dept. of Computer, PCCOE,
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
Data Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis
, 22-24 October, 2014, San Francisco, USA Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis Teng Zhao, Kai Qian, Dan Lo, Minzhe Guo, Prabir Bhattacharya, Wei Chen, and Ying
Resource Scalability for Efficient Parallel Processing in Cloud
Resource Scalability for Efficient Parallel Processing in Cloud ABSTRACT Govinda.K #1, Abirami.M #2, Divya Mercy Silva.J #3 #1 SCSE, VIT University #2 SITE, VIT University #3 SITE, VIT University In the
Energy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
MarketsandMarkets. http://www.marketresearch.com/marketsandmarkets-v3719/ Publisher Sample
MarketsandMarkets http://www.marketresearch.com/marketsandmarkets-v3719/ Publisher Sample Phone: 800.298.5699 (US) or +1.240.747.3093 or +1.240.747.3093 (Int'l) Hours: Monday - Thursday: 5:30am - 6:30pm
In-Memory Analytics for Big Data
In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...
Jeffrey D. Ullman slides. MapReduce for data intensive computing
Jeffrey D. Ullman slides MapReduce for data intensive computing Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Disk Commodity Clusters Web data sets can be very
HPC technology and future architecture
HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange [email protected] Toàn Nguyên [email protected]
Manifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
ISSN: 2321-7782 (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
Big Data Analytics OverOnline Transactional Data Set
Big Data Analytics OverOnline Transactional Data Set Rohit Vaswani 1, Rahul Vaswani 2, Manish Shahani 3, Lifna Jos(Mentor) 4 1 B.E. Computer Engg. VES Institute of Technology, Mumbai -400074, Maharashtra,
DATA SECURITY MODEL FOR CLOUD COMPUTING
DATA SECURITY MODEL FOR CLOUD COMPUTING POOJA DHAWAN Assistant Professor, Deptt of Computer Application and Science Hindu Girls College, Jagadhri 135 001 [email protected] ABSTRACT Cloud Computing
Cloud Platforms, Challenges & Hadoop. Aditee Rele Karpagam Venkataraman Janani Ravi
Cloud Platforms, Challenges & Hadoop Aditee Rele Karpagam Venkataraman Janani Ravi Cloud Platform Models Aditee Rele Microsoft Corporation Dec 8, 2010 IT CAPACITY Provisioning IT Capacity Under-supply
Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel
Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined
Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics
Finding Insights & Hadoop Cluster Performance Analysis over Census Dataset Using Big-Data Analytics Dharmendra Agawane 1, Rohit Pawar 2, Pavankumar Purohit 3, Gangadhar Agre 4 Guide: Prof. P B Jawade 2
A REVIEW: Distributed File System
Journal of Computer Networks and Communications Security VOL. 3, NO. 5, MAY 2015, 229 234 Available online at: www.ijcncs.org EISSN 23089830 (Online) / ISSN 24100595 (Print) A REVIEW: System Shiva Asadianfam
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON HIGH PERFORMANCE DATA STORAGE ARCHITECTURE OF BIGDATA USING HDFS MS.
Joining Cassandra. Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece [email protected].
Luiz Fernando M. Schlindwein Computer Science Department University of Crete Heraklion, Greece [email protected] Joining Cassandra Binjiang Tao Computer Science Department University of Crete Heraklion,
Load Rebalancing for File System in Public Cloud Roopa R.L 1, Jyothi Patil 2
Load Rebalancing for File System in Public Cloud Roopa R.L 1, Jyothi Patil 2 1 PDA College of Engineering, Gulbarga, Karnataka, India [email protected] 2 PDA College of Engineering, Gulbarga, Karnataka,
Big Data and Hadoop with Components like Flume, Pig, Hive and Jaql
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 7, July 2014, pg.759
TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP
Pythian White Paper TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP ABSTRACT As companies increasingly rely on big data to steer decisions, they also find themselves looking for ways to simplify
Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
Leveraging Map Reduce With Hadoop for Weather Data Analytics
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 3, Ver. II (May Jun. 2015), PP 06-12 www.iosrjournals.org Leveraging Map Reduce With Hadoop for Weather
Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
Hadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
L1: Introduction to Hadoop
L1: Introduction to Hadoop Feng Li [email protected] School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General
Big Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
USING BIG DATA FOR INTELLIGENT BUSINESSES
HENRI COANDA AIR FORCE ACADEMY ROMANIA INTERNATIONAL CONFERENCE of SCIENTIFIC PAPER AFASES 2015 Brasov, 28-30 May 2015 GENERAL M.R. STEFANIK ARMED FORCES ACADEMY SLOVAK REPUBLIC USING BIG DATA FOR INTELLIGENT
Cloud SQL Security. Swati Srivastava 1 and Meenu 2. Engineering College., Gorakhpur, U.P. Gorakhpur, U.P. Abstract
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 4, Number 5 (2014), pp. 479-484 International Research Publications House http://www. irphouse.com /ijict.htm Cloud
Advances in Natural and Applied Sciences
AENSI Journals Advances in Natural and Applied Sciences ISSN:1995-0772 EISSN: 1998-1090 Journal home page: www.aensiweb.com/anas Clustering Algorithm Based On Hadoop for Big Data 1 Jayalatchumy D. and
CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES
CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES 1 MYOUNGJIN KIM, 2 CUI YUN, 3 SEUNGHO HAN, 4 HANKU LEE 1,2,3,4 Department of Internet & Multimedia Engineering,
Design of Electric Energy Acquisition System on Hadoop
, pp.47-54 http://dx.doi.org/10.14257/ijgdc.2015.8.5.04 Design of Electric Energy Acquisition System on Hadoop Yi Wu 1 and Jianjun Zhou 2 1 School of Information Science and Technology, Heilongjiang University
Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
Empowering intelligent utility networks with visibility and control
IBM Software Energy and Utilities Thought Leadership White Paper Empowering intelligent utility networks with visibility and control IBM Intelligent Metering Network Management software solution 2 Empowering
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Big Data Analysis using Hadoop components like Flume, MapReduce, Pig and Hive
Big Data Analysis using Hadoop components like Flume, MapReduce, Pig and Hive E. Laxmi Lydia 1,Dr. M.Ben Swarup 2 1 Associate Professor, Department of Computer Science and Engineering, Vignan's Institute
Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014
Highly available, scalable and secure data with Cassandra and DataStax Enterprise GOTO Berlin 27 th February 2014 About Us Steve van den Berg Johnny Miller Solutions Architect Regional Director Western
International Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
Big Data Collection and Utilization for Operational Support of Smarter Social Infrastructure
Hitachi Review Vol. 63 (2014), No. 1 18 Big Data Collection and Utilization for Operational Support of Smarter Social Infrastructure Kazuaki Iwamura Hideki Tonooka Yoshihiro Mizuno Yuichi Mashita OVERVIEW:
I. TODAY S UTILITY INFRASTRUCTURE vs. FUTURE USE CASES...1 II. MARKET & PLATFORM REQUIREMENTS...2
www.vitria.com TABLE OF CONTENTS I. TODAY S UTILITY INFRASTRUCTURE vs. FUTURE USE CASES...1 II. MARKET & PLATFORM REQUIREMENTS...2 III. COMPLEMENTING UTILITY IT ARCHITECTURES WITH THE VITRIA PLATFORM FOR
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
White Paper. Big Data and Hadoop. Abhishek S, Java COE. Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP
White Paper Big Data and Hadoop Abhishek S, Java COE www.marlabs.com Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP Table of contents Abstract.. 1 Introduction. 2 What is Big
AMI and DA Convergence: Enabling Energy Savings through Voltage Conservation
AMI and DA Convergence: Enabling Energy Savings through Voltage Conservation September 2010 Prepared for: By Sierra Energy Group The Research & Analysis Division of Energy Central Table of Contents Executive
Cloud computing - Architecting in the cloud
Cloud computing - Architecting in the cloud [email protected] 1 Outline Cloud computing What is? Levels of cloud computing: IaaS, PaaS, SaaS Moving to the cloud? Architecting in the cloud Best practices
Distributed File Systems
Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)
International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]
