Network Traffic Analysis using HADOOP Architecture. Zeng Shan ISGC2013, Taibei
|
|
- Beverley Gardner
- 8 years ago
- Views:
Transcription
1 Network Traffic Analysis using HADOOP Architecture Zeng Shan ISGC2013, Taibei
2 Flow VS Packet what are netflows? Outlines Flow tools used in the system nprobe nfdump Introduction to Hadoop HDFS Map/Reduce Architecture of our system Function Display
3 Flow VS Packet
4 What is a flow? Network flow is a sequence of packets From a source computer to a destination, which may be another host, a multicast group, or a broadcast domain. A network flow measures sequences of IP packets sharing a common feature as they pass between devices. Flow format: NetFlow(Cisco) J-Flow(Juniper) Sflow(HP).
5 Netflows Netflows concept developed by Cisco Defines a protocol and packet content for collecting network traffic data Several versions of the protocol, v5 most widespread Netflow v5 A flow is a unique tuple of (router port,source IP+port,dest IP+port,TCP protocol) A flow also contains TCP flags and numbers of packets and bytes transferred Flows are cached on the router and sent regularly to a recording node If the load on a router becomes high, flows can get lost as the cache may be flushed Typically more than 2000 Flows/s can be recorded
6 Flow VS Packet Flows contain relevant information to characterize and analyze the traffic Flows contain less information than packets, so recording the flows will not lead to the high network loads Storing flows, comparing with storing packets, requires much less disk space 1 day traffic in IHEP is roughly 1.5G which is stored in flows Instead of recording all flows, sampling could also be used Flows can be delivered by the router or created from packets by nprobe nprobe is an alternative if the router doesn t output netflows
7 Flow Tools used in the System
8 What is nprobe? nprobe is an open source tools Capture packets flowing on a Ethernet segment, computes flows and export them to the specified collectors. Features: Ability to keep up with Gbit speeds on Ethernet networks handling thousand of packets per second without packet sampling on commodity hardware. Support for major OS including Unix, Windows and Mac OS X it is designed for environments with limited resources
9 nfdump fulfils two tasks: data storage and data filtering/resolution nfcapd: does receive flows on a configurable port and stores it every n minutes nfdump: reads dump files containing flow data, filters it and resolve it to readable files nfdump is IPv6 enabled Current stable version is nfdump usage is very similar to tcpdump, especially for filtering rules Example: Looking for any activity of a host with IP on Mar from 11:00 to 11:59am nfdump R nfcapd :nfcapd ip
10
11 Information contains in nfdump output Tag Description Tag Description %ts Start Time - first seen %in Input Interface num %te End Time - last seen %out Output Interface num %td Duration %pkt Packets counts in this flow %pr Protocol %byt Bytes count in this flow %sa Source Address %fl Number of flows. %da Destination Address %flg TCP Flags %sap Source Address: Port %tos Type of Service %dap Destination Address:Port %bps bits per second %sp Source Port %pps packets per second %dp Destination Port %bpp Bytes per package %sas Source AS %das Destination AS
12
13 Introduction to Hadoop
14 What can Hadoop do? Hadoop is an open-source software framework. It was originally developed to support distribution for the Nutch search engine project. Licensed under the Apache v2 license. Supports data-intensive distributed applications. It enables applications to work with thousands of computation-independent computers and petabytes of data Storage: HDFS Compute: Map/Reduce
15 HDFS HDFS is short for Hadoop Distributed File System Differences from other distributed file systems: highly fault-tolerant designed to be deployed on low-cost hardware. Portability across heterogeneous hardware and software platforms Applications run on HDFS need streaming access to their data sets provides high throughput access to application data suitable for applications that have large data sets Moving computation is cheaper than moving data
16 HDFS master/slave architecture NameNode Manages name space of the file system Regulates access to files by clients Determine the mapping of blocks to DataNodes DataNodes Responsible for serving read and write requests from the clients Perform block creation, deletion and replication upon the instructions from NameNode
17 Data Replication in HDFS To ensure the fault tolerance in HDFS, the blocks of a file are replicated, the replicas of a block can be specified by the application
18 Map/Reduce MapReduce is a programming model for processing large data sets MapReduce is typically used to do distributed computing on clusters of computers MapReduce can take advantage of locality of data, processing data on or near the storage assets to decrease transmission of data. The model is inspired by the map and reduce functions "Map" step: The master node takes the input, divides it into smaller sub-problems, and distributes them to slave nodes. The slave node processes the smaller problem, and passes the answer back to its master node. "Reduce" step: The master node then collects the answers of all the sub-problems and combines them in some way to form the final output
19 Architecture of our System
20 Architecture
21 Drawing tools RRDtool acronym for round-robin database tool The data are stored in a round-robin database(circular buffer) It also includes tools to extract RRD data in a graphical format drawing flow trend graph in three dimensionality: Flow count, Packet count, Traffic count Highstock Highstock lets you create stock or general timeline charts in pure JavaScript including sophisticated navigation options like a small navigator series, preset date ranges, date picker, scrolling and panning. We just need to write API between HDFS and Highstock
22 Function Display
23
24
25
26
27 Summary Network trend chart for IHEP every 5 minutes in three dimension: Flow count/packet count/traffic count Detailed traffic information chart select a single timeslot and get the detailed traffic information select a time window and get the detailed traffic information on hovering the chart, a tooltip text with traffic information on each point and series can be displayed. the tooltip follows as the user moves the mouse over the graph Traffic information related to the HEP experiments Once the IP address of the machines related to the data transferring of the HEP experiment is specified we already have DYB/YBJ/CMS/ALTAS Intrusion Detection is also possible Writing rules, generating alerts
28 Thank You Questions?
Network Traffic Analysis using HADOOP Architecture. Shan Zeng HEPiX, Beijing 17 Oct 2012
Network Traffic Analysis using HADOOP Architecture Shan Zeng HEPiX, Beijing 17 Oct 2012 Outline Introduction to Hadoop Traffic Information Capture Traffic Information Resolution Traffic Information Storage
More informationHadoop Architecture. Part 1
Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationHadoop Distributed File System. Jordan Prosch, Matt Kipps
Hadoop Distributed File System Jordan Prosch, Matt Kipps Outline - Background - Architecture - Comments & Suggestions Background What is HDFS? Part of Apache Hadoop - distributed storage What is Hadoop?
More informationHDFS Architecture Guide
by Dhruba Borthakur Table of contents 1 Introduction... 3 2 Assumptions and Goals... 3 2.1 Hardware Failure... 3 2.2 Streaming Data Access...3 2.3 Large Data Sets... 3 2.4 Simple Coherency Model...3 2.5
More informationIntroduction to Netflow
Introduction to Netflow Mike Jager Network Startup Resource Center mike.jager@synack.co.nz These materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International license (http://creativecommons.org/licenses/by-nc/4.0/)
More informationNetwork Monitoring and Management NetFlow Overview
Network Monitoring and Management NetFlow Overview These materials are licensed under the Creative Commons Attribution-Noncommercial 3.0 Unported license (http://creativecommons.org/licenses/by-nc/3.0/)
More informationNfSen Plugin Supporting The Virtual Network Monitoring
NfSen Plugin Supporting The Virtual Network Monitoring Vojtěch Krmíček krmicek@liberouter.org Pavel Čeleda celeda@ics.muni.cz Jiří Novotný novotny@cesnet.cz Part I Monitoring of Virtual Network Environments
More informationHadoop IST 734 SS CHUNG
Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to
More informationTake An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc hairong@yahoo-inc.com
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc hairong@yahoo-inc.com What s Hadoop Framework for running applications on large clusters of commodity hardware Scale: petabytes of data
More informationFault Tolerance in Hadoop for Work Migration
1 Fault Tolerance in Hadoop for Work Migration Shivaraman Janakiraman Indiana University Bloomington ABSTRACT Hadoop is a framework that runs applications on large clusters which are built on numerous
More informationnfdump and NfSen 18 th Annual FIRST Conference June 25-30, 2006 Baltimore Peter Haag 2006 SWITCH
18 th Annual FIRST Conference June 25-30, 2006 Baltimore Peter Haag 2006 SWITCH Some operational questions, popping up now and then: Do you see this peek on port 445 as well? What caused this peek on your
More informationAn overview of traffic analysis using NetFlow
The LOBSTER project An overview of traffic analysis using NetFlow Arne Øslebø UNINETT Arne.Oslebo@uninett.no 1 Outline What is Netflow? Available tools Collecting Processing Detailed analysis security
More informationHadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh
1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets
More informationEXPERIMENTATION. HARRISON CARRANZA School of Computer Science and Mathematics
BIG DATA WITH HADOOP EXPERIMENTATION HARRISON CARRANZA Marist College APARICIO CARRANZA NYC College of Technology CUNY ECC Conference 2016 Poughkeepsie, NY, June 12-14, 2016 Marist College AGENDA Contents
More informationNetwork forensics 101 Network monitoring with Netflow, nfsen + nfdump
Network forensics 101 Network monitoring with Netflow, nfsen + nfdump www.enisa.europa.eu Agenda Intro to netflow Metrics Toolbox (Nfsen + Nfdump) Demo www.enisa.europa.eu 2 What is Netflow Netflow = Netflow
More informationDistributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
More informationDistributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
More informationCase Study : 3 different hadoop cluster deployments
Case Study : 3 different hadoop cluster deployments Lee moon soo moon@nflabs.com HDFS as a Storage Last 4 years, our HDFS clusters, stored Customer 1500 TB+ data safely served 375,000 TB+ data to customer
More informationTHE HADOOP DISTRIBUTED FILE SYSTEM
THE HADOOP DISTRIBUTED FILE SYSTEM Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Presented by Alexander Pokluda October 7, 2013 Outline Motivation and Overview of Hadoop Architecture,
More informationThe ntop Project: Open Source Network Monitoring
The ntop Project: Open Source Network Monitoring Luca Deri 1 Agenda 1. What can ntop do for me? 2. ntop and network security 3. Integration with commercial protocols 4. Embedding ntop 5. Work in
More informationApache Hadoop new way for the company to store and analyze big data
Apache Hadoop new way for the company to store and analyze big data Reyna Ulaque Software Engineer Agenda What is Big Data? What is Hadoop? Who uses Hadoop? Hadoop Architecture Hadoop Distributed File
More informationHadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela
Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance
More informationLarge scale processing using Hadoop. Ján Vaňo
Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine
More informationWho is Generating all This Traffic?
Who is Generating all This Traffic? Network Monitoring in Practice Luca Deri Who s ntop.org? Started in 1998 as open-source monitoring project for developing an easy to use passive monitoring
More informationPrepared By : Manoj Kumar Joshi & Vikas Sawhney
Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Introduction to Hadoop Architecture Acknowledgement Thanks to all the authors who left their selfexplanatory images on the internet. Thanks
More informationOpen Source in Network Administration: the ntop Project
Open Source in Network Administration: the ntop Project Luca Deri 1 Project History Started in 1997 as monitoring application for the Univ. of Pisa 1998: First public release v 0.4 (GPL2) 1999-2002:
More informationApache Hadoop. Alexandru Costan
1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open
More informationNetflow Collection with AlienVault Alienvault 2013
Netflow Collection with AlienVault Alienvault 2013 CONFIGURE Configuring NetFlow Capture of TCP/IP Traffic from an AlienVault Sensor or Remote Hardware Level: Beginner to Intermediate Netflow Collection
More informationvsphere Networking vsphere 6.0 ESXi 6.0 vcenter Server 6.0 EN-001391-01
vsphere 6.0 ESXi 6.0 vcenter Server 6.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more
More informationSector vs. Hadoop. A Brief Comparison Between the Two Systems
Sector vs. Hadoop A Brief Comparison Between the Two Systems Background Sector is a relatively new system that is broadly comparable to Hadoop, and people want to know what are the differences. Is Sector
More information[Optional] Network Visibility with NetFlow
[Optional] Network Visibility with NetFlow TELE301 Laboratory Manual Contents 1 NetFlow Architecture........................... 1 2 NetFlow Versions.............................. 2 3 Requirements Analysis...........................
More informationLarge-Scale TCP Packet Flow Analysis for Common Protocols Using Apache Hadoop
Large-Scale TCP Packet Flow Analysis for Common Protocols Using Apache Hadoop R. David Idol Department of Computer Science University of North Carolina at Chapel Hill david.idol@unc.edu http://www.cs.unc.edu/~mxrider
More informationWatch your Flows with NfSen and NFDUMP 50th RIPE Meeting May 3, 2005 Stockholm Peter Haag
Watch your Flows with NfSen and NFDUMP 50th RIPE Meeting May 3, 2005 Stockholm Peter Haag 2005 SWITCH What I am going to present: The Motivation. What are NfSen and nfdump? The Tools in Action. Outlook
More informationHadoop Technology for Flow Analysis of the Internet Traffic
Hadoop Technology for Flow Analysis of the Internet Traffic Rakshitha Kiran P PG Scholar, Dept. of C.S, Shree Devi Institute of Technology, Mangalore, Karnataka, India ABSTRACT: Flow analysis of the internet
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationBuilding & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp
Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Introduction to Hadoop Comes from Internet companies Emerging big data storage and analytics platform HDFS and MapReduce
More informationViete, čo robia Vaši užívatelia na sieti? Roman Tuchyňa, CSA
Viete, čo robia Vaši užívatelia na sieti? Roman Tuchyňa, CSA What is ReporterAnalyzer? ReporterAnalyzer gives network professionals insight into how application traffic is impacting network performance.
More informationConfiguring Flexible NetFlow
CHAPTER 62 Note Flexible NetFlow is only supported on Supervisor Engine 7-E, Supervisor Engine 7L-E, and Catalyst 4500X. Flow is defined as a unique set of key fields attributes, which might include fields
More informationCisco NetFlow TM Briefing Paper. Release 2.2 Monday, 02 August 2004
Cisco NetFlow TM Briefing Paper Release 2.2 Monday, 02 August 2004 Contents EXECUTIVE SUMMARY...3 THE PROBLEM...3 THE TRADITIONAL SOLUTIONS...4 COMPARISON WITH OTHER TECHNIQUES...6 CISCO NETFLOW OVERVIEW...7
More informationProcessing of Hadoop using Highly Available NameNode
Processing of Hadoop using Highly Available NameNode 1 Akash Deshpande, 2 Shrikant Badwaik, 3 Sailee Nalawade, 4 Anjali Bote, 5 Prof. S. P. Kosbatwar Department of computer Engineering Smt. Kashibai Navale
More informationNETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE Anjali P P 1 and Binu A 2 1 Department of Information Technology, Rajagiri School of Engineering and Technology, Kochi. M G University, Kerala
More informationLab VI Capturing and monitoring the network traffic
Lab VI Capturing and monitoring the network traffic 1. Goals To gain general knowledge about the network analyzers and to understand their utility To learn how to use network traffic analyzer tools (Wireshark)
More informationScalable Extraction, Aggregation, and Response to Network Intelligence
Scalable Extraction, Aggregation, and Response to Network Intelligence Agenda Explain the two major limitations of using Netflow for Network Monitoring Scalability and Visibility How to resolve these issues
More informationWelcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components
Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop
More informationGraySort and MinuteSort at Yahoo on Hadoop 0.23
GraySort and at Yahoo on Hadoop.23 Thomas Graves Yahoo! May, 213 The Apache Hadoop[1] software library is an open source framework that allows for the distributed processing of large data sets across clusters
More informationTutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
More informationIPV6 流 量 分 析 探 讨 北 京 大 学 计 算 中 心 周 昌 令
IPV6 流 量 分 析 探 讨 北 京 大 学 计 算 中 心 周 昌 令 1 内 容 流 量 分 析 简 介 IPv6 下 的 新 问 题 和 挑 战 协 议 格 式 变 更 用 户 行 为 特 征 变 更 安 全 问 题 演 化 流 量 导 出 手 段 变 化 设 备 参 考 配 置 流 量 工 具 总 结 2 流 量 分 析 简 介 流 量 分 析 目 标 who, what, where,
More informationNetFlow Analysis with MapReduce
NetFlow Analysis with MapReduce Wonchul Kang, Yeonhee Lee, Youngseok Lee Chungnam National University {teshi85, yhlee06, lee}@cnu.ac.kr 2010.04.24(Sat) based on "An Internet Traffic Analysis Method with
More informationNetwork Management & Monitoring
Network Management & Monitoring NetFlow Overview These materials are licensed under the Creative Commons Attribution-Noncommercial 3.0 Unported license (http://creativecommons.org/licenses/by-nc/3.0/)
More informationBig Data Testbed for Research and Education Networks Analysis. SomkiatDontongdang, PanjaiTantatsanawong, andajchariyasaeung
Big Data Testbed for Research and Education Networks Analysis SomkiatDontongdang, PanjaiTantatsanawong, andajchariyasaeung Research and Education Networks ThaiREN is a specialized Internet Service Provider
More informationNetflow Overview. PacNOG 6 Nadi, Fiji
Netflow Overview PacNOG 6 Nadi, Fiji Agenda Netflow What it is and how it works Uses and Applications Vendor Configurations/ Implementation Cisco and Juniper Flow-tools Architectural issues Software, tools
More informationInternational Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 8, August 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
More informationNetwork Monitoring and Traffic CSTNET, CNIC
Network Monitoring and Traffic Analysis in CSTNET Chunjing Han Aug. 2013 CSTNET, CNIC Topics 1. The background of network monitoring 2. Network monitoring protocols and related tools 3. Network monitoring
More informationNetFlow Performance Analysis
NetFlow Performance Analysis Last Updated: May, 2007 The Cisco IOS NetFlow feature set allows for the tracking of individual IP flows as they are received at a Cisco router or switching device. Network
More informationMapReduce Job Processing
April 17, 2012 Background: Hadoop Distributed File System (HDFS) Hadoop requires a Distributed File System (DFS), we utilize the Hadoop Distributed File System (HDFS). Background: Hadoop Distributed File
More informationHadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.
Hadoop Source Alessandro Rezzani, Big Data - Architettura, tecnologie e metodi per l utilizzo di grandi basi di dati, Apogeo Education, ottobre 2013 wikipedia Hadoop Apache Hadoop is an open-source software
More informationCisco IOS Flexible NetFlow Technology
Cisco IOS Flexible NetFlow Technology Last Updated: December 2008 The Challenge: The ability to characterize IP traffic and understand the origin, the traffic destination, the time of day, the application
More informationHadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
More informationLecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
More informationNetFlow-Lite offers network administrators and engineers the following capabilities:
Solution Overview Cisco NetFlow-Lite Introduction As networks become more complex and organizations enable more applications, traffic patterns become more diverse and unpredictable. Organizations require
More informationNTT DOCOMO Technical Journal. Large-Scale Data Processing Infrastructure for Mobile Spatial Statistics
Large-scale Distributed Data Processing Big Data Service Platform Mobile Spatial Statistics Supporting Development of Society and Industry Population Estimation Using Mobile Network Statistical Data and
More informationLog Management with Open-Source Tools. Risto Vaarandi SEB Estonia
Log Management with Open-Source Tools Risto Vaarandi SEB Estonia Outline Why use open source tools for log management? Widely used logging protocols and recently introduced new standards Open-source syslog
More informationLimitations of Packet Measurement
Limitations of Packet Measurement Collect and process less information: Only collect packet headers, not payload Ignore single packets (aggregate) Ignore some packets (sampling) Make collection and processing
More informationHadoop at Yahoo! Owen O Malley Yahoo!, Grid Team owen@yahoo-inc.com
Hadoop at Yahoo! Owen O Malley Yahoo!, Grid Team owen@yahoo-inc.com Who Am I? Yahoo! Architect on Hadoop Map/Reduce Design, review, and implement features in Hadoop Working on Hadoop full time since Feb
More informationCSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model
More informationNoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
More informationvsphere Networking vsphere 5.5 ESXi 5.5 vcenter Server 5.5 EN-001074-02
vsphere 5.5 ESXi 5.5 vcenter Server 5.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more
More informationHow To Analyze Network Traffic With Mapreduce On A Microsoft Server On A Linux Computer (Ahem) On A Network (Netflow) On An Ubuntu Server On An Ipad Or Ipad (Netflower) On Your Computer
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pig and Typical Mapreduce Anjali P P and Binu A Department of Information Technology, Rajagiri School of Engineering and Technology,
More informationUser Documentation nfdump & NfSen
User Documentation nfdump & NfSen 1 NFDUMP This is the combined documentation of nfdump & NfSen. Both tools are distributed under the BSD license and can be downloaded at nfdump http://sourceforge.net/projects/nfdump/
More informationHow To Monitor And Test An Ethernet Network On A Computer Or Network Card
3. MONITORING AND TESTING THE ETHERNET NETWORK 3.1 Introduction The following parameters are covered by the Ethernet performance metrics: Latency (delay) the amount of time required for a frame to travel
More informationOpen source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
More informationBigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic
BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop
More informationNoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
More informationJournal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)
Journal of science e ISSN 2277-3290 Print ISSN 2277-3282 Information Technology www.journalofscience.net STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) S. Chandra
More informationAccelerating and Simplifying Apache
Accelerating and Simplifying Apache Hadoop with Panasas ActiveStor White paper NOvember 2012 1.888.PANASAS www.panasas.com Executive Overview The technology requirements for big data vary significantly
More informationDetection of Distributed Denial of Service Attack with Hadoop on Live Network
Detection of Distributed Denial of Service Attack with Hadoop on Live Network Suchita Korad 1, Shubhada Kadam 2, Prajakta Deore 3, Madhuri Jadhav 4, Prof.Rahul Patil 5 Students, Dept. of Computer, PCCOE,
More informationBig Data Storage Options for Hadoop Sam Fineberg, HP Storage
Sam Fineberg, HP Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations
More informationGeneric Log Analyzer Using Hadoop Mapreduce Framework
Generic Log Analyzer Using Hadoop Mapreduce Framework Milind Bhandare 1, Prof. Kuntal Barua 2, Vikas Nagare 3, Dynaneshwar Ekhande 4, Rahul Pawar 5 1 M.Tech(Appeare), 2 Asst. Prof., LNCT, Indore 3 ME,
More informationApplications of Passive Message Logging and TCP Stream Reconstruction to Provide Application-Level Fault Tolerance. Sunny Gleason COM S 717
Applications of Passive Message Logging and TCP Stream Reconstruction to Provide Application-Level Fault Tolerance Sunny Gleason COM S 717 December 17, 2001 0.1 Introduction The proliferation of large-scale
More informationTP1: Getting Started with Hadoop
TP1: Getting Started with Hadoop Alexandru Costan MapReduce has emerged as a leading programming model for data-intensive computing. It was originally proposed by Google to simplify development of web
More informationLecture 2 (08/31, 09/02, 09/09): Hadoop. Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015
Lecture 2 (08/31, 09/02, 09/09): Hadoop Decisions, Operations & Information Technologies Robert H. Smith School of Business Fall, 2015 K. Zhang BUDT 758 What we ll cover Overview Architecture o Hadoop
More informationA REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information
More informationDistributed File Systems
Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)
More informationFlow Based Traffic Analysis
Flow based Traffic Analysis Muraleedharan N C-DAC Bangalore Electronics City murali@ncb.ernet.in Challenges in Packet level traffic Analysis Network traffic grows in volume and complexity Capture and decode
More informationOverview of Network Traffic Analysis
Overview of Network Traffic Analysis Network Traffic Analysis identifies which users or applications are generating traffic on your network and how much network bandwidth they are consuming. For example,
More informationIPv6 Traffic Analysis and Storage
Report from HEPiX 2012: Network, Security and Storage david.gutierrez@cern.ch Geneva, November 16th Network and Security Network traffic analysis Updates on DC Networks IPv6 Ciber-security updates Federated
More informationMonitoring high-speed networks using ntop. Luca Deri <deri@ntop.org>
Monitoring high-speed networks using ntop Luca Deri 1 Project History Started in 1997 as monitoring application for the Univ. of Pisa 1998: First public release v 0.4 (GPL2) 1999-2002:
More informationTransport and Network Layer
Transport and Network Layer 1 Introduction Responsible for moving messages from end-to-end in a network Closely tied together TCP/IP: most commonly used protocol o Used in Internet o Compatible with a
More informationLog Management with Open-Source Tools. Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M
Log Management with Open-Source Tools Risto Vaarandi rvaarandi 4T Y4H00 D0T C0M Outline Why do we need log collection and management? Why use open source tools? Widely used logging protocols and recently
More informationHadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science
A Seminar report On Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science SUBMITTED TO: www.studymafia.org SUBMITTED BY: www.studymafia.org
More informationIPTV Traffic Monitoring System with IPFIX/PSAMP
IPTV Traffic Monitoring System with IPFIX/PSAMP Shingo Kashima NTT Information Sharing Platform Laboratories 3rd NMRG Workshop 2010 NTT Information Sharing Platform Laboratories Outline Introduction Motivation
More informationplixer Scrutinizer Competitor Worksheet Visualization of Network Health Unauthorized application deployments Detect DNS communication tunnels
Scrutinizer Competitor Worksheet Scrutinizer Malware Incident Response Scrutinizer is a massively scalable, distributed flow collection system that provides a single interface for all traffic related to
More informationHadoop implementation of MapReduce computational model. Ján Vaňo
Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed
More informationMASSIVE DATA PROCESSING (THE GOOGLE WAY ) 27/04/2015. Fundamentals of Distributed Systems. Inside Google circa 2015
7/04/05 Fundamentals of Distributed Systems CC5- PROCESAMIENTO MASIVO DE DATOS OTOÑO 05 Lecture 4: DFS & MapReduce I Aidan Hogan aidhog@gmail.com Inside Google circa 997/98 MASSIVE DATA PROCESSING (THE
More informationMachine Learning Algorithm for Pro-active Fault Detection in Hadoop Cluster
Malaviya National Institute of Technology Department of Computer Science & Engineering Machine Learning Algorithm for Pro-active Fault Detection in Hadoop Cluster Winter Internship of: Agrima Seth Advisor:
More informationHands on Workshop. Network Performance Monitoring and Multicast Routing. Yasuichi Kitamura NICT Jin Tanaka KDDI/NICT APAN-JP NOC
Hands on Workshop Network Performance Monitoring and Multicast Routing Yasuichi Kitamura NICT Jin Tanaka KDDI/NICT APAN-JP NOC July 18th TEIN2 Site Coordination Workshop Network Performance Monitoring
More informationVisuSniff: A Tool For The Visualization Of Network Traffic
VisuSniff: A Tool For The Visualization Of Network Traffic Rainer Oechsle University of Applied Sciences, Trier Postbox 1826 D-54208 Trier +49/651/8103-508 oechsle@informatik.fh-trier.de Oliver Gronz University
More informationMapReduce, Hadoop and Amazon AWS
MapReduce, Hadoop and Amazon AWS Yasser Ganjisaffar http://www.ics.uci.edu/~yganjisa February 2011 What is Hadoop? A software framework that supports data-intensive distributed applications. It enables
More informationIP address format: Dotted decimal notation: 10000000 00001011 00000011 00011111 128.11.3.31
IP address format: 7 24 Class A 0 Network ID Host ID 14 16 Class B 1 0 Network ID Host ID 21 8 Class C 1 1 0 Network ID Host ID 28 Class D 1 1 1 0 Multicast Address Dotted decimal notation: 10000000 00001011
More information