Requirement Specification Document Of NSC Open Source Project. Hadoop A Molecular Docking Simulation System based on Hadoop Platform
|
|
- Quentin Williamson
- 8 years ago
- Views:
Transcription
1 Requirement Specification Document Of NSC Open Source Project Hadoop A Molecular Docking Simulation System based on Hadoop Platform E Department of Medical Informatics National Science Council, Taiwan 2010/11/23
2 (Contents) 1 (Introduction) 1.1 System) (Purpose) (Identification) (Overview) (Controlling Documents) (Document) (Purpose) (Acceptance Criteria) (Notation Description) (Priority Definition) Hadoop 2.1 (System Description) (Interface Requirements) (Internal Interface Requirements) (External Interface Requirements) (User Interfaces Requirements) (Function Requirements) (Performance Requirements) (Test Requirements) (System Test Requirement) (Acceptance Criteria) (Other Requirements) (Reliability Requirement) (Delivery Requirement) (Installation Requirement) (Environment Requirement) (Operational Concept) (Scenario 1) (Design and Implementation Constrains) (Technological Limitations) (End User Issues) (Risk Management)...11
3 1 (Introduction) 1.1 (System) (Purpose) Hadoop Hadoop MaReduce HDFS(Hadoop File System) Autodock[1] (Genetic Algorithm GA)[2] Hadoop MapReduce 1
4 1.1.2 (Identification) Hadoop (A Molecular Docking Simulation System based on Hadoop Platform, MDSH) (Overview) Hadoop Hadoop MapReduce HDFS Hadoop (Controlling Documents) MDSH Capability Maturity Model-Integrated v1.2 (CMMI v1.2; ) 2
5 1.2 (Document) (Purpose) MDSH (Acceptance Criteria) (Clearly and properly stated) (Completely) (Consistently) (Uniquely Identified) (Appropriately implement) (Verifiably) (Notation Description) Notation Description MDSH The MDSH system will be labeled with the number MDSH-F-xx MDSH-N-xx MDSH (Functional Requirements) MDSH (Non-Functional Requirements) (Priority Definition) No Name Description 1 Critical 2 Important 3
6 3 Desirable 4 Unnecessary 4
7 2 Hadoop (MDSH 1.0.0) 2.1 (System Description) (Molecular Docking) (ligand) (receptor) Fisher E.[3] complementarity pre-organization (EX: ) 1958 Koshland[4] induced fit ( ) (
8 3 UCSF Kuntz DOCK (anchor and grow) Autodock (Genetic Algorithm GA) Hadoop Hadoop MapReduce HDFS HDFS (NameNode) (DataNode) MapReduce JobTracker TaskTracker JobTracker TaskTracker TaskTracker TaskTracker JobTracker JobTracker MapReduce map reduce key/value Hadoop MapReduce Hadoop 1. Autodock pdbqt 2. (1) pdbqt 6
9 (2) GA (3) (4) (X Y Z) (5) (2) (4) 3. GA Map Reduce MapReuce Hadoop 2009 [5] MapReducing SGAs(MapReducing Compact Genetic Algorithms) GA Hadoop MapReduce GA Hadoop 2009 [6] [5] GA Hadoop [5] HDFS I/O [6] map GA Map GA Map Map GA 2 4 Reduce Reduce Map map key 7
10 [5][6] GA (Interface Requirements) (Internal Interface Requirements) MDSH-N HDFS MDSH-N Hadoop MapReduce MDSH-N Hadoop MapReduce MDSH-N Autodock pdbqt (External Interface Requirements) MDSH-N MDSH-N
11 2.2.3 (User Interfaces Requirements) MDSH-N (Function Requirements) MDSH-F MDSH-F MDSH-F Map Reduce Function MDSH-F GA 2.4 (Performance Requirements) MDSH-N MDSH-N (Test Requirements) (System Test Requirement) MDSH-N MDSH-N MDSH-N MDSH-N (Acceptance Criteria) MDSH-N MDSH-N (Other Requirements) (Reliability Requirement) MDSH-N MDSH-N
12 2.6.2 (Delivery Requirement) MDSH-N Hadoop MDSH-N MDSH-N /06/ (Installation Requirement) MDSH-N Linux Hadoop MDSH-N java (Environment Requirement) MDSH-N ASUS RS-100 ubuntu MDSH-N CISCO Gigabit 2.7 (Operational Concept) (Scenario 1) 1. Autodock pdbqt 2. (.,,, ) 3. hadoop 4. 10
13 2.8 (Design and Implementation Constrains) MDSH-N java 7 MDSH-N clinet Server 2.9 (Technological Limitations) MDSH-N docking 2.10 (End User Issues) MDSH-N MDSH-N (Risk Management) MDSH-N Subversion MDSH-N
14 Reference 1. Autodock 2. J. H. Holland, Adaptation in natural and artificial systems : an introductory analysis with applications to biology, control, and artificial intelligence. Ann Arbor: University of Michigan Press, E. Fischer, "Einfluss der Configuration auf die Wirkung der Enzyme," Berichte der deutschen chemischen Gesellschaft, vol. 27, pp , D. E. Koshland, "Application of a Theory of Enzyme Specificity to Protein Synthesis," Proceedings of the National Academy of Sciences of the United States of America, vol. 44, pp , Feb D. Keco and A. Subasi, "Parallelization of genetic algorithms using Hadoop Map/Reduce," Southeast Europe Journal of Soft Computing, A. Verma, X. Llor, D. E. Goldberg, and R. H. Campbell, "Scaling Genetic Algorithms Using MapReduce," presented at the Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications,
MapReduce Job Processing
April 17, 2012 Background: Hadoop Distributed File System (HDFS) Hadoop requires a Distributed File System (DFS), we utilize the Hadoop Distributed File System (HDFS). Background: Hadoop Distributed File
More informationHadoop Architecture. Part 1
Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationWelcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components
Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop
More informationPerformance and Energy Efficiency of. Hadoop deployment models
Performance and Energy Efficiency of Hadoop deployment models Contents Review: What is MapReduce Review: What is Hadoop Hadoop Deployment Models Metrics Experiment Results Summary MapReduce Introduced
More informationA Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud
A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud Thuy D. Nguyen, Cynthia E. Irvine, Jean Khosalim Department of Computer Science Ground System Architectures Workshop
More informationIMPLEMENTING PREDICTIVE ANALYTICS USING HADOOP FOR DOCUMENT CLASSIFICATION ON CRM SYSTEM
IMPLEMENTING PREDICTIVE ANALYTICS USING HADOOP FOR DOCUMENT CLASSIFICATION ON CRM SYSTEM Sugandha Agarwal 1, Pragya Jain 2 1,2 Department of Computer Science & Engineering ASET, Amity University, Noida,
More informationWeekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay
Weekly Report Hadoop Introduction submitted By Anurag Sharma Department of Computer Science and Engineering Indian Institute of Technology Bombay Chapter 1 What is Hadoop? Apache Hadoop (High-availability
More informationmarlabs driving digital agility WHITEPAPER Big Data and Hadoop
marlabs driving digital agility WHITEPAPER Big Data and Hadoop Abstract This paper explains the significance of Hadoop, an emerging yet rapidly growing technology. The prime goal of this paper is to unveil
More informationLecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
More informationVolume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies
Volume 3, Issue 6, June 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com Image
More informationCSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model
More informationThe Performance Characteristics of MapReduce Applications on Scalable Clusters
The Performance Characteristics of MapReduce Applications on Scalable Clusters Kenneth Wottrich Denison University Granville, OH 43023 wottri_k1@denison.edu ABSTRACT Many cluster owners and operators have
More informationMapReduce. Tushar B. Kute, http://tusharkute.com
MapReduce Tushar B. Kute, http://tusharkute.com What is MapReduce? MapReduce is a framework using which we can write applications to process huge amounts of data, in parallel, on large clusters of commodity
More informationHadoop Parallel Data Processing
MapReduce and Implementation Hadoop Parallel Data Processing Kai Shen A programming interface (two stage Map and Reduce) and system support such that: the interface is easy to program, and suitable for
More informationCopyright 2012, Oracle and/or its affiliates. All rights reserved.
1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions
More informationHow To Install Hadoop 1.2.1.1 From Apa Hadoop 1.3.2 To 1.4.2 (Hadoop)
Contents Download and install Java JDK... 1 Download the Hadoop tar ball... 1 Update $HOME/.bashrc... 3 Configuration of Hadoop in Pseudo Distributed Mode... 4 Format the newly created cluster to create
More informationTutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
More informationThe Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform
The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform Fong-Hao Liu, Ya-Ruei Liou, Hsiang-Fu Lo, Ko-Chin Chang, and Wei-Tsong Lee Abstract Virtualization platform solutions
More informationOpen source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
More informationApache Hadoop new way for the company to store and analyze big data
Apache Hadoop new way for the company to store and analyze big data Reyna Ulaque Software Engineer Agenda What is Big Data? What is Hadoop? Who uses Hadoop? Hadoop Architecture Hadoop Distributed File
More informationTowards a Resource Aware Scheduler in Hadoop
Towards a Resource Aware Scheduler in Hadoop Mark Yong, Nitin Garegrat, Shiwali Mohan Computer Science and Engineering, University of Michigan, Ann Arbor December 21, 2009 Abstract Hadoop-MapReduce is
More informationDesign of Electric Energy Acquisition System on Hadoop
, pp.47-54 http://dx.doi.org/10.14257/ijgdc.2015.8.5.04 Design of Electric Energy Acquisition System on Hadoop Yi Wu 1 and Jianjun Zhou 2 1 School of Information Science and Technology, Heilongjiang University
More informationWhite Paper. Big Data and Hadoop. Abhishek S, Java COE. Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP
White Paper Big Data and Hadoop Abhishek S, Java COE www.marlabs.com Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP Table of contents Abstract.. 1 Introduction. 2 What is Big
More informationHadoop Scheduler w i t h Deadline Constraint
Hadoop Scheduler w i t h Deadline Constraint Geetha J 1, N UdayBhaskar 2, P ChennaReddy 3,Neha Sniha 4 1,4 Department of Computer Science and Engineering, M S Ramaiah Institute of Technology, Bangalore,
More informationWhat We Can Do in the Cloud (2) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea
What We Can Do in the Cloud (2) -Tutorial for Cloud Computing Course- Mikael Fernandus Simalango WISE Research Lab Ajou University, South Korea Overview Riding Google App Engine Taming Hadoop Summary Riding
More informationNTT DOCOMO Technical Journal. Large-Scale Data Processing Infrastructure for Mobile Spatial Statistics
Large-scale Distributed Data Processing Big Data Service Platform Mobile Spatial Statistics Supporting Development of Society and Industry Population Estimation Using Mobile Network Statistical Data and
More informationOpen source large scale distributed data management with Google s MapReduce and Bigtable
Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory
More informationSetup Hadoop On Ubuntu Linux. ---Multi-Node Cluster
Setup Hadoop On Ubuntu Linux ---Multi-Node Cluster We have installed the JDK and Hadoop for you. The JAVA_HOME is /usr/lib/jvm/java/jdk1.6.0_22 The Hadoop home is /home/user/hadoop-0.20.2 1. Network Edit
More informationHow to properly misuse Hadoop. Marcel Huntemann NERSC tutorial session 2/12/13
How to properly misuse Hadoop Marcel Huntemann NERSC tutorial session 2/12/13 History Created by Doug Cutting (also creator of Apache Lucene). 2002 Origin in Apache Nutch (open source web search engine).
More informationIntroduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
More informationImproving Current Hadoop MapReduce Workflow and Performance
Improving Current Hadoop MapReduce Workflow and Performance Hamoud Alshammari Department of Computer Science, CT, USA Jeongkyu Lee Department of Computer Science, CT, USA Hassan Bajwa Department of Electrical
More informationR.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5
Distributed data processing in heterogeneous cloud environments R.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5 1 uskenbaevar@gmail.com, 2 abu.kuandykov@gmail.com,
More informationA Cost-Evaluation of MapReduce Applications in the Cloud
1/23 A Cost-Evaluation of MapReduce Applications in the Cloud Diana Moise, Alexandra Carpen-Amarie Gabriel Antoniu, Luc Bougé KerData team 2/23 1 MapReduce applications - case study 2 3 4 5 3/23 MapReduce
More informationPaRFR : Parallel Random Forest Regression on Hadoop for Multivariate Quantitative Trait Loci Mapping. Version 1.0, Oct 2012
PaRFR : Parallel Random Forest Regression on Hadoop for Multivariate Quantitative Trait Loci Mapping Version 1.0, Oct 2012 This document describes PaRFR, a Java package that implements a parallel random
More informationApache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.
EDUREKA Apache Hadoop 2.0 Installation and Single Node Cluster Configuration on Ubuntu A guide to install and setup Single-Node Apache Hadoop 2.0 Cluster edureka! 11/12/2013 A guide to Install and Configure
More informationResearch Article Hadoop-Based Distributed Sensor Node Management System
Distributed Networks, Article ID 61868, 7 pages http://dx.doi.org/1.1155/214/61868 Research Article Hadoop-Based Distributed Node Management System In-Yong Jung, Ki-Hyun Kim, Byong-John Han, and Chang-Sung
More informationOverview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics
Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)
More informationCURSO: ADMINISTRADOR PARA APACHE HADOOP
CURSO: ADMINISTRADOR PARA APACHE HADOOP TEST DE EJEMPLO DEL EXÁMEN DE CERTIFICACIÓN www.formacionhadoop.com 1 Question: 1 A developer has submitted a long running MapReduce job with wrong data sets. You
More informationBig Data - Infrastructure Considerations
April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A COMPREHENSIVE VIEW OF HADOOP ER. AMRINDER KAUR Assistant Professor, Department
More informationResearch Article Cloud Computing for Protein-Ligand Binding Site Comparison
BioMed Research International Volume 213, Article ID 17356, 7 pages http://dx.doi.org/1.1155/213/17356 Research Article Cloud Computing for Protein-Ligand Binding Site Comparison Che-Lun Hung 1 and Guan-Jie
More informationConsensus Scoring to Improve the Predictive Power of in-silico Screening for Drug Design
Consensus Scoring to Improve the Predictive Power of in-silico Screening for Drug Design Masato Okada Faculty of Science and Technology, Masato Tsukamoto Faculty of Pharmaceutical Sciences, Hayato Ohwada
More informationOpen source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
More informationTelecom Data processing and analysis based on Hadoop
COMPUTER MODELLING & NEW TECHNOLOGIES 214 18(12B) 658-664 Abstract Telecom Data processing and analysis based on Hadoop Guofan Lu, Qingnian Zhang *, Zhao Chen Wuhan University of Technology, Wuhan 4363,China
More informationMETHOD OF A MULTIMEDIA TRANSCODING FOR MULTIPLE MAPREDUCE JOBS IN CLOUD COMPUTING ENVIRONMENT
METHOD OF A MULTIMEDIA TRANSCODING FOR MULTIPLE MAPREDUCE JOBS IN CLOUD COMPUTING ENVIRONMENT 1 SEUNGHO HAN, 2 MYOUNGJIN KIM, 3 YUN CUI, 4 SEUNGHYUN SEO, 5 SEUNGBUM SEO, 6 HANKU LEE 1,2,3,4,5 Department
More informationIntroduction to Cloud Computing
Introduction to Cloud Computing Qloud Demonstration 15 319, spring 2010 3 rd Lecture, Jan 19 th Suhail Rehman Time to check out the Qloud! Enough Talk! Time for some Action! Finally you can have your own
More informationEnergy-Saving Cloud Computing Platform Based On Micro-Embedded System
Energy-Saving Cloud Computing Platform Based On Micro-Embedded System Wen-Hsu HSIEH *, San-Peng KAO **, Kuang-Hung TAN **, Jiann-Liang CHEN ** * Department of Computer and Communication, De Lin Institute
More informationHadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh
1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets
More information!"#$%&' ( )%#*'+,'-#.//"0( !"#$"%&'()*$+()',!-+.'/', 4(5,67,!-+!"89,:*$;'0+$.<.,&0$'09,&)"/=+,!()<>'0, 3, Processing LARGE data sets
!"#$%&' ( Processing LARGE data sets )%#*'+,'-#.//"0( Framework for o! reliable o! scalable o! distributed computation of large data sets 4(5,67,!-+!"89,:*$;'0+$.
More informationHadoopRDF : A Scalable RDF Data Analysis System
HadoopRDF : A Scalable RDF Data Analysis System Yuan Tian 1, Jinhang DU 1, Haofen Wang 1, Yuan Ni 2, and Yong Yu 1 1 Shanghai Jiao Tong University, Shanghai, China {tian,dujh,whfcarter}@apex.sjtu.edu.cn
More informationDATA MINING WITH HADOOP AND HIVE Introduction to Architecture
DATA MINING WITH HADOOP AND HIVE Introduction to Architecture Dr. Wlodek Zadrozny (Most slides come from Prof. Akella s class in 2014) 2015-2025. Reproduction or usage prohibited without permission of
More informationCloudera Manager Health Checks
Cloudera, Inc. 1001 Page Mill Road Palo Alto, CA 94304-1008 info@cloudera.com US: 1-888-789-1488 Intl: 1-650-362-0488 www.cloudera.com Cloudera Manager Health Checks Important Notice 2010-2013 Cloudera,
More informationFundamentals Curriculum HAWQ
Fundamentals Curriculum Pivotal Hadoop 2.1 HAWQ Education Services zdata Inc. 660 4th St. Ste. 176 San Francisco, CA 94107 t. 415.890.5764 zdatainc.com Pivotal Hadoop & HAWQ Fundamentals Course Description
More informationCloudera Manager Health Checks
Cloudera, Inc. 220 Portage Avenue Palo Alto, CA 94306 info@cloudera.com US: 1-888-789-1488 Intl: 1-650-362-0488 www.cloudera.com Cloudera Manager Health Checks Important Notice 2010-2013 Cloudera, Inc.
More informationHadoop on OpenStack Cloud. Dmitry Mescheryakov Software Engineer, @MirantisIT
Hadoop on OpenStack Cloud Dmitry Mescheryakov Software Engineer, @MirantisIT Agenda OpenStack Sahara Demo Hadoop Performance on Cloud Conclusion OpenStack Open source cloud computing platform 17,209 commits
More informationHow To Analyze Network Traffic With Mapreduce On A Microsoft Server On A Linux Computer (Ahem) On A Network (Netflow) On An Ubuntu Server On An Ipad Or Ipad (Netflower) On Your Computer
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pig and Typical Mapreduce Anjali P P and Binu A Department of Information Technology, Rajagiri School of Engineering and Technology,
More informationBright Cluster Manager
Bright Cluster Manager A Unified Management Solution for HPC and Hadoop Martijn de Vries CTO Introduction Architecture Bright Cluster CMDaemon Cluster Management GUI Cluster Management Shell SOAP/ JSONAPI
More informationAnalysing Large Web Log Files in a Hadoop Distributed Cluster Environment
Analysing Large Files in a Hadoop Distributed Cluster Environment S Saravanan, B Uma Maheswari Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham,
More informationA very short Intro to Hadoop
4 Overview A very short Intro to Hadoop photo by: exfordy, flickr 5 How to Crunch a Petabyte? Lots of disks, spinning all the time Redundancy, since disks die Lots of CPU cores, working all the time Retry,
More informationA CLOUD-BASED FRAMEWORK FOR ONLINE MANAGEMENT OF MASSIVE BIMS USING HADOOP AND WEBGL
A CLOUD-BASED FRAMEWORK FOR ONLINE MANAGEMENT OF MASSIVE BIMS USING HADOOP AND WEBGL *Hung-Ming Chen, Chuan-Chien Hou, and Tsung-Hsi Lin Department of Construction Engineering National Taiwan University
More informationSector vs. Hadoop. A Brief Comparison Between the Two Systems
Sector vs. Hadoop A Brief Comparison Between the Two Systems Background Sector is a relatively new system that is broadly comparable to Hadoop, and people want to know what are the differences. Is Sector
More informationQsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
More informationUnstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012
Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 1 Market Trends Big Data Growing technology deployments are creating an exponential increase in the volume
More informationMobile Storage and Search Engine of Information Oriented to Food Cloud
Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:
More informationIntroduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
More informationReference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware
Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray ware 2 Agenda The Hadoop Journey Why Virtualize Hadoop? Elasticity and Scalability Performance Tests Storage Reference
More informationMobile Cloud Computing for Data-Intensive Applications
Mobile Cloud Computing for Data-Intensive Applications Senior Thesis Final Report Vincent Teo, vct@andrew.cmu.edu Advisor: Professor Priya Narasimhan, priya@cs.cmu.edu Abstract The computational and storage
More informationLocal Alignment Tool Based on Hadoop Framework and GPU Architecture
Local Alignment Tool Based on Hadoop Framework and GPU Architecture Che-Lun Hung * Department of Computer Science and Communication Engineering Providence University Taichung, Taiwan clhung@pu.edu.tw *
More informationBenchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
More informationPerformance Analysis of Book Recommendation System on Hadoop Platform
Performance Analysis of Book Recommendation System on Hadoop Platform Sugandha Bhatia #1, Surbhi Sehgal #2, Seema Sharma #3 Department of Computer Science & Engineering, Amity School of Engineering & Technology,
More informationRole of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
More informationApache Hadoop. Alexandru Costan
1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open
More informationL1: Introduction to Hadoop
L1: Introduction to Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General
More informationNETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE Anjali P P 1 and Binu A 2 1 Department of Information Technology, Rajagiri School of Engineering and Technology, Kochi. M G University, Kerala
More informationMapReduce, Hadoop and Amazon AWS
MapReduce, Hadoop and Amazon AWS Yasser Ganjisaffar http://www.ics.uci.edu/~yganjisa February 2011 What is Hadoop? A software framework that supports data-intensive distributed applications. It enables
More informationTHE HADOOP DISTRIBUTED FILE SYSTEM
THE HADOOP DISTRIBUTED FILE SYSTEM Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Presented by Alexander Pokluda October 7, 2013 Outline Motivation and Overview of Hadoop Architecture,
More informationLecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl
Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind
More informationParallel Data Mining and Assurance Service Model Using Hadoop in Cloud
Parallel Data Mining and Assurance Service Model Using Hadoop in Cloud Aditya Jadhav, Mahesh Kukreja E-mail: aditya.jadhav27@gmail.com & mr_mahesh_in@yahoo.co.in Abstract : In the information industry,
More informationHadoop. Apache Hadoop is an open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware.
Hadoop Source Alessandro Rezzani, Big Data - Architettura, tecnologie e metodi per l utilizzo di grandi basi di dati, Apogeo Education, ottobre 2013 wikipedia Hadoop Apache Hadoop is an open-source software
More informationMining of Web Server Logs in a Distributed Cluster Using Big Data Technologies
Mining of Web Server Logs in a Distributed Cluster Using Big Data Technologies Savitha K Dept. of Computer Science, Research Scholar PSGR Krishnammal College for Women Coimbatore, India. Vijaya MS Dept.
More informationH2O on Hadoop. September 30, 2014. www.0xdata.com
H2O on Hadoop September 30, 2014 www.0xdata.com H2O on Hadoop Introduction H2O is the open source math & machine learning engine for big data that brings distribution and parallelism to powerful algorithms
More informationBuilding & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp
Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Introduction to Hadoop Comes from Internet companies Emerging big data storage and analytics platform HDFS and MapReduce
More informationData-Intensive Computing with Map-Reduce and Hadoop
Data-Intensive Computing with Map-Reduce and Hadoop Shamil Humbetov Department of Computer Engineering Qafqaz University Baku, Azerbaijan humbetov@gmail.com Abstract Every day, we create 2.5 quintillion
More informationHigh Performance Computing with Hadoop WV HPC Summer Institute 2014
High Performance Computing with Hadoop WV HPC Summer Institute 2014 E. James Harner Director of Data Science Department of Statistics West Virginia University June 18, 2014 Outline Introduction Hadoop
More informationHadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science
A Seminar report On Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science SUBMITTED TO: www.studymafia.org SUBMITTED BY: www.studymafia.org
More informationKeywords: Big Data, HDFS, Map Reduce, Hadoop
Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Configuration Tuning
More informationHadoop Installation. Sandeep Prasad
Hadoop Installation Sandeep Prasad 1 Introduction Hadoop is a system to manage large quantity of data. For this report hadoop- 1.0.3 (Released, May 2012) is used and tested on Ubuntu-12.04. The system
More informationhttp://www.paper.edu.cn
5 10 15 20 25 30 35 A platform for massive railway information data storage # SHAN Xu 1, WANG Genying 1, LIU Lin 2** (1. Key Laboratory of Communication and Information Systems, Beijing Municipal Commission
More informationHadoop Setup. 1 Cluster
In order to use HadoopUnit (described in Sect. 3.3.3), a Hadoop cluster needs to be setup. This cluster can be setup manually with physical machines in a local environment, or in the cloud. Creating a
More informationOptimize the execution of local physics analysis workflows using Hadoop
Optimize the execution of local physics analysis workflows using Hadoop INFN CCR - GARR Workshop 14-17 May Napoli Hassen Riahi Giacinto Donvito Livio Fano Massimiliano Fasi Andrea Valentini INFN-PERUGIA
More informationThe Improved Job Scheduling Algorithm of Hadoop Platform
The Improved Job Scheduling Algorithm of Hadoop Platform Yingjie Guo a, Linzhi Wu b, Wei Yu c, Bin Wu d, Xiaotian Wang e a,b,c,d,e University of Chinese Academy of Sciences 100408, China b Email: wulinzhi1001@163.com
More informationmap/reduce connected components
1, map/reduce connected components find connected components with analogous algorithm: map edges randomly to partitions (k subgraphs of n nodes) for each partition remove edges, so that only tree remains
More informationHow To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm (
Apache Hadoop 1.0 High Availability Solution on VMware vsphere TM Reference Architecture TECHNICAL WHITE PAPER v 1.0 June 2012 Table of Contents Executive Summary... 3 Introduction... 3 Terminology...
More informationLecture 10 - Functional programming: Hadoop and MapReduce
Lecture 10 - Functional programming: Hadoop and MapReduce Sohan Dharmaraja Sohan Dharmaraja Lecture 10 - Functional programming: Hadoop and MapReduce 1 / 41 For today Big Data and Text analytics Functional
More informationGeneral Terms. Keywords 1. INTRODUCTION 2. RELATED WORKS
Design of a Hybrid Intrusion Detection System using Snort and Hadoop Prathibha.P.G P G Scholar Government Engineering College Thrissur, Kerala, India Dileesh.E.D Assistant Professor Government Engineering
More informationUniversity of Maryland. Tuesday, February 2, 2010
Data-Intensive Information Processing Applications Session #2 Hadoop: Nuts and Bolts Jimmy Lin University of Maryland Tuesday, February 2, 2010 This work is licensed under a Creative Commons Attribution-Noncommercial-Share
More informationMr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo
Sensor Network Messaging Service Hive/Hadoop Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo Contents 1 Introduction 2 What & Why Sensor Network
More informationHadoop Training Hands On Exercise
Hadoop Training Hands On Exercise 1. Getting started: Step 1: Download and Install the Vmware player - Download the VMware- player- 5.0.1-894247.zip and unzip it on your windows machine - Click the exe
More informationEfficient Data Replication Scheme based on Hadoop Distributed File System
, pp. 177-186 http://dx.doi.org/10.14257/ijseia.2015.9.12.16 Efficient Data Replication Scheme based on Hadoop Distributed File System Jungha Lee 1, Jaehwa Chung 2 and Daewon Lee 3* 1 Division of Supercomputing,
More information