GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid
|
|
|
- Shanon Hampton
- 10 years ago
- Views:
Transcription
1 GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid Wantao Liu 1,2 Raj Kettimuthu 2,3, Brian Tieman 3, Ravi Madduri 2,3, Bo Li 1, and Ian Foster 2,3 1 Beihang University, Beijing, China 2 The University of Chicago, Chicago, USA 3 Argonne National Laboratory, Argonne, USA
2 Outline GridFTP overview GridFTP Challenges Commonly used GridFTP clients Zero configure GUI client Experimental results
3 GridFTP A secure, robust, fast, efficient, standards based, widely accepted data transfer protocol We also supply a reference implementation: Server Client tools (globus-url-copy) Development Libraries Multiple independent implementations can interoperate University of Virginia and Fermi Lab have home grown servers that work with ours. Lots of people have developed clients independent of the Globus Project.
4 GridFTP Two channel protocol like FTP Control Channel Communication link (TCP) over which commands and responses flow Low bandwidth; encrypted and integrity protected by default Data Channel Communication link(s) over which the actual data of interest flows High Bandwidth; authenticated by default; encryption and integrity protection optional
5 Striping GridFTP offers a powerful feature called striped transfers (cluster-to-cluster transfers)
6 GridFTP Servers Around the World Created by Lydia Prieto ; G. Zarrate; Anda Imanitchi (Florida State University) using MaxMind's GeoIP technology (
7 GridFTP in production Many Scientific communities rely on GridFTP High Energy Physics tiered data movement infrastructure for the LHC computing Grid LIGO routinely uses GridFTP to move 1 TB a day Southern California Earthquake Center (SCEC), Earth Systems Grid (ESG), Relativistic Heavy Ion Collider (RHIC), European Space Agency, BBC use GridFTP for data movement GridFTP facilitates an average of more than 5 million data transfers every day
8 Challenges Past success Standard big selling point for adoption Throughput GridFTP was sold on speed Robustness has to work all the time Current and future Ease-of-use Zero configuration clients Firewall Scalable Extensible
9 Globus-url-copy Commonly used command line scriptable client globus-url-copy [options] srcurl dsturl URL format - protocol://[user:pass@] [host]/path Users can do client/server and 3 rd party transfers using globus-url-copy
10 Other clients UberFTP Reliable file transfer service Custom clients using globus C and Java client libraries All these clients require non-trivial configuration Security setup None of these clients provide graphical user interface
11 Drag and drop Zero configuration GridFTP GUI Integrated with myproxy Automatically trusts the CAs part of IGTF distribution Fault tolerant Transfer status monitoring Optimized for performance
12 Snapshot of the GUI
13 Fault tolerant Better fault tolerance than other GridFTP clients Like other clients, GUI can recover from transient server and network failures Globus-url-copy can not recover from its own failures GUI can recover from its own failures Unlike RFT, stores information on the local file system
14 Lots of small files Scientific experiments produce huge volume of data the individual file size is modest, on the order of kilobytes or megabytes hundreds of thousands of files to transfer every day the size of the entire dataset is tremendous, from hundreds of gigabytes to hundreds of terabytes
15 Advanced Photon Source Advanced Photon Source at Argonne dozens of samples may be acquired for one experiment every day each sample generates about 2,000 raw data files after processing, each sample produces additional 2,000 reconstructed files each file is 8 to 16 MB in size
16 Lots of small files Transfer threads pool Move multiple files concurrently Maximize the utilization of network bandwidth Improve the transfer performance Two windows for status information Directory window lists all directories and their transfer status File window lists all files under the active directory
17 Experiment Setup We conducted all of our experiments using TeraGrid NCSA nodes and the University of Chicago nodes GridFTP GUI is compared with scp and globus-url-copy TCP is configured as the underlying data transport protocol
18 Experiment Results globus-url-copy Transfer Time(Seconds) globus-url-copy(p=4) scp GridFTP GUI File Size(MB) GridFTP GUI(p=4)
19 Experiment Results(cont.)
20 Questions 20
GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid
GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid Wantao Liu, 1,2 Rajkumar Kettimuthu, 3,4 Brian Tieman, 5 Ravi Madduri, 3,4 Bo Li, 1 Ian Foster 2,3,4 1 School of Computer Science and Engineering,
GridFTP: A Data Transfer Protocol for the Grid
GridFTP: A Data Transfer Protocol for the Grid Grid Forum Data Working Group on GridFTP Bill Allcock, Lee Liming, Steven Tuecke ANL Ann Chervenak USC/ISI Introduction In Grid environments,
Data Movement and Storage. Drew Dolgert and previous contributors
Data Movement and Storage Drew Dolgert and previous contributors Data Intensive Computing Location Viewing Manipulation Storage Movement Sharing Interpretation $HOME $WORK $SCRATCH 72 is a Lot, Right?
Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago Outline Introduction Features Motivation Architecture Globus XIO Experimental Results 3 August 2005 The Ohio State University
Web Service Robust GridFTP
Web Service Robust GridFTP Sang Lim, Geoffrey Fox, Shrideep Pallickara and Marlon Pierce Community Grid Labs, Indiana University 501 N. Morton St. Suite 224 Bloomington, IN 47404 {sblim, gcf, spallick,
High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand
High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand Hari Subramoni *, Ping Lai *, Raj Kettimuthu **, Dhabaleswar. K. (DK) Panda * * Computer Science and Engineering Department
A Tutorial on Configuring and Deploying GridFTP for Managing Data Movement in Grid/HPC Environments
A Tutorial on Configuring and Deploying GridFTP for Managing Data Movement in Grid/HPC Environments John Bresnahan Michael Link Rajkumar Kettimuthu Dan Fraser Argonne National Laboratory University of
globus online Integrating with Globus Online Steve Tuecke Computation Institute University of Chicago and Argonne National Laboratory
globus online Integrating with Globus Online Steve Tuecke Computation Institute University of Chicago and Argonne National Laboratory Types of integration Resource integration Connect campus, project,
Chapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
Open Source File Transfers
Open Source File Transfers A comparison of recent open source file transfer projects By: John Tkaczewski Contents Introduction... 2 Recent Open Source Projects... 2 UDT UDP-based Data Transfer... 4 Tsunami
Integration of Network Performance Monitoring Data at FTS3
Integration of Network Performance Monitoring Data at FTS3 July-August 2013 Author: Rocío Rama Ballesteros Supervisor(s): Michail Salichos Alejandro Álvarez CERN openlab Summer Student Report 2013 Project
Comparisons between HTCP and GridFTP over file transfer
Comparisons between HTCP and GridFTP over file transfer Andrew McNab and Yibiao Li Abstract: A comparison between GridFTP [1] and HTCP [2] protocols on file transfer speed is given here, based on experimental
Scala Storage Scale-Out Clustered Storage White Paper
White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current
EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage
EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage Applied Technology Abstract This white paper describes various backup and recovery solutions available for SQL
Sector vs. Hadoop. A Brief Comparison Between the Two Systems
Sector vs. Hadoop A Brief Comparison Between the Two Systems Background Sector is a relatively new system that is broadly comparable to Hadoop, and people want to know what are the differences. Is Sector
Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007
Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the
Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2
Axceleon s CloudFuzion Turbocharges 3D Rendering On Amazon s EC2 In the movie making, visual effects and 3D animation industrues meeting project and timing deadlines is critical to success. Poor quality
Scalable Multi-Node Event Logging System for Ba Bar
A New Scalable Multi-Node Event Logging System for BaBar James A. Hamilton Steffen Luitz For the BaBar Computing Group Original Structure Raw Data Processing Level 3 Trigger Mirror Detector Electronics
Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing
Roadmap for Applying Hadoop Distributed File System in Scientific Grid Computing Garhan Attebury 1, Andrew Baranovski 2, Ken Bloom 1, Brian Bockelman 1, Dorian Kcira 3, James Letts 4, Tanya Levshina 2,
The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland
The Lattice Project: A Multi-Model Grid Computing System Center for Bioinformatics and Computational Biology University of Maryland Parallel Computing PARALLEL COMPUTING a form of computation in which
Using Globus Toolkit
Using Globus Toolkit G. Poghosyan & D. Nilsen GridKa School 11-15 September 2006 Basic Grid Services in GT Security Services GSI (Grid Security Infrastructure) Data Services GridFTP RFT (Reliable File
File Transfer Best Practices
File Transfer Best Practices David Turner User Services Group NERSC User Group Meeting October 2, 2008 Overview Available tools ftp, scp, bbcp, GridFTP, hsi/htar Examples and Performance LAN WAN Reliability
XSEDE Service Provider Software and Services Baseline. September 24, 2015 Version 1.2
XSEDE Service Provider Software and Services Baseline September 24, 2015 Version 1.2 i TABLE OF CONTENTS XSEDE Production Baseline: Service Provider Software and Services... i A. Document History... A-
Concepts and Architecture of the Grid. Summary of Grid 2, Chapter 4
Concepts and Architecture of the Grid Summary of Grid 2, Chapter 4 Concepts of Grid Mantra: Coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations Allows
Diagram 1: Islands of storage across a digital broadcast workflow
XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,
Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF
Panasas at the RCF HEPiX at SLAC Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory Centralized File Service Single, facility-wide namespace for files. Uniform, facility-wide
13.1 Backup virtual machines running on VMware ESXi / ESX Server
13 Backup / Restore VMware Virtual Machines Tomahawk Pro This chapter describes how to backup and restore virtual machines running on VMware ESX, ESXi Server or VMware Server 2.0. 13.1 Backup virtual machines
On the features and challenges of security and privacy in distributed internet of things. C. Anurag Varma [email protected] CpE 6510 3/24/2016
On the features and challenges of security and privacy in distributed internet of things C. Anurag Varma [email protected] CpE 6510 3/24/2016 Outline Introduction IoT (Internet of Things) A distributed IoT
File Transfer Examples. Running commands on other computers and transferring files between computers
Running commands on other computers and transferring files between computers 1 1 Remote Login Login to remote computer and run programs on that computer Once logged in to remote computer, everything you
DTI Image Processing Pipeline and Cloud Computing Environment
DTI Image Processing Pipeline and Cloud Computing Environment Kyle Chard Computation Institute University of Chicago and Argonne National Laboratory Introduction DTI image analysis requires the use of
1 Product. Open Text is the leading fax server vendor in the world. *
1 Product Open Text Fax s Replace fax machines and inefficient paper processes with efficient and secure computer-based faxing and electronic document delivery Open Text is the leading fax server vendor
EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS
EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EMC Isilon solutions for oil and gas EMC PERSPECTIVE TABLE OF CONTENTS INTRODUCTION: THE HUNT FOR MORE RESOURCES... 3 KEEPING PACE WITH
Considerations In Developing Firewall Selection Criteria. Adeptech Systems, Inc.
Considerations In Developing Firewall Selection Criteria Adeptech Systems, Inc. Table of Contents Introduction... 1 Firewall s Function...1 Firewall Selection Considerations... 1 Firewall Types... 2 Packet
Scalable Windows Storage Server File Serving Clusters Using Melio File System and DFS
Scalable Windows Storage Server File Serving Clusters Using Melio File System and DFS Step-by-step Configuration Guide Table of Contents Scalable File Serving Clusters Using Windows Storage Server Using
HADOOP, a newly emerged Java-based software framework, Hadoop Distributed File System for the Grid
Hadoop Distributed File System for the Grid Garhan Attebury, Andrew Baranovski, Ken Bloom, Brian Bockelman, Dorian Kcira, James Letts, Tanya Levshina, Carl Lundestedt, Terrence Martin, Will Maier, Haifeng
Monitoring Clusters and Grids
JENNIFER M. SCHOPF AND BEN CLIFFORD Monitoring Clusters and Grids One of the first questions anyone asks when setting up a cluster or a Grid is, How is it running? is inquiry is usually followed by the
The GRID and the Linux Farm at the RCF
The GRID and the Linux Farm at the RCF A. Chan, R. Hogue, C. Hollowell, O. Rind, J. Smith, T. Throwe, T. Wlodek, D. Yu Brookhaven National Laboratory, NY 11973, USA The emergence of the GRID architecture
A Reliable and Fast Data Transfer for Grid Systems Using a Dynamic Firewall Configuration
A Reliable and Fast Data Transfer for Grid Systems Using a Dynamic Firewall Configuration Thomas Oistrez Research Centre Juelich Juelich Supercomputing Centre August 21, 2008 1 / 16 Overview 1 UNICORE
How To Build A Clustered Storage Area Network (Csan) From Power All Networks
Power-All Networks Clustered Storage Area Network: A scalable, fault-tolerant, high-performance storage system. Power-All Networks Ltd Abstract: Today's network-oriented computing environments require
EMC EXAM - E20-598. Backup and Recovery - Avamar Specialist Exam for Storage Administrators. Buy Full Product. http://www.examskey.com/e20-598.
EMC EXAM - E20-598 Backup and Recovery - Avamar Specialist Exam for Storage Administrators Buy Full Product http://www.examskey.com/e20-598.html Examskey EMC E20-598 exam demo product is here for you to
An objective comparison test of workload management systems
An objective comparison test of workload management systems Igor Sfiligoi 1 and Burt Holzman 1 1 Fermi National Accelerator Laboratory, Batavia, IL 60510, USA E-mail: [email protected] Abstract. The Grid
bbc Adobe LiveCycle Data Services Using the F5 BIG-IP LTM Introduction APPLIES TO CONTENTS
TECHNICAL ARTICLE Adobe LiveCycle Data Services Using the F5 BIG-IP LTM Introduction APPLIES TO Adobe LiveCycle Enterprise Suite CONTENTS Introduction................................. 1 Edge server architecture......................
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets
The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and
Australian Synchrotron, Storage Gateway
Australian Synchrotron, Storage Gateway User Help Manual Version 1.2 Storage Gateway User Help Manual 2 REVISION HISTORY Date Version Description Author 2 May 2008 1.0 Document creation Chris Myers 13
Globus Toolkit: Authentication and Credential Translation
Globus Toolkit: Authentication and Credential Translation JET Workshop, April 14, 2004 Frank Siebenlist [email protected] http://www.globus.org/ Copyright (c) 2002 University of Chicago and The University
Campus Network Design Science DMZ
Campus Network Design Science DMZ Dale Smith Network Startup Resource Center [email protected] The information in this document comes largely from work done by ESnet, the USA Energy Sciences Network see
How To Test The Bandwidth Meter For Hyperv On Windows V2.4.2.2 (Windows) On A Hyperv Server (Windows V2) On An Uniden V2 (Amd64) Or V2A (Windows 2
BANDWIDTH METER FOR HYPER-V NEW FEATURES OF 2.0 The Bandwidth Meter is an active application now, not just a passive observer. It can send email notifications if some bandwidth threshold reached, run scripts
Summer Student Project Report
Summer Student Project Report Dimitris Kalimeris National and Kapodistrian University of Athens June September 2014 Abstract This report will outline two projects that were done as part of a three months
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
Administering the Web Server (IIS) Role of Windows Server
Course 10972A: Administering the Web Server (IIS) Role of Windows Server Course Details Course Outline Module 1: Overview and Installing Internet Information Services In this module students will learn
EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise
EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable with
Amazon Cloud Storage Options
Amazon Cloud Storage Options Table of Contents 1. Overview of AWS Storage Options 02 2. Why you should use the AWS Storage 02 3. How to get Data into the AWS.03 4. Types of AWS Storage Options.03 5. Object
IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE
White Paper IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE Abstract This white paper focuses on recovery of an IBM Tivoli Storage Manager (TSM) server and explores
StorReduce Technical White Paper Cloud-based Data Deduplication
StorReduce Technical White Paper Cloud-based Data Deduplication See also at storreduce.com/docs StorReduce Quick Start Guide StorReduce FAQ StorReduce Solution Brief, and StorReduce Blog at storreduce.com/blog
How to Choose your Red Hat Enterprise Linux Filesystem
How to Choose your Red Hat Enterprise Linux Filesystem EXECUTIVE SUMMARY Choosing the Red Hat Enterprise Linux filesystem that is appropriate for your application is often a non-trivial decision due to
Research Data Storage, Sharing, and Transfer Options
Research Data Storage, Sharing, and Transfer Options Principal investigators should establish a research data management system for their projects including procedures for storing working data collected
Log Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, [email protected] Amruta Deshpande Department of Computer Science, [email protected]
THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid
THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING José Daniel García Sánchez ARCOS Group University Carlos III of Madrid Contents 2 The ARCOS Group. Expand motivation. Expand
Bridgit Conferencing Software: Security, Firewalls, Bandwidth and Scalability
Bridgit Conferencing Software: Security, Firewalls, Bandwidth and Scalability Overview... 3 Installing Bridgit Software... 4 Installing Bridgit Software Services... 4 Creating a Server Cluster... 4 Using
Cisco and EMC Solutions for Application Acceleration and Branch Office Infrastructure Consolidation
Solution Overview Cisco and EMC Solutions for Application Acceleration and Branch Office Infrastructure Consolidation IT organizations face challenges in consolidating costly and difficult-to-manage branch-office
Ciphermail Gateway Separate Front-end and Back-end Configuration Guide
CIPHERMAIL EMAIL ENCRYPTION Ciphermail Gateway Separate Front-end and Back-end Configuration Guide June 19, 2014, Rev: 8975 Copyright 2010-2014, ciphermail.com. CONTENTS CONTENTS Contents 1 Introduction
Distributed Systems Architectures
Software Engineering Distributed Systems Architectures Based on Software Engineering, 7 th Edition by Ian Sommerville Objectives To explain the advantages and disadvantages of different distributed systems
UDR: UDT + RSYNC. Open Source Fast File Transfer. Allison Heath University of Chicago
UDR: UDT + RSYNC Open Source Fast File Transfer Allison Heath University of Chicago Motivation for High Performance Protocols High-speed networks (10Gb/s, 40Gb/s, 100Gb/s,...) Large, distributed datasets
Directory and File Transfer Services. Chapter 7
Directory and File Transfer Services Chapter 7 Learning Objectives Explain benefits offered by centralized enterprise directory services such as LDAP over traditional authentication systems Identify major
BMC CONTROL-M Agentless Tips & Tricks TECHNICAL WHITE PAPER
BMC CONTROL-M Agentless Tips & Tricks TECHNICAL WHITE PAPER Table of Contents BMC CONTROL-M An IT workload automation platform... 1 Using standard agent-based scheduling... 1 Agentless scheduling... 1
Managing your Red Hat Enterprise Linux guests with RHN Satellite
Managing your Red Hat Enterprise Linux guests with RHN Satellite Matthew Davis, Level 1 Production Support Manager, Red Hat Brad Hinson, Sr. Support Engineer Lead System z, Red Hat Mark Spencer, Sr. Solutions
Cloud Computing. Lecture 5 Grid Case Studies 2014-2015
Cloud Computing Lecture 5 Grid Case Studies 2014-2015 Up until now Introduction. Definition of Cloud Computing. Grid Computing: Schedulers Globus Toolkit Summary Grid Case Studies: Monitoring: TeraGRID
How SafeVelocity Improves Network Transfer of Files
How SafeVelocity Improves Network Transfer of Files 1. Introduction... 1 2. Common Methods for Network Transfer of Files...2 3. Need for an Improved Network Transfer Solution... 2 4. SafeVelocity The Optimum
From Centralization to Distribution: A Comparison of File Sharing Protocols
From Centralization to Distribution: A Comparison of File Sharing Protocols Xu Wang, Teng Long and Alan Sussman Department of Computer Science, University of Maryland, College Park, MD, 20742 August, 2015
White Paper. Amazon in an Instant: How Silver Peak Cloud Acceleration Improves Amazon Web Services (AWS)
Amazon in an Instant: How Silver Peak Cloud Acceleration Improves Amazon Web Services (AWS) Amazon in an Instant: How Silver Peak Cloud Acceleration Improves Amazon Web Services (AWS) Amazon Web Services
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
Internet Information TE Services 5.0. Training Division, NIC New Delhi
Internet Information TE Services 5.0 Training Division, NIC New Delhi Understanding the Web Technology IIS 5.0 Architecture IIS 5.0 Installation IIS 5.0 Administration IIS 5.0 Security Understanding The
