Notes on Transferring 100 TB of Data Using Globus. William E. Mihalo; Anton Verlygo; Ryan K. Sisk Northwestern University



Similar documents
SYSTEM SETUP FOR SPE PLATFORMS

Server Installation Procedure - Load Balanced Environment

Globus and the Centralized Research Data Infrastructure at CU Boulder

Distributed File System Performance. Milind Saraph / Rich Sudlow Office of Information Technologies University of Notre Dame

Ignify ecommerce. Item Requirements Notes

Microsoft Hyper-V chose a Primary Server Virtualization Platform

Imaging Computing Server User Guide

The safer, easier way to help you pass any IT exams. Exam : E Backup Recovery - Avamar Expert Exam for Implementation Engineers.

ITCertMaster. Safe, simple and fast. 100% Pass guarantee! IT Certification Guaranteed, The Easy Way!

EMC SYNCPLICITY FILE SYNC AND SHARE SOLUTION

Cornell University Center for Advanced Computing

4cast Server Specification and Installation

Before deploying SiteAudit it is recommended to review the information below. This will ensure efficient installation and operation of SiteAudit.

inforouter V8.0 Server & Client Requirements

Configuration Maximums

Checkmate 5.5 Self Hosted Quick Start Guide

Cluster Implementation and Management; Scheduling

Tableau Server 7.0 scalability

Upgrading Client Security and Policy Manager in 4 easy steps

for Networks Installation Guide for the application on the server August 2014 (GUIDE 2) Lucid Exact Version 1.7-N and later

Table of Contents. Online backup Manager User s Guide

F5 BIG-IP V9 Local Traffic Management EE Demo Version. ITCertKeys.com

insync Installation Guide

ClockWork Enterprise 5

Backup / migration of a Coffalyser.Net database

Parallels Plesk Automation

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

ClearPass Policy Manager 6.3

Installation Instruction STATISTICA Enterprise Small Business

STATISTICA VERSION 12 STATISTICA ENTERPRISE SMALL BUSINESS INSTALLATION INSTRUCTIONS

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Table of Contents Introduction and System Requirements 9 Installing VMware Server 35

ADAM 5.5. System Requirements

Acronis Backup & Recovery 11

Merge CADstream. For IT Professionals. Merge CADstream,

LaCie 5big Backup Server

Evaluation of Dell PowerEdge VRTX Shared PERC8 in Failover Scenario

Protect SQL Server 2012 AlwaysOn Availability Group with Hitachi Application Protector

I.T. System Requirements 2015

Safexpert Installation instructions MS SQL Server 2008 R2

Solution for private cloud computing

HA Certification Document Armari BrontaStor 822R 07/03/2013. Open-E High Availability Certification report for Armari BrontaStor 822R

Aqua Connect Load Balancer User Manual (Mac)

Hardware Configuration Guide

Practice Management Installation Guide. Requirements/Prerequisites: Workstation Requirements. Page 1 of 5

VMWare Workstation 11 Installation MICROSOFT WINDOWS SERVER 2008 R2 STANDARD ENTERPRISE ED.

CMS Tier-3 cluster at NISER. Dr. Tania Moulik

Hardware and Software Requirements. Release 7.5.x PowerSchool Student Information System

Solution for private cloud computing

Stratusphere Solutions

CAMAvision v18.5.x System Specification Guide 7/23/2014

Dragon Medical Enterprise Network Edition Technical Note: Requirements for DMENE Networks with virtual servers

Brocade Network Advisor High Availability Using Microsoft Cluster Service

Storage Systems: 2014 and beyond. Jason Hick! Storage Systems Group!! NERSC User Group Meeting! February 6, 2014

How To Set Up A Macintosh With A Cds And Cds On A Pc Or Macbook With A Domain Name On A Macbook (For A Pc) For A Domain Account (For An Ipad) For Free

System Requirements for Netmail Archive

Virtual Web Appliance Setup Guide

Data Movement and Storage. Drew Dolgert and previous contributors

Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations

Catalogic DPX TM 4.3. DPX Open Storage Best Practices Guide

Reference Design: Scalable Object Storage with Seagate Kinetic, Supermicro, and SwiftStack

Hadoop on the Gordon Data Intensive Cluster

for Networks Installation Guide for the application on the server July 2014 (GUIDE 2) Lucid Rapid Version 6.05-N and later

Hardware & Software Specification i2itracks/popiq

QNAP in vsphere Environment

Hardware Requirements

CTERA Cloud Onramp for IBM Tivoli Storage Manager

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

Cluster Computing at HRI

Pearl Echo Installation Checklist


Support Document: Microsoft SQL Server - LiveVault 7.6X

ASTROW HR. Installation & Operation & Programming MANUAL

OPAS Prerequisites. Prepared By: This document contains the prerequisites and requirements for setting up OPAS.

An Oracle White Paper December Oracle Virtual Desktop Infrastructure: A Design Proposal for Hosted Virtual Desktops

Disaster Recovery Cookbook Guide Using VMWARE VI3, StoreVault and Sun. (Or how to do Disaster Recovery / Site Replication for under $50,000)

NEXTGEN v5.8 HARDWARE VERIFICATION GUIDE CLIENT HOSTED OR THIRD PARTY SERVERS

Logically a Linux cluster looks something like the following: Compute Nodes. user Head node. network

John D. Bonam Disaster Recovery Architecture Session # 2841

Active Directory Integration Manual

Manual for using Super Computing Resources

I. General Database Server Performance Information. Knowledge Base Article. Database Server Performance Best Practices Guide

MTA Course: Windows Operating System Fundamentals Topic: Understand backup and recovery methods File name: 10753_WindowsOS_SA_6.

DNS must be up and running. Both the Collax server and the clients to be backed up must be able to resolve the FQDN of the Collax server correctly.

NTP Software File Reporter Analysis Server

Purpose Computer Hardware Configurations... 6 Single Computer Configuration... 6 Multiple Server Configurations Data Encryption...

PORTA ONE. o r a c u l a r i u s. Concepts Maintenance Release 19 POWERED BY.

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

Veeam Cloud Connect. Version 8.0. Administrator Guide

Implementing Enterprise Disk Arrays Using Open Source Software. Marc Smith Mott Community College - Flint, MI Merit Member Conference 2012

1.. Know the capabilities of the network system you are going to be adding cameras and/or DVR s to. Meaning, know if the present LAN has the

Improve Backup Performance by Upgrading to Symantec Backup Exec 12.5 with Service Pack 2

Barracuda Backup Vx. Virtual Appliance Deployment. White Paper

In order to upload a VM you need to have a VM image in one of the following formats:

An Oracle White Paper September Oracle Exadata Database Machine - Backup & Recovery Sizing: Tape Backups

Section A Notes to the Application

Automated Accounts Payable User Guide

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Transferring AIS to a different computer

CTO STATISTICS AND INFORMATION MANAGEMENT SYSTEM

Transcription:

Notes on Transferring 100 TB of Data Using Globus William E. Mihalo; Anton Verlygo; Ryan K. Sisk Northwestern University

1. Background and description of the data 2. Procedure for transferring the data 3. Problems encountered 4. Recommendations

Workstation and disk drives 1. 2011 imac (USB 2.0 connectors) with 8 GB of RAM. 2. 3 TB Fantom or Lacie external drives attached via USB 3. Network connection was via 1 GigE ethernet 4. 48 external disk drives were used for the transfer.

End point configuration 1. A total of 2 servers provide endpoint support. 2. One machine runs globus connect (server). 3. A second server provides login authentication into the quest cluster and also runs globus connect. 4. Each globus server has two Xeon X5650 CPUs with 24 GB of RAM. 5. Our globus servers are located at the Evanston Data Center. 6. Access to our storage volumes from the globus servers is via Infiniband/GPFS for our primary storage and a 10 Gb/s NFS connection for our supplemental storage.

Procedure 1. Transfers started in late October and finished in late February. 2. Disk drives were shipped to Northwestern s Chicago Campus. 3. Transfers were done on a lab computer. Transfers started in the late afternoon and ran over night or over the course of a weekend.

Procedure (cont d) 1. Disk drives were attached to an imac computer either singly or in pairs. 2. Typical transfer speed with a single disk drive attached was about 219 megabits/second 3. Transfer speed with two drives attached throttled down to 194 megabits/second

Description of Data 1. Average file size was about 747 MB. 2. 147,311 files were transferred 3. Transfers took 81,597 minutes or about 1,360 hours.

Description of the transfer 1. Data were transferred from Northwestern s Chicago Campus to our Evanston Campus. 2. Globus connect server nodes are located at the Evanston campus. 3. Between Chicago and Evanston campuses, we have dark fiber for the network communication

Problems 1. Transfers were done when the workstation was not busy. This meant transfers took place in the evening and over the weekend. 2. Transfers conflicted with backups that take place between Chicago and Evanston campuses and this had a substantial impact on our transfer speed. We had a high network load when we started the transfers. 3. USB 2 was a bottleneck.

Problems (cont d) 1. Condition of the disk drive will affect the speed of the transfers. 2. Drives that were damaged during shipping had their transfer speeds cut in half.

Recommendations 1. Do some test transfers before embarking on a massive set of transfers. 2. Run your test transfers at different times of the day. 3. Use smartmon tools to check the condition of a disk drive before starting a transfer. If you are doing a lot of transfers, you might want to defer transferring data from a failing drive until the end. If it is possible, you can also see about having the data re-shipped on a different disk drive (if it is still available).

Recommendations (cont d) 1. In our case, data stayed on the storage system of the sending institution for 30 days and then were automatically deleted. With serveral dozen disk drives arriving, it was difficult to check the condition of each drive until just before transferring data.

Recommendations (cont d) 1. Use smartmon tools to check the condition of a disk drive before starting a transfer. If you are doing a lot of transfers, you might want to defer transferring data from a failing drive until the end. If it is possible, you can also see about having the data re-shipped on a different disk drive (if it is still available).

Why did we use a modified sneaker net? 1. Many institutions use default port speeds of 100 MB/s. 2. We have run non-globus transfers from other sites where they could not ship a disk drive to us. We found these transfers took an extremely long time and could easily fail, or have corrupted data. 3. We have proselytized the use of globus technology for transfer purposes. Globus needs to make this technology more visible to potential end-users and institutions.

Recommendation for globus An online tool within globus connect that showed network statistics and the current transfer rate would be extremely helpful.