LARGE DATA TRANSPORT FOR THE SCIENCE DMZ Overcoming the Challenges in Support of Scientific Collaboration Michelle Munson, CEO & Co-Founder Aspera, Inc. an IBM Company
OUTLINE Fundamental Challenges of Large Data Collaboration in the Science DMZ WAN Huge Data Distributed Object Storage ( Cloud ) Key Problems and Aspera Technology Solutions Bulk Transfer over WAN Bulk Ingest and Distribution from Cloud Synchronization over WAN Quick Product Overview Ingest and distribution Site-to-site transfer Person-to-person sharing Sync to / from infrastructure Use Cases in Science and HPC
ASPERA S MISSION" Creating next-generation transport technologies that move the world s digital assets at maximum speed, regardless of file size, transfer distance and network conditions.
TRENDS Big Data Explosion 90% of data today file-based or unstructured Mix of file sizes but larger and larger files the norm Diversity of IP Networks Media, Bandwidth Rates, and Conditions Variable bandwidth rates (slow to super-fast) Bandwidth rates increasing costs decreasing Network media remains diverse (terrestrial, satellite, wireless) Conditions vary all networks prone to degradation over distance Global Workflows moving Big Data over WANs Teams are geographically dispersed Over distance, network conditions degrade Contemporary TCP acceleration solutions not designed for big data transfer and replication Cloud Computing Grows Up More choices: SoftLayer, AWS, Microsoft Azure, OpenStack, HP Cloud No longer a niche Netflix (transcoding), MTV (global video distribution), BGI (genomic sequencing) 4
BULK TRANSFER OVER WAN Problems of traditional transport and breakthrough solutions
CHALLENGES WITH TCP AND ALTERNATIVE TECHNOLOGIES Distance degrades conditions on all networks Latency (or Round Trip Times) increase Packet losses increase Fast networks just as prone to degradation TCP performance degrades with distance Throughput bottleneck becomes more severe with increased latency and packet loss TCP does not scale with bandwidth TCP designed for low bandwidth Adding more bandwidth does not improve throughput Alternative Technologies TCP-based - Network latency and packet loss must be low Modified TCP Improves TCP performance but insufficient for fast networks UDP traffic blasters - Inefficient and waste bandwidth Data caching - Inappropriate for many large file transfer workflows Data compression - Time consuming and impractical for certain file types CDNs & co-lo build outs - High overhead and expensive to scale 6
FASP HIGH-PERFORMANCE DATA TRANSPORT Maximum line-rate WAN transfer speed Transfer performance scales with bandwidth independent of transfer distance and resilient to packet loss Optimal end-to-end throughput efficiency Congestion Avoidance and Policy Control Automatic, full utilization of available bandwidth On-the-fly prioritization and bandwidth allocation Uncompromising security and reliability Secure, user/endpoint authentication AES-128 cryptography in transit & at-rest Scalable management, monitoring and control Real-time progress, performance and bandwidth utilization Detailed transfer history, logging, and manifest Enterprise-Class File Delivery Transfers up to thousands of times faster than FTP/HTTP(S) Precise and predictable transfer times Extreme scalability (concurrency and throughput) 7
ASPERA PRODUCT PORTFOLIO FILE SIZE INDEPENDENT Maximum transfer speed, optimal bandwidth utilization and maximum I/O throughput on any storage platform regardless of number or size of files FASP DISTANCE INDEPENDENT ALL PARADIGMS & DEPLOYMENT MODELS Transfer, replicate or synchronize your data in all types of scenarios: one-to-one, hub and spoke, unidirectional and multi-directional INFRASTRUCTURE AGNOSTIC Locate your big data anywhere and move it seamlessly with high-speed, secure transfers CLOUD HYBRID ON PREMISE
FASP PERFORMANCE BREAKTHROUGH Across US! US Europe! US ASIA! 10 GB" 100 GB" 10 GB" 100 GB" 10 GB " 100 GB" FTP" 10 Mbps" 100 Mbps" 1 Gbps" 10 Gbps" 10-20 Hrs" Impractical" 15-20 Hrs" Impractical" Impractical" Impractical" 10 Mbps" 140 Min" 23.3" 140 Min" 23.3" 140 Min" 23.3" Aspera FASP " 100 Mbps" 14 Min" 2.3 Hrs" 14 Min" 2.3 Hrs" 14 Min" 2.3 Hrs" 1 Gbps" 1.4 Min" 14 Min" 1.4 Min" 14 Min" 1.4 Min" 14 Min" 10 Gbps" 8.4 Sec" 1.4 Min" 8.4 Sec" 1.4 Min" 8.4 Sec" 1.4 Min" TCP transfer times limited by packet loss, delay (network distance) NOT BANDWIDTH Aspera transfer times shorten linearly with bandwidth Independent of packet loss, delay (network distance) Cross US Add 1% to 5% Intercontinental Add 1% to 10% Satellite Add 1% to 10% 9
FASP MANAGEMENT WITH ADAPTIVE RATE CONTROL Extraordinary bandwidth control Automatic, full utilization of available bandwidth Protection of other network traffic On-the-fly, per flow, user and job prioritization Highly-concurrent transfer stacking System-wide monitoring and reporting Real-time progress and performance analysis Real-time bandwidth utilization Detailed transfer history, logging and manifest Centralized command and control Per transfer, user, group and node Manage and create global transfer policies Remotely initiate, schedule and automate transfer jobs 10
FASP TM SECURITY & RELIABILITY Complete Built-in Security Secure endpoint authentication, data encryption on-the-fly and at rest, and per-packet integrity verification FIPS 140-2 compliant, built on the openssl libraries Secure User/Endpoint Authentication Authentication via secure SSH mechanisms: interactive password or public key LDAP, Active Directory user authentication Native File System Access Control support across all operating systems AES-128 Cryptography On-the-fly data encryption Data encryption in transit and (optionally) at rest (secured storage of transferred content) Data Integrity Verification Each transmitted data block is verified with a cryptographic hash function Protects against man-in-the-middle, re-play, and UDP denial-of-service attacks 100% Reliable Data Transmission Session semantics guarantee 100% bit-for-bit identical data copy at the destination Automatic resume of partial or failed transfers Automatic HTTP fallback in highly restrictive networks 11
DEMONSTRATION ASPERA OVER GLOBAL WAN Demo 1 ascp single stream over global WAN (200 ms / 2% packet loss in each direction) scp rate: < 500 Kilobits / second (0.5 Megabits / second) ascp rate: 3 Gigabits / second (3,000 Megabits / second) local disk I/O and single CPU core are limiting factors Demo 2 ascp over mulitiple CPU cores, global WAN (200 ms / 2% packet loss in each direction) ascp rate: 9 Gigabits / second (9,000 Megabits / second) Host Computers: Intel Xeon processors E5-2600 family (dual 2.70GHz CPU, 2x16 cores with Intel HyperThreading) NUMA and Intel DDIO enabled by default (*standard packet sizes* - no jumbo frames used) Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection CentOS 6.4 Network Emulation provided by Netropy 12
DEMONSTRATION 10GBPS ASPERA OVER GLOBAL WAN 13
DEMONSTRATION 40GBPS ASPERA USING DPDK, ETC Demo 1 ascp single stream over global WAN (200 ms / 2% packet loss in each direction) scp rate: < 500 Kilobits / second (0.5 Megabits / second) ascp rate: 3 Gigabits / second (3,000 Megabits / second) local disk I/O and single CPU core are limiting factors Demo 2 ascp over mulitiple CPU cores, global WAN (200 ms / 2% packet loss in each direction) ascp rate: 9 Gigabits / second (9,000 Megabits / second) Host Computers: Intel Xeon processors E5-2600 family (dual 2.70GHz CPU, 2x16 cores with Intel HyperThreading) NUMA and Intel DDIO enabled by default (*standard packet sizes* - no jumbo frames used) Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection CentOS 6.4 Network Emulation provided by Netropy 14
DEMONSTRATION 40GBPS ASPERA USING DPDK, ETC 15
ACHIEVING 10GBPS+ TRANSFERS OVER GLOBAL WAN Joint research project between Aspera and Intel Corporation Investigate high-speed (10Gbps and beyond) transport over WAN Prove out Intel s optimizations to eliminate delay in I/O and packet forwarding (Intel Xeon processor E5-2600 product systems) using standard packet sizes (no jumbo frames) Leverage Aspera s fasp transport Hardware Platform Intel Data Direct I/O Technology (Intel DDIO) which allows Intel Ethernet controllers to route I/O traffic directly to the processor cache Built-in support for Single-Root I/O Virtualization (SR-IOV) which allows virtual machine platforms to bypass the hypervisor in order to directly access resources on the physical network interface Results 300 percent throughput improvement versus a baseline system that did not contain support for Intel DDIO and SR-IOV, showing the clear advantages of Intel s innovative Intel Xeon processor E5-2600 product family, with standard packet sizes Similar results across both LAN and WAN transfers confirming that Aspera fasp transfer performance is independent of network latency and robust to packet loss on the network Approximately the same throughput for both physical and virtualized computing environments Aspera"Transfer"Throughput"::"Single"Session" Aspera"Transfer"Throughput"??"MulBple"Sessions" Hardware Intel Xeon processor E5-2600 family (dual 2.70GHz CPU, 2x16 cores with Intel HyperThreading Technology) NUMA and Intel DDIO enabled as default Intel Corporation 82599EB 10-Gigabit SFI/ SFP+ Network Connection Throughput"(Gbps)" 10" 8" 6" 4" 2" 0" Baseline" DDIO"(LAN)" DDIO"+"SR:IOV" DDIO"+"SR:IOV" DDIO"+"SR:IOV" DDIO"+"SR:IOV" (LAN)" (50ms:0.1%" (100ms:1%" (500ms:5%" Packet"Loss)" Packet"Loss)" Packet"Loss)" Throughput"(Gbps)" 10" 8" 6" 4" 2" 0" 2" 3" 4" Number"of"Sessions" DDIO$(LAN)$ DDIO+SR-IOV$(LAN)$ DDIO+SR-IOV$(50ms-0.1%$ Packet$Loss)$ DDIO+SR-IOV$(100ms-1%$ Packet$Loss)$ DDIO+SR-IOV$(500ms-5%$ Packet$Loss)$ Results available: asperaso0.com 16
DASHNET OVER AARNET WITH ASPERA High speed network service built over AARNet4 Connects RDSI funded Nodes to each other and to researchers around Australia. Supports up to 100 gigabits per second, significantly increasing data transfer rates across the country. Moved data over a 1700km connec<on and recorded speeds peaking at 75 gigabits per second (Gbps) between the Brisbane Conven<on Centre and a site in Carlton, Victoria. Aspera toolkit can reliably run at over 95% of the available bandwidth across DaShNet, while still sharing bandwidth fairly between applications. Elements of the DaShNet infrastructure include: Physically Diverse Access Fibre from all Nodes to AARNet Network Access Points (NAPs) in their respective cities; New active network equipment at AARNet Network Access Points (NAPs) supporting up to 100Gb/s, and the virtual networking capability supported by the AARNet 4 architecture; Redundant High Performance Customer Premise Equipment (CPE), at all RDSI Nodes, configured using the ESNet ScienceDMZ Aspera software at all nodes: Connect Server, Shares, browser and command line, and SDK clients 17
ASPERA PRODUCT PORTFOLIO TRANSFER CLIENTS WEB APPLICATIONS MANAGEMENT & AUTOMATION SYNCHRONIZATION High- speed transfers for web, desktop and mobile File sharing, collabora;on and exchange applica;ons Transfer management, monitoring and automa;on Scalable, mul;- direc;onal, mul;- node synchroniza;on TRANSFER SERVERS High-speed file transfer servers for on premise, private, public, and hybrid cloud deployments FASP TRANSPORT Innovative, patented, highly efficient bulk data transport technology, unique and core to all Aspera products 18
FASP SOFTWARE ENVIRONMENT 19
ASPERA DEVELOPER NETWORK A complete set of SDKs with guides, reference information, and sample code to assist them with integrating Aspera technology into their own applications. Aspera fasp technology can be used in desktop, network-based, and web applications in place of FTP, HTTP, or custom TCP-based copy protocols. ASPERA TRANSFER APIs Aspera Web Services A SOAP based web service API that allows initiation, monitoring and controlling of fasp based file transfers. Aspera Web Javascript API exposed by Aspera Connect client. It allows integration of fasp based file transfers into web applications. Connect 3.0 New, Enhanced Javascript API Introducing the new Connect 2.8 developer preview! Integrate the functionality of Aspera Connect 2.8, a faspbased file transfer client, into your own web applications, while customizing it to your unique brand. fasp Manager A class library that allows intiations, monitoring and controlling of fasp based file transfers. Aspera Multicast SDK A Java class library that allows initiation and management of IP multicast based data transmissions using Aspera fasp-mc. ASPERA MOBILE APIs Android SDK Aspera Android SDK provides a Java API to transfer files using fasp-air. iphone SDK Aspera iphone SDK with Objective C API to transfer files using fasp-air. ASPERA APPLICATION APIs faspex Web API The Aspera faspex Web API provides a set of services that enables users to create and receive digital deliveries via a Web interface, while taking advantage of fasp high-speed transfer technology. Console API The Aspera Console API provides full programmatic management of transfer sessions including initiation, queuing, management and control through a RESTful API. Shares API The Aspera Shares API provides full programmatic control over browsing Shares, transfer authorization, and upload / download
ASPERA CONNECT SERVER APPLICATION PLATFORM WITH WEB INTERFACE Key Features and Benefits Built on top of the Enterprise Server Application platform with Web interface Simple web portal exposes root directory for each user Transfers via Aspera Connect web browser plug-in Powers fasp transfers for Shares and faspex web applications Rich APIs enable integration into any custom web application Applications and Use Cases Global browser-based file exchange Field and third-party content or data gathering Integration with custom web applications Licensing and deployment Multi-tiered licensing based on bandwidth capacity Usage-based licensing with Aspera On Demand Comprehensive platform support High-availability configuration (active/active or active/passive) Supports connections with Aspera Connect browser plug-in client Automatic installation of Aspera Connect browser plug-in on the client side 21
ASPERA CONNECT BROWSER PLUG-IN Key Features High-performance fasp transport Supports for all major browsers including Chrome, Internet Explorer, Firefox, Safari on Windows, Mac, and Linux (native 64 and 32 bit) Upload/download client transparently launched by browser File, directory, and multi-item transfers with queuing Automatic retry and resume of partial or failed transfers Built-in transfer monitor for visual, on-the-fly transfer rate control and monitoring Comprehensive transport and content security, integrated with at-rest encryption, easy one-click decryption of files and directories Automatic inline installation from the web browser with no administrative rights required Rich Javascript API for fully-integrated branded experience, persistence across host reboots, and in browser control HTTP fallback mode for highly-restrictive networks with support for authenticated proxies Language localized for Japanese, Chinese, French and Spanish Application examples Upload / download client for Connect Server and Aspera faspex Server Browser based, self installable plug-in enables 3rd party access to content 22
BULK DATA IN/OUT/BETWEEN CLOUDS Problems of HTTP multipart at distance and Aspera solution
CHALLENGES FOR STORING BIG FILES IN DISTRIBUTED OBJECT CLOUD STORAGE APIs require Large files are divided into chunks (typically 64 MB - 128 MB) and multiple replicas are distributed across the storage for durability 1 TB file requires 10,000 chunks at 100MB per chunk! Optimized for high throughput access BUT only for *chunk* sizes in parallel, i.e. no high throughput, whole file read/write I/O protocol is HTTP only! HTTP PUT or GET by chunk SLOW over the WAN (due to TCP throughput bottleneck over WAN), e.g. <1-10s of Megabits/s depending on distance Even local I/O is slow unless a parallel HTTP stream write/read is used, e.g. local file system drivers, S3 fuse, are notoriously slow, e.g. 8-10 Megabytes/s HTTP Security and access control is only as good as the application Simply no tools for inter-cloud data transfer - Lock In! Most use cases need high speed transfer, virtually unlimited size, robust performance & security
STANDARD SOLUTION MULTI-PART UPLOAD
HIGH SPEED TRANSFER TO CLOUD STORAGE WITH DIRECT-TO-CLOUD Cloud 26
ASPERA ON DEMAND ADVANTAGES Maximum speed Enables large data set transfers over any network at maximum speed, regardless of network conditions or distance Transfers large data sets of small files with the same efficiency as large single files Very lightweight - does not require specialized or powerful hardware to maintain high speeds Adaptive rate control Provides precise rate control (pre-set and on-the-fly) for guaranteed transfer times Uses an adaptive rate control to fully utilize available bandwidth while remaining fair to other traffic Supports on-the-fly configurable bandwidth sharing policies. Users may pre-set and change individual transfer rates and finish times as needed Complete security Includes complete, built-in security using open standard cryptography for user authentication, data encryption and data integrity verification Robust, software only solution Uses standard, unmodified IP networking and is implemented in software as an application protocol.. Automatically resumes partial transfers and retries failed transfers Flexible open architecture Supports interoperable file and directory transfers between all major operating systems and provides a complete, modern software API to build upon
TRANSFERRING LARGE FILES WITH THE CLOUD SINGLE STREAM MULTIPART HTTP VS FASP TRANSFER DATA TO CLOUD OVER WAN EFFECTIVE THROUGHPUT Typical internet conditions 50 250ms latency & 0.1 3% packet loss 15 parallel http streams <10 to 100 Mbps depending on distance Aspera fasp transfer over WAN to Cloud up to 1Gbps (per EC2 Extra Large Instance) independent of distance! 10 TB transferred per 24 hours
AWS ENHANCED UPLOADER VS ASPERA CONNECT LOCATION AND AVAILABLE BANDWIDTH EFFECTIVE THROUGHPUT & TRANSFER TIME FOR 2GB Montreal to AWS East 100 Mbps Shared Internet Connection AWS Enhanced Uploader: 30 minutes (7-10 Mbps) Aspera Connect: 3.7 minutes (80 Mbps) 9X Speed Up Rackspace in Dallas to AWS East 600 Mbps Shared Internet Connection AWS Enhanced Uploader: 7.5 minutes (38 Mbps) Aspera Connect: 0.5 minutes (600 Mbps) 15X Speed Up Other pains Enhanced Bucket Uploader requires java applet, very large transfers time out, no good resume for interrupted transfers, no downloads
GSUTILS VS ASPERA ASCP LOCATION AND AVAILABLE BANDWIDTH EFFECTIVE THROUGHPUT & TRANSFER TIME FOR 2GB Montreal to Google Storage (West US) 100 Mbps Shared Internet Connection Google Uploader: 30 minutes (10 Mbps) Aspera Connect: 3.7 minutes (80 Mbps) 8X Speed Up AWS EMEA to Google Storage (West US) 400 Mbps Shared Internet Connection Google gsutils: 100 minutes (3 Mbps) Aspera ascp: 1 minute (300 Mbps) 100X Speed Up Other pains Very large transfers time out, no good resume for interrupted transfers, no reliable cancel, not bandwidth control, etc.
AWS MULTIPART COPY VS ASPERA ASCP LOCATION AND AVAILABLE BANDWIDTH New York to AWS East Coast 1 Gbps Shared Connection New York to AWS West Coast 1 Gbps Shared Connection EFFECTIVE THROUGHPUT & TRANSFER TIME FOR 4.4 GB/15691 FILES AWS Multipart Copy: 334 seconds (113 Mbps) Aspera ascp: 107 seconds (353 Mbps) 3.3X Speed Up AWS Multipart Copy: 8.7 GB in 1032 seconds (36 Mbps) Aspera ascp: 8.7 GB in 110 seconds (353 Mbps) 9.4 X Speed Up LOCATION AND AVAILABLE BANDWIDTH New York to AWS East Coast 1 Gbps Shared Connection New York to AWS West Coast 1 Gbps Shared Connection EFFECTIVE THROUGHPUT & TRANSFER TIME FOR 8.7 GB/18,995 FILES AWS Multipart Copy: 477 seconds (156 Mbps) Aspera ascp: 178 seconds (420 Mbps) 2.7 X Speed Up AWS Multipart Copy: 967 seconds (77 Mbps) Aspera ascp: 177 seconds (420 Mbps) 5.4 X Speed Up
ASPERA SOFTWARE ON DEMAND AVAILABLE ON AWS AND WINDOWS AZURE Transfer Servers On demand high-speed fasp transport to, from, and across cloud infrastructure unlimited scale out with additional transfer server instances Base transfer server, or Web-enabled application platform with support for Connect browser plug-in for browser-based transfers Direct integration with Amazon S3 and Windows Azure BLOB object storage for line speed ingest Web Applications (AWS Only) Aspera Shares Multi-node file ingest and sharing, provides access to content across any location with powerful security and access model Aspera faspex Global Person-to-person file delivery, collaboration, and exchange with add-hoc content ingest and contribution Optional Clients Optional add-ons to extend the content ingest and contribution to mobile devices and email clients Available for Apple ios devices, Google Android, and Microsoft Outlook email client, as well as Cargo automatic downloader Licensing and Availability Tiered usage (up to 10 PB a month) based on GB transferred; No bandwidth caps, no user counts, no limit on number of servers (or instances ) one can run Month-to-month pay as you go, or annual and multi-year terms Available now: http://cloud.asperasoft.com
USE CASE 1: HIGH SPEED DATA INGEST WITH DIRECT-TO-CLOUD Cloud
USE CASE 2: INTRA-CLOUD TRANSFERS ACROSS SAME OR DIFFERENT CLOUD INFRASTRUCTURE THE SOLUTION Data migration from one region to another or from one provider to another Transfer database or application logs from one region to another for DR or Business Continuity fasp Node fasp fasp Node Node US West US East
USE CASE 3: SHARING AND COLLABORATION HYBRID ACROSS PUBLIC & PRIVATE CLOUDS THE SOLUTION Shares Web app transparently communicates with Aspera server Nodes and displays content in a single user interface User browses authorized content across multiple shares Independent high-speed data transfers to/from Datacenter, AWS S3, and Windows Azure BLOB, transparent to user fasp Shares fasp fasp Client, NY, NY DMZ Node Node Datacenter, Emeryville, CA 35
USE CASE 4: PERSON-TO-PERSON DELIVERY WITH DIRECT-TO-S3 THE SOLUTION Faspex upload to S3 and notification to recipient Faspex download Direct from S3 fasp faspex Person-to- Person fasp Connect Browser Plug-in HTTP multipart Connect Browser Plug-in Los Angeles Herndon, VA New York
SAAS, TRANSFER AND STORAGE SERVICE - SaaS solution includes app and transfer costs - Customer pays for object storage - Aspera offers transfer service Aspera On Demand HTTP multipart Cloud Platform fasp 3 Aspera Transfer Service Storage Service 2 Client, NY, NY HTTP 1 4 1. User authenticates via SaaS web application 2. SaaS application communicates with Aspera via Node API 3. Download or Upload from Object storage via Aspera on Demand SaaS Service SaaS Web application 4. SaaS application also has access to content in object storage
FILE SYNCHRONIZATION OVER WAN Aspera Sync
PROBLEMS OF TRADITIONAL SYNCHRONIZATION Typical Situation Large numbers of files (1 Million++) to be Syncd over the WAN, with time-critical deadlines Traditional rsync - Finding Changes is highly costly Each file is compared over the WAN with costly chatty messaging: 10 million files x 100 milliseconds RTT means comparison time is more than 24 hours! Traditional rsync Uses TCP to transport new data, very slow over the WAN On global WANs (100 millisecond RTT / 1% packet loss and worse) standard TCP rates are <<10Mbps On difficult global WANs (300 millisecond RTT / 3 % packet loss and worse) standard TCP rates are <<1 Mbps Transferring each gigabyte of data takes hours! Opening concurrent rsync jobs has diminishing improvement and is hard to manage Traditional rsync Moving or renaming directories results in data copies over the WAN E.g. a new build directory with the very same files gets copied over the network again. If the directory is large, can be many hours. Traditional rsync does not have tracking, visibility, bandwidth management The progress is at best to command line standard output on the host where sync ing occurs Proprietary synchronization and replication solutions have one or more of the above problems, and may only work between identical proprietary systems e.g. Vendor A and Vendor B synchronization don t interoperate. 39
A FUNDAMENTAL SOLUTION Aspera Sync Solves these Problems ü Finding Changes Requires No Remote Communication Changes are computed against a local snapshot with fastest possible local comparison ü Uses Aspera FASP for line speed transfers over the WAN with a single transport stream ü Moving or renaming directories is replicated as-is at the target ü Files with the same checksums are de-duplicated saving network and storage bandwidth ü Sync progress, status, and throughput are globally visible and controllable through Aspera Console ü Fully interoperable between operating systems and storage systems 40
INTRODUCING ASPERA SYNC HIGH-SPEED SYNCHRONIZATION AND REPLICATION Cross platform software application designed for high performance synchronization and replication of large file stores over the WAN Application and use cases Offsite synchronization and replication for storage migration, disaster recovery and business continuity. Bi-directional system mirroring for alternate access to digital content. Hub and spoke sync for high-speed content collection or distribution. Features and Benefits High speed, multi-directional synchronization of remote files and directories Highly scalable designed for today s extremely large data sets Highly efficient - designed for long distance WANs Secure matches security standards set by government, FIPS 140-2 compliant Platform agnostic runs on industry standard Linux and Windows Storage agnostic compatible with any standard file or block storage system Familiar rsync-like interface shrinks the learning curve for IT professionals One-to-one, one-to-many, and full mesh synchronization Unidirectional, bi-directional; One time, scheduled, or continuous SYNCHRONIZATION Scalable, mul;- direc;onal, mul;- node synchroniza;on
ASPERA SYNC NEW FEATURES AVAILABLE IN 1.5 Integrated Reporting and Management in Console Full integration with Aspera Console 2.0 for global visibility into sync sessions and activity Reports synchronization progress, per file status, throughput Enables pause, resume and on-the-fly bandwidth priority and speed control for Sync sessions Advanced Features Rich include and exclude filters, full support for soft and hard links, configurable user and group ID Preserves access time, user and group IDs File-level de-duplication eliminates unnecessary transfers, reduces sync times and reduces storage use Secure token authorization and document roots as well as API for embedding Sync. Expanded platform support OS X, Solaris and BSD in addition to Windows and Linux Introduction of distributed file system watcher Fastest possible capture of file system changes on NAS (NFS, CIFS) deployments, clusters of ingest servers and very large numbers of files clusterable and scalable 42
SYNC USE CASES Content Distribution / System Mirroring Disaster Recovery / Business Continuity Web Application Primary Site Alternate Site Replicate mission critical data from a primary site to one or more alternate sites Systems remain available after a system outage or site loss - critical data is preserved Hub and Spoke for Collection or Distribution Daily collection or distribution of data across a large network of end points Continuous update of online content, software and media Primary Site Alternate Site Automatically replicate digital content from one system to another Content is always available, offering high availability of the service Data Migration for Diverse Storage Environments Replicate entire file system, or portions of the file system Preserve file attributes such as permissions, access times, ownership, etc. 43
ASPERA SYNC PERFORMANCE BENCHMARKS Performance comparison synchronizing many small files (average size 100 KB) over WAN of 100ms/1% Small files performance Number of files Data set size Sync time Throughput First Run Performance comparison synchronizing many large files (average size 100 MB) over WAN of 100ms/1% Large file performance Number of files Data set size (GB) Sync time Throughput Async 978,944 93.3 GB 9,968 sec (2.8 hours) 80.4 Mbps Async 5,194 500.1 GB 4,664 sec (1.3 hours) 921 Mbps rsync 978,944 93.3 GB 814,500 sec (9.4 days) 0.99 Mbps rsync 5,194 500.1 GB 4,320,000 sec (50 days) 0.98 Mbps Speed up difference 81x Speed up difference 940x Synchronization time after adding 31,056 files to 1 million small files (100 KB each) over WAN of 100ms/1% Second Run Synchronization time after adding new files to set of large files (100 MB) over WAN of 100ms/1% Change file performance Number of existing files Number of files added Total size Sync time Throughput Change file performance Number of existing files Number of files added Total size Sync time Throughput Async 978,944 31,056 2.97 GB rsync 978,944 31,056 2.97 GB 947 sec (16 min) 37,076 sec (10.3 hrs) 26.9 Mbps 0.68 Mbps Async 5,194 54 5.49 GB 54 sec 871 Mbps rsync 5,194 54 5.49 GB 54,573 sec (15 hrs) 0.86 Mbps Speed up difference 39x Speed up difference 1000x 44
ASPERA DRIVE End User High Speed Desktop Browse, Transfer and Sync
PROBLEMS WITH CURRENT FILE SHARING SOLUTIONS Performance Today s Drive, File Sync and File Sharing services are impractical for Big Data On global WANs (100 millisecond RTT / 1% packet loss+) standard TCP rates are <<10Mbps On difficult global WANs (300 millisecond RTT / 3 % packet loss+) standard TCP rates are <<1 Mbps Transferring or synchronizing each gigabyte of data takes hours! Data Size Large data sets (large files or large collections of files) cannot move Transfers and sync sessions are extremely slow and most often do not complete at all Security Security for hosted solutions is dependent on security of the provider Access control is limited and integrity of file ownership is fragile Huge concerns over privacy, contractual obligations Limited deployment options Typical SaaS offerings don t support on premise or hybrid infrastructure Often a single infrastructure solution Limited sharing paradigm Users are typically locked into a single delivery option (Sync, Share, or Person-to-Person delivery)
NEW PRODUCT: ASPERA DRIVE TOP NEW FEATURES AND CAPABILITIES A new file sharing experience Brings together the best of Aspera faspex, Shares, and Sync technologies into a single unified platform for file-based collaboration Includes full access control, privacy and security of the Aspera FASP technology Single file access and control sharing platform across distributed sharing points which may be on premise or in cloud Key Features Integrated secure desktop browsing of content on remote Aspera Servers with Aspera Shares authentication. Drag and drop to initiate high-speed transfer to and from remote shares (on premise and in cloud). Right click to send a faspex package and subscribe to automatically download faspex packages to the desktop. Built on Aspera Sync technology for maximum speed, synchronize any network file system or local directory with other Drive users or remote servers, or mirror content from remote sites. Email notification on new package availability or new content uploaded to a Share. 47
ASPERA IN SCIENCES
SAMPLE HPC CUSTOMERS IN THE LIFE SCIENCES Pharmaceu;cal Companies Transcoding Public Data Sources/Repositories Public and Private Research Ins;tu;ons
THANK YOU For more information, please see www.asperasoft.com