Transforming cloud infrastructure to support Big Data Ying Xu Aspera, Inc
Presenters and Agenda! PRESENTER Ying Xu Principle Engineer, Aspera R&D ying@asperasoft.com AGENDA Challenges in Moving Big Data Aspera Cloud Technology Use Cases Q & A
Aspera s mission! Creating next-generation transport technologies that move the world s digital assets at maximum speed, regardless of file size, transfer distance and network conditions.
Trends in Technology Big Data Explosion 90% of data today file-based or unstructured Mix of file sizes but larger and larger files the norm Diversity of IP Networks Media, Bandwidth Rates, and Conditions Variable bandwidth rates (slow to super-fast) Bandwidth rates increasing costs decreasing Network media remains diverse (terrestrial, satellite, wireless) Conditions vary all networks prone to degradation over distance Global Workflows moving Big Data over WANs Teams are geographically dispersed Over distance, network conditions degrade Contemporary TCP acceleration solutions not designed for big data transfer and replication Cloud Computing Grows Up Amazon Web Services (AWS) S3 cloud storage 2010: 262 billion objects, 2012: 1.3 trillion objects More choices: Microsoft Azure, OpenStack, HP Cloud No longer a niche Netflix (transcoding), MTV (global video distribution), BGI (genomic sequencing) 4
Underlying technology bottlenecks Transport protocols TCP and TCP variants Strongest constraint often under 10Mbps on commercial WAN Storage Not designed for single high-speed readers and writers Often constrains transfers to 100s Mbps Cloud storage slow I/O acfile system Computer architecture Commodity hardware has limitations in processing I/O interrupts Often constrains transfers to 1-2 Gbps Last Foot Bo)leneck WAN 3 1 2 5
Challenges with TCP and alternative Distance degrades conditions on all networks Latency (or Round Trip Times) increase Packet losses increase Fast networks just as prone to degradation TCP performance degrades with distance Throughput bottleneck becomes more severe with increased latency and packet loss TCP does not scale with bandwidth TCP designed for low bandwidth Adding more bandwidth does not improve throughput Alternative Technologies TCP-based - Network latency and packet loss must be low Modified TCP Improves TCP performance but insufficient for fast networks UDP traffic blasters - Inefficient and waste bandwidth Data caching - Inappropriate for many large file transfer workflows Data compression - Time consuming and impractical for certain file types CDNs & co-lo build outs - High overhead and expensive to scale
Solving the transfer bottleneck fasp TM - A reliable bulk data transport protocol that completely separates reliability and rate control Uses standard UDP in the transport layer Uses a theoretically optimized approach that retransmits precisely the real packet loss on the channel Uses separate control systems for reliability (retransmission of dropped data) and rate control Both optimized to achieve the highest possible effective rate without cost. Rate-based congestion control with packet delay (not loss) as congestion feedback Faster speed to detect incipient congestion Instead of a sliding window, packets are scheduled for sending based upon a calculated rate, governed by a rate controller Decoupling reliability and rate control New packets need not slow down for the retransferring of lost data unlike TCP Lost data is retransmitted at the available bandwidth inside the end-to-end path, with nearly zero duplicate
RATE based congestion control Network is shared resource, and each flow causes a price to others by using the shared resource Qmax q x x x Assumes each link in the network has a fixed BW t Based on the congestion and queue size, generates a price for the link (queuing delay at a router) t0 t1 Rate controller then (mathematically) drives each flow to a rate that maximizes the sum of the utilities (rate / congestion price) of all flows, without exceeding the BW capacity. When there is NO congestion, rate controller drives transfer rate towards BW with fastest speed 8
fasp : high-performance transport Maximum transfer speed Optimal end-to-end throughput efficiency Transfer performance scales with bandwidth independent of transfer distance and resilient to packet loss Automatic, full utilization of available bandwidth Optimized reliability Less than 0.1% retransmission overhead on 30% packet loss High performance with large files or large sets of small files Uncompromising security Secure, user/endpoint authentication AES-128 cryptography in transit and at-rest Resulting in Transfers up to thousands of times faster than FTP Precise and predictable transfer times Extreme scalability (concurrency and throughput) 9
fasp TM performance breakthrough FTP Across US US EU US ASIA Satellite 1 GB 1 2 hrs 2 4 hrs 4 20 hrs 8 20 hrs 10 GB 15 20 hrs 20 40 hrs Impractical Impractical 100 GB Impractical Impractical Impractical Impractical TCP transfer times limited by packet loss, delay (network distance) NOT BANDWIDTH fasp 2 Mbps 10 Mbps 45 Mbps 100 Mbps 200 Mbps 1 Gbps 1 GB 70 min. 14 min. 3.2 min. 1.4 min. 42 sec. 8.4 sec. 10 GB 11.7 hrs 140 min. 32 min. 14 min. 7 min. 1.4 min. 100 GB 23.3 hrs 5.3 hrs 2.3 hrs 1.2 hrs 14 min. Aspera transfer times shorten linearly with bandwidth Independent of packet loss, delay (network distance) Cross US Add 1% to 5% Intercontinental Add 1% to 10% Satellite Add 1% to 10%
Advanced bandwidth management Virtual bandwidth cap Virtual link : distributed rate control which allows flows to share a portion of link bandwidth while leaving the rest of bandwidth for other traffic QoS-style control without support from router or ANY hardware Preserve bandwidth for other applications (VoIP, Video) De-centralized architecture Individual flow infers aggregate traffic based on intermittent multicast Aggregate traffic is used for comparing with the preset virtual link capacity to infer a virtual queueing delay Virtual queueing delay is fed back as the congestion price which eventually governs the transfer rate Flexible control BW cap capacity changed according to schedule Through distributed multicast, flows calculate aggregate sending rate, compare the aggregate to the virtual bandwidth target value, to infer a virtual queuing delay 11
Solving the Transport Bottleneck Advanced Bandwidth Sharing Flexible bandwidth sharing Allows applications to build intentional control in their transport service Built-in response to network congestion proves a virtual handle to offer differentiated BW sharing User-specified high low priority Flows 1&2 utility function - U1(x) Flow 3 utility function - U3(x)=2U1(x). Advanced management functionality Priority changes on the fly 12
Challenges for Storing Big Files in Cloud Storage Large files are divided into chunks Typically 64 MB - 128 MB with multiple replicas distributed across the storage 1 TB file requires 10,000 chunks at 100MB per chunk! I/O protocol is HTTP only! HTTP PUT or GET by chunk SLOW over the WAN due to TCP throughput bottleneck Even local I/O is slow unless a parallel HTTP stream write/read is used, e.g. local file system drivers, e.g. S3 fuse, are notably slow, e.g. 8-10 Megabytes/s Security and access control is only as good as the application Simply no tools for inter-cloud data transfer - Lock In! Media use cases need high-speed transfer, virtually unlimited size, robust performance & security HTTP
Big data cloud storage challenge
fasp 3 New generation high-speed transport Maximized efficiency for cloud object store Small file metadata transmission moved from TCP to FASP à Maximal efficiency achieved independent of file size Parallel I/O architecture optimized for multi-core CPU Number of I/O workers configurable to fit for native storage system characteristics Multi-stage I/O processing opens door for in-line encryption and LZO compression Parallel HTTP forwarding overcomes last foot storage bottleneck Incoming FASP traffic is forwarded to internal storage via parallel HTTP API Number of HTTP streams tuned to optimize throughput performance Gbps transfer speed achieved for WAN upload/download to the cloud Infrastructure agnostic Virtual file system adapter support cloud and on-premise storage Hybrid cloud and on-premise architecture CLOUD HYBRID ON PREMISE 15
Overcoming both bottlenecks #1 TRANSFER DATA TO EC2 OVER WAN EFFECTIVE THROUGHPUT http transfer over WAN (single stream) Typical internet conditions 50 250ms latency & 0.1 3% packet loss 15 parallel http streams Aspera fasp transfer over WAN to EC2 <10 Mbps <10 to 100 Mbps up to 1Gbps (per EC2 Extra Large Instance) #2 TRANSFER DATA FROM EC2 TO S3 EFFECTIVE THROUGHPUT Standard single stream http Aspera S3 Proxy With parallel I/O http streams 10 to 100 Mbps up to 1Gbps (per EC2 Extra Large Instance) 10 TB transferred per 24 hours
Aspera On Demand High Speed Ingest With Direct-to-CLOUD Cloud
Intra-cloud Transfers ACROSS SAME OR DIFFERENT CLOUD INFRASTRUCTURE THE SOLUTION Data migration from one region to another or from one provider to another Transfer database or application logs from one region to another for DR or Business Continuity fasp Node fasp fasp Node Node US West US East
Aspera product portfolio TRANSFER CLIENTS WEB APPLICATIONS MANAGEMENT & AUTOMATION SYNCHRONIZATION High- speed transfers for web, desktop and mobile File sharing, collabora9on and exchange applica9ons Transfer management, monitoring and automa9on Scalable, mul9- direc9onal, mul9- node synchroniza9on TRANSFER SERVERS High-speed file transfer servers for on premise, private, public, and hybrid cloud deployments FASP TRANSPORT Innovative, patented, highly efficient bulk data transport technology, unique and core to all Aspera products
Aspera Developer Network ASPERA MOBILE APIs Android SDK Aspera Android SDK provides a Java API to transfer files using fasp-air. iphone SDK Aspera iphone SDK with Objective C API to transfer files using fasp-air. ASPERA BROWSER APIs Connect JavaScript API JavaScript API exposed by Aspera Connect for integration of fasp based file transfers into web applications for a complete in-browser experience ASPERA APPLICATION APIs Shares API Full programmatic control over browsing Shares, transfer authorization, and upload / download. faspex Web API A set of services that enables users to create and receive digital deliveries via a Web interface, while taking advantage of fasp high-speed transfer technology. Console API Full programmatic management of transfer sessions including initiation, queuing, management and control through a RESTful API. Aspera Web Services A SOAP based web service API that allows initiation, monitoring and controlling of fasp based file transfers. ASPERA TRANSFER APIs fasp Manager A class library that allows intiations, monitoring and controlling of fasp based file transfers. Aspera Multicast SDK A Java class library for initiation and management of IP multicast based data transmissions using Aspera fasp-mc.
Ingest and Sharing HYBRID ACROSS PUBLIC & PRIVATE CLOUDS THE SOLUTION Shares Web app transparently communicates with Aspera server Nodes and displays content in a single user interface User browses authorized content across multiple shares Independent high-speed data transfers to/ from Datacenter, AWS S3, and Windows Azure BLOB, transparent to user Client, NY, NY fasp Shares fasp DMZ Node Node Datacenter, Emeryville, CA
Automated End-to-end Workflow ACROSS HYBRID ACROSS PUBLIC & PRIVATE CLOUDS! THE SOLUTION Aspera Console Aspera Orchestrator AUTOMATE 1. Content is transferred and ingested to an on-premise Aspera server 2. Aspera high-speed transfer Direct-to-S3 with Aspera On Demand 3. Multiple parallel transcoding jobs, with output stored back to S3 4. Faspex packages are created, sent and downloaded by the customer 5. Media files are archived to Azure BLOB or AWS S3 / Glacier DMZ 3 Media Customer BLOB Transcoding Service 4 5 Media Company Faspex On Demand Aspera Node 1 S3 Glacier Aspera Node 2 INGEST TRANSFORM DISTRIBUTE ARCHIVE
THANK YOU FOR JOINING! Ying Xu Principle Engineer, Aspera R&D Questions?