Netflix: Building Up and Scaling Out on Open Source



Similar documents
Design For Availability. October 2013 Stevan Vlaovic

Netflix and Open Source. April 2013 Adrian

NetflixOSS A Cloud Native Architecture

NetflixOSS A Cloud Native Architecture

Velocity and Volume (or Speed Wins)

Amazon Elastic Beanstalk

ur skills.com

Cloud Computing with Amazon Web Services and the DevOps Methodology.

Scalable Architecture on Amazon AWS Cloud

TECHNOLOGY WHITE PAPER Jan 2016

TECHNOLOGY WHITE PAPER Jun 2012

Migrating to Microservices. Adrian QCon London 6 th March 2014

Fault-Tolerant Computer System Design ECE 695/CS 590. Putting it All Together

A Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)

Scaling in the Cloud with AWS. By: Eli White (CTO & mojolive) eliw.com - mojolive.com

Introduction to Amazon Web Services! Leo Senior Solutions Architect

Amazon Web Services Yu Xiao

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

MICROSTRATEGY ON AWS

How AWS Pricing Works May 2015

Using ArcGIS for Server in the Amazon Cloud

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Scalable Application. Mikalai Alimenkou

Amazon EC2 Product Details Page 1 of 5

High-Availability in the Cloud Architectural Best Practices

Expand Your Infrastructure with the Elastic Cloud. Mark Ryland Chief Solutions Architect Jenn Steele Product Marketing Manager

StorReduce Technical White Paper Cloud-based Data Deduplication

Thing Big: How to Scale Your Own Internet of Things.

Design for Failure High Availability Architectures using AWS

CLOUD COMPUTING FOR THE ENTERPRISE AND GLOBAL COMPANIES Steve Midgley Head of AWS EMEA

An Introduction to Cloud Computing Concepts

Amazon Web Services Primer. William Strickland COP 6938 Fall 2012 University of Central Florida

Alfresco Enterprise on AWS: Reference Architecture

Web Application Hosting in the AWS Cloud Best Practices

Service Organization Controls 3 Report

Introduction to Cloud Computing

DLT Solutions and Amazon Web Services

Zadara Storage Cloud A

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

Razvoj Java aplikacija u Amazon AWS Cloud: Praktična demonstracija

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

EEDC. Scalability Study of web apps in AWS. Execution Environments for Distributed Computing

Cloud Models and Platforms

Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings

EXECUTIVE SUMMARY CONTENTS. 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix. 1.

Enterprise IT in the Cloud How to accelerate your business and be an IT hero

Intro to AWS: Storage Services

ArcGIS for Server: In the Cloud

Amazon Web Services Annual ALGIM Conference. Tim Dacombe-Bird Regional Sales Manager Amazon Web Services New Zealand

Deploying for Success on the Cloud: EBS on Amazon VPC. Phani Kottapalli Pavan Vallabhaneni AST Corporation August 17, 2012

Distributed Scheduling with Apache Mesos in the Cloud. PhillyETE - April, 2015 Diptanu Gon

Real Time Big Data Processing

ZADARA STORAGE. Managed, hybrid storage EXECUTIVE SUMMARY. Research Brief

Preparing Your IT for the Holidays. A quick start guide to take your e-commerce to the Cloud

Introduction to AWS in Higher Ed

PBS on Amazon. Jon Brendsel Vice President, Products

Running Oracle Applications on AWS

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

What is Cloud Computing? Why call it Cloud Computing?

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

How AWS Pricing Works

Service Organization Controls 3 Report

Software AG and the AWS cloud. Past, Present and Best Practices. Jonathan Madamba Director, Solution Cloud John Fitzgerald Director, Product Marketing

LONDON. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Cloud Computing For Bioinformatics

JAVA IN THE CLOUD PAAS PLATFORM IN COMPARISON

Microservices on AWS

The Virtualization Practice

Deploying Database clusters in the Cloud

OTM in the Cloud. Ryan Haney

Cloud computing - Architecting in the cloud

Migration Scenario: Migrating Batch Processes to the AWS Cloud

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Building an AWS-Compatible Hybrid Cloud with OpenStack

Netflix s Journey to the Cloud: Lessons Learned from Netflix s Migration to the Public Cloud

Chapter 9 PUBLIC CLOUD LABORATORY. Sucha Smanchat, PhD. Faculty of Information Technology. King Mongkut s University of Technology North Bangkok

Web Application Hosting in the AWS Cloud Best Practices

Building Storage-as-a-Service Businesses

AWS Performance Tuning

Scale Cloud Across the Enterprise

IAN MASSINGHAM. Technical Evangelist Amazon Web Services

IPFW Innovate Cloud Service Task Force

Big Data Pipeline and Analytics Platform

Postgres Plus Cloud Database!

MarkLogic Server. MarkLogic Server on Amazon EC2 Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved.

APP DEVELOPMENT ON THE CLOUD MADE EASY WITH PAAS

Building Success on Acquia Cloud:

Making Your ColdFusion Apps Highly Available. Brian Klaas Johns Hopkins Bloomberg School of Public Health

Migration Scenario: Migrating Backend Processing Pipeline to the AWS Cloud

Aleksandar Nenov. Devops Talk Belgrade 2015

Transcription:

Netflix: Building Up and Scaling Out on Open Source Black Duck 2013

Presenters Adrian Cockcroft is the director of architecture for the Cloud Systems team at Netflix. He is focused on availability, resilience, performance, and measurement of the Netflix cloud platform, and has presented at many conferences, including QCon San Francisco, Beijing and Tokyo. Adrian is also well known as the author of several books while a Distinguished Engineer at Sun Microsystems: Sun Performance and Tuning; Resource Management; and Capacity Planning for Web Services. From 2004-2007 he was a founding member of ebay Research Labs. He graduated with a BSc in Applied Physics from The City University, London. Andrew Aitken - Founder and GM of Olliance Consulting, the leading open source business and strategy consultancy and a division of Black Duck. With 15+ years of industry experience, Andrew is a recognized expert on strategies for FOSS commercialization and a leader in the open source community. Founder of the industry s only think tank on the future of commercial open source, a bi-annual event held in Napa, CA and Paris, France, and regularly attended by the leading CEOs and visionaries. He has served as an expert witness on the issues of open source and been an invited guest lecturer at Stanford s Entrepreneur program. Andrew has chaired and spoken internationally at multiple industry conferences, sits on the Board of Advisors of SugarCRM, DotNetNuke, and Funambol, and has personally worked with companies such as IBM, Microsoft, Intel and the U.S. Navy. In 2 Black Duck 2013 2

Olliance Consulting, a division of Black Duck Open Source Strategy: Our Experience, Your Success The world s leading organizations turn to Olliance Consulting to create and implement open source strategies to achieve business success. With more than a decade of experience and hundreds of engagements assisting companies ranging from start-ups to the world s largest corporations, Olliance creates innovative strategies to leverage the strategic, financial and technological advantages of open source software and methods. Profile Open Source Software Industry s leading business consultancy Over 700 engagements to date Trusted Advisor to leading Fortune 2000 companies 3 Black Duck 2013

Open Source Think Tank The Open Source Think Tank is an invitation-only conference for 140 CEOs, CIOs, CTOs, legal experts, investors and other senior executives engaged in open source software. An annual event held in Napa, CA, and regularly attended by the industry s leading CEO s and visionaries. Visit osthinktank.com 4 Black Duck 2013

Software is Eating the World Marc Andreessen 2011 5 Black Duck 2013

Cloud Native Open Source at Netflix June 2013 Adrian Cockcroft @adrianco #netflixcloud @NetflixOSS http://www.linkedin.com/in/adriancockcroft

Cloud Native NetflixOSS Cloud Native On-Ramp Netflix Open Source Cloud Prize

We are Engineers We solve hard problems We build amazing and complex things We fix things when they break

We strive for perfection Perfect code Perfect hardware Perfectly operated

But perfection takes too long So we compromise Time to market vs. Quality Utopia remains out of reach

Where time to market wins big Web services Agile infrastructure - cloud Continuous deployment

How Soon? Code features in days instead of months Hardware in minutes instead of weeks Incident response in seconds instead of hours

Tipping the Balance Utopia Dystopia

A new engineering challenge Construct a highly agile and highly available service from ephemeral and often broken components

Inspiration

Netflix Streaming A Cloud Native Application

Netflix Member Web Site Home Page Personalization Driven How Does It Work?

How Netflix Streaming Works Consumer Electronics AWS Cloud Services CDN Edge Locations Customer Device (PC, PS3, TV ) Web Site or Discovery API Streaming API User Data Personalization DRM QoS Logging OpenConnect CDN Boxes CDN Management and Steering Content Encoding

Content Delivery Service Open Source Hardware Design + FreeBSD, bird, nginx

Nov 2012 Streaming Bandwidth 18x March 2013 Mean Bandwidth +39% 6mo 25x Amazon Video 1.31%

Real Web Server Dependencies Flow (Netflix Home page business transaction as seen by AppDynamics) Each icon is three to a few hundred instances across three AWS zones Start Here Cassandra memcached Web service S3 bucket Personalization movie group choosers (for US, Canada and Latam)

New Anti-Fragile Patterns Micro-services and Chaos engines Highly available systems composed from ephemeral components Open Source is the default

Cloud Native Master copies of data are cloud resident Everything is dynamically provisioned All services are ephemeral

How to get to Cloud Native Freedom and Responsibility for Developers Decentralize and Automate Ops Activities Integrate DevOps into the Business Organization

Netflix BusDevOps Organization Chief Product Officer VP Product Management VP UI Engineering VP Discovery Engineering VP Platform Directors Product Directors Development Directors Development Directors Platform Code, independently updated continuous delivery Developers + DevOps Developers + DevOps Developers + DevOps Denormalized, independently updated and scaled data UI Data Sources Discovery Data Sources Platform Data Sources Cloud, independently updated and scaled infrastructure AWS AWS AWS

Four Transitions Management: Integrated Roles in a Single Organization Business, Development, Operations -> BusDevOps Developers: Denormalized Data NoSQL Decentralized, scalable, available, polyglot Responsibility from Ops to Dev: Continuous Delivery Decentralized small daily production updates Responsibility from Ops to Dev: Agile Infrastructure - Cloud Hardware in minutes, provisioned directly by developers

Cost reduction Process reduction Lower margins Slow down developers Higher margins Speed up developers Less revenue Less competitive More revenue More competitive What s Different? Get out of the way of innovation Best of breed, provisoned by the hour Choices based on features and scale Almost everything is Open Source

Decentralized Deployment

Asgard http://techblog.netflix.com/2012/06/asgard-web-based-cloud-management-and.html

Ephemeral Instances Largest services are autoscaled Average lifetime of an instance is 36 hours P u s h Autoscale Up Autoscale Down

Global Deployment

Cross Region Use Cases Geographic Isolation US to Europe replication of subscriber data Read intensive, low update rate Production use since late 2011 Redundancy for regional failover US East to US West replication of everything Includes write intensive data, high update rate Testing now

Managing Multi-Region Availability AWS Route53 UltraDNS DynECT DNS Denominator Regional Load Balancers Regional Load Balancers Zone A Zone B Zone C Zone A Zone B Zone C Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas Denominator manage traffic via multiple DNS providers

Benchmarking Global Cassandra Write intensive test of cross region capacity 16 x hi1.4xlarge SSD nodes per zone = 96 total Test Load 1 Million reads CL.ONE with no Data loss Validation Load 1 Million writes CL.ONE Test Load US-West-2 Region - Oregon US-East-1 Region - Virginia Zone A Zone B Zone C Zone A Zone B Zone C Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas Cassandra Replicas Inter-Zone Traffic Inter-Region Traffic Up to 9Gbits/s, 83ms 18TB S3

Cloud Native Big Data

Netflix Dataoven From cloud Services ~100 Billion Events/day Ursula RDS Metadata From C* Terabytes of Dimension data Aegisthus Data Pipelines Data Warehouse Over 2 Petabytes Gateways Hadoop Clusters AWS EMR Tools 1300 nodes 800 nodes Multiple 150 nodes Nightly

A Cloud Native Open Source Platform

Beware of Geeks Bearing Gifts: Strategies for an Increasingly Open Economy Simon Wardley - Researcher at the Leading Edge Forum

How did Netflix get ahead? Netflix BusDevOps Org Doing it since 2009 SaaS Applications PaaS for agility Public IaaS for AWS features Big data in the cloud Integrating many APIs FOSS from github Renting hardware for 1hr Coding in Java/Groovy/Scala Traditional IT Operations Taking their time Pilot private cloud projects Beta quality installations Small scale Integrating several vendors Paying big $ for software Paying big $ for consulting Buying hardware for 3yrs Hacking at scripts

Netflix Platform Evolution 2009-2010 2011-2012 2013-2014 Bleeding Edge Innovation Common Pattern Shared Pattern Netflix ended up several years ahead of the industry, but it s becoming commoditized now

Making it easy to follow Exploring the wild west each time vs. laying down a shared route

Establish our solutions as Best Practices / Standards Hire, Retain and Engage Top Engineers Goals Build up Netflix Technology Brand Benefit from a shared ecosystem

How does it all fit together?

Example Application RSS Reader Zuul Traffic Processing and Routing Z U U L

Zuul Architecture http://techblog.netflix.com/2013/06/announcing-zuul-edge-service-in-cloud.html

Zuul Components

What s Coming Next? Better portability More Features Higher availability Easier to deploy Contributions from end users Contributions from vendors More Use Cases

Vendor Driven Portability Interest in using NetflixOSS for Enterprise Private Clouds It s done when it runs Asgard Functionally complete Demonstrated March Released June in V3.3 Some vendor interest Needs AWS compatible Autoscaler Growing vendor interest Openstack Heat getting there Another very large vendor planning to demo NetflixOSS at July 17 th Meetup

AWS 2009 Baseline features needed to support NetflixOSS Eucalyptus 3.3

Boosting the @NetflixOSS Ecosystem

Judges Aino Corry Program Chair for Qcon/GOTO Simon Wardley Strategist Martin Fowler Chief Scientist Thoughtworks Werner Vogels CTO Amazon Joe Weinman SVP Telx, Author Cloudonomics Yury Izrailevsky VP Cloud Netflix

Github Registration Opened March 13 Github Apache Licensed Contributions Github Close Entries September 15 AWS Re:Invent Award Ceremony Dinner November Six Judges Winners $10K cash $5K AWS Ten Prize Categories Netflix Engineering Nominations Categories Trophy AWS Re:Invent Tickets Entrants Conforms to Rules Working Code Community Traction

Functionality and scale now, portability coming Moving from parts to a platform in 2013 Netflix is fostering a cloud native ecosystem Rapid Evolution - Low MTBIAMSH (Mean Time Between Idea And Making Stuff Happen)

Slideshare NetflixOSS Details Lightning Talks Feb S1E1 http://www.slideshare.net/ruslanmeshenberg/netflixoss-open-house-lightning-talks Asgard In Depth Feb S1E1 http://www.slideshare.net/joesondow/asgard-overview-from-netflix-oss-open-house Lightning Talks March S1E2 http://www.slideshare.net/ruslanmeshenberg/netflixoss-meetup-lightning-talks-androadmap Security Architecture http://www.slideshare.net/jason_chan/ Cost Aware Cloud Architectures with Jinesh Varia of AWS http://www.slideshare.net/amazonwebservices/building-costaware-architectures-jineshvaria-aws-and-adrian-cockroft-netflix

Takeaway NetflixOSS makes it easier for everyone to become Cloud Native Open Source is not just the default, it s a strategic weapon @adrianco #netflixcloud @NetflixOSS

Q&A 57 Black Duck 2013

Amazon Cloud Terminology Reference See http://aws.amazon.com/ This is not a full list of Amazon Web Service features AWS Amazon Web Services (common name for Amazon cloud) AMI Amazon Machine Image (archived boot disk, Linux, Windows etc. plus application code) EC2 Elastic Compute Cloud Range of virtual machine types m1, m2, c1, cc, cg. Varying memory, CPU and disk configurations. Instance a running computer system. Ephemeral, when it is de-allocated nothing is kept. Reserved Instances pre-paid to reduce cost for long term usage Availability Zone datacenter with own power and cooling hosting cloud instances Region group of Avail Zones US-East, US-West, EU-Eire, Asia-Singapore, Asia-Japan, SA-Brazil, US-Gov ASG Auto Scaling Group (instances booting from the same AMI) S3 Simple Storage Service (http access) EBS Elastic Block Storage (network disk filesystem can be mounted on an instance) RDS Relational Database Service (managed MySQL master and slaves) DynamoDB/SDB Simple Data Base (hosted http based NoSQL datastore, DynamoDB replaces SDB) SQS Simple Queue Service (http based message queue) SNS Simple Notification Service (http and email based topics and messages) EMR Elastic Map Reduce (automatically managed Hadoop cluster) ELB Elastic Load Balancer EIP Elastic IP (stable IP address mapping assigned to instance or ELB) VPC Virtual Private Cloud (single tenant, more flexible network and security constructs) DirectConnect secure pipe from AWS VPC to external datacenter IAM Identity and Access Management (fine grain role based security keys)