eccharts: Behind the scenes



Similar documents
Stackato PaaS Architecture: How it works and why.

bla bla OPEN-XCHANGE Open-Xchange Hardware Needs

ZingMe Practice For Building Scalable PHP Website. By Chau Nguyen Nhat Thanh ZingMe Technical Manager Web Technical - VNG

IERG 4080 Building Scalable Internet-based Services

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Monitoring Nginx Server

Large-Scale Web Applications

Social Networks and the Richness of Data

How Comcast Built An Open Source Content Delivery Network National Engineering & Technical Operations

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

Performance Analysis and Capacity Planing

Nick McClure University of Kentucky

MAGENTO HOSTING Progressive Server Performance Improvements

CI Pipeline with Docker

opennms reporting generation tool

Linux A first-class citizen in Windows Azure. Bruno Terkaly bterkaly@microsoft.com Principal Software Engineer Mobile/Cloud/Startup/Enterprise

Serving 4 million page requests an hour with Magento Enterprise

Performance. CSC309 TA: Sukwon Oh

Cache All The Things

Big Data with Component Based Software

Layers of Caching: Key to scaling your website. Lance Albertson -- Narayan Newton

The Cloud to the rescue!

MiaRec. Architecture for SIPREC recording

Oracle BI Publisher Enterprise Cluster Deployment. An Oracle White Paper August 2007

Creating Web Farms with Linux (Linux High Availability and Scalability)

Scalable Architecture on Amazon AWS Cloud

Wikimedia architecture. Mark Bergsma Wikimedia Foundation Inc.

Best practices for operational excellence (SharePoint Server 2010)

Web Application Hosting in the AWS Cloud Best Practices

Load Balancing using Pramati Web Load Balancer

PostgreSQL Performance Characteristics on Joyent and Amazon EC2

StreamServe Persuasion SP5 StreamStudio

Monitoring Drupal with Sensu. John VanDyk Iowa State University DrupalCorn Iowa City August 10, 2013

Java, PHP & Ruby - Cloud Hosting

Random Walk Shoes. Setting Up a Web Server

Microsoft Azure IaaS: Virtual Machine e Open Source interoperability

Common Server Setups For Your Web Application - Part II

Bricks Cluster Technical Whitepaper

IBM Cloud Manager with OpenStack

BASICS OF SCALING: LOAD BALANCERS

(An) Optimal Drupal 7 Module Configuration for Site Performance JOE PRICE

depl Documentation Release depl contributors

This document will list the ManageEngine Applications Manager best practices

Shoal: IaaS Cloud Cache Publisher

Intro to Load-Balancing Tomcat with httpd and mod_jk

Last time. Today. IaaS Providers. Amazon Web Services, overview

Monitor Your Key Performance Indicators using WSO2 Business Activity Monitor

Database Scalability and Oracle 12c

World Leading Application Delivery Controllers. Peter Draper Technical Director EMEA

Wisdom from Crowds of Machines

This presentation provides an overview of the architecture of the IBM Workload Deployer product.

WHITE PAPER Redefining Monitoring for Today s Modern IT Infrastructures

Understanding Slow Start

Introduction to Big Data Training

Clustering with Tomcat. Introduction. O'Reilly Network: Clustering with Tomcat. by Shyam Kumar Doddavula 07/17/2002

Integrating the F5 BigIP with Blackboard

Cloud Panel Service Evaluation Scenarios

Apigee Gateway Specifications

CORD Monitoring Service

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage

Drupal High Availability High Performance

VMware vcloud Automation Center 6.1

Drupal Performance Tuning

Chapter 1 - Web Server Management and Cluster Topology

EWD: Simplifying Web Application Architecture

LinuxWorld Conference & Expo Server Farms and XML Web Services

Web Application Hosting in the AWS Cloud Best Practices

RED HAT SOFTWARE COLLECTIONS BRIDGING DEVELOPMENT AGILITY AND PRODUCTION STABILITY

Web Browsing Examples. How Web Browsing and HTTP Works

High Level Design Distributed Network Traffic Controller

Surviving the Big Rewrite: Moving YELLOWPAGES.COM to Rails. John Straw YELLOWPAGES.COM

Reference Model for Cloud Applications CONSIDERATIONS FOR SW VENDORS BUILDING A SAAS SOLUTION

How To Run A Web Farm On Linux (Ahem) On A Single Computer (For Free) On Your Computer (With A Freebie) On An Ipv4 (For Cheap) Or Ipv2 (For A Free) (For

Load Balancer Comparison: a quantitative approach. a call for researchers ;)

Managing your Red Hat Enterprise Linux guests with RHN Satellite

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

VMware vcloud Automation Center 6.0

IdP Clustering. You want to prevent service outages. High Availability and Load Balancing. Possible problems: HW failures

Improved metrics collection and correlation for the CERN cloud storage test framework

always available Cloud

VMware vrealize Automation

OBSERVEIT DEPLOYMENT SIZING GUIDE

Filr 2.0 Administration Guide. April 2016

Availability Digest. Redundant Load Balancing for High Availability July 2013

Transcription:

eccharts: Behind the scenes The Web people at ECMWF <eccharts-support@lists.ecmwf.int> ECMWF October 1, 2015

eccharts: What users see (1) 2

eccharts: What users see (2) 3

eccharts: What users do not see (when we get it right!) 4

eccharts: Servers and operating systems ~500 CPUs ~1.5 TB memory ~200 TB disk All running Linux (SuSE SLES 11 Service Pack 3) Python 2.7 everywhere 5

eccharts: The front-end layer Two hardware load balancers for distributing HTTP/HTTPS requests to.. several Varnish HTTP caches (HTTP traffic) Very robust, used mainly for load-balancing to lower layers, helps with caching and several Apache instances (HTTPS traffic) Several Nginx HTTP servers for streaming large static files (i.e. plots) 6

eccharts: The web applications layer Several Django instances, taking care of: Access control Parsing user requests from the Javascript Dispatching requests to the services layer below (see next slides) Our usage of Django is very lightweight Most UI work is done in the Javascript code above, and most data processing is done in the services layer below. 7

eccharts: The (micro)services layer A microservices architecture, based on Celery. Using RabbitMQ, an AMQP request broker for dispatching jobs and Redis, a key-value store, for storing job results. Services written in Python. Lots of caching everywhere, using Memcached. 8

eccharts: The (micro)services layer (2) # The inevitable `echo` and `sum` services from servicelib import errors, start_services def echo_service(context, *args): context.log.debug("executing echo() request from: %s", context.user) return " ".join(args) def sum_service(context, *args): try: args = [(a) for a in args] except: raise errors.badrequest("invalid args: %s" % (args,)) return sum(args) if name == " main ": start_services({"name": "sum", "execute": "sum_survice"}, {"name": "echo", "execute": "echo_service"}) 9

eccharts: The (micro)services layer (3) 10

eccharts: The (micro)services layer (4) from metview.macro import retrieve, sqrt def wind_speed(r): if r['levtype'] == 'sfc': u = '165.128 v = '166.128 else: u = '131.128 v = '132.128 r['param'] = u u = retrieve(r) r['param'] = v v = retrieve(r) return sqrt(u * u + v * v) 11

eccharts: The data layer A distributed file storage service written in Python Content is addressable as HTTP URLs by all services New content pushed with HTTP PUT requests Content is indexed in a MongoDB cluster > db.fields.findone() { "param" : "msl", base_time": ISODate("2013-04-08T00:00:00Z"),.. "locations" : [ {"url": "http://host42.ecmwf.int/data0000.grib", "offset": 0, "length": 4158 } ]} 12

eccharts: The data layer (2) Twice a day: We push ~250 GB of GRIB data We insert ~250,000 records in the MongoDB indexes We remove ~250 GB of old GRIB data We remove ~250,000 old records from the MongoDB indexes Life gets interesting at times 13

eccharts: The release process We do releases every two weeks or so. The more frequent, the better! We have two identical clusters of servers: production and pre-production Release candidate tested in a pre-production cluster On release day, production traffic is redirected to pre-production cluster, and vice-versa. Usual downtime: 15 minutes Should be transparent! 14

eccharts: DevOps Two different teams: Development and Operations We talk to each other 15

eccharts: Behind the scenes Questions? 16