Scaling Apache 2.x > 20,000 concurrent downloads

Similar documents
Davor Guttierrez 3 Gen d.o.o. Optimizing Linux Servers

Scalability and Performance with Apache 2.0

What's new in httpd 2.2?

Benchmarking FreeBSD. Ivan Voras

Wednesday, October 10, 12. Running a High Performance LAMP stack on a $20 Virtual Server

Apache and Tomcat Clustering Configuration Table of Contents

Web Server Deathmatch

NetIQ Access Manager 4.1

Secure Web. Hardware Sizing Guide

TODAY web servers become more and more

CLOUDSPECS PERFORMANCE REPORT LUNACLOUD, AMAZON EC2, RACKSPACE CLOUD AUTHOR: KENNY LI NOVEMBER 2012

Web Hosting for Fame and Fortune. A Guide to using Apache as your web-server solution

PARALLELS CLOUD STORAGE

The current version installed on your server is el6.x86_64 and it's the latest available.

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)

An Esri White Paper January 2010 Performance and Throughput Tips for ArcGIS Server Cached Map Services and the Apache HTTP Server

Achieving High Throughput. Fernando Castano Sun Microsystems

Hyper-V vs ESX at the datacenter

Apache Performance Tuning

Maximizing VMware ESX Performance Through Defragmentation of Guest Systems. Presented by

iscsi Performance Factors

Apache2 Configuration under Debian GNU/Linux. Apache2 Configuration under Debian GNU/Linux

HPC performance applications on Virtual Clusters

How To Configure Apa Web Server For High Performance

Apache Tomcat & Reverse Proxies

Filesystems Performance in GNU/Linux Multi-Disk Data Storage

MAGENTO HOSTING Progressive Server Performance Improvements

Cloud Computing Workload Benchmark Report

Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Recommended hardware system configurations for ANSYS users

Microsoft Windows Server 2003 with Internet Information Services (IIS) 6.0 vs. Linux Competitive Web Server Performance Comparison

Investigation of storage options for scientific computing on Grid and Cloud facilities

DualFS: A New Journaling File System for Linux

An Architectural study of Cluster-Based Multi-Tier Data-Centers

Tech Tip: Understanding Server Memory Counters

CS197U: A Hands on Introduction to Unix

Configuring Apache Derby for Performance and Durability Olav Sandstå

Business white paper. HP Process Automation. Version 7.0. Server performance

NAS Storage needs to be purchased; Will not be offered IAAS - Utility SMTP Per SMTP account Per server

Acronis Backup & Recovery 10 Server for Linux. Installation Guide

Of Penguins and Wildebeest. Anthony Rodgers VA7IRL

Vizrt Community Expansion Performance Guide

PHP web serving study Performance report

owncloud Enterprise Edition on IBM Infrastructure

XTM Web 2.0 Enterprise Architecture Hardware Implementation Guidelines. A.Zydroń 18 April Page 1 of 12

Performance and scalability of a large OLTP workload

CIA Lab Assignment: Web Servers

Efficient and Large-Scale Infrastructure Monitoring with Tracing

Benchmarking Cassandra on Violin

Installing Apache Software

Created by : Ashish Shah, J.M. PATEL COLLEGE UNIT-5 CHAP-1 CONFIGURING WEB SERVER

AppDynamics Lite Performance Benchmark. For KonaKart E-commerce Server (Tomcat/JSP/Struts)

Enabling Technologies for Distributed Computing

Apache Tomcat. Load-balancing and Clustering. Mark Thomas, 20 November Pivotal Software, Inc. All rights reserved.

Optimizing TYPO3 performance

Using VMware Player. VMware Player. What Is VMware Player?

Parallels Virtuozzo Containers 4.7 for Linux Readme

Virtuoso and Database Scalability

Use of Hadoop File System for Nuclear Physics Analyses in STAR

Benchmarking Hadoop & HBase on Violin

Client-aware Cloud Storage

Tomcat Tuning. Mark Thomas April 2009

LSN 10 Linux Overview

How To Test For Speed On Postgres (Postgres) On A Microsoft Powerbook On A 2.2 Computer (For Microsoft) On An 8Gb Hard Drive (For

POSIX and Object Distributed Storage Systems

Igor Seletskiy. CEO, CloudLinux

AIX NFS Client Performance Improvements for Databases on NAS

Perforce with Network Appliance Storage

The course will be run on a Linux platform, but it is suitable for all UNIX based deployments.

Violin: A Framework for Extensible Block-level Storage

Deployment Guide. How to prepare your environment for an OnApp Cloud deployment.

Parallels Cloud Server 6.0

Watch your Flows with NfSen and NFDUMP 50th RIPE Meeting May 3, 2005 Stockholm Peter Haag

IT Business Management System Requirements Guide

Table of Contents Introduction and System Requirements 9 Installing VMware Server 35

Promise Pegasus R6 Thunderbolt RAID Performance

JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers

Web Server Software Architectures

The Linux Virtual Filesystem

nanohub.org An Overview of Virtualization Techniques

Vizrt Community Expansion Performance Guide

CentOS Linux 5.2 and Apache 2.2 vs. Microsoft Windows Web Server 2008 and IIS 7.0 when Serving Static and PHP Content

Cloud Computing Performance. Benchmark Testing Report. Comparing ProfitBricks vs. Amazon EC2

On Benchmarking Popular File Systems

Summer Student Project Report

Low-cost BYO Mass Storage Project. James Cizek Unix Systems Manager Academic Computing and Networking Services

Optimization of Cluster Web Server Scheduling from Site Access Statistics

Server Monitoring. AppDynamics Pro Documentation. Version Page 1

Painless Web Proxying with Apache mod_proxy

Outline: Operating Systems

Enabling Technologies for Distributed and Cloud Computing

Transcription:

Scaling Apache 2.x > 20,000 concurrent downloads 1 / 43 http://www.stdlib.net/~colmmacc/apachecon-eu2005/

Material 2 / 43 Introduction Benchmarking Tuning Apache Tuning an Operating System Design of ftp.heanet.ie Future directions

A warning 3 / 43 1000 httpd processes per CPU is close to the limit - Sander Temme, 12:01 yesterday, this stage

4 / 43 ftp.heanet.ie

ftp.heanet.ie 5 / 43 National Mirror Server for Ireland http://ftp.heanet.ie/about/ http://ftp.heanet.ie/status/ Also used for Network/Systems development IPv6 Apache 2.0/2.1/2.2 Give back to OpenSource community And get free T-Shirts Relatively small budget (50k Euro Vs 400k Euro)

ftp.heanet.ie 6 / 43 Mirror for; Apache, Sourceforge, Debian, FreeBSD, RedHat, Fedora, Slackware, Ubuntu, NASA Worldwinds, Mandrake, SuSe, Gentoo, Linux, OpenBSD, NetBSD... and so on

The Numbers 7 / 43 > 27,000 concurrent downloads from 1 webserver, in production 984Mbit/sec, in production. 4Gbit/sec in testing. Roughly 80% of all Sourceforge downloads from April 03 to April 04 Usually 4 times busier than ftp.kernel.org 7 Free T-Shirts (RedHat and Sourceforge)

The Numbers: a day 8 / 43 10,753,084 files stored 4.53 Terabytes 3,011,067 downloads 3.4TB shipped

9 / 43 The Numbers

Resources 10 / 43 http://www.kegel.com/c10k.html http://httpd.apache.org/ http://www.csn.ul.ie/~mel/projects/vm/ Kernel sources Tuning/NFS/high-availability HOWTO s

Method 11 / 43 Research the principles behind the software involved Configure, test, benchmark, repeat Configure, test, benchmark, repeat

Benchmarking 12 / 43 Webserver benchmarking: apachebench, httperf, autobench most important benchmark and can be used for measuring any system changes. Use common files for benchmarking /pub/heanet/100.txt /pub/heanet/1000.txt /pub/heanet/10000.txt

Benchmarking 13 / 43 ab gives a good quick overview of current server performance. httperf + autobench stress-tests the webserver to determine maximum response rate, detect any errors and so on. Produces useful graphs.

14 / 43 Benchmarking

Benchmarking Filesystems 15 / 43 IOzone, postmark, bonnie++ Postmark aimed at simulating mail-server load. May be suitable for some webservers, but unlikely. IOzone is extensive and thorough bonnie++ is simple to understand and sufficient for most needs

VM and scheduler benchmarking 16 / 43 No generic tools for benchmarking schedulers and memory managers Benchmarks usually consist of compiling a kernel, benchmarking a webserver, etc To judge the effect of the VM and scheduler on I/O, we use dder.sh

Scheduler and VM 17 / 43 #!/bin/sh STARTNUM="1" ENDNUM="102400" # create a 100 MB file dd bs=1024 count=102400 if=/dev/zero of=local.tmp # Clear the record rm -f record # Find the most efficient size for size in `seq $STARTNUM $ENDNUM`; do dd bs=$size if=local.tmp of=/dev/null done 2>> record # get rid of junk grep "transf" record awk '{ print $7 }' cut -b 2- cat -n \ while read number result ; do echo -n $(( $number + $STARTNUM - 1 )) echo " " $result done > record.sane

18 / 43 Scheduler and VM

19 / 43 Scheduler and VM

20 / 43 Scheduler and VM

21 / 43 Scheduler and VM

Tuning Apache 2.x 22 / 43 Choosing an MPM Run with various different ones, measure with benchmark utilities. For our load, the prefork MPM came out on top by a margin of 20% Static Vs DSO Very small difference (0.2%) in favour of compiled-in static modules.

Tuning Apache 2.x 23 / 43 <IfModule prefork.c> StartServers 100 MinSpareServers 10 MaxSpareServers 10 ServerLimit 50000 MaxClients 50000 MaxRequestsPerChild 2000 </IfModule>

Tuning Apache 2.x 24 / 43 <Directory "/ftp/"> Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all IndexOptions NameWidth=* +FancyIndexing \ +SuppressHTMLPreamble +XHTML </Directory>

Tuning Apache 2.x 25 / 43 Sendfile Enabled if found, however broken on Linux with IPv6 (checksum offloading bug). Mmap Next best thing, allows Apache to treat files as contiguous memory, kernel handle's reading.

Tuning Apache 2.x 26 / 43 mod_cache Experimental in 2.0, occasional bugs in 2.1, not for everyone, but very useful nevertheless. Not just for proxies, allows webserver to cache files as they are sent for the first time. Thus many reads from a slow filesystem can be avoided

Tuning Apache 2.x 27 / 43 mod_mem_cache can use memory to cache file content, however on Linux the VM caches aggressively anyway mod_disk_cache Can use filesystem directory as cache. By using 4x 36Gb 15K RPM SCSI disks in a RAID0 configuration we can speed up read() speed very much.

Tuning Apache 2.x 28 / 43 <IfModule mod_cache.c> <IfModule mod_disk_cache.c> CacheRoot /usr/local/apache2/cache/ CacheEnable disk / CacheDirLevels 5 CacheDirLength 3 </IfModule> </IfModule>

Tuning Apache 2.x 29 / 43 Cache cleaning doesn't work in 2.0. Brutal combination of find, xargs and rm is one option. Use htcacheclean from 2.1 htcacheclean runs periodically and prunes down to a target size. Important to ensure there is grow room

30 / 43 Tuning Apache 2.x

htcacheclean 31 / 43 Deletes files somewhat arbitrarily noatime is a valueable mount option Hack: Second filesystem (ramfs) with atime mod_disk_cache hack to create 0-byte files there also find xargs ls -u sort -rn head rm

Tuning the Operating System 32 / 43 Choose a kernel: 2.6 Vs 2.6-mm? Vs 2.4 2.6 kernel is MUCH better, allows > 20,000 processess in production 2.4 limits at about 11,000 2.6-mm was needed for a while, but most patches in now. 2.6.11 found be most stable yet.

Tuning the Operating System 33 / 43 Tuning a filesystem Always mount with noatime, can double read speed. XFS: use logbufs=8, ihashsize=65567 mount options Ext3: set blocksize to 4096, use dir_index build option

Tuning the Operating System 34 / 43 Tuning NFS Use jumboframes if possible, increase rsize and wsize accordingly. Increase the number of NFS threads available on the server side. Use nolock mount option on the clients if they will not be doing any writing.

Tuning the Operating System 35 / 43 Tuning the VM Linux VM uses similar approach to mod_disk_cache for freeing space. Allocate memory to processes genourously and prune back to target level periodically The VM also caches file data aggresively. If a lot of files are being served quickly, easy to fill memory and generate OOM

Tuning the Operating System 36 / 43 Tuning the VM vm/min_free_kbytes = 204800 vm/lower_zone_protection = 1024 vm/page-cluster = 20 vm/swappiness = 200 vm/vm_vfs_scan_ratio = 2 Other sysctls: fs/file-max=5049800

Tuning the Operating System 37 / 43 Tuning the Networking stack net/ipv4/tcp_rfc1337=1 net/ipv4/tcp_syncookies=1 net/ipv4/tcp_keepalive_time = 300 net/ipv4/tcp_max_orphans=1000 sys/net/core/rmem_default=262144 sys/net/core/rmem_max=262144

System Design 38 / 43 RAM intensive, buy lots Bounce buffering and PAE means CPU hit, buy lots Fast (15k RPM) system disks for intermediary caching

System Design 39 / 43 Machine Model CPU Memory Storage Network Avoca Alphaserver 200Mhz 256Mb 20Gb 10Mbit Athene Canyonero Alphaserver DS20E Dell 2650 667Mhz 2x 1.8Ghz 1.5Gb 4Gb 30Gb 2.2Tb 100Mbit 1Gb Cassandra Dell 2650 2x 2.4Ghz 12Gb 5.6Tb 2Gb Coroebus Dell 7250 2x 1.5Ghz 32Gb 14.2Tb 4Gb

40 / 43 System Design

Future directions 41 / 43 Multicast services Jumboframes mod_ftp(d) and reverse proxies Itanium platform mod_bittorrent?

Other talks 42 / 43 TH14: What's new in HTTPD 2.2 TH17: Caching, Tips for Improving Performance FR09: Clustering and Load-balancing using mod_proxy

Questions? 43 / 43?