Accelerating Zope applications with Squid and ESI Simon Eisenmann 7. Jun 2004 Göteborg, EuroPython 2004 2004 struktur AG page 1 2004 struktur AG
Squid in front of Zope - Why? Massive speedup. Only sanitized HTTP requests are proxied to ZServer. Fast and powerfull access control. HTTP Load balancing. Zero downtime on software upgrade. Serve multiple sites with one IP (virtual hosting). Extremly flexible redirection (Redirector). 2004 struktur AG page 2
Why Squid and not Apache? pro Squid: Squid is faster than Apache. Custom redirector more flexible than mod_rewrite rules. All sorts of different caching strategies and options. Squid3 supports Edge Side Includes (ESI). contra Squid: Not a web server (no PHP, CGI, static pages,... ). Complex setup. More hardware resources required. 2004 struktur AG page 3
Squid <> Zope communication 2004 struktur AG page 4
ZEO as Squid parent cache Squid can use ZEO clients as parent Caches. Only requires Zope HTTP and ICP ports to be configured. ZEO clients are configured as cache peers in squid.conf: cache_peer 127.0.0.1 parent 8081 3131 no-digest \ no-netdb-exchange round-robin 2004 struktur AG page 5
ZEO client configuration Enabling HTTP and ICP ports in zope.conf: <http-server> address 8081 </http-server> <icp-server> # valid key is "address" address 127.0.0.1:3131 </icp-server> 2004 struktur AG page 6
What's the ICP port thing? ICP is the Internet Cache Protocol Used for querying neighbor caches about objects. Squid uses ICP to detect dead or unreliable peers. Squid can use ICP to choose the fastest peer. ICP included as standard from Zope 2.6 and later. http://www.zope.org/members/htrd/icp/intro http://www.linofee.org/~jel/proxy/squid/icp-id.html 2004 struktur AG page 7
Squid2 or Squid3 Squid2: + stable + supports all features required for accelerating + binaries for almost any platform (including Win) - requires patching to log in Apache's combined log format Squid3-PRE3: - unstable - contains bugs + supports ESI + supports Apache's combined log format + simpler VirtualHosting configuration + Supports multithreaded redirectors 2004 struktur AG page 8
Squid.conf - the Squid configuration The squid.conf is a huge beast. (3600+ Lines) Thankfully most of these lines are comments. 2004 struktur AG page 9
Squid.conf changes required http_port 80 vhost #hierarchy_stoplist cgi-bin? logformat combined %>a %ui %un [%tl] "%rm %ru HTTP/%rv" %Hs %<st \ "%{Referer}>h" "%{User-Agent}>h" access_log /usr/local/squid3/logs/access.log combined redirect_program /usr/local/squid3/iredir/iredirector.py redirect_children 20 redirect_rewrites_host_header off acl in_backendpool dstdomain backendpool cache_peer 127.0.0.1 parent 8081 3131 no-digest no-netdb-exchange round-robin cache_peer_access 127.0.0.1 allow in_backendpool cache_peer_access 127.0.0.1 deny all http_access allow all never_direct allow all ie_refresh on 2004 struktur AG page 10
The Redirector The redirector is a external tool which takes a URL on the command line and returns the rewritten URL. Redirectors can be used to do redirection based on numerous conditions. Usually, redirectors make use of regular expressions. There are redirectors written in Perl, Bash, Python, C,... 2004 struktur AG page 11
iredir - a Python redirector Configuration is a Python method. Flexible and stable. # define sitemap matching regex mapping sitemap = { (10, '[\S]*plone.org'): 'backendpool/virtualhostbase/http/$netloc$:80/plo ne.org/virtualhostroot', (20, '[\S]*mydomain.com'): '302:http://plone.org', (10, 'localhost'): 'backendpool/virtualhostbase/http/$netloc$:80/vir tualhostroot', } http://longsleep.org/projects/iredir 2004 struktur AG page 12
Edge Side Includes (ESI) Simple markup language. Meant for application servers and content management. Open standard. Developed by Akamai, ATG, BEA Systems, Circadence, Digital Island, IBM, Interwoven, Oracle, and Vignette. Provides possibility to cache highly dynamic pages. Zope Corp. sponsored ESI development in Squid3. 2004 struktur AG page 13
ESI markup tags Tag <esi:include> <esi:choose> <esi:try> <esi:vars> <esi:remove> <!--esi... --> <esi:inline> Purpose Include a separately cacheable fragment. Conditional execution? choose among several different alternatives based on, for example, cookie value or user agent. Specify alternative processing when a request fails (e.g., the origin server is not accessible). Permit variable substitution (for environment variables). Specify alternative content to be stripped by ESI but displayed by the browser if ESI processing is not done. Specify content to be processed by ESI but hidden from the browser. Include a separately cacheable fragment whose body is included in the template. 2004 struktur AG page 14
ESI support in Squid3 ESI in Squid works well when: The builtin custom parser is used. Only ASCII websites are served. Squid also supports Expat as ESI parser which features other encodings than ASCII. Though with Expat, ESI doesn't work very well in latest Squid3 (PRE3) release. 2004 struktur AG page 15
A simple Zope example 2004 struktur AG page 16
The ESI header (set_esi_header) To enable ESI processing the Surrogate-Control HTTP header has to be set for each response. 2004 struktur AG page 17
Plone2 skin using ESI Metal makros need to become ESI markup: <!--<a metal:use-macro="here/global_logo/macros/portal_logo"> The portal logo, linked to the portal root </a>--> <esi:global_logo metal:usemacro="here/esi_slot_global_logo/macros/main"> The portal logo, linked to the portal root </esi:global_logo> I introduce the esi_slot and esi_view id namespaces. esi_slot is used for seperation of the real view and esi markup. esi_view contains the real html code which is later on inserted instead of the markup. esi_slot code may be replaced by some general tool which auto generates the required markup. 2004 struktur AG page 18
ESI inserts the Plone logo 2004 struktur AG page 19
ESI inserts the Plone logo 2004 struktur AG page 20
Making a Plone2 ESI skin Simulate the main_template/macros/master and add the esi scripts (get_esi_template and set_esi_header) Add the esi header to the skin. (set_esi_header) default_template has to use esi markup for each caching relevant part. (eg news slot, etc..) Write esi_slot markup for each portlet / part. Write esi_view for the portlet code / site part. (requires a master macro) Test the esi_view_something template by accessing it directly ttw. Each esi_view_something template can define own caching headers. 2004 struktur AG page 21
Let's try it Ok.. let's see Plone2 with ESI in action. 2004 struktur AG page 22
Performance / Benefits Uncached ESI pages are slower than without ESI. With the right caching headers Squid can serve 500 req/s on a standard machine. On high traffic sites ESI is the only way to cache highly dynamic pages. Allowes different caching headers for different site parts. 2004 struktur AG page 23
Future Squid3 will be come stable. ESI parser will support UTF-8. ESI integration into CMFSquidTool. 2004 struktur AG page 24
Questions? I know caching is a very complex thing..... so are the any Questions? 2004 struktur AG page 25
Thanks! More information can be found here: http://www.squid-cache.org http://www.linofee.org/~jel/proxy/squid/icp-id.html http://longsleep.org/projects/iredir http://www.esi.org http://www.edge-delivery.org/spec.html http://www.esi.org/press103002.html http://squid.sourceforge.net/old_projects.html#esi http://longsleep.org/howto/ This talk will be available at http://www.struktur.de 2004 struktur AG page 26