WHITE PAPER WORK PROCESS AND TECHNOLOGIES FOR MAGENTO PERFORMANCE (BASED ON FLIGHT CLUB) June, 2014 Project Background Flight Club is the world s leading sneaker marketplace specialising in storing, shipping, and selling the most coveted items in athletic footwear to enthusiasts around the world. Its two physical retail stores in New York and Los Angeles not only ship internationally, but serve as destination shopping centres for travellers seeking rare and limited-edition sneakers. Flight Club launched its new responsive online store with the help of Vaimo on the 20th of November 2013. Their business idea is a bit unusual from Vaimo s perspective, but from what we've learnt quite widely used in the US. In short, collectors buy and sell their unique inventory, and FlightClub Vaimo info@vaimo.com www.vaimo.com 1
provides both a few physical marketplace stores (New York, Los Angeles) and the online e-commerce site for selling the footwear. Early on in the project planning phase Flight Club communicated their anticipated traffic volumes for the website. As we looked at numbers, we realized we were about to implement a site with around 4x more page views than what we had done for any other site in the past. Business model The sales flow is highly individual, FlightClub sells unique pairs of shoes. Every pair of shoes that is sold has some unique data on it, such as the sellers ID and personally decided price level, the condition of the pair (brand new, no box, missing laces, bruised box etc). Some of the shoes are autographed and yet others have unique serial numbers - a limited edition, sold only at one quarter final game, etc. On top of this, exactly the same pair of shoes is available for sale both in one of the retail stores, and online. So we had to create a real time reservation system, that guarantees that a given pair can be reserved and bought once. If both an e-commerce customer and offline store visitor tries to buy the same pair at the same time, only one of them should succeed. The fact that "every piece of inventory is unique, it makes the data flow and caching a bit more challenging. Every sale basically invalidates and changes some pages on the site. You can compare that to when selling something like books, where you might have 78 in stock before the sale, and 77 after a sale, at exactly the same price. Nothing really changes then on the product page, so caching becomes more straight forward. Planning for traffic The traffic volume we designed the website for was in the range of 500 000 page views per hour. Flight Club has not quite yet reached this, but in general we provide capacity with some margin. We estimate that they have peaked around 40-50% of this capacity so far. We realized the challenge in combining quite dynamic content with the highest traffic volume we had encountered in any project before Flight Club. Everyone agreed pretty early on in the "Discovery phase" that work on performance and caching had to be its own ongoing process throughout the project and we reserved part of the project budget for that. Magento data model The Magento data model is flexible and generic. One can add any type of custom attributes to predefined models, these attributes can pull data from any source etc. This flexibility is one background factor why Magento can solve many different e-commerce scenarios. But this genericness and extensibility also has a price in terms of code complexity and performance. As an example, a product load can take in range of 50-100 msec of server time. This depends a lot on what modules are processing the product load, what logic you have connected with attributes. A rule-of-thumb we use in e-commerce is that turnover of an e-commerce site decreases with 10% per every additional second of page load time. With slow pages, search engines will also rate the site Vaimo info@vaimo.com www.vaimo.com 2
lower. So, realistically, each product load we want to do, on common pages, decreases e-commerce sales with up to 1%. Pull in the team We took the approach to involve the whole project team in the performance work quite early on in the project. One initiative that paid off was to hold a two hour workshop about profiling code, and how to work with performance improvements. We used the Php/XDebug builtin profiler for that, together with a standalone visualization / exploration tool (webgrind). With this, it is quite easy to locate bottlenecks in the code and to understand inefficiencies. Also, as the profiler traces are data files stored at particular point in time, it is simple to do before/after comparisons, as we simply store the trace files with some descriptive names, and a date tag. The team picked up quite well on this, and got enthusiastic about profiling, improving and removing code, which we had not so far understood how slow it was. During this initial week of performance focus, the site improved between a factor two and three in terms of basic non-cached performance (we define that as normal page generation time when there is a miss in full page cache [FPC]). Working with this more inside the team (rather than optimizing the site at the end, or applying changes to the code base in a parallel work flow) had many advantages. We got quick feedback on various special (customer specific) ways data was generated and rendered, data dependencies we could easily have missed working on the site more from outside. Stack of performance technologies Our approach to performance work can be seen as a "stack" where we use four techniques on top of each other. The starting point is algorithmic and implementation efficiency. Much of this are things one can pick up in good books on programming but can be easy to forget about. Examples include: doing an expensive computation just once per request (rather than every time you need the result); pulling expensive computations out of loops; simplifying a sequence of object loading to perhaps one DB query; using index/keys skilfully to avoid looping over data collections to find "your thing". The Magento framework also sometimes invites you to write expansive code where it's easy to loose track of the complexity and the cost it brings along. Model loading and database roundtrips are often implicit and hidden. They happen somewhere down the stack. With functions and objects like this, one often doesn't really know the cost - until finally profiling it. By understanding the data we need well, one can often find a simpler and more direct way to extract it, cutting down on excessive processing, database roundtrips and preparation of data that in reality will never be used. The second level in this stack is "data caching" - where we see that instead of computing the same expensive values for each visitor, we compute them for the first visitor and then store it in Magento Vaimo info@vaimo.com www.vaimo.com 3
cache. This level is quite important, as this type of caching helps all pages on the site render faster, it's not just storing the HTML result of a unique page request. In a session based language like PHP it is really a vital part. Examples of this include caching attribute and option information in ready-touse format, or product URL:s which sound simple to create, but end up with a bit of model loading in Magento standard. On top of this we use a combination of block caching and then full page caching (FPC). This stores ready-to-reuse HTML that matches a very exact set of parameters (product ID, customer group ID,...). This type of caching is very fast, since we essentially have the final HTML data with a simple cache request. The downside is that the more parameters it depends on, and the larger our set of data (products, categories, attributes ), the lower cache hit rate we get. Also, the dynamic aspect of the site (order placement + integrations) mean that this cache buildup is disturbed and invalidated much of the time, because the underlying data is continuously changing. Fast hole punching One part we've started to work with more during recent projects (including Flight Club) is "fast hole punching". Hole punching is the means by which we're able to serve pages from FPC, containing a number of dynamic blocks (header cart, my wish list, ). These holes can be filled in either the "fast way" or "slow way". In many blogs and examples of hole punching on the Internet, holes are filled the slow way via "applyinapp". One should look out here, since that means that the whole application configuration/routing machinery in Magento is launched. This in itself increases page generation time with around factor 10x. Then the missing blocks are rendered with ordinary Magento framework and models. Fast holes on the other hand are filled in without initializing Magento, by a separate block cache in the context of the FPC. The challenge here is to know your cache keys (what is unique about the dynamic block in this request), to be able to generate these keys without having the initialized Magento framework available for that. The results of Flight Club website performance - After 4 months of production, the system has held up against visitor load very well. We have only had one performance related issue, that came from a sub-optimal indexing query as part of checkout order placement. After solving that (this DB query was part of Magento standard and we patched it) we have not had any other performance issues. - The live system is currently setup with 5 logical servers residing on 3 hardware nodes. On top of that we have an external CDN service. There are 3 load balanced web nodes, servicing some 33% traffic each. Each physical server has 32 cores (with HT). - The average page generation time (across FPC and non FPC hits) is around 0.4 sec. The category view, with its filter navigation and fairly long product list, is difficult to cache well. People apply different filters and products change continuously. We see a potential to lower this overall page generation time by another 30-50%. Vaimo info@vaimo.com www.vaimo.com 4
- As we monitor database activity during peak time, we see very few queries that last longer than 20-100 msec. A lot of what normally is fairly complex database queries are now looked up by faster cache requests. - In terms of server load, we see that overall CPU utilization stays around 10-15% even at peak load. The server load value stays around 2.0 also then - being quite low for a 32 core system. These facts indicate that the system has significantly higher capacity than what we had provisioned for. - During load testing phase, we developed scripts that simulate real visitor behaviour based on statistics from Flight Club (conversion rates, page views per customer). The load tests generate real checkout activity and orders. Running them at system peak rate (before performance starts to drop because of overload) indicates that our system can handle around 400 page requests per second, including the checkout/order activity. (Activity in admin and integrations were not simulated). Conclusion By having had a continuous performance focus in the team throughout the project, and working with multiple performance technologies, we were able to produce a site that holds well, even under maximum visitor, order and integration pressure. We reused optimization technology and code from a number of previous projects, and added a few new techniques to our optimization knowledge during Flight Club project. In the case of generating dynamic visitor pages, we cannot really say we know of any limit, as our experience indicates that Magento scales really well when adding web nodes. The database is the place where it's not easy to grow with adding nodes. So far we have not really touched that limit, but if that would come, we feel confident it is possible to analyze that data-flow and work with those limits as well. Vaimo info@vaimo.com www.vaimo.com 5