From a crazy hackathon idea to an empty queue
How can you make sure images don't slow down your page load performance? All it took us was some unconventional thinking: seti@bol.com.
Images
We use a lot of images at bol.com, as they're very important in our visitors' decision-making and buying process. Unfortunately, these images slow down our page performance. In fact, they account for most of the downloaded bytes on a page.
It goes without saying that fast loading pages have a positive effect on user experience and, in the end, conversion. The fewer bytes the browser has to download, the less competition on the client's bandwith and the faster the browser can download and render content on the screen. So if we can find a way to optimize images without loss of quality, we can easily speed up our page performance.
But first things first: let me briefly explain our (simplified) image flow, from suppliers to the webshop. There are a couple of services involved, which I'll explain in detail below:
The image flow architecture
Our regular asset flow
We can distinguish:
- An orchestration service, which processes all incoming images in their original sizes/dpi/colorspaces/formats/etc. This includes downloading the images, sanitizing them, gathering metadata, matching the image against our product database, scoring the image, etc.
- A render service, whose sole purpose is to create a web-optimised rendition based on the original. These are the images that we serve to our customers.
- The webshop itself.
Another succesful hackaton idea
It all started at one of bol.com’s hackathons. We had the idea to incorporate Google's Guetzli algorithm in the image flow. In short, this algorithm improves the online user experience by producing smaller image file sizes without sacrificing quality. Unfortunately it takes on average 50 seconds to optimize one image (and we were planning to render hundreds of millions images). Since image rendering is a synchronous call in our landscape,. we came up with the following solution:
Asset flow with Guetzli
The synchronous flow stays the same, so our orchestration server can still ask the renderer to create a rendition. But after creating the rendition (and saving to the webshop), the renderer uses an internal queue to which all the renditions are added. Each rendition on this queue will be processed by the renderer itself in an a-sync way. The result is a file which is processed by Guetzli and therefore way smaller than the previous (web-optimized) version.So, to summarize:
- The orchestration service can still make synchronous calls to the renderer, resulting in on-the-fly renditions on the webshop.
- But, in parallel, the renderer fills and processes its own queue which is used to apply the Guetzli algorithm.
Nice, but soon we ended up with this:
Guetzli queue
It turned out that the incoming stream of images was bigger than the number of workers processing the Guetzli queue could handle, as it takes up to one minute to "Guetzli" one file.
seti@bol :lol:
Then we had an idea: the work on the Guetzli queue isn't very complex and all our colleagues have a laptop with lots of idle CPU cycles ... So why don't we make use of that? Eventually, we ended up doing this:
- We added one endpoint on the render service: /work.
- This "work"-endpoint can be used to fetch a "work-package" and to put a "work-done-package".
- We also created a small command-line application which was pushed to our colleagues' laptops (opt-in).
This small app (we called it Miracle) does exactly the same as the renderer: it runs on a laptop, receives a web-optimized version from the renderer, locally applies the Guetzli algorithm and returns a "work-done-package".
Asset flow with work endpoint
Gamification
Since we really wanted this experiment to work (and we really needed lots of CPU cycles), we added the extra element of gamification. The three people who processed the most images in a week could win some nice prizes. Top 10 dashboards were created, an email was sent out to all colleagues and then: BAM! People really liked it, were getting involved, offered their help, sent us pull request for better dashboards, became creative in finding spare hardware, etc. Within two days, the whole queue was processed! We never expected the queue to be emptied this fast!
An empty resize queue
What's left?
And empty queue, smaller images and an extremely nice project to remember ...
Thanks for reading this, if you want more information about this project, please don't hesitate to contact me.