We use a lot of images at bol.com, as they’re very important in our visitors’ decision-making and buying process. Unfortunately, these images slow down our page performance. In fact, they account for most of the downloaded bytes on a page.
But first things first: let me briefly explain our (simplified) image flow, from suppliers to the webshop. There are a couple of services involved, which I’ll explain in detail below:
The image flow architecture
- An orchestration service, which processes all incoming images in their original sizes/dpi/colorspaces/formats/etc. This includes downloading the images, sanitizing them, gathering metadata, matching the image against our product database, scoring the image, etc.
- A render service, whose sole purpose is to create a web-optimised rendition based on the original. These are the images that we serve to our customers.
- The webshop itself.
Another succesful hackaton idea
- The orchestration service can still make synchronous calls to the renderer, resulting in on-the-fly renditions on the webshop.
- But, in parallel, the renderer fills and processes its own queue which is used to apply the Guetzli algorithm.
Nice, but soon we ended up with this:
- We added one endpoint on the render service: /work.
- This “work”-endpoint can be used to fetch a “work-package” and to put a “work-done-package”.
- We also created a small command-line application which was pushed to our colleagues’ laptops (opt-in).
This small app (we called it Miracle) does exactly the same as the renderer: it runs on a laptop, receives a web-optimized version from the renderer, locally applies the Guetzli algorithm and returns a “work-done-package”.
Since we really wanted this experiment to work (and we really needed lots of CPU cycles), we added the extra element of gamification. The three people who processed the most images in a week could win some nice prizes. Top 10 dashboards were created, an email was sent out to all colleagues and then: BAM! People really liked it, were getting involved, offered their help, sent us pull request for better dashboards, became creative in finding spare hardware, etc. Within two days, the whole queue was processed! We never expected the queue to be emptied this fast!
And empty queue, smaller images and an extremely nice project to remember …
Thanks for reading this, if you want more information about this project, please don’t hesitate to contact me.