Let's consider the following picture and see how we enable teams at bol.com to be in the sweet spot using our tech vision:
At bol.com we like our teams to be autonomous when it comes to implementing solutions to help our customers. However if people are completely free to do whatever they want to reach a team goal we might end up in the bottom right corner of the picture due to lack of direction. We like to be in the upper right corner where there’s both freedom and alignment. The other parts of the diagram are of course less desirable for our software engineers and data scientists.
Part of the alignment is to have an overall tech vision that helps teams and domains to establish direction. At bol.com we have quite a few topics to cover, I'll provide a summary and show how this enables and benefits our developers. There is quite some ground to cover with the following topics from the vision:
- Technical Platform becoming developer-centric
- Engineering reliable products
- Availability of data and the Data Platform
- Further raising security while enabling innovation
- Paving a smooth road to production
- Maximize autonomy within a framework
- Lower barriers to implement Data Science
Technical platform becoming developer-centric
We are moving from a datacenter environment towards self-service cloud environments. And while this self-service gives developers more flexibility and speed in delivering solutions it also gives them extra responsibilities to run these applications in production. This and other demands like cost consciousness add a lot to the cognitive load of developers. We want to make this easier.
How do we do that? By adding analyst and UX capabilities to the teams that deliver the internal developer platform they understand their customers and the product teams better. By creating a second generation of our self-service cloud platform we utilize more open-source components and not bol-specific solutions. With this in place we can utilize Backstage to indeed give actionable insights that you just apply instead of reverse engineering from all available documentation and tools. That should free up some space in your head and some time to code!
Engineering reliable products
In order to make the daily life of customers and partners easier, we need to balance rapid innovation with world class reliability. Our platform also follows specific load patterns related to our market and seasonal influences. High load during the holiday season followed by a more quiet period in January for example.
How do we do that? With Google Cloud we can dynamically scale infra up and down on demand. Since teams are responsible for running their own applications in the cloud we have started an SRE discipline that helps teams to run reliably. This entails workshops to set up SLI and SLO’s with your business counterparts or supporting and enabling the ‘engineer on duty’ pool for the night shift. But it can be as much fun as promoting the use of Renovate to update your dependencies with a song
Availability of data and the Data Platform
At bol.com we know that there’s tons of value in our data, but we need to get it to the people who can put it in the right context. These are the people making business decisions and the data scientists that help create actionable insights. This requires clear definitions of data, data ownership, quality- and security controls and prevention of unnecessary data duplication
How do we do that? To this end we have democratized the use of data by having a piece of middleware to create standard high quality datasets with built-in consistency checks from every service that has data to offer. Our self-service platform helps with the ownership and access control. You can check our journey on this with Google in the interview here. Of course there’s also a department ‘Data&’ that helps people to get the most out of these datasets.
Further raise security while enabling innovation
Bol.com is big, we have millions of customers, thousands of partners and billions of sales. We need to protect and maintain the trust of our customers, protect their data and run our business without interruption.
How do we do that? Security is part mindset and education, but in our case also a lot of automation. Building blocks on our self service platform are secure by default for several classes of data (ie. personal information, financial statement, shop images). Awareness and being safe by default enable a safe shop that is scalable. Automated container scanning is in place, and we’re optimizing the compliance process so that the right person gets notified when something happens that requires an explanation or sign-off so that people don’t have to ‘periodically check everything ’.
Paving a smooth road to production
We need to improve our way of testing to keep going to production in a smooth fashion. With the ever increasing number of teams and services it’s becoming harder and harder to create a stable test environment. This is amplified by the fact that all testdata has to be artifical and the amount of parameters that need to be ‘just right’ to get a realistic load test.
How do we do that? We want to make better use of test strategies like contract testing, canary releases and other test methods that allow us to deploy to production without relying on a fully functional staging environment.
Maximize autonomy within a framework
Bol.com is constantly evolving as a company, however we want to keep core cultural traits that define our identity, such as the autonomy teams have to build, run and love their products. Nevertheless, as we grow we also need to have more mechanisms for alignment and a need to keep a certain level of consistency to be able to be flexible as an organization, easier onboarding or switching teams
How do we do that? We’re working on explicitly defining what the boundaries of the framework are so that everybody can know them without asking around or acquiring a lot of experience first. At our current scale we need to write down our culture as well as living it so that everybody can participate. The framework is more than just a thick rulebook of things we do and don’t do. It entails organizational elements like a techlead community, a tech radar and architectural principles. It also consists of the tooling we provide form the platform teams, making the developer workflow as easy as possible. But also our culture which is very important to stay adaptable to inside and outside influences.
Lower barriers to implement Data Science
We envision to grow and expand the utilization of data science which in turn requires us to provide tailored support for its unique capabilities. We don’t want people to reinvent the wheel all over the place.
How do we do that? By acknowledging that the way of working for data scientists is different from software engineers. Having done that we create a golden path specifically for data scientists consisting of Python tooling and AI cloud resources.