Cloud as Enabler to become more data driven. The adjective data driven means that progress in an activity is compelled by data, rather than by intuition, personal experience or a gut feeling. Googling data driven gives you over 637 mln. results so time to add an extra result to this with this episode. But we wanted to make it more specific. We moved or big data into the cloud and we had some nice insights that we want to share. We asked our two guests of this episode to do…Read more
In this episode, we talk with Daniel and Emiel, software engineer and product owner in the customer support domain. In this domain, the focus is to help our customers in the best way possible. But what if we can prevent the customer to feel the need to contact bol.com in the first place, they asked themselves. They realized this can be possible using the analyses of the various customer interactions we have via the Chatbot “Billie”, live chat, phone and email. For these analyses, they introduced techniques from the Data…Read more
As long as retail exists, people tried to predict the future. An accurate forecast makes it much easier to buy the correct amount of products from suppliers, know what you need to keep on stock and even know what the sales will do with specific promotions. Over the last couple of years, this domain changed dramatically because of the introduction of Data Science and Artificial Intelligence. In this episode, we chat about this changing playing field to share our experiences with you. Guests Harmen Prins Erick Webbe Hosts Peter Brouwers…Read more
In many systems at bol.com the response speed of our systems is very important. This blog is about the data structures and algorithms we used to make a specific analysis step a lot faster: Finding the longest matching string prefix.Read more
Data Science and Machine Learning are becoming more integrated into current businesses. Especially in e-commerce there is huge potential for predictive modeling. It is therefore no surprise that bol.com has given extra focus on significantly expanding its Data Science efforts the coming year. That’s not to say that there aren’t already some interesting Data Science projects running. In this blog post we will take a look at one of the projects I am currently working on with fellow data scientist Joep Janssen: the chunk project.Read more
Ever since I’ve started working for a WebAnalytics company in 2005 I’ve been working on problems related to making sense of web data. One of the most difficult elements in this type of analysis is making sense of the user agent.
Very often the raw web data I work with is stored in Apache HTTPD access log files that have been compressed using gzip.