In many systems at bol.com the response speed of our systems is very important. This blog is about the data structures and algorithms we used to make a specific analysis step a lot faster: Finding the longest matching string prefix.Read more
Ever since I’ve started working for a WebAnalytics company in 2005 I’ve been working on problems related to making sense of web data. One of the most difficult elements in this type of analysis is making sense of the user agent.
Very often the raw web data I work with is stored in Apache HTTPD access log files that have been compressed using gzip.
At bol.com we service millions of products to millions of customers, resulting in billions of pages each year. We want to create the most effective service to our visitors we possibly can. So we spent a lot of time preparing the content, monitoring the systems and analyzing how we can support our customers better in finding what they really want.Read more
On The NextBuild Conference I will host a talk about Designing scalable data intensive applications. At bol.com the datavolumes we need to process exceed the capacity of any single computer. As a consequence we are forced to think about designing our data intensive applications in such a way that they can be run on a cluster of systems and grow to handle much more that the volumes of today. In this talk I will show the main design concepts that bol.com has been using for more than 8 years that are needed to…Read more