Too big to fail - a Beam Pattern for enriching a Stream using State and Timers

Posted on Tue 01 August 2023 in Apache Beam

The recording of my talk at the Beam Summit 2023, discussing a pattern for enriching streaming data using state and timers, is now available:

Playing the Long Game: Transforming Ricardo's Data Infrastructure with Apache Beam

Posted on Wed 03 August 2022 in Apache Beam

The recording of my talk at the Beam Summit 2022 about building real-time data pipelines, from running our own Apache Flink cluster on-premise to the fully managed GCP Dataflow service, is now available:

Python script to control BI Engine Reservations

Posted on Sat 07 May 2022 in Google BigQuery

I wrote a small Python script to be able to create and delete BI Engine reservations for Google Cloud BigQuery. I used this script in a cronjob to disable BI Engine on the weekends/during the nights to save some costs and see how well the system is with and …

Continue reading

Apache Beam Case Study

Posted on Fri 03 December 2021 in Apache Beam

In a case study for Apache Beam, I described how the framework enables Ricardo to evolve into a smarter second-hand marketplace. You can find it on the Apache Beam website.

You belong together - detecting linked accounts at Ricardo

Posted on Fri 06 August 2021 in Apache Beam

The recording of my talk at the Beam Summit 2021 about one of our first production pipelines created with the Python SDK is now available on YouTube:

GCP Zurich Meetup #8

Posted on Tue 16 March 2021 in Talks

At the GCP Meetup #8 in Zurich, I've talked about how the Google BigQuery data warehouse can benefit from the NoSQL database Cloud Bigtable, by using Bigtable as a fast key-based lookup store for point queries:

8 Years of Event Streaming with Apache Kafka

Posted on Tue 02 February 2021 in Apache Kafka

A post for the Confluent blog about my personal history with Apache Kafka is availabe here.

Why we chose Bigtable to complement BigQuery

Posted on Thu 14 January 2021 in Google Cloud

I've contributed a blog post for the GCP Blog about our use of Bigtable at Ricardo. You can read it here.

Four Apache Technologies Combined for Fun and Profit

Posted on Wed 26 August 2020 in Apache Beam

The recording of my talk at the Online Beam Summit 2020 is now available on YouTube:

From database dumps to streaming -'s Beam journey

Posted on Tue 18 June 2019 in Apache Beam

My talk at the Beam Summit Europe 2019 is now available on YouTube: