The recording of my talk at the Beam Summit 2022 about building real-time data pipelines, from running our own Apache Flink cluster on-premise to the fully managed GCP Dataflow service, is now available:
I wrote a small Python script to be able to create and delete BI Engine reservations for Google Cloud BigQuery. I used this script in a cronjob to disable BI Engine on the weekends/during the nights to save some costs and see how well the system is with and …
In a case study for Apache Beam, I described how the framework enables Ricardo to evolve into a smarter second-hand marketplace. You can find it on the Apache Beam website.
The recording of my talk at the Beam Summit 2021 about one of our first production pipelines created with the Python SDK is now available on YouTube:
At the GCP Meetup #8 in Zurich, I've talked about how the Google BigQuery data warehouse can benefit from the NoSQL database Cloud Bigtable, by using Bigtable as a fast key-based lookup store for point queries:
A post for the Confluent blog about my personal history with Apache Kafka is availabe here.
I've contributed a blog post for the GCP Blog about our use of Bigtable at Ricardo. You can read it here.
The recording of my talk at the Online Beam Summit 2020 is now available on YouTube:
My talk at the Beam Summit Europe 2019 is now available on YouTube:
I gave a talk at the Tamedia TX conference about how we approached the modernization of the Data Intelligence architecture at Ricardo: