Our billing pipeline was suddenly slow
Cloudflare experienced a slowdown in its billing pipeline due to a hidden bottleneck in ClickHouse. This issue arose after a migration and impacted the timely processing of billing jobs, which are crucial for revenue generation. The company implemented three patches to address the performance issues and improve their data retention policies.
- ▪Cloudflare relies heavily on ClickHouse for billing and analytics, processing millions of calls daily.
- ▪A hidden bottleneck was discovered in ClickHouse's internals, causing delays in daily aggregation jobs.
- ▪The company adopted a new partitioning scheme to allow per-namespace retention, improving data management.
Opening excerpt (first ~120 words) tap to expand
Our billing pipeline was suddenly slow. The culprit was a hidden bottleneck in ClickHouse2026-05-14James MorrisonChristian Endres9 min readThis post is also available in 日本語 and 한국어.At Cloudflare, we are heavy users of ClickHouse, an open source online analytical processing (OLAP) database. Every day, we make millions of calls to ClickHouse to determine how much users should be billed for their usage of Cloudflare products. If we don't finish those jobs in a timely fashion, the invoices become very difficult to reconcile.This pipeline powers hundreds of millions of dollars in usage revenue, fraud systems, and more, so being delayed has major downstream implications.Which is why it was a big problem when the daily aggregation jobs in ClickHouse – responsible for ensuring Cloudflare’s bills…
Excerpt limited to ~120 words for fair-use compliance. The full article is at The Cloudflare Blog.