We ditched Prometheus for autoscaling, and we don't miss it. More ↓ 🚨 The Problem: Traditional CPU/memory autoscaling fails for real-time analytics workloads. By the time CPU spikes, your queues are already backed up. Prometheus adds scraping delays when you need scaling most. ⚡ Our Challenge: Our Kafka service ingests terabytes daily, and we can have 10x traffic spikes during customer launches or other notable events. CPU-based scaling was too slow, memory-based too vague. We needed real signals, not lagging resource metrics. ✅ The Solution: We use KEDA and swapped out Prometheus for a single Tinybird API. Instead of the traditional multi-hop delay (Prometheus scraping → aggregation → federation → KEDA query), we go direct: KEDA polls live metrics computed fresh from streaming data, directly in Tinybird. Dog food never tasted so good. 🔧 Technical Implementation: We built a Tinybird pipe that exposes Prometheus-format endpoints with metrics like Kafka lag. KEDA's pulls these directly with zero scraping lag. Metrics are computed on-demand when KEDA requests them - always fresh data driving scaling decisions 💡 The Key Insight: Traditional approaches have multi-hop delays at every step. Our approach eliminates the middle layers entirely. Metrics are computed from the same streaming data we already trust for analytics. 📊 Results: Faster scaling, simpler infrastructure, and autoscaling based on real workload signals. No separate monitoring stack to run and maintain. This is what happens when you use your own platform to solve your own scaling challenges. 💪 Read the full technical implementation in our blog post. You'll find it in the comments.
Blog post: tbrd.co/keda