Battle of the Pods: Kubernetes Autoscaling Showdown - KEDA vs. vanilla Kubernetes

1. Introduction: The Importance of Autoscaling

In today's cloud-native ecosystem, fluctuating workloads and dynamic traffic patterns are the norms. Accommodating such unpredictable behavior requires systems that can adjust in real-time. Autoscaling is a necessity, ensuring optimal resource allocation, curbing excessive costs, and fostering efficient resource use.

Autoscaling isn't just about costs. It plays a pivotal role in maintaining application performance and throughput. By avoiding both under-provisioning (leading to poor user experience) and over-provisioning (resulting in unnecessary costs), autoscaling strikes the right balance.

2. The Contenders: Understanding the Basics

Horizontal Pod Autoscaler (HPA)

HPA, as Kubernetes' native solution, scales the number of pods based on observed metrics, primarily CPU and memory. While it’s straightforward and beneficial for uniform workloads, its limitations become evident when you consider its inability to scale to zero and reliance solely on CPU and memory metrics.

HPA changes the number of pods

Vertical Pod Autoscaler (VPA)

VPA is more about adjusting resources than expanding them. It gauges the demand and adapts resources dynamically, ensuring the right fit for a workload. But here's the catch: a beefed-up pod isn't necessarily better. Sometimes, having more workers process data is more efficient than having one large, powerful worker.

VPA resizes a pod

3. The Limitations: When Vanilla Kubernetes Autoscalers Fall Short

While built-in Kubernetes autoscalers like HPA and VPA provide basic scaling capabilities, they are inherently limited in their scope. Their primary focus on CPU and memory metrics can be a significant limitation for modern applications that might need to react to diverse metrics, some of which might not even emanate from the application itself.

One of the compelling challenges modern applications face is the need to scale in response to events from external systems. For instance:

  • Message Queues: Applications might need to scale based on the number of messages in a queue (like RabbitMQ or Kafka). If there's a surge of unprocessed messages, it might be an indicator to scale up.
  • Database Triggers: Changes or updates in a database (like a sudden increase in rows of a particular table) might necessitate an application scale-up to process or analyze the influx of data.
  • External Webhooks: Incoming webhooks from third-party services (e.g., GitHub pushes or eCommerce transaction events) could require more resources to handle the additional load.
  • IoT Signals: For applications connected to IoT devices, a swarm of signals from these devices might be the metric that necessitates scaling.

Moreover, there are instances where scaling to zero is vital to manage resources efficiently, or scenarios where a combination of different metrics, perhaps CPU utilization coupled with database read/write rates, determines the scaling logic. These nuanced demands highlight the shortcomings of built-in Kubernetes autoscalers.

Custom Metrics Extension for HPA

Kubernetes introduced an interface for custom metrics aiming to offer the Horizontal Pod Autoscaler (HPA) more adaptability beyond just CPU and memory metrics. However, practical implementation has surfaced challenges.

While robust, the custom metrics API is not intuitively user-friendly. It demands a detailed grasp of Kubernetes internals, making setup and adjustments cumbersome.

Intermezzo: Prometheus Adapter

Prometheus Adapter attempts to bridge this gap by tapping into the custom metrics API, bringing in Prometheus' extensive metrics. But it comes with its baggage: a complex, non-intuitive configuration and being tied only to Prometheus metrics. Implementing and upkeeping the configuration demands constant vigilance. Infrastructure or application changes can trigger the need for reconfigurations.

4. Enter KEDA: The Hero of the Showdown

Kubernetes Event-Driven Autoscaling (KEDA) doesn't just integrate with Kubernetes' custom metrics API—it makes it accessible. It's a testament to how user-friendly interfaces can transform an experience, making autoscaling truly customizable and versatile.

Benefits of KEDA

KEDA offers multiple technical advantages:

  • Event-driven autoscaling: KEDA's ability to respond to specific events, even scaling down to zero, ensures resources are used judiciously.
  • Ease of use: Its intuitive configuration makes implementation a breeze, allowing developers to focus on application logic rather than configuration semantics.
  • Broad applications: Beyond just scaling pods, KEDA can schedule Kubernetes jobs based on events, ideal for tasks that don't need constant running but might require significant resources intermittently. 
  • Versatile integrations: With support for diverse authentication providers, integrating KEDA is both simple and secure.

KEDA in Practice

KEDA scaling Kafka consumers
KEDA scaling Kafka Consumer application

While traditional metrics such as CPU and memory provide some insights, real-world applications often demand more granular and diverse indicators for effective autoscaling. Here are some scenarios to consider:

  • Event-Driven Applications: Consider a Kafka-driven setup. While CPU usage might remain stable, a surge in incoming Kafka events is the real metric that determines load. In such a case, autoscaling should ideally respond to the rate of incoming events or the message backlog.
  • E-Commerce Transactions: In an e-commerce framework, special sale events might see a spike in order checkouts. The CPU might be unperturbed, but the genuine load can be the accumulating unprocessed orders in a database queue.
  • Streaming Data Pipelines: Applications processing data streams from platforms like Apache Kafka or AWS Kinesis experience variable data inflow rates. Here, the pertinent metric could be the backlog or lag in processing, not the CPU or memory consumption.
  • Selenium Test Workers: In a Continuous Integration (CI) pipeline, when a new code is committed, it might trigger a suite of Selenium tests. The real metric here might be the queue of pending tests. If there's a bottleneck with a large number of tests waiting, autoscaling Selenium workers based on this queue would be more effective than merely observing the CPU or memory metrics.
  • API Rate Limiting: For applications heavily reliant on third-party APIs with rate limits, the nearing rate limit can be a signal to scale. Instead of passive reactions to rate limit errors, proactive scaling based on API call frequency can ensure smooth operations.

Such varied real-world scenarios emphasize the need for a versatile autoscaling solution that can understand and react to a multitude of metrics. KEDA, with its flexibility and adaptability, addresses these challenges effectively.

5. Conclusion: The Future of Autoscaling with KEDA

While Kubernetes has native autoscaling tools like HPA and VPA, and extensions like the Prometheus Adapter, they often come with complexities. KEDA, on the other hand, provides a straightforward platform for diverse autoscaling needs. Its ability to handle event-driven scaling, including scaling to zero, is a significant advantage. Moreover, setting up KEDA is simpler, reducing the typical hurdles users face with Kubernetes' custom metrics.

KEDA's active community is a testament to its utility. Regular contributions to the project, vendors like Kedify or Microsoft and increasing adoption among businesses show its growing importance in the Kubernetes ecosystem.