How it Works

Set Replica Bounds - Define the absolute minimum and maximum pod counts.

Define Targets - Set the target CPU or Memory utilization percentage (e.g., scale at 80% CPU).

Target Workload - Bind the HPA to a specific Deployment or StatefulSet.

Generate YAML - Export the autoscaling/v2 manifest.

Best Practices

Elastic scaling requires precise CPU utilization thresholds and defined replica boundaries.

API Version

autoscaling/v1

autoscaling/v2

Metrics

CPU only

Mixed CPU and custom metrics

Bounds

No maximum limit

Strict maxReplicas constraint

Example Output

Here is a real generated snippet matching the production best practices above:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa

Advanced Configuration Logic

Managing traffic spikes manually is impossible. The Horizontal Pod Autoscaler solves this, but its API (autoscaling/v2) is complex. Scaling purely on CPU without understanding the underlying limit requests leads to thrashing—where pods are rapidly created and destroyed. Our tool enforces proper API structure, ensuring the scaleTargetRef correctly points to your Deployment, and that utilization metrics are properly formatted to prevent autoscaler deadlock.

Ready to automate your infrastructure?

Scroll back up to the generator and export your production-ready configuration in seconds.

Start Building

Kubernetes hpa Generator

Workload Type

Service Exposure

Additional Resources

How it Works

Best Practices

Example Output

Advanced Configuration Logic

Ready to automate your infrastructure?