DevOps 9: Implementing Auto-Scaling in Amazon EKS
Efficiently managing application scalability in response to user demand is a critical aspect of cloud-native application development. Amazon Elastic Kubernetes Service (EKS) supports this through manual scaling and auto-scaling capabilities. In this ninth installment of our series, we delve into manually scaling services in EKS and setting up auto-scaling using Kubernetes Horizontal Pod Autoscaler (HPA) and the Cluster Autoscaler. These mechanisms ensure that our microservices can dynamically scale in and out based on actual load, optimizing resource utilization and maintaining application performance.
Manual Scaling in EKS
Step 1: Manually Scaling a Deployment
Before diving into auto-scaling, it’s useful to understand how to manually scale a deployment. This can be done using the kubectl scale
command:
kubectl scale deployment <DEPLOYMENT_NAME> --replicas=<NUMBER_OF_REPLICAS>
Replace <DEPLOYMENT_NAME>
with the name of your deployment and <NUMBER_OF_REPLICAS>
with the number of pod instances you want.
While manual scaling is straightforward, it lacks the ability to respond automatically to changes in load, which is where auto-scaling comes into play.
Implementing Auto-Scaling
Kubernetes Horizontal Pod Autoscaler (HPA)
The Horizontal Pod Autoscaler automatically scales the number of pod replicas in a deployment, replication controller, replica set, or stateful set based on observed CPU utilization (or, with custom metrics support, on other application-provided metrics).
Step 1: Deploy Metrics Server
HPA requires metrics (like CPU and memory usage) from the Metrics Server:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Step 2: Create an HPA Resource
Create an HPA resource targeting your deployment. This example scales based on CPU usage:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50
This HPA will scale the deployment my-app
to maintain an average CPU utilization across all pods of 50%. It will adjust the replicas between 1 and 10 based on demand.
Cluster Autoscaler
While HPA adjusts the pod count, the Cluster Autoscaler adjusts the number of nodes in your cluster. It increases the number of nodes when there are insufficient resources for all pods and decreases the count when some nodes are underutilized.
Step 1: Enable Cluster Autoscaler in EKS
To enable the Cluster Autoscaler in EKS, you need to edit your node group’s configuration to allow auto-scaling and then deploy the Cluster Autoscaler with the appropriate IAM roles and permissions.
For managed node groups, you can enable auto-scaling through the AWS Management Console or AWS CLI. Ensure your IAM role has the necessary permissions as outlined in the EKS documentation.
Step 2: Deploy Cluster Autoscaler
Deploy the Cluster Autoscaler by applying a configuration similar to this:
apiVersion: v1
kind: ServiceAccount
metadata:
name: cluster-autoscaler
namespace: kube-system
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_FOR_CA>
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
spec:
serviceAccountName: cluster-autoscaler
containers:
- image: k8s.gcr.io/autoscaling/cluster-autoscaler:v<VERSION>
name: cluster-autoscaler
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --nodes=1:10:<NODEGROUP_NAME>
env:
- name: AWS_REGION
value: <AWS_REGION>
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: true
volumes:
- name: ssl-certs
hostPath:
path
: /etc/ssl/certs/ca-certificates.crt
Replace <ACCOUNT_ID>
, <IAM_ROLE_FOR_CA>
, <VERSION>
, <NODEGROUP_NAME>
, and <AWS_REGION>
with your specific values.
Conclusion
Implementing manual and auto-scaling strategies in Amazon EKS ensures that your applications can efficiently handle varying loads, providing cost savings and an improved user experience. By utilizing Kubernetes HPA for pod scaling and the Cluster Autoscaler for node scaling, you create a responsive and resource-efficient environment that can adapt to the needs of your applications in real time.
Gotchas and Tips
- Monitor Scaling Events: Keep an eye on scaling events and metrics to adjust thresholds and limits as needed.
- Understand Cost Implications: Auto-scaling can lead to cost increases if not monitored and managed carefully. Set appropriate limits based on your budget and performance requirements.
- Test Scaling: Regularly test your scaling configurations under different loads to ensure they behave as expected and meet your application’s needs.