Efficiently managing application scalability in response to user demand is a critical aspect of cloud-native application development. Amazon Elastic Kubernetes Service (EKS) supports this through manual scaling and auto-scaling capabilities. In this ninth installment of our series, we delve into manually scaling services in EKS and setting up auto-scaling using Kubernetes Horizontal Pod Autoscaler (HPA) and the Cluster Autoscaler. These mechanisms ensure that our microservices can dynamically scale in and out based on actual load, optimizing resource utilization and maintaining application performance.

Manual Scaling in EKS

Step 1: Manually Scaling a Deployment

Before diving into auto-scaling, it’s useful to understand how to manually scale a deployment. This can be done using the kubectl scale command:

kubectl scale deployment <DEPLOYMENT_NAME> --replicas=<NUMBER_OF_REPLICAS>

Replace <DEPLOYMENT_NAME> with the name of your deployment and <NUMBER_OF_REPLICAS> with the number of pod instances you want.

While manual scaling is straightforward, it lacks the ability to respond automatically to changes in load, which is where auto-scaling comes into play.

Implementing Auto-Scaling

Kubernetes Horizontal Pod Autoscaler (HPA)

The Horizontal Pod Autoscaler automatically scales the number of pod replicas in a deployment, replication controller, replica set, or stateful set based on observed CPU utilization (or, with custom metrics support, on other application-provided metrics).

Step 1: Deploy Metrics Server

HPA requires metrics (like CPU and memory usage) from the Metrics Server:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Step 2: Create an HPA Resource

Create an HPA resource targeting your deployment. This example scales based on CPU usage:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50

This HPA will scale the deployment my-app to maintain an average CPU utilization across all pods of 50%. It will adjust the replicas between 1 and 10 based on demand.

Cluster Autoscaler

While HPA adjusts the pod count, the Cluster Autoscaler adjusts the number of nodes in your cluster. It increases the number of nodes when there are insufficient resources for all pods and decreases the count when some nodes are underutilized.

Step 1: Enable Cluster Autoscaler in EKS

To enable the Cluster Autoscaler in EKS, you need to edit your node group’s configuration to allow auto-scaling and then deploy the Cluster Autoscaler with the appropriate IAM roles and permissions.

For managed node groups, you can enable auto-scaling through the AWS Management Console or AWS CLI. Ensure your IAM role has the necessary permissions as outlined in the EKS documentation.

Step 2: Deploy Cluster Autoscaler

Deploy the Cluster Autoscaler by applying a configuration similar to this:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_FOR_CA>
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
        - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v<VERSION>
          name: cluster-autoscaler
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=least-waste
            - --nodes=1:10:<NODEGROUP_NAME>
          env:
            - name: AWS_REGION
              value: <AWS_REGION>
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt
              readOnly: true
      volumes:
        - name: ssl-certs
          hostPath:
            path

: /etc/ssl/certs/ca-certificates.crt

Replace <ACCOUNT_ID>, <IAM_ROLE_FOR_CA>, <VERSION>, <NODEGROUP_NAME>, and <AWS_REGION> with your specific values.

Conclusion

Implementing manual and auto-scaling strategies in Amazon EKS ensures that your applications can efficiently handle varying loads, providing cost savings and an improved user experience. By utilizing Kubernetes HPA for pod scaling and the Cluster Autoscaler for node scaling, you create a responsive and resource-efficient environment that can adapt to the needs of your applications in real time.

Gotchas and Tips

  • Monitor Scaling Events: Keep an eye on scaling events and metrics to adjust thresholds and limits as needed.
  • Understand Cost Implications: Auto-scaling can lead to cost increases if not monitored and managed carefully. Set appropriate limits based on your budget and performance requirements.
  • Test Scaling: Regularly test your scaling configurations under different loads to ensure they behave as expected and meet your application’s needs.