Unlocking Efficiency: A Guide for Optimising Usage and Reducing Cost of your Amazon EKS Clusters

Introduction

As businesses increasingly embrace the scalability and flexibility of containerised applications, Amazon Elastic Kubernetes Services (EKS) has emerged as a leading choice for deploying and managing Kubernetes clusters. However, with the benefits of cloud services come the challenges of cost management.

Cost optimisation is not just about saving money, it is about using your resources wisely, ensuring long-term sustainability, and maintaining a competitive edge.

In this blog, we will explore the various avenues where cost optimisation can be applied to your Amazon EKS clusters, without compromising performance or scalability.

Why Should you Optimise your Costs?

Lets delve a bit deeper into why you should optimise your Amazon EKS costs. As we will find out, there are more benefits than just reducing the monthly bill.

Here are some key benefits of optimising your cost.

  1. Cost Efficiency: By optimising your Amazon EKS costs, you can ensure that you are getting the most value out of your cloud resources. Reducing unnecessary spending allows you to allocate your budget more efficiently and invest in other critical areas of your business.
  2. Budget Management: Controlling Amazon EKS costs helps you maintain a predictable budget and prevents unexpected cost overruns. This allows you to plan and allocate resources effectively without encountering financial surprises.
  3. Resource Utilisation: Optimising Amazon EKS costs encourages better resource utilisation. It ensures that your Kubernetes cluster is using the right amount of resources, reducing wastage and making the most out of the available infrastructure. To get more insights into your Amazon EKS resource utilisation, you can use the monitoring solution that I described in my previous blog at https://nivleshc.wordpress.com/2023/07/24/monitor-amazon-elastic-kubernetes-service-clusters-using-prometheus-and-grafana/.
  4. Scalability: Proper cost optimisation allows you to scale your Amazon EKS cluster more effectively. When you can manage costs efficiently, you can confidently scale your applications to meet increasing demand without worrying about unnecessary expenses.
  5. ROI and Profitability: Maximising cost efficiency directly impacts your organisation’s return on investment (ROI) and overall profitability. By cutting unnecessary expenses, you can improve your bottom line and increase profit margins.
  6. Competitive Advantage: Cost optimisation can give you a competitive edge. If you can deliver your services at a lower cost compared to your competitors, it allows you to offer more competitive pricing and attract more customers.
  7. Sustainability: Cloud cost optimisation aligns with sustainability goals. By using resources more efficiently, you reduce the environmental impact associated with excessive resource consumption.
  8. Long-term Viability: A well-managed cost structure ensures the long-term viability of your Amazon EKS infrastructure. It helps you run your applications and services sustainably over time without incurring debt or unnecessary expenses.

As you can see from the above, there are many benefits that come out of optimising your Amazon EKS costs. These can help in realising your organisations financial, competitiveness and sustainability visions.

What Makes Up Your Amazon EKS Bill?

Before we dive into the various strategies for optimising the Amazon EKS costs, it is good to understand what actually contributes towards your Amazon EKS bill.

AWS charges you for Amazon EKS clusters based on the following components:

  1. Amazon EKS Control Plane: AWS charges you based on the number of hours your Amazon EKS cluster’s control plane is active. The control plane is responsible for managing and orchestrating your Kubernetes cluster. The pricing depends on the region where your cluster’s control plane is deployed (currently 0.10 USD per hour for Sydney region).
  2. Amazon EC2 Instances: If you use Amazon EC2 instances as worker nodes in your Amazon EKS cluster, you will be charged for the Amazon EC2 instance usage based on the instance type, running hours, and data transfer. The Amazon EC2 instances run your containerised applications. You can find more information about Amazon EC2 instance pricing at https://aws.amazon.com/ec2/pricing/.
  3. AWS Fargate: If you use AWS Fargate as the compute engine for your Amazon EKS cluster, you will be charged based on the vCPU and memory resources used by your containers. More information regarding AWS Fargate pricing is available at https://aws.amazon.com/fargate/pricing/
  4. Amazon Elastic Container Registry: If you are using Amazon Elastic Container Registry (ECR) to store your container images, this will contribute towards your monthly bill. Amazon ECR pricing information is available at https://aws.amazon.com/ecr/pricing/
  5. Networking: Data transfer between your Amazon EKS cluster and other AWS services or the internet is subject to AWS Data Transfer charges.
  6. Load Balancing: If you use AWS Load Balancer services (e.g., Application Load Balancer, Network Load Balancer) to distribute traffic to your applications, or as ingress controllers, you will be charged based on the usage of these load balancers. You can find more information about the AWS Load Balancer pricing at https://aws.amazon.com/elasticloadbalancing/pricing/
  7. Storage: AWS will charge you for any storage that you use. This consists of any additional volumes that you have attached to your Amazon EC2 worker nodes, any persistent volumes that you have created for your Amazon EKS cluster and any other types of storage you are consuming.

It’s important to note that the charges are based on actual usage. For example, you are only billed for the hours your control plane is active, the running time and size of your Amazon EC2 instances or Fargate tasks, the data transfer, storage and load balancer usage.

Strategies for Optimising your Amazon EKS Cost

Now that we know why we need to optimise our Amazon EKS costs and have a good understanding of what makes up the Amazon EKS monthly bill, lets go through some strategies that will help us optimise the cost.

1. Right size your worker nodes

This strategy is applicable when you are using Amazon EC2 instances as your Amazon EKS worker nodes. Use a data driven approach to find out if your worker nodes are getting adequately used. It would be wasteful to provision capacity that you will never be able to use. On the flip side, you don’t want to have undersized worker nodes, as it could result in pods stuck in the pending stage most of the times.

A monitoring solution, like the one we provisioned in the previous blog (https://nivleshc.wordpress.com/2023/07/24/monitor-amazon-elastic-kubernetes-service-clusters-using-prometheus-and-grafana/), will help you use a data driven approach to find the right Amazon EC2 instance size that can be used when provisioning your worker nodes.

Where possible, always use a managed Amazon EKS node group, as this removes the administrative burden of managing the Amazon EC2 instance fleet.

2. Use Amazon EC2 Spot Instead Of On-Demand Instances

As you might be aware, Amazon EKS managed node groups support Spot instances. At their simplest, Spot instances are unused compute capacity that AWS has, which it “auctions” to the highest bidder. These can provide a significant discount compared to on-demand capacity.

However, as with every good thing, there is a catch. The price for a particular spot instance type can fluctuate from hour to hour, and If the price exceeds your bid price in the next hour, AWS will terminate your spot instance, giving you only a two minute notice.

If your workloads are fault tolerant and can survive interruptions, you can enjoy significant discounts by running them on managed node groups that use Amazon EC2 Spot instances.

3. Use A Mix Of Amazon EC2 Spot And On-Demand Instances

Sometimes it is not easy to move all your workloads onto Spot instances. You might have a portion of the workload that might be susceptible to interruptions.

If you can re-architect your workloads, so that you can separate the fault tolerant from those that are not, then you can help drive down the costs. You can run the fault tolerant workloads on managed node groups that use Amazon EC2 Spot instances, while those that are not fault tolerant can be run on managed node groups that use On-Demand Amazon EC2 instances.

The discounts realised by the above might not be as much as that if you were to run your entire workload on Spot Instances, however the cost will still be less than running the entire workload on On-Demand Amazon EC2 instances.

4. Use AWS Fargate Instead Of Amazon EC2 Instances

AWS Fargate is a serverless compute engine, which can be used to run your Amazon EKS containers, instead of Amazon EC2 backed node groups.

With AWS Fargate, you don’t have to manage the underlying compute instances (arguably the same can be said about managed Amazon EC2 node groups). Instead of paying for the running costs of Amazon EC2 instances (as is with Amazon EC2 backed managed node groups), you only pay for the vCPU and memory that your containers use.

However, AWS Fargate comes with some limitations, which you need to consider before changing your Amazon EKS cluster to use it. These are discussed in detail here https://docs.aws.amazon.com/eks/latest/userguide/fargate.html.

5. Consolidate Your Amazon EKS Clusters

If you have multiple Amazon EKS clusters, it might be worthwhile to review the workloads and consolidate them into fewer clusters. You can use of different namespaces to make this happen.

Consolidation is a great idea, especially in non-production environments. This could help you fully utilise your worker nodes, thereby reducing your Amazon EKS costs.

6. Use A Cluster AutoScaler

The Kubernetes Cluster AutoScaler is an open-source tool that automatically adjusts the size of the worker node group in an Amazon EKS cluster, based on the demand and resource requirements of the running workload.

When you deploy applications in a Kubernetes cluster, they consume resources such as CPU and memory. If the resource demand increases and the existing worker nodes in the cluster do not have enough capacity to accommodate the new pods, the Cluster Autoscaler identifies the need for additional capacity, and adds more worker nodes to the node group.

Conversely, when the resource demand decreases, and there are idle worker nodes, the Cluster Autoscaler scales down the cluster by removing the under-utilised nodes, helping you save on costs by using only the required resources during low-demand periods.

The Cluster Autoscaler for EKS works in conjunction with Kubernetes’ Horizontal Pod Autoscaler (HPA). While the HPA scales the number of pods based on application demand, the Cluster Autoscaler scales the number of worker nodes to ensure the necessary resources are available to support the desired number of pods.

Enabling the Amazon EKS Cluster Autoscaler helps you achieve better resource utilisation, reduce over-provisioning, and efficiently handle workload fluctuations without manual intervention. It is an important tool for maintaining a scalable and cost-effective Kubernetes environment on Amazon EKS.

This article provides the necessary information for deploying Cluster AutoScaler in your Amazon EKS cluster https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md

7. Shutdown Unused Amazon EKS Clusters

For most Organisations, their non-production environments do not need to be run 24×7. This is why, to save costs, we must turn them off when they are not in use.

Lets use an example to illustrate this point. Lets assume we have a non-production environment with the following attributes.

Number of Amazon EKS Cluster: 1

Node Group Type: Managed Amazon EC2 Instances

Number of worker nodes: 2

Amazon EC2 Instance Type: r4.xlarge (on-demand)

Region: Sydney

If we run the above Amazon EKS cluster 24×7, the total cost will be (0.10 + (2 x 0.3192)) * 730 = 539.03 USD/Month (6,468.36 USD/Year).

However, if we find out that this non-production environment is only used between 9am – 7pm weekdays, and if we can shut it down outside of these hours, we can save on the EC2 costs. This will bring down the cost to 211.70 USD/Month (2540.40 USD/Year).

That is a saving of more than 60%!

Note: Currently there is no functionality available to shutdown the Amazon EKS control plane, that is why only the worker nodes are shut down.

8. Periodically Review your Cost Optimisation Strategies

Cost optimisation strategies are not one of those that you can set once and then forget about them. This is because your requirements can change from time to time and as such, what you found optimal a few months back, might not be useful now.

This is why you need to revisit your cost optimisation strategies regularly, to ensure that they are still applicable.

As stated above, always use a data driven approach to find out what is working and what is not. Use your monitoring solution to get insights into your resource utilisation trends, to ensure you strike a balance between cost optimisation and meeting your application’s performance and availability requirements.

As always, I hope this blog has been useful to you and helps you get more out of your Amazon EKS clusters, while keeping the costs down.

Till the next time, stay safe!