A DevOps life isn’t a piece of cake in AWS. How are you supposed to make sense of EC2 instance types when you’re looking at over 500 different ones?
Picking the right VM type for the job that doesn’t burn a hole in your pocket is a challenge. But there are a few things you can do to make your life easier (and gain points with your financial department).
Careful choice of EC2 instances is definitely worth your time because compute is the biggest part of your cloud bill. If you manage to optimize it, you’ll open the doors to dramatic reductions of your cloud costs.
What you’ll find inside:
- Before we get started: 5 basic facts about Amazon EC2 instances
- How to choose the EC2 instance types with cost optimization in mind
- Identify your application’s requirements
- Shop around for EC2 instance families
- Choose your instance size with cost savings in mind
- Weigh the pros and cons of different pricing models
- Reduce costs with CPU bursting
- Optimize your storage choice
- Use Spot Instances (even for production workloads)
- Use automation to find better-suited instances
Note: The cloud world changes rapidly, so we update this article to reflect that. Last update: 04.08.2021.
Before we get started: 5 basic facts about Amazon EC2 instances
- Amazon Elastic Compute Cloud ( EC2) is a service that delivers compute capacity in the cloud to help teams benefit from easy-to-scale cloud computing.
- AWS currently offers over 500 different instances with choices across storage options, networking, operating systems.
- Users can choose from machines located in 24 regions and 77 availability zones all over the world.
- EC2 instances use two types of processors: Intel Xeon and AMD EPYC, and Arm-based AWS Graviton.
- To match your use case, you can choose from 5 different EC2 instance families optimized for compute, memory, storage, accelerated computing or general purpose.
How to choose the EC2 instance types with cost optimization in mind
1. Identify your application’s requirements
Some teams make the mistake of choosing EC2 instances that are too large. They want to be on the safe side in case their application’s requirements increase. But why overprovision when you can use a burstable instance or delegate the task to incredibly cost-effective spot instances when needed?
Other teams are tempted to use more affordable instances. But what if they start running memory-intensive applications and encounter performance issues?
It all starts with knowing your workload requirements well. Make a deliberate effort to get only what your application really needs.
Identify the minimum requirements of your workload and pick EC2 instance types that meets them across these dimensions:
- vCPU count
- vCPU architecture
- SSD storage
Let’s say that you’ve done your homework and came up with a set of targeted instance types.
CPU vs. GPU – which one should you pick?
If you’re looking for an instance to support a machine learning application, for GPU instead of CPU. GPU-dense instance types train models much faster. Interestingly, the GPU wasn’t initially designed for machine learning – it was designed to display graphics.
What about running predictions? Is investing specialized instance types worth it? AWS has introduced a new instance type designed for inference, AWS EC2 Inf. It supposedly delivers up to 30% higher throughput and 45% lower cost per inference than EC2 G4 instances.
And what’s the hype around Arm all about? The EC2 A1 family is powered by the Graviton2 Arm processor. Since Arm is less power-hungry, it’s also cheaper to run and cool. Cloud providers usually charge less for this type of processor.
But if you’d like to use it, you might have to re-architect your delivery pipeline to compile your application for Arm. On the other hand, if you’re already running an interpreted stack like Python, Ruby or NodeJS, your applications will likely run on Arm.
2. Shop around for EC2 instance types and families
|EC2 instance family||Key characteristics||Use cases|
|General purpose||Balanced ratio of vCPU to memory||– General-purpose applications that use vCPU and memory in equal proportions|
– Scale-out workloads like web servers, containerized microservices, and small to mid-sized development environments
– Low-latency user interactive applications, small to medium databases workloads
– Virtual desktops machines, code repositories, application servers
|Compute optimized||– High ratio of vCPU to memory |
– Optimized for vCPU-intensive workloads
|– High performance web servers, batch processing, distributed analytics|
– High performance computing (HPC)
– Highly scalable multiplayer gaming platform apps
– High performance frontend fleets, backend applications, and API servers
– Science and engineering applications
|Memory optimized||– High ratio of memory to vCPU||– High performance database clusters|
– Distributed web scale in-memory caches
– Mid-size in-memory databases and enterprise applications
– Applications tha process unstructured big data in real time
– High performance computing (HPC) and Hadoop/Spark clusters
|Storage optimized||– Designed for workloads that need high, sequential read and write access to massive data sets on local storage|
– Can deliver thousands of low-latency, random I/O operations per second (IOPS) to applications
|– NoSQL databases (Cassandra, MongoDB, Redis)|
– In-memory databases (SAP HANA, Aerospike)
-Scale-out transactional databases and distributed file systems (HDFS and MapR-FS)
– Massively Parallel Processing (MPP)
– MapReduce and Hadoop distributed computing
– Apache Kafka, and big data workload clusters
|Accelerated computing||– Uses hardware accelerators (co-processors) to power functions that machine and deep learning systems require||– Machine/deep learning|
– High performance computing (HCP)
– Computational finance
– Speech recognition and conversational agents
– Molecular modelling and genomics
– Recommendation engines
– 3D visualizations and rendering
|Inference type||– Promises up to 30% higher throughput and 45% lower cost per inference than EC2 G4 instances |
– Includes 16 AWS Inferentia chips, second generation Intel Xeon Scalable processors, and networking of up to 100 Gbps
– Learn more
|– Machine learning applications|
– Search recommendation
– Speech recognition and natural language processing
– Fraud detection
3. Choose your instance size with cost savings in mind
EC2 instance types come in one or more sizes, so scaling resources to match your workload’s requirements is easy.
But size isn’t the only factor that determines the cost.
AWS rolls out different computers to provide compute capacity. And the chips in those computers have different performance characteristics.
You might get an instance running on an older-generation processor that is slightly slower or a new-generation one that is a bit faster. The instance type you pick might come with strong performance characteristics your application doesn’t really need. And you won’t even know it.
How to verify this? Benchmarking is the best approach. It means that you drop the same workload on every machine type you want to examine and check its performance characteristics.
Here’s an example of benchmarking
To understand instance performance, we developed a metric called Endurance Coefficient. Here’s how we calculate it:
- We measure how much work an instance type can carry out in 12 hours and how variable the CPU performance is.
- A sustained base load needs stability. A workload that experiences traffic or batch job occasionally can get away with lower stability.
- In our calculation, instances with stable performance are close to 100 and ones with random performance edge closer to 0 value.
We tested the DigitalOcean s1_1 machine and – as you can see – it achieved a pretty high endurance coefficient of 0.97107 (97%). The AWS t3_medium_st instance delivered a less stable result with the endurance coefficient of 0.43152 (43%).
Source: CAST AI
4. Weigh the pros and cons of different pricing models
Next, you have to select an EC2 pricing model that matches your needs and budget. AWS offers the following models:
You pay only for the resources that you actually use. No need to worry about long-term binding contracts or upfront payments. Increase or reduce your usage just-in-time. But this flexibility comes with a high price tag. Workloads with fluctuating traffic spikes benefit the most from On-Demand instances.
Buy capacity upfront in a given availability zone with a large discount off the On-Demand price. The larger your upfront payment, the larger the discount. But if go for it, you’re also committing to a specific instance or family. And you can’t change that later if your requirements change.
Get the Reserved Instances discounts but commit to use a given amount of compute power per hour (not specific instance types and configurations). Anything extra will be billed at the high On-Demand rate.
But wait, didn’t you migrate to the cloud to avoid CAPEX in the first place? Resourced Instances and Savings Plans pose risk of vendor lock-in. The resources you get today might make little sense for your company doesn the line. Three years is an eternity in cloud computing.
Take a look here for more insights on this: Do AWS Reserved Instances and Savings Plans really reduce costs?
Bidding on spare compute is a smart move, you can save up to 90% off the On-Demand pricing. But AWS can pull the plug on your instance any time and give you just 2 minutes to prepare for it. You need to come up with a strategy to deal with that.
Learn more about spot instances here: Spot instances: How to reduce AWS, Azure, and GCP costs by 90%
A physical server that brings an instance capacity that is fully dedicated to you. You can reduce costs by using your own licenses to slash costs and get the resiliency and flexibility of the cloud. It’s pricey, but a good match for applications that have to achieve compliance and, for example, not share hardware with other tenants.
5. Slash costs with CPU bursting
Burstable performance instances were designed to give you a baseline level of CPU performance together with the possibility of bursting to a higher level when the need arises.
Burstable instances in families T2, T3, T3a, and T4g are a good fit for low-latency interactive applications, microservices, small/medium databases, and product prototypes.
Bursting can happen if you have credits. The number of accumulated CPU credits depends on your instance type. Generally, larger instances collect more credits per hour. But note that there’s a cutoff to the number of credits that can be collected (and naturally, it’s higher for larger instances)
Restarting instances leads to losing credits:
- Restarting an instance in T2 family means that you immediately lose all the accrued credits.
- If you restart an instance in T3 and T4 families, your credits will still be there for seven days (and then you’ll lose them).
We examined burstable instances AWS offers and discovered that if you load your instance for 4 hours or more per day (on average), you’re better off with a non-burstable instance. But if you run an e-commerce business and experience traffic spikes once in a while, a burstable instance is cost-effective.
Side note: vCPU capacity is limited
Our tests revealed that compute capacity tends to increase linearly during the first four hours. After that, the increase is limited and the amount of available compute goes down by nearly 90% by the end of the day.
Source: CAST AI
6. Optimize storage choices for EC instance types
To maximize cloud cost savings, be careful about data storage:
- Make sure that the EC2 instance types you choose have a storage throughput your application needs.
- Avoid expensive products like premium SSD unless you plan to use them to the fullest.
- Be careful about egress traffic. In a single-cloud scenario, you pay egress costs between various availability zones, which most often costs some $0.01/GB. But in a multi-cloud setup, you’ll be charged more – for example $0.02 for using direct fiber.
7. Use Spot Instances (even for production workloads)
Spot Instances are a great way to save up on your AWS bill. By bidding on instances AWS isn’t using, you can get up to a 90% discount on the On-Demand pricing.
The first step is qualifying your workload for Spot Instances. Is it spot-ready? Answer these questions to find out:
- How much time does your workload need to finish the job?
- Is it mission- and time-critical?
- Can it tolerate interruptions gracefully?
- Is it tightly coupled between nodes?
- Do you have a strategy in place for moving your workload when AWS pulls the plug?
Once you determine that your workload is a good candidate for Spot Instances, here are a few helpful pointers:
- Consider less popular Spot Instances as your chances of getting interrupted are lower.
- Check an instance’s frequency of interruption (the rate at which this instance reclaimed capacity during the trailing month). You can check it in AWS Spot Instance Advisor:
- Don’t be afraid of using Spot Instances for more important workloads. AWS offers special Spot Instances that guarantee uninterrupted operation for up to 6 hours. They’re a bit more expensive but you still achieve 30-50% cost savings.
- When bidding your price on a Spot Instance, set the value equal to On-Demand pricing. Otherwise, you risk that your workload is interrupted when the price increases.
- Set up groups called AWS Spot Fleets to boost your chances of snatching a Spot Instance. This is how you can request multiple instance types simultaneously. You’ll pay the maximum price per hour for the entire fleet, not specific spot pool (i.e. instances of the same type and with the same OS, availability zone, and network).
8. Use automation to discover better-suited instances
A specialized instance selection algorithm like the one we’ve built at CAST AI cherry-picks the most cost-effective EC2 instance types and sizes that meet your application’s requirements.
Here’s an example:
At CAST AI, we had 6
e2-standard-4 instances running in our production. We used our Cost Optimizer to check potential savings and got a recommendation to switch to 2
e2-standard-2 and 1
e2-highmem-2 instead. This choice made a lot of sense given that we use more memory – and the
e2-highmem-2 instance is more cost efficient!
You can get the same eye-opening recommendations for free:
Connect your cluster to the platform and let the read-only agent analyze it. You’ll get a report to help you identify potential cost savings – you can then optimize your cluster on your own or let the solution do that for you automatically.
Wondering how it all works? Read this: How to reduce your Amazon EKS costs by half in 15 minutes
An EC2 instance is a virtual server located in Amazon’s Elastic Compute Cloud (EC2). Teams use it for running applications on the infrastructure of Amazon Web Services (AWS), a cloud computing platform. In its essence, EC2 is a service that helps businesses to run applications in the AWS computing environment. It offers a practically unlimited number of virtual machines.
EC2 instances differ from virtual machines in several areas.
A virtual machine is a simulated computer system running on virtualized hardware – on a computer or server. This means that virtual machines don’t use dedicated hardware but allotted portions of other systems. They don’t have direct access to the host system’s OS, files, or hardware.
An Elastic Compute Cloud (EC2) instance is a virtual server teams can use to run applications in AWS. They can easily configure the instance’s CPU, storage, memory, and networking resources, picking from different types of instances according to their requirements.
EC2 instances and virtual machines differ in how they handle different resources such as storage, CPU, and memory.
For example, when you establish a virtual machine, you need to define how much of your host’s resources it may utilize. When your VM isn’t in use, the resources are made accessible to other VMs and your host system. When you create an EC2 instance, you also choose which resources each instance can access – but even if the instance isn’t using these resources, they’re not available to other instances.
Amazon EC2 offers a variety of instance types that are tailored to certain use cases. Instance types provide different combinations of CPU, memory, storage, and networking capabilities that allow teams to pick the best resource mix for their applications. Each instance type has one or more instance sizes, which allows scaling resources according to the changing needs of your workloads.
In general, Amazon EC2 instance types are grouped as follows:
– General Purpose
– Compute Optimized
– Memory Optimized
– Storage Optimized
– Accelerated Computing
Before starting an EC2 instance, you must choose the right type and size (and other features) that match your needs. Most of the time, you’ll be selecting an instance type based on the following criteria:
– Availability Zone or Region
Once you open the Amazon EC2 console, you can check what instances you can get in Regions that are available to you (regardless of your location).
There are a few things you need to know to make the best choice in terms of cost vs. performance – we prepared a detailed guide to help you: How to choose the best VM type for the job and save on your cloud bill.
The number of EC2 instances you need will depend on the requirements of your application. One thing to keep in mind is that your account’s EC2 instance limitations should be set to more than 20 per region. Otherwise, you may find yourself unable to create additional instances when you need them most.
Amazon EC2 allows launching On-Demand Instances and pay as you go. This flexibility might make tracking your usage more challenging.
Fortunately, AWS offers an interactive usage report that can be accessed in the AWS Management Console: the EC2 Instance Usage Report. The report offers insights into instance usage and its patterns, giving you some helpful information to start optimizing your EC2 usage.
But the best way to track and optimize your usage is getting an intelligent cloud optimization solution that selects the best EC2 instances (including Spot Instances) and works to reduce your cloud bill by constantly looking for options matching your changing requirements.
Leave a reply
Thanks for the article. This was informative, filled in some gaps about ec2 instances
so CAST sort of automates for the most optimal cost efficiency?
Hi, Dave. Thank you for your question. CAST algorithm will strike for best price-performance ratio, so the tool will save you a bunch of costs and keep the performance at the same time. At some point in the near future there will be adjustable parameters to select better performance or more savings, at the expense of the other.
can you specify spot instances types with CAST or is it automated?
Hi, thank you for your question. At the moment it is automated to find the best price-performance ratio. The algorithm takes overhead into an account when calculating what would be the most suited instance for the workload. In the near future, this will be more customizable.
In the sea of EC2 instances, its crucial to know which is the best bang for your buck for the specific workload that you are going to have. guides like these do help to understand them better, but the selection has to be done by some sort of AI for most optimized results. I hope that CAST does that AI optimization to select it!
That endurance coefficient should be a publicly disclosed thing by the instance type description, not a thing that you actually have to test and benchmark to see its performance……..
I appreciate people that take time and effort to educate others, especially on such a broad topic, of which instance type is the right for your load, thanks Leon
I haven’t heard about CPU bursting and it seems complicated, but its a thing to learn now to save even more, thanks!
Our team just started using different and better performing instances and in the nick of time, we started seeing some better performance for our $ which obviously means saving money!
Benchmarking every instance just to see which one would be the best performing one is so time consuming…
Even though we didnt pull trough with the optimizations that cast would like to do, I appreciate the insights that we got for free from EKS optimizer. I’ll use those to save some $ without actually changing much, just a few instance types..
I think we are kind of used to the amount of the instances being 400, but I just had a thought, “why” , just why so many, do we really need to dig trough so many types to find that one that just works? Overcomplicated much..
I’m down to spend a few more $ just to find the best performing instance for our required loads, good tip right there
I just absolutely love the fact that you guys give out this instance recommendation for free after connecting my EKS cluster. Its been such a head ache to even find the right ones to run our software on and now, you just gave the best possible solution for it for free… Amazing!