Bonus content: Guide to Amazon EC2 instance types
A DevOps life isn’t a piece of cake in AWS. How are you supposed to make sense of EC2 instance types when you’re looking at almost 400 different ones?
Picking the right VM type for the job that doesn’t burn a hole in your pocket is a challenge. But there are a few things you can do to make your life easier (and gain points with your financial department).
Careful choice of EC2 instances is definitely worth your time because compute is the biggest part of your cloud bill. If you manage to optimize it, you’ll open the doors to dramatic reductions of your cloud costs.
What you’ll find inside:
- Before we get started: 5 basic facts about Amazon EC2 instances
- How to choose the EC2 instance types with cost optimization in mind
- Identify your application’s requirements
- Shop around for EC2 instance families
- Choose your instance size with cost savings in mind
- Weigh the pros and cons of different pricing models
- Reduce costs with CPU bursting
- Optimize your storage choice
- Use Spot Instances (even for production workloads)
- Use automation to find better-suited instances
Note: The cloud world changes rapidly, so we update this article to reflect that. Last update: 04.08.2021.
Before we get started: 5 basic facts about Amazon EC2 instances
- Amazon Elastic Compute Cloud ( EC2) is a service that delivers compute capacity in the cloud to help teams benefit from easy-to-scale cloud computing.
- AWS currently offers nearly 400 different instances with choices across storage options, networking, operating systems.
- Users can choose from machines located in 24 regions and 77 availability zones all over the world.
- EC2 instances use two types of processors: Intel Xeon and AMD EPYC, and Arm-based AWS Graviton.
- To match your use case, you can choose from 5 different EC2 instance families optimized for compute, memory, storage, accelerated computing or general purpose.
How to choose the EC2 instance types with cost optimization in mind
1. Identify your application’s requirements
Some teams make the mistake of choosing EC2 instances that are too large. They want to be on the safe side in case their application’s requirements increase. But why overprovision when you can use a burstable instance or delegate the task to incredibly cost-effective spot instances when needed?
Other teams are tempted to use more affordable instances. But what if they start running memory-intensive applications and encounter performance issues?
It all starts with knowing your workload requirements well. Make a deliberate effort to get only what your application really needs.
Identify the minimum requirements of your workload and pick EC2 instance types that meets them across these dimensions:
- vCPU count
- vCPU architecture
- SSD storage
Let’s say that you’ve done your homework and came up with a set of targeted instance types.
CPU vs. GPU – which one should you pick?
If you’re looking for an instance to support a machine learning application, for GPU instead of CPU. GPU-dense instance types train models much faster. Interestingly, the GPU wasn’t initially designed for machine learning – it was designed to display graphics.
What about running predictions? Is investing specialized instance types worth it? AWS has introduced a new instance type designed for inference, AWS EC2 Inf. It supposedly delivers up to 30% higher throughput and 45% lower cost per inference than EC2 G4 instances.
And what’s the hype around Arm all about? The EC2 A1 family is powered by the Graviton2 Arm processor. Since Arm is less power-hungry, it’s also cheaper to run and cool. Cloud providers usually charge less for this type of processor.
But if you’d like to use it, you might have to re-architect your delivery pipeline to compile your application for Arm. On the other hand, if you’re already running an interpreted stack like Python, Ruby or NodeJS, your applications will likely run on Arm.
2. Shop around for EC2 instance types and families
|EC2 instance family||Key characteristics||Use cases|
|General purpose||Balanced ratio of vCPU to memory||– General-purpose applications that use vCPU and memory in equal proportions|
– Scale-out workloads like web servers, containerized microservices, and small to mid-sized development environments
– Low-latency user interactive applications, small to medium databases workloads
– Virtual desktops machines, code repositories, application servers
|Compute optimized||– High ratio of vCPU to memory |
– Optimized for vCPU-intensive workloads
|– High performance web servers, batch processing, distributed analytics|
– High performance computing (HPC)
– Highly scalable multiplayer gaming platform apps
– High performance frontend fleets, backend applications, and API servers
– Science and engineering applications
|Memory optimized||– High ratio of memory to vCPU||– High performance database clusters|
– Distributed web scale in-memory caches
– Mid-size in-memory databases and enterprise applications
– Applications tha process unstructured big data in real time
– High performance computing (HPC) and Hadoop/Spark clusters
|Storage optimized||– Designed for workloads that need high, sequential read and write access to massive data sets on local storage|
– Can deliver thousands of low-latency, random I/O operations per second (IOPS) to applications
|– NoSQL databases (Cassandra, MongoDB, Redis)|
– In-memory databases (SAP HANA, Aerospike)
-Scale-out transactional databases and distributed file systems (HDFS and MapR-FS)
– Massively Parallel Processing (MPP)
– MapReduce and Hadoop distributed computing
– Apache Kafka, and big data workload clusters
|Accelerated computing||– Uses hardware accelerators (co-processors) to power functions that machine and deep learning systems require||– Machine/deep learning|
– High performance computing (HCP)
– Computational finance
– Speech recognition and conversational agents
– Molecular modelling and genomics
– Recommendation engines
– 3D visualizations and rendering
|Inference type||– Promises up to 30% higher throughput and 45% lower cost per inference than EC2 G4 instances |
– Includes 16 AWS Inferentia chips, second generation Intel Xeon Scalable processors, and networking of up to 100 Gbps
– Learn more
|– Machine learning applications|
– Search recommendation
– Speech recognition and natural language processing
– Fraud detection
3. Choose your instance size with cost savings in mind
EC2 instance types come in one or more sizes, so scaling resources to match your workload’s requirements is easy.
But size isn’t the only factor that determines the cost.
AWS rolls out different computers to provide compute capacity. And the chips in those computers have different performance characteristics.
You might get an instance running on an older-generation processor that is slightly slower or a new-generation one that is a bit faster. The instance type you pick might come with strong performance characteristics your application doesn’t really need. And you won’t even know it.
How to verify this? Benchmarking is the best approach. It means that you drop the same workload on every machine type you want to examine and check its performance characteristics.
Here’s an example of benchmarking
To understand instance performance, we developed a metric called Endurance Coefficient. Here’s how we calculate it:
- We measure how much work an instance type can carry out in 12 hours and how variable the CPU performance is.
- A sustained base load needs stability. A workload that experiences traffic or batch job occasionally can get away with lower stability.
- In our calculation, instances with stable performance are close to 100 and ones with random performance edge closer to 0 value.
We tested the DigitalOcean s1_1 machine and – as you can see – it achieved a pretty high endurance coefficient of 0.97107 (97%). The AWS t3_medium_st instance delivered a less stable result with the endurance coefficient of 0.43152 (43%).
Source: CAST AI
4. Weigh the pros and cons of different pricing models
Next, you have to select an EC2 pricing model that matches your needs and budget. AWS offers the following models:
You pay only for the resources that you actually use. No need to worry about long-term binding contracts or upfront payments. Increase or reduce your usage just-in-time. But this flexibility comes with a high price tag. Workloads with fluctuating traffic spikes benefit the most from On-Demand instances.
Buy capacity upfront in a given availability zone with a large discount off the On-Demand price. The larger your upfront payment, the larger the discount. But if go for it, you’re also committing to a specific instance or family. And you can’t change that later if your requirements change.
Get the Reserved Instances discounts but commit to use a given amount of compute power per hour (not specific instance types and configurations). Anything extra will be billed at the high On-Demand rate.
But wait, didn’t you migrate to the cloud to avoid CAPEX in the first place? Resourced Instances and Savings Plans pose risk of vendor lock-in. The resources you get today might make little sense for your company doesn the line. Three years is an eternity in cloud computing.
Take a look here for more insights on this: Do AWS Reserved Instances and Savings Plans really reduce costs?
Bidding on spare compute is a smart move, you can save up to 90% off the On-Demand pricing. But AWS can pull the plug on your instance any time and give you just 2 minutes to prepare for it. You need to come up with a strategy to deal with that.
Learn more about spot instances here: Spot instances: How to reduce AWS, Azure, and GCP costs by 90%
A physical server that brings an instance capacity that is fully dedicated to you. You can reduce costs by using your own licenses to slash costs and get the resiliency and flexibility of the cloud. It’s pricey, but a good match for applications that have to achieve compliance and, for example, not share hardware with other tenants.
5. Slash costs with CPU bursting
Burstable performance instances were designed to give you a baseline level of CPU performance together with the possibility of bursting to a higher level when the need arises.
Burstable instances in families T2, T3, T3a, and T4g are a good fit for low-latency interactive applications, microservices, small/medium databases, and product prototypes.
Bursting can happen if you have credits. The number of accumulated CPU credits depends on your instance type. Generally, larger instances collect more credits per hour. But note that there’s a cutoff to the number of credits that can be collected (and naturally, it’s higher for larger instances)
Restarting instances leads to losing credits:
- Restarting an instance in T2 family means that you immediately lose all the accrued credits.
- If you restart an instance in T3 and T4 families, your credits will still be there for seven days (and then you’ll lose them).
We examined burstable instances AWS offers and discovered that if you load your instance for 4 hours or more per day (on average), you’re better off with a non-burstable instance. But if you run an e-commerce business and experience traffic spikes once in a while, a burstable instance is cost-effective.
Side note: vCPU capacity is limited
Our tests revealed that compute capacity tends to increase linearly during the first four hours. After that, the increase is limited and the amount of available compute goes down by nearly 90% by the end of the day.
Source: CAST AI
6. Optimize storage choices for EC instance types
To maximize cloud cost savings, be careful about data storage:
- Make sure that the EC2 instance types you choose have a storage throughput your application needs.
- Avoid expensive products like premium SSD unless you plan to use them to the fullest.
- Be careful about egress traffic. In a single-cloud scenario, you pay egress costs between various availability zones, which most often costs some $0.01/GB. But in a multi-cloud setup, you’ll be charged more – for example $0.02 for using direct fiber.
7. Use Spot Instances (even for production workloads)
Spot Instances are a great way to save up on your AWS bill. By bidding on instances AWS isn’t using, you can get up to a 90% discount on the On-Demand pricing.
The first step is qualifying your workload for Spot Instances. Is it spot-ready? Answer these questions to find out:
- How much time does your workload need to finish the job?
- Is it mission- and time-critical?
- Can it tolerate interruptions gracefully?
- Is it tightly coupled between nodes?
- Do you have a strategy in place for moving your workload when AWS pulls the plug?
Once you determine that your workload is a good candidate for Spot Instances, here are a few helpful pointers:
- Consider less popular Spot Instances as your chances of getting interrupted are lower.
- Check an instance’s frequency of interruption (the rate at which this instance reclaimed capacity during the trailing month). You can check it in AWS Spot Instance Advisor:
- Don’t be afraid of using Spot Instances for more important workloads. AWS offers special Spot Instances that guarantee uninterrupted operation for up to 6 hours. They’re a bit more expensive but you still achieve 30-50% cost savings.
- When bidding your price on a Spot Instance, set the value equal to On-Demand pricing. Otherwise, you risk that your workload is interrupted when the price increases.
- Set up groups called AWS Spot Fleets to boost your chances of snatching a Spot Instance. This is how you can request multiple instance types simultaneously. You’ll pay the maximum price per hour for the entire fleet, not specific spot pool (i.e. instances of the same type and with the same OS, availability zone, and network).
8. Use automation to discover better-suited instances
A specialized instance selection algorithm like the one we’ve built at CAST AI cherry-picks the most cost-effective EC2 instance types and sizes that meet your application’s requirements.
Here’s an example:
At CAST AI, we had 6
e2-standard-4 instances running in our production. We used our Cost Optimizer to check potential savings and got a recommendation to switch to 2
e2-standard-2 and 1
e2-highmem-2 instead. This choice made a lot of sense given that we use more memory – and the
e2-highmem-2 instance is more cost efficient!
You can get the same eye-opening recommendations for free:
Connect your cluster to the platform and let the read-only agent analyze it. You’ll get a report to help you identify potential cost savings – you can then optimize your cluster on your own or let the solution do that for you automatically.
Wondering how it all works? Read this: How to reduce your Amazon EKS costs by half in 15 minutes