Testing the Waters of AWS EC2 C5D Instances

Totalcloud.io
FAUN — Developer Community 🐾
6 min readNov 9, 2018

--

Ever since Amazon announced AWS EC2 C5D instances, we — as AWS practitioners — have been digging deep into the technicalities of it. We found some interesting stuff, and we’d like to share our findings for the greater good.

C5D Intro: AWS EC2 Instances with Local NVMe Storage

For those unaware, AWS EC2 C5D instances are high-performance block storage instances. Introduced sometime during May 2018, they are said to be ideal for applications that need access to high-speed, low latency local storage. This has made it media vertical’s favorite instance out of the lot.

Read on about its details below or move directly to ‘Tests’ section of this post to view the benchmarking observations.

Highlights:

#1 | Distinctive feature: Local NVMe-based SSD block level storage physically connected to the host server.

#2 | Powered by 3.0 GHz Intel Xeon Platinum 8000-series processes, similar to other EC2 C5.

#3 | Got the CPU juice of the C5 family and the disk performance (IoPs) of the i3 family, making it ideal for database workloads that are harder on CPU, like block compression.

#4 | 25% to 50% improvement in price-performance over C4 instances.

#5 | While I/O files are typically stored as Amazon S3, the intermediate files are expendable.

#6 | The local storage is terminated when the instance is stopped, making it ideal for storing intermediate files, not long-term storage.

#7 | Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.

#8 | The local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.

#9 | Cheaper (for now) than regular C5 Large, but as Spot instances. Spot requests for C5D Large starts at $0.0324 per hour across all regions. Where as C5 Large on-demand costs $0.085 per hour. This is likely only due to the slow uptake. As C5D instances become more popular, the prices might increase.

Here’s a detailed pricing for all variations of C5D instances in N Virginia.

Applications:

  • Batch and log processing
  • Apps that need caches and scratch files heavily
  • Image manipulation
  • Distributed and or real-time analytics
  • High-performance computing (HPC)
  • Ad serving
  • Highly scalable multiplayer gaming
  • Video encoding and other forms of media processing that requires large amounts of I/O to temporary storage

Note that the batch and log processing runs in a race-to-idle model, flushing volatile data to disk as fast as possible in order to make full use of compute resources. More details on availability, regions, sizes, and purchase models, here.

To test the performance of C5D instances, we launched a C5D.large instance in US-east 1 region with 9GB Amazon Elastic Block Store + 50GB Amazon EC2 NVMe Instance Storage.

In the same region and availability zone, we launched a C5.large instance with 54 GB EBS volume and T2.small instance for comparison with no NACLs in the way, and open security groups between the instances. All instances were running Amazon Linux 2 OS (ami-0a5e707736615003c), and patched up to date as of October 2018.

Before we walk you through the benchmarking results, let’s have a look at the pricing comparison between C5D.Large, C5.Large and T2.Small on-demand instances:

We ran the Sysbench tool to calculate all prime numbers up to 20,000 to compare and contrast the pure compute performance of C5D instances with C5 instance as well as the the most popular instance type, T2.

Here are the kind of tests we ran on all the three instances:

  • To run CPU tests, we made use of the SysBench tool, with a single threaded CPU test.
  • To measure disk performance, we made use of Sysbench file IO benchmark (using random r/w). As these instances are capable of bursting, we conducted few tests to drain the IO balance. Before draining the IO balance, we used IOping to test latency. Then, using FIO, we attempted to drain the balance and test the performance as the burst bucket was emptied and refilled.
  • To benchmark the performance of a typical file server workload, we made use of the blogbench score.
  • To run network tests and demonstrate TCP/IP latency, we used average latency of 100 icmp packets. Additionally, we used iperf to test the bandwidth between devices.
  • To get a more a slightly more “real-world” test, we used the phoronix test suite to run a benchmark compiler test (compiling Linux). This gives real world disk IO across a filesystem alongside general system and CPU performance. Disk IO is often a constraining factor in machine performance, but it very much depends on what particular workload you’re running.

Here are the results:

Here are Our Key Observations:

#1 | The CPUs all come out at the same score, as all three instances have CPUs of the same capability.

#2 | The C5D vastly outperforms the C5 and T2 in File IO due to the nvme disk, with nearly 4 times the read and write capability of a C5 instance. AWS uses the same underlying disk technology for C5 and T2. Hence the results for C5 and T2 are similar.

#3 | The disk latency measure (IOping) shows that the nvme disk has half the latency of the C5. This is a dramatic difference, and has significant implications for latency-sensitive workloads such as real-time analytics.

#4 | The network measures are essentially the same for C5D and C5 instances, as expected, but significantly improved over the T2 instance.

#5 | The compile time is interesting. The task hits a CPU bottleneck at this point. Although the C5D is faster every time, it’s hard to find a repeatable real-world task that really stretches the capability of the new nvme disk.

#6 | The test showed that the CPU maxes out to 100% to compile Linux. This indicates that C5D is a little faster. Even though compiling Linux C5D is a very disk intensive operation, the CPU was the ultimate bottleneck of the task in our test case.

Just to add to these tests, the C5 instance took 8.10 seconds to unpack the Linux kernel. The C5D took 8.06 seconds to unpack.

The Wrap UP

There were several tests conducted over a week to benchmark C5D.Large instance’s performance. Hope these test results gives you the real perspective of the new C5D instances.

If you’d like to know more about these test results, write to us at tech@totalcloud.io or tweet to us at @totalcloudio. Also, do share your views on this post. We would love to hear your thoughts.

Originally published at blog.totalcloud.io on November 9, 2018.

Join our community Slack and read our weekly Faun topics ⬇

If this post was helpful, please click the clap 👏 button below a few times to show your support for the author! ⬇

--

--

TotalCloud helps cloud engineers indulge in no-code AWS automation. We enable engineers to go script-less, saving more than 95% of engineering time.