TvE 2100

At 2100 feet above Santa Barbara

Network Performance Within Amazon EC2 and to Amazon S3

What is the expected network performance between Amazon EC2 instances? What is the available bandwidth between Amazon EC2 and Amazon S3? How about in and out of EC2?…These are common questions that we get very regularly. While we more or less know the answers to them, out of our own experience in the past 15 months, we haven’t really conducted a clean experiment to put some more precise numbers behind them. Since it’s about time, I’ve decided to do some ‘informal’ experiments to measure some of the available network performance around EC2.

Before we start, though, a few warnings (disclaimer): Like in the drug commercials…Results may vary! :) . The results presented here use a couple of EC2 instances and therefore should only be interpreted as “typical/possible” results. The only claim that we make here is that, these are the results we got, and therefore we expect that perhaps this is an indication of available performance at this point. Amazon can make significant hardware and architectural changes which could greatly alter the results (hopefully only making them better ;) )

Let’s start with some experiments to measure the performance from EC2 instance to instance.

Performance between Amazon EC2 instances

In this first experiment I boot a couple of EC2 large instances. I make one of them a ‘server’ by setting up apache and copying some large files into it, and use the other instance as a client by issuing http requests with curl. All file transfers are made out of memory cached blocks, so there’s virtually no disk I/O involved in them.

So, this experimental setup consists of: * 2 Large EC2 instances: * Server: Apache (non-SSL) serving 1-2GB files (cached in memory) * Client: curl retrieving the large files from the server

These two instances seem to be actually separated by an intermediate router…so they don’t seem to be on the same host. This is the traceroute across them:

traceroute to 10.254.39.48 (10.254.39.48), 30 hops max, 40 byte packets
 1  dom0-10-254-40-170.compute-1.internal (10.254.40.170)  0.123 ms  0.102 ms  0.075 ms
 2  10.254.40.2 (10.254.40.2)  0.380 ms  0.255 ms  0.246 ms
 3  dom0-10-254-36-166.compute-1.internal (10.254.36.166)  0.278 ms  0.257 ms  0.231 ms
 4  domU-12-31-39-00-20-C2.compute-1.internal (10.254.39.48)  0.356 ms  0.331 ms  0.319 ms

So what are the results we got? Well, using 1 single curl file retrieval, we were able to get around 75MB/s consistently. And adding additional curls uncovered even more network bandwidth, reaching close to 100MB/s. Here are the results:

  • 1 curl -> 75MB/s (cached, i.e., no I/O on the apache server)
  • 2 curls -> 88MB/s (2x44MB/s) (cached)
  • 3 curls -> 96MB/s (33+35+28 MB/s) (cached)

I did not repeat the experiments using SSL. However, I did some additional tests transferring files using ‘scp’ across the same instances . Those tests seem to max out at around 30-40MB/s regardless of the amount of parallelism as the CPU becomes the bottleneck.

This is really nice: basically we’re getting a full gigabit between the instances! Now, let’s take a look at what we get when EC2 instances talk to S3.

Performance between Amazon EC2 and Amazon S3

This experiment is similar to the previous one in the sense that I use curl to download or upload files from the server. The server, however, is s3.amazonaws.com. (Still using HTTP and HTTPS since S3 is a REST service).

So, this experimental setup consists of: * 1 Large EC2 instance: * curl to retrieve or upload S3 objects to/from S3 * Amazon S3: i.e., s3.amazonaws.com * serving (or storing) 1GB files

The trace to the selected s3 server looks like:

traceroute to s3.amazonaws.com (72.21.206.171), 30 hops max, 40 byte packets
 1  dom0-10-252-24-163.compute-1.internal (10.252.24.163)  0.122 ms  0.150 ms  0.209 ms
 2  10.252.24.2 (10.252.24.2)  0.458 ms  0.348 ms  0.409 ms
 3  othr-216-182-224-9.usma1.compute.amazonaws.com (216.182.224.9)  0.384 ms  0.400 ms  0.440 ms
 4  othr-216-182-224-15.usma1.compute.amazonaws.com (216.182.224.15)  0.990 ms  1.115 ms  1.070 ms
 5  othr-216-182-224-90.usma1.compute.amazonaws.com (216.182.224.90)  0.807 ms  0.928 ms  0.902 ms
 6  othr-216-182-224-94.usma1.compute.amazonaws.com (216.182.224.94)  151.979 ms  152.001 ms  152.021 ms
 7  72.21.199.32 (72.21.199.32)  2.050 ms  2.029 ms  2.087 ms
 8  72.21.205.24 (72.21.205.24)  2.654 ms  2.629 ms  2.597 ms
 9  * * *

So, although the server itself doesn’t respond to ICMPs, the trace tells that there’s a significant path to be traversed.

Let’s start with downloads, more specifically with HTTPS downloads. The first thing that I noticed is that the performance of a single download stream is quite good (i.e., around 12.6MB/s). What is also interesting to note is that while download performance doesn’t scale linearly with the number of concurrent curls, it is possible for a large instance to reach higher download speeds when downloading several objects in parallel. The maximum performance seems to flatten out around 50MB/s. At that point the large instance is operating at a CPU usage of around 22% user plus 20% system, which given the SSL encryption going on is nice!

Here are the raw HTTPS numbers: * 1 curl -> 12.6MB/s * 2 curls -> 21.0MB/s (10.5+10.5 MB/s) * 3 curls -> 31.3MB/s (10.2+10.0+11.1 MB/s) * 4 curls -> 37.5MB/s (9.0+9.1+9.8+9.6 MB/s) * 6 curls -> 46.6MB/s (8.0+7.8+7.6+7.9+7.8+7.5 MB/s) * 8 curls -> 49.8MB/s (6.0+6.3+7.0+6.1+6.0+5.9+6.2+6.3 MB/s)

The SSL encryption uses RC4-MD4, so there is a fair amount of work for both S3 and the instance to do. So the next natural question is to find out if there’s more to gain when talking to S3 without SSL. Unfortunately, the answer is no. While the load in the client reduces significantly (from 22% to 5% user and from 20-14% system when using 8 curls), the available bandwidth using non-SSL is basically the same (i.e., the differences fall within the margin of error). Which leads me to believe that in either case the instance is not the bottleneck.

Here are the same data points for non-SSL (HTTP) downloads: * 1 curl -> 10.2 MB/s * 2 curls -> 20.0 MB/s (10.1+9.9 MB/s) * 3 curls -> 29.6 MB/s (10.0+9.7+9.9 MB/s) * 4 curls -> 37.6MB/s (9.1+9.4+9.4+9.7MB/s) * 6 curls -> 46.5 MB/s (7.8+7.8+7.6+7.9+7.8+7.6 MB/s) * 8 curls -> 51.5 MB/s (6.6+6.4+6.6+6.3+6.2+6.2+6.7+6.3 MB/s)

Interestingly enough, a single non-SSL stream, seems to get less performance than an SSL one (10.2MB/s vs 12.6MB/s). I didn’t check whether the SSL stream uses compression, that may be one reason this is occurring.

So how about uploads? I’ll use the same setup but using curl to upload a 1GB file using a signed S3 URL. The first interesting thing to notice from the results is that 1 single upload stream gets half the bandwidth that the downloads get (i.e., 6.9MB/s vs. 12.6MB/s). However, the good news is that the upload bandwidth still scales when using multiple streams.

Here are the raw numbers for SSL uploads: * 1 curl -> 6.9MB/s * 2 curls -> 14.2MB/s * 4 curls -> 23.6MB/s * 6 curls -> 37.6MB/s * 8 curls -> 48.0MB/s * 12 curls -> 53.8MB/s

In other words: give me some data and I’ll fill-up S3 in a hurry :-). So what about using non-SSL uploads? Well, that turned out to be an interesting one… I’ve seen a single curl upload achieve the same performance as download (that is: 1 curl upload with no SSL, can achieve 12.6MB/s). But over quite a number of experiments I’ve seen non-SSL uploads exhibit a weird behavior where some of them mysteriously slow down and linger for a while almost idle (i.e., at a very low bandwidth). The end result is that the average bandwidth at the end of the run varies by a factor of almost 2x. I’m still investigating to see what happens.

Summary

The bottom line from these experiments is that Amazon is providing very high throughput around EC2 and S3. Results were readily reproducible (except for the problem described with the non-SSL uploads) and definitely support high bandwidth high volume operation. Clearly if you put together a custom cluster in your own datacenter you can wire things up with more bandwidth, but for a general purpose system, this is a ton of bandwidth all around.