Update: This post is still relevant for build servers, but it turns out that at the moment EC2 does not give you a serial speed advantage over most modern laptops. Run the loop below on your laptop if you don’t believe me.
I sometimes use EC2 spot instances to run my test suite. To figure out which instance type to use, I ran a quick benchmark. I mostly care about serial performance, not the number of cores, which left me with the following contenders (prices are hourly for spot instances):
- m1.large / $0.12 / “Large”
2 x 2 EC2 Compute Units; here: 2 cores of Intel Xeon E5507 @ 2.27GHz
- m2.xlarge / $0.17 / “High-Memory Extra Large”
2 x 3.25 EC2 Compute Units; here: 2 cores of Intel Xeon X5550 @ 2.67GHz
- c1.medium / $0.06 / “High-CPU Medium” [32-bit]
2 x 2.5 EC2 Compute Units; here: 2 cores of Intel Xeon E5410 @ 2.33GHz
- c1.xlarge / $0.24 / “High-CPU Extra Large”
8 x 2.5 EC2 Computer Units; here: 2 cores of Intel Xeon E5506 @ 2.13GHz
- cc1.4xlarge / $0.56 / “Cluster Compute Quadruple Extra Large”
total 33.5 (I assume 16 x 2.1) EC2 Compute Units on two Intel Xeon X5570 @ 2.93GHz
I ran two benchmarks. First, the following loop:
Second, my Rails app’s test suite, with Selenium on a headless X server, unparallelized.
Here are the timings:
1 2 3 4 5 6
The result seems pretty clear: For serial computation, the high-memory instances (m2.xlarge and above) are the fastest.
(This of course is not because of the RAM, but because their CPUs happen to have the highest serial computation power, as you might expect from the EC2 Compute Unit count provided by Amazon.)
- The timings came back with +/- 1 on the loop, and +/- 2 on the test suite, so at this point I am reasonably confident in the results.
- Just to make sure, I launched an m2.4xlarge instance, and as expected, it has the same CPU model and the same timings as m2.xlarge. The only difference is that it comes with 8 cores instead of 2.
- I’m not sure why the cluster-compute instance (cc1.4xlarge) is so slow on my test suite, compared to the loop. I launched two different instances, in case I had a “bad apple”, and both came out the same. Perhaps it’s some weird timing issue in my test. But I think more likely, the CPUs in the cluster-compute instances just happen to be very good at this particular loop. (Unlike the loop timings, the slower test suite timing is in line with what you might expect from Amazon’s EC2 Compute Unit count.)
- All benchmarks were run on the official Ubuntu 11.04 AMIs, 32-bit for c1.medium, 64-bit for the rest, with EBS for the cc1.4xlarge instance and instance storage for the rest.