Search this site





EC2 reserved vs on-demand costs (and R graphs!)

I'm sure this is covered well elsewhere online, but that's never the point of these things ;)

I was helping with some capacity planning and run-rate math today at work and found that ec2 reserved instances are much cheaper compared to on-demand - If this is obvious to you, chill out, I have historically never really used EC2 nor have I ever been close to budgeting. ;)

I proved this conclusion with some math, but frankly I like visualizations better, so I decided to learn R. I wrote an R script that will graph an on-demand vs reserved pricing for one m1.large instance (code at end of the post).

The result is this graph:

The graph says it all, and definitely tells me that we need to be reserving all of our instances at Loggly - and it gives me a rule-of-thumb:

  • If we're going to use one instance unit for at least 9 months, reserve for 3 years.
  • If we're going to use one instance unit for at least 6 months, reserve for 1 year.
  • Otherwise, stick with on-demand.
The "reserved instances" pay structure is you pay a one-time fee for access to a reduced hourly rate.

This also means that our random "debug something" deployments that are shutdown much of the time are probably best off being reserved instances as well- at least for a 1-year thing - since we are likely to use those deployments for more than half of a year.

A 3-year on-demand price for m1.large is just shy of $9000, which is twice as expensive as the 3-year reserve. Capaticy plan and maybe start buying reserved instances. Make your CFO happy.

And in case you were going to ask, I ran the same plot with data from EC2 "quaduple extra large" instances and the savings and break-even points were the same. I bet the rest of the prices flow similarly.

The R script is follows, run it with 'R --save < yourscript.r':

# Values taken from
# for an m1.large ("Large") instance
on_demand_hourly = 0.34
reserve_hourly = 0.12
reserve_1year = 910       
reserve_3year = 1400

# quadruple extra large instances
#on_demand_hourly = 1.60
#reserve_hourly = 0.56
#reserve_1year = 4290
#reserve_3year = 6590

on_demand_daily = on_demand_hourly * 24
reserve_daily = reserve_hourly * 24
x <- c(0, 365)
y <- on_demand_daily * x

# Calculate day of break-even point reserve vs on-demand rates
break_1year_x = reserve_1year / (on_demand_daily - reserve_daily)
break_3year_x = reserve_3year / (on_demand_daily - reserve_daily)

png(filename = "ec2_m1large_cost.png", width = 500, height=375)
plot(x,y, type="l", col='red', xlab="", ylab="cost ($USD)")
title("EC2 cost analysis for m1.large", 
      sprintf("(days)\n1-year is cheaper than on-demand after %.0f days of usage,\n 3-year is cheaper after %.0f days", break_1year_x, break_3year_x))
text(60, 0, sprintf("on-demand=$%.2f/hour", on_demand_hourly), pos=3)

abline(reserve_1year, reserve_daily, col='green')
text(60, reserve_1year, sprintf("1-year=$%.0f+$%.2f/hour", reserve_1year, reserve_hourly), pos=3)

abline(reserve_3year, reserve_daily, col='blue')
text(60, reserve_3year, sprintf("3-year=$%.0f+$%.2f/hour", reserve_3year, reserve_hourly), pos=3)

point_y = reserve_1year + reserve_daily * break_1year_x
points(break_1year_x, point_y)
text(break_1year_x, point_y, labels = sprintf("%.0f days", break_1year_x), pos=1)

point_y = reserve_3year + reserve_daily * break_3year_x
points(break_3year_x, point_y)
text(break_3year_x, point_y, labels = sprintf("%.0f days", break_3year_x), pos=1)

Fedora 6, utmp growth, Amazon EC2

% ls -l /var/run/[wu]tmp
-rw-rw-r-- 1 root  utmp  364366464 Aug 13 22:00 utmp
-rw-rw-r-- 1 root  utmp  1743665280 Aug 13 22:10 wtmp
That's 350 megs and 1.7 gigs. Cute. Performance sucks for anything needing utmp (w, uptime, top, etc). The 'init' process is spending tons of time chewing through cpu. System %cpu usage says 38% and is holding there on a mostly idle machine.

Lots of these in /var/log/messages:

Aug 13 22:10:27 domU-XX-XX-XX-XX-XX-XX /sbin/mingetty[6843]: tty3: No such file or d
Aug 13 22:10:27 domU-XX-XX-XX-XX-XX-XX /sbin/mingetty[6844]: tty4: No such file or d
Aug 13 22:10:27 domU-XX-XX-XX-XX-XX-XX /sbin/mingetty[6845]: tty5: No such file or d
Aug 13 22:10:27 domU-XX-XX-XX-XX-XX-XX /sbin/mingetty[6846]: tty6: No such file or d
Aug 13 22:10:32 domU-XX-XX-XX-XX-XX-XX /sbin/mingetty[6847]: tty2: No such file or d
I'm not sure why /dev/tty1 is the only /dev/ttyN device, but whatever. Either way, mingetty flapping will flood /var/run/[ubw]tmp over the span of weeks and eventually you end up with a system that spends most of its time parsing that file and/or restarting mingetty.

I fixed this by commenting out all tty entries in /etc/inittab and running "init q":

# Run gettys in standard runlevels
#1:2345:respawn:/sbin/mingetty tty1
#2:2345:respawn:/sbin/mingetty tty2
#3:2345:respawn:/sbin/mingetty tty3
#4:2345:respawn:/sbin/mingetty tty4
#5:2345:respawn:/sbin/mingetty tty5
#6:2345:respawn:/sbin/mingetty tty6