soam's home

home mail us syndication

EC2 Instance CPU Types

Amazon provides a whole variety of instance types for EC2 but lists their CPU capabilities via “EC2 Compute Units” where

One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.

That’s somewhat helpful for m1.small but what about c1.xlarge which has something like 20 EC2 compute units? How to map that to the real world? Fortunately, I found a cloud computing presentation from Constantinos Evangelinos and Chris Hill from MIT/EAPS which contained mappings of most of the common ec2 instance types. It’s from 2008 but should still be applicable. Drawing from the slides, we have:

  • m1.small => (1 virtual core with 1 EC2 Compute Unit) => half core of a 2 socket node, dual core AMD Opteron(tm) Processor 2218 HE, 2.6GHz
  • m1.large => (2 virtual cores with 2 EC2 Compute Units each) => half of a 2 socket node, dual core AMD Opteron(tm) Processor 270, 2.0GHz
  • m1.xlarge => (4 virtual cores with 2 EC2 Compute Units each) => one 2 socket node, dual core AMD Opteron(tm) Processor 270, 2.0GHz
  • c1.medium => (2 virtual cores with 2.5 EC2 Compute Units each) => half of a 2 socket node, quad core Xeon E5345, 2.33GHz
  • c1.xlarge => (8 virtual cores with 2.5 EC2 Compute Units each) => one 2 socket node, quad core Xeon E5345, 2.33GHz

MapReduce vs MySQL

Brian Aker talks about the post Oracle MySQL world in this O’Reilly Radar interview. Good stuff. One section though caused me to raise an eyebrow:

MapReduce works as a solution when your queries are operating over a lot of data; Google sizes of data. Few companies have Google-sized datasets though. The average sites you see, they’re 10-20 gigs of data. Moving to a MapReduce solution for 20 gigs of data, or even for a terabyte or two of data, makes no sense. Using MapReduce with NoSQL solutions for small sites? This happens because people don’t understand how to pick the right tools.

Hmm. First of all, just because you have 10-20GB of data right now doesn’t mean you’ll have 10-20GB of data in the future. From my experience, once you start getting into this range of data, scaling mysql becomes painful. More likely as not, your application has absolutely no sharding/distributed processing capability built in to your mysql setup, so at this point, your choices are:

  1. vertical scaling => bigger boxes, RAID/SSD disks etc.
  2. introduce sharding into mysql, retrofit your application to deal with it
  3. bite the bullet and offload your processing into some other type of setup such as MapReduce

(1) is merely kicking the can down the road.

(2) involves maintaining more mysql servers, worrying about sharding schemes, setting up a middleman to deal with partitioning, data collation etc.

In both (1) and (2), you still have to worry about many little things in mysql such as setting up replication, setting up indexes for tables, tuning queries etc. And in (2), you’ll have more servers running. While it is true mysql clustering exists, as does native partitioning support in newer mysql versions, setting that stuff up is still painful and it’s not clear whether the associated maintenance overhead is worth the performance you get.

It’s not a surprise more and more people are turning to (3). A hadoop cluster provides more power out of the box than a sharded mysql setup, and a more brain dead scalable path. Just add more machines! Yes, there are configuration issues involved in a hadoop cluster as well but I think they’re far easier to deal with than the equivalent mysql setup. The main drawback here is (3) only works if your processing requirements are batch based, not real time.

It is true that not all of the technologies in the Hadoop ecosystem outside of Hadoop itself are all that mature. However, BigTable solutions like Hbase are still not that easy to setup and run. Pig is still evolving but Cascading is an amazing library. Additionally, if one uses Amazon’s cloud products judiciously, it may actually be possible to do (3) really cheap (as opposed to (2) which requires more and bigger machines).

How? Store persistent files in S3 (logs etc). Use Elastic MapReduce periodically so you are not running a dedicated hadoop cluster. Use SimpleDB for your db needs. SimpleDB has limitations (2500 limit on selects, restricted attributes, strings only) but more and more people (such as Netflix) are using it for high volume applications. Furthermore, all of these technologies are enabling single entrepreneurs to do things like crawl and maintain big chunks of the web so that they can build interesting new applications on top, something that would have been too cost prohibitive in the older MySQL world. I hope to write more about it soon.

Brave New World Of Oversharing

From the New York Times:

“Ten years ago, people were afraid to buy stuff online. Now they’re sharing everything they buy,” said Barry Borsboom, a student at Leiden University in the Netherlands, who this year created an intentionally provocative site called Please Rob Me. The site collected and published Foursquare updates that indicated when people were out socializing — and therefore away from their homes.

In this day and age of Too Much Information (TMI), the only real security, it would seem, would be the “security through obscurity” variety. If everyone flooded the web about the minutiae of their day to day lives, chances are it’s going to be tough to single out anyone in particular. That approach, however, puts early adopters at risk. No longer would they be just a face in the crowd. Comes with the territory, I guess.

That being said, websites making said TMI possible should probably realize there are still some boundaries best left uncrossed.

Recruiter LOL

linkedin

The picture says it all really. For the record, the full subject line from the recruiter was “Data Analytics Architect Opportunity – NOT SPAM.”

EC2 Reserved Instance Breakeven Point 2.0

After Amazon’s reserved instance pricing announcement last year, there were quite a few folks writing about the breakeven point for your ec2 instance i.e. the length of time you’d need to run your instance continuously before the reserved pricing turned out to be cheaper than the standard pay-as-you-go scheme. Looking around, I believe the general consensus was that it would take around 4643 hours or 6.3 months. See herehere and here, for example.

Around late October of last year, Amazon announced even cheaper pricing for their ec2 instances. However, not seeing any newer breakeven numbers computed in the wake of lower prices, I decided to post some of my own. These are for one year reserved pricing for Amazon’s US-N-Virginia data center. All data is culled from the AWS ec2 page.

As we can see, the break even numbers have dropped quite a bit – down to 4136 hours on most of the instance types, a drop of almost 500 hours or so. That translates to better pricing 3 weeks earlier than before, in about 5.7 months. Interestingly enough, the high memory instances have slightly earlier break even points (by about 50 hours or so). Not quite sure why.

Netflix + AWS

Recently, I discovered Practical Cloud Computing, a blog run by Siddharth Anand, an architect in Netflix’s cloud infrastructure group. In a recent post, he writes:

I was recently tasked with fork-lifting ~1 billion rows from Oracle into SimpleDB. I completed this forklift in November 2009 after many attempts. To make this as efficient as possible, I worked closely with Amazon’s SimpleDB folks to troubleshoot performance problems and create new APIs.

Why would they need something like this? From another entry titled, Introducing the Oracle-SimpleDB Hybrid, Siddharth writes:

My company would like to migrate its systems to the cloud. As this will take several months, the engineering team needs to support data access in both the cloud and its data center in the interim. Also, the RDBMS system might be maintained until some functionality (e.g. Backup-Restore) is created in SimpleDB.

To this aim, for the past 9 months, I have been building an eventually-consistent, multi-master data store. This system is comprised of an Oracle replica and several SimpleDB replicas.

In other words, Netflix is planning to move many of its constituent services into the AWS cloud starting with their main data repository. This sounded like a pilot project, albeit a massive one, and understandably so given the size of Netflix. If this went smoothly the immediate upside would be Netflix not spending a fortune on Oracle licenses and maintenance. In addition, AWS would have proved itself to be able handle Netflix’s scale requirements.

Evidently things went well as I came across a slide deck detailing Netflix’s cloud usages further:

Fascinating stuff. From the deck, it appears that in addition to using SimpleDB for data storage, Netflix is using many AWS components in for its online streaming setup. Specifically:

  • ec2 for encoding
  • S3 for storing source and encoded files
  • SQS for application communication

I also saw references to EBS (Elastic Block Storage), ELB (Elastic Load Balancing) and EMR (Elastic Map Reduce).

I think for the longest time, AWS and other services of its ilk, were viewed as resources used by startups (such as ourselves) in an effort to ramp up to scale quickly so as to go toe to toe with the big guys. It’s interesting to see the big guys get in on the act themselves.

Replacing my MacBook Pro Drive

Back in grad school, my thesis advisor Brian Smith, to his eternal credit, really put the systems into computer systems where our research was concerned. He also placed the same emphasis on our group and how we dealt with our own computers. We joked that much like Marine Boot Camp, our group members needed to know how to take apart and put together an entire computer in less than minute in order to be able to graduate!

I tried carrying on this entire DIY ethic in my post graduate career. There were stumbles though when I first started dealing with laptops. In 2000, I permanently crippled my Dell Inspiron while taking it apart in order to replace the internal hard drive. It never worked quite as well post operation. It literally was held together by liberal applications of masking tape and glue. I still take pride in being able to sell it to a fly by night computer repair operator in Kolkata sometime around 2005. It probably exists in some incarnation somewhere, fueling some kid’s IIT aspirations right about now.

Given my last experience, I was somewhat nervous about replacing my MacBook Pro’s hard drive. No question it was due – two years of hard labor had squeezed the existing Fujitsu down to cacophonous joint on joint grinding. Each day, I thought, would be its last. However, it wasn’t a straighforward processs. Apple provides a How-To on its site if you want upgrade your MacBook but for your Pro, no dice.

Luckily, help exists online, in particular, here, here, here and here. The basic procedure is the same: you first buy a 2.5″ SATA drive, ideally 7200 RPM for faster performance even though it’s a bigger drain on the battery. I’ve never gone wrong with Western Digital, so I bought a WD 320G Scorpio. Next, you’ll need a 2.5″ enclosure – make sure it can do SATA drives. I learned the hard way. Finally, you’ll need a Phillips and a T-6 Torx screwdriver. I bought all except the Phillips screwdriver (I already own a set) from Fry’s. Not the cheapest but at least they’ll take returns in the first 30 days.

After fitting the WD drive to the enclosure, I hooked it up to the Pro’s USB drive and used SuperDuper to completely clone my main HD into an externally bootable drive. I rebooted the Pro from the external drive to confirm (hold down the option key when rebooting your Pro and, if multiple bootable devices are available, it’ll ask you to choose).

For the actual physical work, I printed out ifixit’s guide and followed it step by step. You have to take out a lot of screws and some parts. To keep track, I placed all the pages in the guide side by side and placed each set of screws next to the pictures, once I completed the step. This was particularly useful when putting everything back together again. My Phillips screwdriver is magnetized, so it holds the screw to itself. This was invaluable as many of the screws in the Pro’s casing are tiny and placing them can be tricky.

It was quite a relief when I put everything back together again, powered up my laptop and, after a brief, yet agonizing period, the Apple logo came on. And soon after, the machine booted up quite happily and my desktop appeared. Now, I have a souped up box and quite possibly have saved my company, Delve, a fair chunk of change in terms of not having to replace my laptop. One of the keys in the keyboard is still loose but I am hoping the tape and chewing gum approach will work in holding it in place!

jconsole, ec2, ubuntu

Remote debugging your jmx enabled process in the ec2 cloud via jconsole isn’t easy for any number of reasons. Perhaps it’s the NAT setup at AWS. Perhaps it’s Ubuntu or Linux related. The most common workarounds given are to have jconsole run on the remote box and either export its display locally via X (using, for example ssh -X or ssh -Y to tunnel) or via VNC. I found the former too slow and the latter too time consuming to set up on our existing systems.

However, I discovered nx, a set of technologies to greatly compress the X protocol, was very easy to set up on our Ubuntu boxes. It made the process of running jconsole remotely but displaying to my laptop locally very tolerable indeed. Not surprising as nx is intended to allow you to run xterm even over dialup! Here are the set of steps I followed (instructions derived from the nomachine site) to set up a nx enabled account on a remote Ubuntu Hardy Heron box. Your mileage may well vary.

On your remote box:

  1. install jconsole if you don’t have it. It’s in the Sun JDK package: apt-get install  sun-java6-jdk
  2. Get the nx debian packages from nomachine:
    1. wget http://64.34.161.181/download/3.4.0/Linux/nxnode_3.4.0-6_i386.deb (64 bit: http://64.34.161.181/download/3.4.0/Linux/nxnode_3.4.0-13_x86_64.deb)
    2. wget http://64.34.161.181/download/3.4.0/Linux/nxclient_3.4.0-5_i386.deb (64 bit: http://64.34.161.181/download/3.4.0/Linux/nxclient_3.4.0-7_x86_64.deb)
    3. wget http://64.34.161.181/download/3.4.0/Linux/FE/nxserver_3.4.0-8_i386.deb (64 bit: http://64.34.161.181/download/3.4.0/Linux/FE/nxserver_3.4.0-12_x86_64.deb)
  3. Required by nxserver: apt-get install libaudiofile0
  4. Install nx client, node, server:
    1. dpkg -i nxclient_3.4.0-5_i386.deb
    2. dpkg -i nxnode_3.4.0-6_i386.deb
    3. dpkg -i nxserver_3.4.0-8_i386.deb
  5. Create an nx enabled account called “nxtest”. You supply the account password.
    1. /usr/NX/bin/nxserver –useradd nxtest –system
    2. You might have to edit ~nxtest/.profile to add the “/usr/bin” to the PATH if it’s not added already.
    3. There are other and more secure ways of doing this.
  6. Install enough X libraries to get the remote desktop going. You can do various patchwork stuff but nx appearts to work best with KDE, so: sudo apt-get install kubuntu-desktop

On your local box:

  1. go to nomachine.com, download and install the free nx client for your PC type.
  2. start up the nx client
  3. give the remote hostname/ip address, provide a session name
  4. username: nxtest, password: <whatever password you used to set up the account>
  5. this should open a remote desktop with KDE running
  6. run “jconsole” as a separate command, use the remote option to connect to the java process.

That should do it!

Bing’s Engineers

Nice San Jose Mercury article on the ex-Inktomi and Yahoo-ites behind Bing’s real time search launch – I particularly enjoyed the opening paragraphs:

Microsoft engineer Chad Carson wasn’t thrilled about surrendering his solo window seat on the Alaska Airlines flight from San Jose to Seattle so he could talk shop with his boss Sean Suchter and colleague Eric Scheel.

But that innocent decision last July 22 would spark a 91-day sprint to a previously unreached Internet milestone.

By the time Flight 321 was over Oregon, the group in Row 6 had evolved from a technology klatch to a cabal of plotters who scrawled a schematic tangle of boxes on a sheet of paper to map out something no big Internet search engine had yet achieved. The three members of Microsoft’s new Silicon Valley search team would try to make their company’s Bing a window into America’s stream of consciousness, serving up the chatter on Twitter and blog posts, with the latest updates on everything from celebrity gossip to breaking news.

Knowing Sean and Chad’s talent and work ethic, it’s great to see them get this exposure, particularly after spending so much time in Google’s shadow. Congrats guys! Also, I found the mention of row 6 in Alaska Airlines particularly amusing. If you’re not MVP or Gold or any type of high falutin’ flyin’ status holder, you can still get on early before the rest of the folks in cattle class if you score a seat on row 6. It’s my own shortcut on flights to Seattle to Delve HQ.

Peak Load

The graph below shows the requests/sec on the Delve production load balancers for our playlist service system. The time frame roughly covers the past 7 days.

As you can see, we’ve had at least three major peaks over the past couple of days. Some of these have been due to some big traffic partners coming online (including Pokemon) and at least one of them (the most recent one) was because of singer/songwriter Jay Reatard’s untimely passing and the subsequent massive demand for his videos by way of pitchfork, a partner of ours. In other words, some we predicted. Others – well those just happen.

All of these hits are great for the business and for our growth but is definitely white knuckle time for those of us responsible for keeping the system running. Fortunately, through some luck and a whole lot of planning, things have gone very smoothly thus far, fingers crossed. Some of the things we did in advance to prepare:

  • load testing the entire system to isolate the weakest links: we found apachebench and httperf to be our good friends
  • instrument components to print out response times: in particular, we did this with nginx, our load balancer of choice, as it is very easy to print out upstream and client response times
  • utilize the cloud to prepare testbeds: instead of hitting our production system, we were able to set up smaller replicas in the cloud and test there
  • monitor each machine in the chain: running something as simple as top or a little more sophisticated as netstat during load testing can provide great insights. In fact, this is something not limited to load testing. Simply, monitoring production machines during heavy traffic can provide a lot of information.

Our testing showed that:

  • we needed to offload serving more static files to the CDN
  • we could use the load balancer to serve some files which were generated dynamically in the backend, yet never changed. This was an enormous saving.
  • our backend slave cloud dbs needed some semblance of tuning. I don’t consider myself to be a mysql expert by any stretch of the imagination. However, I found that our dbs were small enough and there was sufficient RAM in our AWS instances such that tweaks like increasing the query cache size and raising the innodb buffer pool ensured no disk I/O when serving requests.
  • altering our backend caching to evict after a longer period of time – this would reduce load on our dbs
  • smoothen our deployment process so we can fire up additional backend nodes and load balancers if necessary

There’s much more to be done but surviving the onslaught thus far (with plenty of remaining capacity) has definitely been very heartening. It almost (but not quite) makes up for working through most of the holiday season :)

Next entries »