soam's home

home mail us syndication

The Case for an Open Source As Service Platform

In his article on Steve Jobs (The Tinkerer), Malcolm Gladwell gets to the core of what made the UK dominate the industrial revolution:

They believe that Britain dominated the industrial revolution because it had a far larger population of skilled engineers and artisans than its competitors: resourceful and creative men who took the signature inventions of the industrial age and tweaked them—refined and perfected them, and made them work.

Similarly, Steve Jobs, as per Isaacson’s biography:

But Isaacson’s biography suggests that he was much more of a tweaker. He borrowed the characteristic features of the Macintosh—the mouse and the icons on the screen—from the engineers at Xerox PARC, after his famous visit there, in 1979. The first portable digital music players came out in 1996. Apple introduced the iPod, in 2001, because Jobs looked at the existing music players on the market and concluded that they “truly sucked.” Smart phones started coming out in the nineteen-nineties. Jobs introduced the iPhone in 2007, more than a decade later, because, Isaacson writes, “he had noticed something odd about the cell phones on the market: They all stank, just like portable music players used to.

And so on. This observation does give rise to a question – if Jobs could rise to such exalted heights by mere ruthless refinement, what hope is left for the rest of us mere mortals? What the article does not say and should be obvious to anyone in the tech industry is that we’ve been living through the golden age of tweaking. After all, what is open source if not tweaking unleashed? I don’t need to go through the sheer quantity and variety of tools, programs, methods and systems that open source has produced. There is an open source equivalent for pretty much every functionality you can think of. Yet, I wonder if we have already lived through its golden age and are moving on to something else.

What I mean is that in the past ten years or so, we have moved from the paradigm of software as executable to software as service. It is not enough to produce a program or system. You have to run it as well and keep it running. Hence websites, search engines, social networks and pretty much everything else in our grand world wide web. This leads to the next question: while there is an open source equivalent to pretty much any software from Microsoft or Adobe, where is the open source as service (OSaaS) equivalent to Google or Facebook? Nutch doesn’t count – it’s code that has to be installed and run. Doing so is nontrivial and illustrates some of the issues facing OSaaS:

  • cost of machines to run the service and supporting services
  • cost of bandwidth
  • storage costs
  • operations costs

I would argue Wikipedia is a good example of a OSaaS – and it is continually in fundraising mode. Furthermore, eiven Google and Facebook’s masive scale, it’s impossible to produce any kind of competing system the traditional way without massive amounts of money. So much for open source purity! While it is true SaaS and PaaS providers often make available APIs and platforms on a tiered basis with the first x or so requests being free and hence is a great way of getting started with your app, you have to pay after you exceed a certain level of usage. Again, non scalable and sure to discourage the next budding Steve Jobs toiling away in his/her garage.

I think the success of the SETI project (not in finding aliens but in getting people to contribute their spare cycles) or even what I saw at Looksmart when we acquired Grub indicates there might be another way. Grub was an open source crawler that users could download and install on their computers. It showed nifty screen savers when your computer needed to snooze and crawled URLs at the same time. We were surprised by Grub’s uptake. People wanted to make search better and were happy to download and run it on their own computers. We had people allocating farms of machines devoted to running Grub. We used it for nothing but dead link checking for our Wisenut search engine – but even that made people happy to contribute.

One possible lesson from this could be that if it is possible to develop a framework/platform for effectively partitioning the service amongst many participants, each participant would pay a fraction of the total cost. Of course, as BitTorrent shows, load balancing has to be carefully done. People that host too many files leech too much ISP resources and get sued. Grid computing and projects like BOINC are certainly promising but seem to be specialized for long running jobs of certain types like protein folding or astrophysics computations. It’s not clear whether they can provide a distributed, public, OSaaS platform. Such a platform, if carefully engineered, could pave the way to many interesting applications that could provide alternatives to the Facebooks and Googles of the world and ensure tinkering in the new millenium remains within the reach of dreamers.

 

Appetite For Self Destruction

Appetite For Self Destruction is a recounting of the fall, rise and subsequent decimation of the US music industry. The books starts with the “Disco Sucks” backlash and the subsequent precipitous fall in LP sales. The CD comes along just in time to rescue the music industry from these doldrums with Michael Jackson’s “Thriller” being one of the first killer apps of this new technology. Guess what? The industry then sees an opportunity in the improved fidelity of CDs. It uses this to jack up prices and rides the subsequent boom hard to amazing profits and profligacy.

The book has a great time recounting some of these merry stories of excess and the unsavory characters that flocked into the businss. We all know the bottom finally started to fall out with the introduction of Napster but author Steve Knopper makes the point that this occurred more due to the insistence of the RIAA and the rest of the gang in clinging to what had hitherto worked. In doing so, however, they began alienating fans and musicians alike and never recovered. Suing grandmothers is hardly a particularly good business model.

A nice graphic from a related article in Business Insider (The Real Death of the Music Industry) says it all:


Apparently, Steve Jobs was essentially forced to step in and create a digital music distribution system as he badly needed content for the then recently introduced iPod. By that time, the music industry had realized they needed a legal online presence and Apple with a scant 4-5% of the PC market hardly seemed any kind of threat. Accordingly, head honchos of labels like Warner, Sony and others ended up signing agreemens largely biased in favor of Apple, realizing only belatedly they had given away the farm.

Interspersed through the book are chapters covering in painful detail every mistake made by the record companies on their way to their current depleted state. How music can survive, new business models, apps and services – all these topics are hot areas of discussion by pundits. The contribution of this book is illustrating how we got here in the first place.

Flattering Streaming Media Coverage for Our Video Platform

Streaming Media writes that OVP (Online Video Platform) is a pretty crowded space with more than a hundred vendors. They then list the top ten platforms and yes, we’re in there. Here’s what they say:

The OVP formerly known as Delve (until Limelight bought it) stands out for its user experience and workflow. The user interface was designed to be friendly and to make it easy to accomplish tasks. When a user uploads a video, he or she can update its metadata and set the channel even before the video is fully uploaded. That’s a timesaver few can match. It also offers strong analytics and APIs.

The platform’s Video Clipper tool makes it easy to shorten videos and works well in combination with Limelight’s real-time analytics. If users see that viewers are routinely quitting a video at a certain point, they have the option of ending the video there.

The OVP was acquired by Limelight in August 2010. Rather than focusing on just the mobile experience or just analytics, the Limelight Video Platform focuses on offering the best end-to-end user experience. Since it’s now under the same roof as a top CDN, it’s able to offer tie-ins that others can’t. Users gain from functionality such as player edge scaling, which the video platform is able to offer through low-level access to the CDN. Users also get to use developer APIs that aren’t open to the public.

Given that we started relatively late in this space (the company had started to pivot away from semantic video search to the platform SaaS approach just prior to my coming on board), this recognition is particularly heartening. It has been a monumental amount of work and accumulation of many battle scars for all of us. Full credit to Alex, Edgardo and the rest of the team!

Being committed to a startup means you have to be prepared to do anything and everything. That and my own role(s) at Delve/Limelight mean involvement with pretty much all system aspects, especially the backend. I am particularly chuffed at the repeated mentions of analytics since a lot of my own blood, sweat and tears go into it.

Okay, better stop now before this starts reading like an acceptance speech :)

Reverse Migration – Moving Out of AWS Part I

It’s been a while since I updated and with good reason. We were acquired by Limelight Networks last year and are now happily ensconced within the extended family of other acquisitions like Kiptronics and Clickability. While it was great to have little perks like my pick of multiple offices (Limelight has a couple scattered around the Bay Area) after toiling at home for most of two plus years, we had other mandates to fulfill. A big one was moving our infrastructure from AWS to Limelight. The reason for this was simple enough. Limelight operates a humongous number of data centers around the world, all connected by private fiber. Business cost efficiencies aside, it really didn’t make sense for us to not leverage this.

Amazon Web Services

AWS makes it really easy to migrate to the cloud. Firing up machines and leveraging new services is a snap. All you really need is a credit card and a set of digital keys. It’s not until you try going the other way you realize how intertwined your applications are with AWS. I am reminded of the alien wrapped around the face of John Hurt!

Obviously, we’d not be where we were if we hadn’t been able to leverage AWS infrastructure – and for that I am eternally thankful. However, as we found, if you do need to move off of it for whatever reason, you’ll find the difficulty of your task to be directly proportional to the number of cloud services you use. In our case, we employed a tremendous amount – EC2, S3, SimpleDB, Elastic Map Reduce, Elastic Load Balancing, CloudFront, Cloud monitoring and, to a lesser extent, Elastic Block Storage and SQS. This list doesn’t even include the management tools that we used to handle our applications in the AWS cloud. Consequently, part of our task was to find suitable replacements for all of these and perform migration while ensuring our core platform continued to run (and scale) smoothly.

I’ll be writing more about our migration strategy, planning and execution further on down the line but for now, I thought it would be useful to summarize some of our experiences with Amazon services over the past couple of years. Here goes:

  • EC2 – one of the things I’d expected when I first moved our services to EC2 would be that instances would be a lot more unstable. You’re asked to plan for instances going down at any time and I’ve experienced instances of all types crashing but when taking out the failures due to application issues, the larger instances turned out to be definitely more stable. Very much so, actually.
  • S3 is both amazing and frustrating. It’s mind blowing how it’s engineered to store an indefinite number of files per bucket. It’s astonishing at how highly available it is. In my 3 years with AWS, I can recall S3 having serious issues only twice or thrice. It’s also annoying because it does not behave like a standard file system, in particular, in its eventual consistency policy, yet it’s very easy to write your applications to treat it like one. That’s dangerous. We kept changing our S3 related code to include many, many retries as the eventual consistency model doesn’t guarantee that just because you have successfully written a file to S3, it will actually be there.
  • Cloudfront is an expensive service and the main reason to use it is because of its easy integration with S3. Other CDNs are more competitive price-wise.
  • Elastic Map Reduce provides Hadoop as a service. Currently, there are two main branches of hadoop supported – 0.18.3 and 0.20.2 although EMR folks encourage you to use the latter, and I believe, is the default for job submission tools. The EMR team has incorporated many fixes of their own into the hadoop code base. I’ve heard of grumblings of this causing yet further splintering of the hadoop codebase (in addition to what comes out of Yahoo and Cloudera). I’ve also heard these will be merged back into the main branches but I am not sure of the status. Being one of the earliest and consistent users of this service (we run every couple of hours and at peak can have 100 plus instances running), I’ve found it to be very stable and the EMR team to be very responsive when jobs fail for whatever reason. Our map reduce applications use 0.18.3 and I am currently in the process of moving over to 0.20.2, something recommended by the EMR team. Once that occurs, I’d expect performance and stability to improve further. Two further thoughts regarding our EMR usage:
    • EMR is structured such that it’s extremely tempting to scale your job by simply increasing the number of nodes in your cluster. It’s just a number after all. In doing so, however, you run the risk of ignoring fundamental issues with your workflow until it assumes calamitous proportions, especially as the size of your unprocessed data increases. Changing node types is a temporary palliative but at some point you have to bite the bullet and get down to computational efficiency and job tuning.
    • Job run time matters! I’ve found similar jobs to run much faster in the middle of the night than during the day. I can only assume this is the dark side of multi tenancy in a shared cloud. In other words, the performance a framework like hadoop relying heavily on network interconnectivity is going to be dependent on the load imposed on the network by other customers. Perhaps this is why AWS has rolled out the new cluster instance types that can run in a more dedicated network setting.
  • SimpleDB: my experience with SimpleDB is not particularly positive. The best thing I can say about it is that it rarely goes down wholesale. However, retrieval performance is not that great and endemic with 503 type errors. Whatever I said for S3 and retries goes double for SimpleDB. In addition, there’s been no new features added to SimpleDB for quite a while and I am not sure what AWS long term goals are for it, if any.

That’s it for now. I hope to add more in a future edition.

Note: post updated thanks to Sid Anand.

Some Tips on Amazon’s EMR II

Following up from Some Tips on Amazon’s EMR

Recently, I was asked for some pointers for a log processing application intended to be implemented via Amazon’s Elastic Map Reduce. Questions included how big the log files should be, what format worked best and so on. I found it to be a good opportunity to summarize some of my own experiences. Obviously, there’s no hard and fast rule on optimal file size as input for EMR per se. Much depends on the computation being performed and the types of instances chosen. That being said, here are some considerations:

  • it’s a good idea to batch log files as much as possible. Hadoop/EMR will generate a map job for every individual file in S3 that serves as input. So, the lower the number of files, the less maps in the initial step and the faster your computation.
  • another reason to batch files: hadoop doesn’t deal with many small files very well. See http://www.cloudera.com/blog/2009/02/the-small-files-problem/ It’s possible to get better performance with a smaller number of large files.
  • however, it’s not a good idea to batch files too much – if 10 100M files are blended into a single 1GB file, then it’s not possible to take advantage of the parallelization of map reduce. One mapper will then be working on downloading one file while the rest are idle, waiting for that step to finish.
  • gzip vs other compression schemes for input splits – hadoop cannot peer inside gzipped files, so it can’t produce splits very well with that kind of input files. It does better with LZO. See https://forums.aws.amazon.com/message.jspa?messageID=184519
In short, experimenting with log file sizes and instance types is essential for good performance down the road.

MySQL High Availability Talk – My Notes

Last month, Lenz Grimmer, Community Manager at MySQL (now Oracle, I suppose), gave an overview of a number of MySQL HA techniques at the SF MySQL Meetup group. My notes from that talk:

MySQL not trying to be an Oracle replacement, rather the goal is to make it better for its specific needs and requirements.

HA Concepts and Considerations:

  • something can and will fail
  • can’t afford downtime eg. maintenance
  • adding HA to an existing system is complex (Note: from my experience, this is definitely the case with MySQL!)

HA components:

  • a heartbeat checker is definitely necessary to check
    • whether services still alive?
    • components: individual servers, services, network etc
  • HA monitoring
    • have to be able to add and remove services
    • have to allow shutdown/startup operations on services
    • have to allow manual control if necessary
  • Shared storage/replication

One of the possible failure scenarios for a distributed system is the Split Brain syndrome whereby communication failures can lead to cluster partitioning. Further problems can ensue when each partition tries to take control of the entire cluster. Have to use approaches such as fencing or moderation/arbitration to avoid.

Some notes on MySQL replication:

  • replicate statements that have changed. This is a statement or row based approach.
  • can be asynchronous so slaves can lag
  • new in mysql 5.5 – semi sync replication
  • not fully synchronous but replication is included in the transaction i.e. transcation will not proceed until master receives ok from at least one slave
  • master maintains binary log and index
  • replication on slave is single threaded i.e. parallel transactions are serialized
  • there is no automated fail-over
  • a new feature in 5.5 – replication heartbeat

The master master configuration is not suitable for write load balancing. Don’t write to both masters at the same time, use sharding/partitioning instead eg. auto increments is a PITA (audience query)

Disk replication is another HA technique. This is not mysql level replication. Instead, files are replicated to another disk at the disk level via block level replication. DRBD (Disk Replacement Block Device) is one such technology. Some features:

  • raid-1 over network
  • synchronous/async block replication
  • automatic resync
  • application agnostic since operating at the disk level
  • can mask local I/O issues

By default, DRBD operates on an active-passive configuration such that block device on 2nd system isn’t accessible. Now, DRBD has changed to allow writes on the passive device as well but it only really works if using clustered file system underneath like GFS or OCFS2. However, it remains a dangerous practice.

When using DRBD with MySQL:

  • really bad with MyISAM tables since the replication occurs at the block level. Failover could lead to an inconsistent file system, so integrity check to repair would be required. Hence, can only use journaled file system with DRBD. Also, Innodb more easily repaired than MyISAM.
  • MySQL server runs only on primary drbd node, not on secondary.

Instead of replication, another possibility is to use a storage area network to secure your data. However, in that case, the SAN can become a single point of failure. In addition, following a switchover, new MySQL instances can have cold caches – since they have had not had time to warm up yet.

MySQL Cluster technology, on the other hand, is not good with cross table JOINs. In addition, owing to their architecture, they may not be suitable for all applications.

A number of companies were mentioned in the talk that are active in the space. One such example is Galera, a Norwegian company which provides their own take on MySQL replication. Essentially, they have produced a patch for Innodb as well as an external library. This allows single or multi master setups as well as multicast based replication.

Data Trends For 2011

From Ed Dumbill at O’Reilly Radar comes some nice thoughts on key data trends for 2011. First, the emergence of a data marketplace:

Marketplaces mean two things. Firstly, it’s easier than ever to find data to power applications, which will enable new projects and startups and raise the level of expectation—for instance, integration with social data sources will become the norm, not a novelty. Secondly, it will become simpler and more economic to monetize data, especially in specialist domains.

The knock-on effect of this commoditization of data will be that good quality unique databases will be of increasing value, and be an important competitive advantage. There will also be key roles to play for trusted middlemen: if competitors can safely share data with each other they can all gain an improved view of their customers and opportunities.

There’s a number of companies emerging that crawl the general web, Facebook and Twitter to extract raw data, process/cross-reference that data and sell access. The article mentions InfoChimp and Gnip. Other practitioners include BackType, Klout, RapLeaf etc. Their success indicates a growing hunger for this type of information. I definitely seeing this need where I am currently. Limelight, by virtue of its massive CDN infrastructure and customers such as Netflix, collects massive amounts of user data. Such data could greatly increase in value when cross referenced against other databases which provide additional dimensions such as demographic information. This is something that might best be obtained from some sort of third party exchange.

Another trend that seems familiar is the rise of real time analytics:

This year’s big data poster child, Hadoop, has limitations when it comes to responding in real-time to changing inputs. Despite efforts by companies such as Facebook to pare Hadoop’s MapReduce processing time down to 30 seconds after user input, this still remains too slow for many purposes.
:
:
It’s important to note that MapReduce hasn’t gone away, but systems are now becoming hybrid, with both an instant element in addition to the MapReduce layer.

The drive to real-time, especially in analytics and advertising, will continue to expand the demand for NoSQL databases. Expect growth to continue for Cassandra and MongoDB. In the Hadoop world, HBase will be ever more important as it can facilitate a hybrid approach to real-time and batch MapReduce processing.

Having built Delve’s (near) real time analytics last year, I am familiar with the pain points of leveraging hadoop to fit into this kind of role. In addition NoSQL based solutions, I’d note that other approaches are emerging:

It’s interesting to see how a new breed of companies have evolved from treating their actual code as a valuable asset to giving away their code and tools and treating their data (and the models they extract from that data) as major assets instead. With that in mind, I would add a third trend to this list: the rise of cloud based data processing. Many of the startups in the data space use Amazon’s cloud infrastructure for storage and processing. Amazon’s ElasticMapReduce, which I’ve written about before, is a very well put together and stable system that obviates the need to maintain a continuously running Hadoop cluster. Obviously, not all applications fit this criteria but if it does, it can be very cost effective.

View From My Office

After a couple of years of working remotely, it still feels strange to have an office of my own, let alone one with a modicum of a view of the downtown SF skyline, so I am enjoying it while it lasts. We’re scheduled to move to a more central SOMA location sometime in the next month.

View From My Office

Gesture Recognition And Music

Despite Farhad Manjoo’s assertions that a week at CES is essentially a week wasted (“The Most Worthless Week in Tech”), I found this LA Times article talking about gesture recognition vendors at CES to be particularly interesting:

Competing examples on display were from PrimeSense, the Israeli designers of the microchips that power Microsoft’s popular controller-free Kinect gaming accessory, and Softkinetic, a Belgian rival that powered an interactive billboard in Hollywood last summer for “The Sorcerer’s Apprentice.” The former relies on an approach called structured light — a projector fills the area in front of the display with beams of infrared light, then a sensor detects how the beams are distorted by moving objects. The latter takes the so-called time of flight approach, which detects motion by projecting light in front of a display and measuring how long it takes to bounce back.

PrimeSense has a considerable head start in the gesture recognition field thanks to the inclusion of its technology in Kinect — Microsoft sold some 8 million units of the device in 60 days. But games are “just the tip of the iceberg,” said Uzi Breier, executive vice president of PrimeSense. “We’re in the middle of a revolution. We’re changing the interface between man and machine.”

PrimeSense is focused on living room devices, while SoftKinect is also active in display advertising and medical applications. Breier said other possible uses include automobile security and safety, robotics, home security and rehabilitation.

To this list of uses, I would add another: music. Anyone who has played air guitar, air drums and/or the theremin would agree, I think. Percussion, in particular, would be a natural fit. Perhaps, in the future, conducting itself would be the actual performance and the orchestra would not even be there!

Theremin

EC2 Instance CPU Types

Amazon provides a whole variety of instance types for EC2 but lists their CPU capabilities via “EC2 Compute Units” where

One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.

That’s somewhat helpful for m1.small but what about c1.xlarge which has something like 20 EC2 compute units? How to map that to the real world? Fortunately, I found a cloud computing presentation from Constantinos Evangelinos and Chris Hill from MIT/EAPS which contained mappings of most of the common ec2 instance types. It’s from 2008 but should still be applicable. Drawing from the slides, we have:

  • m1.small => (1 virtual core with 1 EC2 Compute Unit) => half core of a 2 socket node, dual core AMD Opteron(tm) Processor 2218 HE, 2.6GHz
  • m1.large => (2 virtual cores with 2 EC2 Compute Units each) => half of a 2 socket node, dual core AMD Opteron(tm) Processor 270, 2.0GHz
  • m1.xlarge => (4 virtual cores with 2 EC2 Compute Units each) => one 2 socket node, dual core AMD Opteron(tm) Processor 270, 2.0GHz
  • c1.medium => (2 virtual cores with 2.5 EC2 Compute Units each) => half of a 2 socket node, quad core Xeon E5345, 2.33GHz
  • c1.xlarge => (8 virtual cores with 2.5 EC2 Compute Units each) => one 2 socket node, quad core Xeon E5345, 2.33GHz
Next entries »