So, you’ve got your new iPhone 3G, you’ve waited in line at the Apple or AT&T store for hours and it was worth it. You probably bought the 8 Gig…but maybe you bought the 16 Gig (and like a fellow co-worker you bought the white version so people knew you had the 16 Gig iPhone). You’ve now got the cool new iPint application, or you’ve finally solved an age old problem with the BigTipper application. Furthermore, you’re extremely happy that you can use MobileMe to store your contacts, calendar appointments, tasks, etc. on “The Cloud” so that you can easily keep your iPhone and MacBook synchronized. Life is good, Windows is bad, and Apple rules….but, er, wait. What’s this? Why can’t I manage my contacts and why are the changes I made not propagated to my devices? And where are my emails?
I don’t care how good Apple products are. “The Cloud” is bad (not that it means to be). With all this hype that the new iPhone 3G is getting some unexpected issues arose on the Mac run me.com (MobileMe) site when it was flooded with traffic (i.e. users). These issues caused downtime, loss of functionality, and even loss of mail messages for some MobileMe users.
MobileMe is a web application that allows you to keep all your information on the web so that all your devices can be synced from one convenient location, that conveniently isn’t under your control. I don’t know what exactly Apple did to prepare for the launch of this application but it would seem that Load Testing was not performed (or not properly performed) since there was no concept of how much load the servers could handle.
A load test would have helped the IT folks at Apple understand what capacity they were currently operating at, where their likely points of failure are at (and at what levels of usage would those points fail), and what they would need (infrastructure) to surpass those limits and prevent failure. Load testing is something that should be done before any major release and ideally would be done by someone with a proficient background in load testing and the review of load test results (either someone internal or external to the company, as long as they have some experience with load testing).
Generally load tests are launched before a major release of a web application or service. A load test will consist of a number of iterations in which virtual users (the load) are applied to the server for a duration of time (usually 1 hour). during the iteration multiple types of scenarios will be run. in the case of the MobileMe application scenarios could have been:
- Login to MobileMe.
- Sync contacts on MobileMe.
- Update calendar event.
The first iteration of a load test would have consisted of 1000′s of virtual users (each one performing one of the above scenarios) accessing the MobileMe application. During the test, the virtual users would track performance metrics (availability and load time) for the various scenarios mentioned above. For example, as the load increases on the server we could see how the performance would degrade for someone who was trying to login to the MobileMe application while 1000′s of other requests were being served by the MobileMe server. Once this initial iteration is complete, the Apple team would make revisions to their application and architecture based on the results of the load test. After those changes are made another load test iteration could be conducted to ensure that the changes have improved performance and not caused any unattended issues. Rinse and repeat.
Here’s a link to the MobileMe status blog (http://www.apple.com/mobileme/status/) which outlines more specifics about the outage. This is a problem that all companies providing web based applications and services need to consider; it’s great when you launch an application that’s popular, it’s not so great when that popularity damages your brand and prevents your service from being usable. The iPhone is popular enough that most users will probably ignore the outage (and the lost emails between July 16th and 18th), heck, those users may even endure a couple more outages. But the majority of applications on the web do not have the brand backing of Apple/iPhone.
More on load testing offered by Webmetrics: http://www.webmetrics.com/loadtesting.html.
And here’s the performance for this blog:
- Average load time is 1.70 seconds.
- Availability (uptime) is 100%.