The PerfMon Blog

September 30, 2008

PerfMonBlog Performance Update

Filed under: Performance Monitoring — Tags: — Tyler Fullerton @ 10:27 am

I’ve noticed lately that the performance metrics I’ve been collecting for this blog have degraded.  The statistics I’m seeing lately are:

  • A daily (9/30) average load time of 2.06 seconds.
  • A daily (9/30) uptime of 99.19%.
  • A weekly uptime of 99.73%.

Nothing too alarming, the cause of the alerts is that the performance of the page is now starting to exceed 3 seconds for some locations (Atlanta and St. Louis) due to the addition of content (text and images).

The Newbie Introduction (A Recap)

I’ve been blogging on performance monitoring for a couple months now and it dawned on me that the information presented is probably straightforward for someone who has had past experience in performance monitoring (setting up, interacting with, etc.), but for someone new to performance monitoring it may be hard to cobble together a decent understanding of performance monitoring from a bunch of scattered concepts posted on a blog.  My goal with this post is to provide a list of questions that are integral to establishing a functional performance monitoring solution.  These questions are:

  • What problems are being solved?
  • What base are we trying to solve problems for?
  • Who is involved in solving the problems?
  • What information is required to solve the problems?
  • What is the perspective needed to solve the problems?
  • How will problem solving techniques be integrated?
  • Can future (unexpected) problems be solved?

These questions revolve around the idea of solving a problem.  That is, you have a web based resource (website, application, web service, etc.) and something about it is keeping you up at night.  So, let’s go through these one at a time:

  • What problems are being solved? This is the most important question you can ask yourself, but you already knew that :) .  You need to know where you’re at before you can determine if you’re moving in the right direction.  The process of implementing performance monitoring should be broken down into distinct milestones that should be in place before you even start talking to vendors.  For example, if your answer to this question is: I want to know where on my site my users are going, where they enter the site, and where they leave the site!.  Talking to a performance monitoring vendor about this is only going to infuriate you as they try to sell you monitoring when all you really need is analytics (by the way, there are lots of really well established companies that can help you with analytics, such as: Omniture or Coremetrics).  So know the problems that you want to solve…these are your goals.  Write them down on a piece of paper and give each one a weight (critical, nice to have, not really necessary).  Since the assumption is that you don’t know anything about monitoring you’ll have to be vague.  Problems like: My site is slow, It becomes unavailable a lot and I want to know when that is, my customers complain about performance but I don’t see it, I want to track where my users are going, I want to be able to perform fail over if something happens, I want to have someone else host my site/content, and I want a cup of coffee! are what you need to concentrate on.  They help you know where you need to go and how to get there.  The first three problems can be solved with monitoring.  The fourth, fifth, sixth, and seventh are probably not problems that you would want to solve with monitoring.  So now you’re armed with information that will allow you to successfully navigate the process of picking a vendor.
  • What base are we trying to solve problems for? That is, what exactly do you want to monitor?  In an ideal world you could monitor and collect performance metrics on every page of your site, or every business process in your web application.  The problem is that the cost and management of that solution is prohibitive and the data generated (alerts, logs, and reports) is probably more than any organization would be able to handle.  So it’s clear that you need to distinguish between what you need to monitor and what you don’t need to monitor.  This really depends on your business model and commitments to your customers (and other stakeholders).  If one of the main problems you’re trying to solve (from the previous question) is manage SLA values then you need to consider monitoring only those resources that fall under the SLA.  If many resources fall under the SLA then you could potentially re-tool your SLA to take into consideration that SLA verification will be based on a subset of the services you provide (this depends on how well established your SLA already is and if your clients will allow you to amend your agreement with them).  One important attitude to have during this step is honesty with yourself.  You really need to be honest and make compromises with yourself (and your organization) as to what needs to be monitored, what would be a nice to monitor (but not necessary), and finally – what doesn’t need to be monitored.  You may even want to reach out beyond your current needs and ask others (customers, executives, etc.) what they might need performance metrics on.  They may not fit into your current budget or plans but it’s always good to know what the future holds.
  • Who is involved in solving the problems? Gather the people that are going to help you make a decision.  if you’re the head of an IT department then you’re going to want to poll your customer base (whether this is an internal or external base) and find out what they’re asking for.  Are they even concerned with performance?  If they are, how apparent are performance issues with them?  Also, find others within your organization that can help you evaluate the performance monitoring services/tools that you will be looking at.  Marketing for example may be able to make great use of the data that is collected (by the way, marketing departments are almost always beneficiaries of performance monitoring data), or your development and Q&A departments may be interested in looking at the data that is collected.  This goes all the way up to your boss, your boss’ boss, etc..  When you have consensus among individuals in your company it will help to enrich your list of requirements.  Again, you’re just trying to build consensus at this point.  You do not have to commit to any of these requirements, you’re just building an eco-system (check out some of the other blog entries for more information on this term) view of your companies performance monitoring needs.  Also, you may be able to expand your budget by doing this (how many IT departments have the same budget as the Marketing department?).
  • What information is required to solve the problems? Monitoring is all about data collection!  Sure there is definitely something to be said about how that data is collected (as has been expressed in numerous posts), but at the end of the day you’re left with a set of data.  That’s all you’ve got!  Yes, you have a control panel or some other tangible artifact but the only thing that’s going to consistently show your the ROI of performance monitoring is data (alerts, reports, graphs, logs).  So it is very important to figure out ahead of time what type of data you are looking for, how you want to display it, and how easy it is to get to data presentation that isn’t standard.
  • What is the perspective needed to solve the problems? The accuracy of monitoring is really quite subjective.  It depends on what you are willing to consider end-user perspective.  For example, you may want to consider only the HTML load time as the performance of your application because your goal is to only improve the deliver of the HTML (and no consideration is to be given to the various other components that make up the page).  Or you may want to consider everything that happens when a person uses your application (JavaScript execution, downloading images, etc.).  The importance of this is that you need to understand what you’re buying from a vendor and also that you will need to understand the context around the data that is collected.
  • How will problem solving techniques be integrated? No monitoring solution will be able to meet all your IT demands.  For example, monitoring is developed to provide accurate and reliable functionality that will alert you of issues and report on those issues as well as overall performance.  So it’s imperative to make sure that your monitoring solution can easily plug in to any existing tools (or future tools) your organization has.  This may be through technical solutions (API, SNMP msgs, etc) or procedural solutions (who gets alerts, how they react to them, decision trees, etc.).  Doing this will give creedence to a monitoring initiative and will reduce confusion once the solution is implemented.
  • Can future (unexpected) problems be solved? Often a monitoring solution will meet the inital needs but will fail to meet future needs due to the accelerated evolution of technology.  As an example, a standard monitoring solution can easily monitor a site that relies on basic HTML but will more than likely have problems with more dynamic technologies like Flash or Ajax.

September 9, 2008

Active and Passive Monitoring Solutions

Filed under: Business Considerations, Performance Monitoring — Tags: , — Tyler Fullerton @ 8:48 am

Note: My brain must be on the fritz.  Dear readers, I have updated the post as I completely misused the terminology and definitions of Active and Passive monitoring.  I apologize for any inconvenience and have updated this post as of September 22nd (3pm PST).

I’ve been asked quite a few times about the distinctions between active and passive monitoring and which is the best method to consider when implementing a monitoring methodology.  In this post I’d like to provide a basic introduction to the two types of monitoring and talk briefly about their benefits and deficiencies.

First, let’s start with definitions of these terms:

  • Passive Monitoring – Performance/Availability monitoring that uses data sets generated from actual human users of a website or web application.
  • Active Monitoring – Performance/Availability monitoring that uses data sets that are generated by a consistent and automated user of a website or web application.

We can see from these definitions that in one case (Passive Monitoring) we are relying on the real world experiences of the existing user base for the website/application, similar in fashion to how web analytic data is collected.  In the other case (Active Monitoring) we are relying on the experience of a synthetic user (a piece of software that emulates an end-user’s interaction with a website/application).  Let’s start our analysis of the two methodologies by looking at their similar properties:

  1. Both can provide the same statistics (uptime, availability, errors, throughput, and other performance metrics).  Essentially, neither is limited to the basic data sets of performance monitoring solutions.
  2. Both will reflect accurate measurements that will represent the performance of the server at the time the sample was taken.  Stated differently, if the infrastructure for the website/application is under duress then the impact will be reflected in the data that is collected by the monitoring solution.  There are fringe cases where this concept breaks down for Passive Monitoring that we will discuss below.

What about the differences in these monitoring methodologies?  Here are the basic properties of a Active Monitoring solution:

  • Monitoring is performed from an emulated user.  This can be as simple as an automated process that makes base level HTTP requests (ex: Unix wget commands) or can be a complex solution using actual browsers for performing monitoring.  In either case, we are talking about a user of the website/application that is strictly software based.
  • Monitoring is consistent throughout the day and will always attempt to monitor regardless of the state of the website/application infrastructure.
  • Monitoring is consistent in configuration of the monitoring environment.  That is, every time you monitor the user (automated process) is the same.

And for Passive Monitoring solutions, the properties are:

  • Monitoring is performed by actual (human) users.  This is done by execution of JavaScript code embedded in the website/application that track the performance that the end user sees while accessing the site.
  • Monitoring reflects the actual usage parameters of the end users (ex: browser type, configuration, platform, etc.).  This is another way of saying that the end user perspective is accurately represented.
  • Monitoring will adapt to the demographic of the users of the website/application.

The end goal of monitoring is really going to be the driving force in dictating which solution is the best.  Some companies may want to be able to record the experiences’ of their actual users, in this case Passive Monitoring is the appropriate solution.  Passive Monitoring will allow the company to collect samples on performance that are actionable in the sense that they can see what type of browsers are being used, what platforms are most important, which problems are certain users having, how is performance from an exact location.  This can help direct the companies efforts when it comes to initial development and improvements of a website/application.  The Active Monitoring solution is more in tune with the task of reporting on performance to management, ensuring availability, and tracking SLAs because it is more consistent and has a reliable monitoring base that will not change over time.

Each solution has its faults as well, for Active Monitoring the faults are:

  • Does not track experiences of actual users accessing the site.
  • Does not provide statistics on browsers and platforms used by website/application end users.
  • Does not provide last mile information.

Passive Monitoring has the following faults:

  • Monitoring requires JavaScript which can alter the performance of the website/application being monitored and can potentially break or not work all together (if someone has JavaScript turned off).
  • Monitoring is subject to spoofing since information about a browser, platform, and other environment variables can been altered by a malicious end user.
  • Monitoring data will be sporadic and will only be collected when users are accessing the website/application.  No data will be collected during times when users are not on the site.  Therefore…
  • Issues with the website/application will not be detected until someone accesses the site.  This ability to detect problems before customers do is key and central to an on-going monitoring solution.
  • If the site becomes unavailable then no monitoring will be performed because users will be unable to interact with the website/application and therefore will not be able to execute the JavaScript that will track their experience.

The final analysis is: Passive Monitoring is great for QA and development purposes.  If your product is in Beta or not mature/critical enough for an SLA then this may be the best solution because it provides statistics on how your end users experience your website/application and the specifics of their environments.  However, if your application is central to your business or a certain level of service (performance and availability) is expected (even agreeded to) by your customers then you need a more consistent and robust monitoring solution.  The Active Monitoring solution is far superior for these types of environments because it guarentees monitoring, consistent monitoring (you don’t have to distinguish samples based on environmental factors such as IE vs. Firefox), early alerting of problems (before your customers see those problems), and provides a basis for reporting performance and availability to others in the organization as well as tracking SLAs.

Blog at WordPress.com.