Friday, January 21, 2011

Applications Monitoring is a Waste of Time

Organizations are making efforts on monitoring their information systems. These efforts are basically wasting the organization’s time and money.

There are great enterprise solutions from CA, IBM and others but even the leader (according to Gartner) which is HP BSM and its complementary components will fail in the basic purpose of monitoring. Not because these products monitor badly or incorrectly, but because organizations don’t understand the information they provide.

Yes you have a consolidate view of the system metrics for the relevant infra of the application, yes you do have some extra metrics from the Web Server, Application Server and DB Server.

You don’t understand which of those tens or hundreds of metrics are responsible for a failure in a case it happens and it will. There is no application with zero failures. Yes these products can analyze the historical metrics and alert when they get out of line, but who said that the historical metrics are good?

The answer in my opinion is Load and Performance Testing in order to get the metrics affect on the application performance and availability. Only when setting the relevant metrics on our monitoring systems we can fully take advantage of our investment in the monitoring solution.

I will try to convey in the following posts.

Thursday, January 13, 2011

dynaTrace Browser Cache False Positive?

I am using dynaTrace Free Edition (AJAX Edition) for couple of years now as a Browser Side profiling tool and I am really pleased with it, especially with the new features found on version 2.x.

I was working on a performance optimization project for an Intranet web application couple of weeks ago and of course dynaTrace is in my toolkit. After working with the customer and understanding he's requirements I started the analysis.

The first thing I do is working with dynaTrace to get a performance overview. The overall rank was good but I got FAIL on the caching rank. Indeed no caching headers were set on the HTTP response headers on none of the cacheable resources.

Long story short, one of the recommendations in my final performance report was to use caching headers and I also declared that according to dynaTrace, setting these headers will save about 1.5MB of network traffic and about 10 seconds (on slow network connections from remote locations in the customer's so called "LAN").

This is where the first part of the story ends, with a report recommends on putting the main effort on this no caching problem.

This week I am instructing on a Performance course for programmers and I introduced them with dynaTrace as my favorite client side / browser side profiler. I love showing examples the freestyle way and asked one of the participants to donate his web application for the science. After browsing to the homepage of the web site, I closed IE and got back to the performance report. I noticed that on his web site he got FAIL on caching rank as well and he was amazed that there was no caching.

On one of the breaks the participant asked to investigate the caching problem. We've opened the report and got found that many flash (swf), gifs, css, js files were not cached. dynaTrace says that it sums up to about 500KB of transfer size and about three seconds in download time. The participant asked to reload the page again because it can't be true, the entire load time should be less the three seconds according to his experience. I reloaded the page and indeed the reload time was faster than the first request. We've re-opened IE and try again and the load time was fast again. It seems like the browser is using a cache but on dynaTrace it keep on showing the same recommendation – "Specifying Expires Headers can save up to 500KB in transfer and up to 3 seconds in download time".

We opened the Temporary Internet Files folder and found the cacheable files which were "not cached" in it. IE is saving any file on a local cache folder even if no cache headers were provided by the web server. This is true since IE 5 ( and true to other browsers as well. This means that the load time of these components is faster than what prompted by dynaTrace.

When clicking on details on the resource we get to see the relevant HTTP request header and HTTP response header. In the response header we see:

1. HTTP/1.1 200 OK – which means we got a new file from the web server.

2. Content-Length 146471 bytes – which is the size of the response.

This is can't be true because we know that IE is using the files from the Temporary Internet Files folder – we see this component really fast on the browser. How can it be that dynaTrace shows a 200 OK ?

At this point I launched network sniffing tool – WireShark to see what's really going on and while working with dynaTrace in parallel I got:
1. The actual response code is not 200 OK but 304 Not Modified.

2. The actual size of the response body is not 146471 but 0 zero(no body).

So to sum things up this is what I learned:

1. Browsers will cache any resource (default configuration) even if no caching headers were provided.

2. Browsers will ask the web server if modified since on any resources on the caching folder which has no caching header (you can see it in the screen cap, the red line starting with If-None-Match).

3. Web servers have a special response when the browser already has an up-to-date resource – 304 Not Modified.

4. dynaTrace 2.1 has a bug – showing wrong information about this kind of requests.

Last thing - dynaTrace prompt for this caching problem and this is correct even with this bug. We still waste the browser's connections on asking for validation of these resources and on high latency networks this is a waste of time. If caching headers were provided, the browser wouldn't even ask to validate those resources.