Transaction Response Time (Distribution) 
Thursday, August 21, 2008, 16:10 - Technology
I was recently helping a colleague to highlight how good some transaction response times were. However his graphs were plagued with the odd "spike" where a response time was really bad, i.e from 1.4 seconds average up to a 90+ second spike.

This made the transaction response time graph look bad and would have cause the reader of the report to focus on the spikes instead of the averages.

Instead a transaction response time distribution graph was used, this highlights a count just how many transaction had a particular response time:



The graph is much easier to talk about since the readers ear is immediately (and rightly so) drawn to the the "big number".

What I'm wondering about now is - why are there two clear groups of slow transactions? Between 46 to 58 seconds and 76 to 98 seconds. I've heard about defect clustering - but slow transaction clustering sounds strange.... but they are clustered.

Adrian 
Saturday, August 30, 2008, 00:38
Hi Steve,

If you (or your colleague) are desperately curious about this, I'd start by looking for some kind of lock or transaction contention event with a timeout or re-try interval of around 30 seconds. Looking at the distribution on the graph, it's a pretty rare event so might be hard to track down, but that's the first thing that occurred to me.

If there's any correlation between the occurrence of a slow transaction and heavy system load, that would also suggest the same kind of thing.

Just a thought!

Cheers,

Adrian.

Comments 

Add Comment

Fill out the form below to add your own comments.









Insert Special:








Moderation is turned on for this blog. Your comment will require the administrators approval before it will be visible.