Had a really good conversation about metrics the other day. We’ve been discussing ways to express how our systems are performing, delivering value, and staying available – and I’d like to use the same general structure for all systems, regardless of function (transactions, integrations, analytics) or platform (Wintel, AS/400, Open Systems). For each type of metric, we need to understand two dimensions:
- Performance against some Target. This can either be a baseline (a minimum or average expected score), or a threshold (over which the system loses efficiency; we may want to define both.
- Trend over time: We may need to show sustained improvement, or use trends to predict when a threshold will be reached and plan accordingly.
We defined four fundamental types of metrics we want to understand and monitor for all systems:
- Availability: I need to be able to access and use the system; expectations vary (24×7 vs “9-to-5”), but I should be able to quantify readiness. This metric requires a baseline, such as “98% uptime”.
- Capacity: Expressed in growth rates, but can also refer to a system’s ability to work on some volume of tasks at any one time. Here, we might set a minimum baseline when first implementing the system, and a threshold to monitor before the next expansion (think disk space or memory).
- Performance: Think speed, especially response time; how fast can I get stuff done? Here , we should set a baseline expectation; application response time for a transactional system might be quite different than an analytics system.
- Thruput: How much work gets processed over time; how much work can I get done at any one time (or over some time period). A typical baseline would be a performance target (project tasks completed per month).
As we reviewed this framework, it became apparent that we could use it for people and groups as well as systems. Some examples:
- Availability might track the mundane, like attendance – but why not track skills development and training (our “availability” to work on different types of technology?)
- Capacity measures total time for projects (versus system maintenance). Process improvement reduces time required for maintenance, makes more time for projects.
- Performance could include qualitative reviews based on peer reviews, surveys, etc.
- Common Thruput measures include service desk calls answered, problem tickets resolved, projects completed, and programs written.
The next challenge will be to define metrics that can fit into this framework. More to follow …