Let's talk about metrics...

Today I read an article on metrics and it was interesting. Here's the link to the original article.

I am kind of a metrics geek. When done well, a metrics program can be of extreme value to a security program. However, when done badly, they can cloud your vision and make it difficult to notice that your radar is off by a few degrees. The article addressed several key areas of a security metrics program that I would like to augment. Let's take the proposed metrics from the article and see what we can do with them.

Metrics are hard. There are some key characteristics that we need to understand to determine if a metric is useful for us.

Base Metrics vs. Compound Metrics

Base metrics are derived directly from the data you have gathered. They are raw cardinal numbers that you can apply formulas on. “200 new vulnerabilities found in our core infrastructure by our vulnerability management tool” is a base metric. Compound metrics are metrics that are the result of a formula. “a 20% increase of vulnerabilities in our core infrastructure” is a simple example of a compound metric. We've found 200 new vulnerabilities and we knew there were already 1,000 there from a previous data gathering effort. The formula used for this would be "new vulnerabilities"/("existing vulnerabilities"/100). All too often we find ourselves getting lost in compound metrics before we have a good grasp on base metrics. We also make the mistake that base metrics are boring and unimportant. This is not true. A good metrics program is build on a strong definition of base metrics.

Key Characteristics

Before incorporating a metric into a metrics program, you want to know a few things about it.

First we need to see the difference between cardinal and ordinal numbers. 5, representing the number of ports open on a server, is a cardinal number. However, 5 on a scale of 1 to 5 representing level of risk is an ordinal number. This is important because you should never apply a formula to an ordinal number. Unless you want to express something qualitatively, ordinals should not be considered as base metrics. You can use ordinal numbers in your reporting. For example, you can define that patch adoption between 75% and 85% is a 4 on a scale of 1(bad) to 5(good). This will make your metric easier to digest for your audience.

You also want to know your data source. This determines ownership of the source data, which is important information to maintain, especially within a larger organization. You don't want to suddenly lose a data source on which several of your metrics rely and have no clue where it came from or where it went.

Another key characteristic is periodicity. You want to know at what frequency you will be able to gather or receive the data. Continuously? Daily? Weekly? Monthly? Quarterly? …

This is important because a lot of the reporting you do with your metrics will be used to identify trends. If the base metrics and your reporting do not have similar periodicities, your trends may have very little meaning. Continuing with our vulnerability management example, this means that if you can only gather "new vulnerabilities" every 2 weeks it does not make sense to report on "vulnerability increase" weekly.

Now that we understand this, let's run with some of the metrics from the article and see if we can reasonably construct them. We're in for a joyride of sorts.

Average Time to Detect

(page 1)

Type: Compound Metric, preferably expressed in hours.
Base Metrics: Time of occurrence (ToO), time of detection (ToD), # of incidents
Formula: sum of (ToD-ToO) of all incidents / # of incidents
Source: Your preferred ticketing system
Gathering method: initially this will be manually, could be automated.
Periodicity: I'd consider monthly but depending on your incident load, you could go for weekly as well.

Average Time to Respond

(page 1)

Type: Compound Metric, preferably expressed in hours.
Base Metrics: Time of occurrence (ToO), time of response (ToR)
Formula: sum of (ToR-ToO) of all incidents / # of incidents
Source: Your preferred ticketing system
Gathering method: initially this will be manually, could be automated.
Periodicity: I'd consider monthly but depending on your incident load, you could go for weekly as well.

You already see that for one metric, we often need 3 base metrics. In this case they come from the same source but that is definitely not always the case. Let's continue.

On page 3 of the article we find an interesting reference to a metric that is described as “critical defects fixed against those reported”. My management would not be happy with such a name so I'll call it DFR (Defect Fix Ratio). The formula I derive from the article is ("# of critical defects reported" - "# of critical defects fixed"). The best way to look at the problems with this metric is through the following table. We're assuming a monthly periodicity:

Month	DEFECTS REPORTED	DEFECTS FIXED	DFR	DEFECTS OPEN
January	7	2	5	5
February	10	5	5	10
March	10	3	7	17
April	5	5	0	17
May	3	0	3	20
June	2	5	-3	18

Do you see where I am going with this? The proposed metric would trend down because it ignores the accumulating defects. If you take into account the accumulation of incidents because you're faster at detecting issues than you are at fixing them (as it is represented in the last column), you can set a clear KPI against "Defects Open" that will allow you to decide when your development team is swamped in defects to fix..

Another interesting proposed metric is called "Fully Revealed Incident Rate" … I needed to ponder on this one for a few minutes but I can not come up with a reasonable way to describe this one or gather it for that matter. “Fully Revealed” is a binary state, either an incident is or it isn't. If our source were a ticketing system we could compare closed vs. open tickets related to security incidents. But that doesn't seem to be what "Fully Revealed" means in this context. I understand that the opposite would be “not fully revealed” and the state of the ticket would be closed, just like for a fully revealed incident, but there would not be a clear indicator of it's state. This would indicate manual work to analyze each incident ticket on it's own and in my experience this does not produce very reliable metrics.

It might be that the metric as described represents the number of abandoned investigations (closed before root cause was determined). If your ticketing system has a field that indicates whether the investigation was fully completed, this would make sense. And that's another important point! Sometimes you will need to modify data sources to fit your data needs. This again requires time and effort from the organization. Make sure that you identify what you need and build relationships with the data owners that can deliver that data.

We'll end with another relatively well-described metric, found on page 9 of the article:

Percentage Of Security Incidents Detected By An Automated Control

Type: Compound metric, expressed in a percentage.
Base Metrics: # of incidents detected, # of incidents detected by an automated tool
Assumption: our data source allows us to easily tell the difference (e.g. there's a source field of some sort)
Formula: # of automated incidents / (# of total incidents/100)
Source: SIEM, Incident Management Tool, Ticketing System, …
Gathering Method: preferably automatic
Periodicity: Monthly

In closing, I'm the first to admit that I make a lot of assumptions around the proposed metrics. A metrics program is a very customized effort for every organization but there are some key points around the construction of metrics that I believe deserve attention and that I hope to have illustrated with this blog post.

Of course, if you are interested in a deeper debate on how to build an efficient security metrics program, I'm always up for a good conversation.

[Thanks to my awesome colleagues Guillaume Ross and Maranda Cigna for reviewing this post and making it better.]

[note: the first 2 metrics described were on page 1 of the original article. When revisiting the article after writing this blog post, the described metrics had disappeared. I am not sure if this is a glitch in the publishing platform or if they were removed for other reasons. I still think they are interesting to illustrate as they show the dependency of a compound metric on base metrics.]