Fixing vulnerability data quality requires fixing the architecture first

In this Help Net Security interview, Art Manion, Deputy Director at Tharros, examines why vulnerability data across repositories stays inconsistent and hard to trust.

The problem starts with systems not designed to collect or manage that data well. They introduce the idea of Minimum Viable Vulnerability Enumeration (MVVE), a minimum set of assertions needed to confirm two systems describe the same vulnerability, and find no true minimum exists. Assertions vary by case and change over time. They argue that before writing new specifications or building new tools, the community needs shared terms and principles. Metrics like CVSS scores often distract from the harder work of assessing actual risk in context.

vulnerability data quality

When two repositories disagree about whether a patch fixes a vulnerability, is that a data quality problem, a governance problem, or a definitional problem? And does the distinction matter for how we fix it?

This is likely all three problems, in some combination, and with some degree of overlap. The set of vulnerable and fixed software products and versions may be inaccurate or incomplete. Governance may not adequately detect and resolve such inaccuracies. Definitions, vocabulary, and grammar are neither strict enough nor shared widely enough. One of the principles we’re proposing is that vulnerability record quality is an architecture problem before it is a data problem. We should not expect high quality data from a system that is not designed to collect, manage, and convey it in the first place.

Producers and consumers of vulnerability information have a variety of skills, experience, and knowledge. Perhaps more importantly, both producers and consumers have unequal access to information, and both the information and access to it change over time. Another principle is that vulnerability records must be managed over time. The system must adapt and even encourage change so that it adapts and evolves with our understanding. We must accept that incomplete information and legitimate disagreement are permanent features of the landscape and manage records accordingly.

What is the minimal set of assertions that would allow two independent systems to confirm they are talking about the same vulnerability, without either system having to trust the other’s authority?

We set out to define this minimal set (MVVE) and found that there likely is no such minimum. There are shared elements, such as specifying an affected (software) product and identifying the conditions under which exploitation would be successful and one or more compromised security properties. Beyond that, the number and type of assertions needed to deduplicate and disambiguate a vulnerability varies.

We expect a variable set of assertions, over time, both within and across repositories. But sorting out lists and types of assertions is secondary. As we’ve been researching the problems with vulnerability data, we found we first need a better foundation of shared terms and concepts, then we can construct conventions for proper assertions.

If you strip away the severity scores, the advisory prose, the CWE assignments, and the affected product lists, what is left? Is what remains sufficient to anchor cross-repository deduplication, or does stripping it down expose a void?

There is no doubt that our existing record formats have opportunities for improvement. We did not start out with “What do we have?” but instead asked “What do we need?” This led to “What vulnerability management tasks and decisions do vulnerability records support?” One of the most critical initial tasks is identification of the affected software products. Recent research showed that “Naming inconsistencies were identified in 50.18% of vendor names used in CPEs within the official NVD database.” If we can’t accurately identify affected products, the rest doesn’t matter. (CPE isn’t unique in this failure mode, other software identification systems suffer similarly).

One challenge in measuring the quality of something like a CPE assertion is that it may look perfectly reasonable to a human, but it fails to properly identify software, especially in terms of automation and machine usability. Telling the difference between identifiers and software products requires significant manual effort. Reducing or even eliminating that manual effort starts in the architecture and design of the identification system and its assertions.

Once we have designed to capture the right level of detail, we then need to focus on our ability to trust the information. Another principle we will be discussing: “Every assertion in a record must be able to answer for itself.” So when we do record assertions, they need to be simple, precise, observable, useful, and include provenance. The new vulnerability record is a collection of assertions, growing (and changing) over time, bound to a vulnerability identifier, and machine-usable. These assertions describe the vulnerability, enabling identification, deduplication, and vulnerability management and they need to be independently verifiable and refutable.

Metrics-driven incentives, whether response time, coverage counts, or CVSS throughput, have a distorting effect on record quality. What specific distortions do you observe most frequently, and are any of them invisible to the people producing them?

Quantity is not quality. Not to say that coverage and counts aren’t useful, but for example, having a CVSS base score or a CWE ID in a vulnerability record doesn’t mean that information is accurate, precise, or even useful. There’s a tendency to focus on the things you are measuring, simply because you are measuring them. The measurements themselves can become the goal.

Consider CVSS. Vulnerability repositories typically provide CVSS Base scores and vectors, which are intended to convey proximate technical severity. There’s nothing superficially wrong with this. Repositories cannot easily provide consumers with local, context-dependent information needed to assess risk. Consumers need to supply this context, which is probably more important than proximate technical severity. But attention spent on relatively inexpensive CVSS Base scores (“watch out, 9.8!”) distracts from the work of determining context and more holistically assessing risk. Counting vulnerability records with CVSS scores or measuring distributions of CVSS Base scores is possible, but does it help?

Different language games, definitions, and overly abstract metrics also lead to distortion. How can two different but similarly qualified analysts looking at the same vulnerability information come up with different CWE IDs or CVSS Attack Complexity vectors? Many of the common assertions we use today fail the tests of atomicity and observability. Disagreements naturally occur, and metrics based on these assertions will be distorted.

The security community has a long history of producing elegant specifications that get adopted selectively and implemented inconsistently. What is different about this proposal that would prevent that outcome?

We can’t guarantee the outcome, but the current state of the practice is untenable. We’re not about to draft a new specification, elegant or otherwise. A new format, specification, or a handful of fields aren’t going to resolve the foundational and philosophical problems we are facing. The premise of our work is that we first need to develop a set of principles and requirements with which to design and build better vulnerability repositories. Maybe then it will be time to write a specification, build a new repository, or make changes to existing repositories. But first we need a solid foundation.

Download: 2026 SANS Identity Threats & Defenses Survey

More about

Fixing vulnerability data quality requires fixing the architecture first

When two repositories disagree about whether a patch fixes a vulnerability, is that a data quality problem, a governance problem, or a definitional problem? And does the distinction matter for how we fix it?

What is the minimal set of assertions that would allow two independent systems to confirm they are talking about the same vulnerability, without either system having to trust the other’s authority?

If you strip away the severity scores, the advisory prose, the CWE assignments, and the affected product lists, what is left? Is what remains sufficient to anchor cross-repository deduplication, or does stripping it down expose a void?

Metrics-driven incentives, whether response time, coverage counts, or CVSS throughput, have a distorting effect on record quality. What specific distortions do you observe most frequently, and are any of them invisible to the people producing them?

The security community has a long history of producing elegant specifications that get adopted selectively and implemented inconsistently. What is different about this proposal that would prevent that outcome?

Featured news

Resources

Don't miss