Refactor NVD node configuration parsing #546
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This refactors the NVD node configuration parsing since the previous implementation when porting from v5 scheme to v6 did not consider non-application CPE parts when crafting new AffectedCPE entries for the DB. The structure of the code was based off of what was there is v5 and was not really best fit to build on.
For instance, in the original code a transposition of affected CPEs and platforms CPEs was performed before attempting to find vulnerable CPE candidates (e.g.
AND(OR(app cpes), OR(platform cpes))
-->OR( AND(app cpe, platform cpe), ... )
). This transposition is no longer needed.To test the correctness of the functionality the existing transform function tests were kept intact and updated to reflect the additional functionality that was missing. Additional tests were added to show how different transformer configs (e.g. with or without o and h CPE parts being considered) as well as highlighting when incompatible topologies are encountered.
We now also log warnings when dropping criteria from node configurations:
These nodes are dropped today but are done so silently, this PR adjusts this behavior to at least call out dripping criteria with a warning similar to how we call out CPEs that cannot be parsed as warnings. This allows us to see actively in the logs when new cases crop up and determine if we need to adjust this node configuration parser to account for more cases.
Another case that was not considered when porting from v5 to v6 schema is that now all CPE fields should be considered sensitive when deduplicating matched CPEs; with v5 only a small subset were supported. This PR now considers all except for the version and update fields when deduplicating CPEs -- this will ultimately lead to a difference in the number of
a
part CPEs written to the DB compared to what we've written previously.In terms of how many additional CPEs we're storing, how that affects the count of affects CPE / blob records, and the distributed DB size... here's the final breakdown :
select count(*) from affected_cpe_handles
select count(*) from blobs
select count(*) from cpes where part == "a"
select count(*) from cpes where part == "o"
select count(*) from cpes where part == "h"
vulnerability.db
file sizetar.zst
file size