8000 feat: add Bitnami matcher by juan131 · Pull Request #2538 · anchore/grype · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: add Bitnami matcher #2538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Conversation

juan131
Copy link
@juan131 juan131 commented Mar 18, 2025

Summary

After anchore/vunnel#447, anchore/grype-db#217 and anchore/syft#3065 this PR attempts to use Bitmani vulndb data for matching package vulnerabilities.

Signed-off-by: juan131 <juan.ariza@broadcom.com>
Comment on lines 23 to 26
// Bitnami packages' metadata are built from the package URL (instead of CPE, hence CPE match is not supported)
// ref: https://github.com/anchore/syft/blob/main/syft/pkg/bitnami.go#L3-L13
// ref: https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/bitnami/package.go#L18-L45
return internal.MatchPackageByEcosystemAndCPEs(store, p, m.Type(), false)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willmurphyscode should we include some mechanisms to match based on the pURL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you want to do something besides CPEs here. CPEs are pretty bad, and it looks like we're not making up CPEs for the bitnami packages we find (which is probably a good thing).

Since Bitnami is neither a language ecosystem or a distro, you'll want to add a similar top level function similar to either

func MatchPackageByEcosystemPackageName(provider vulnerability.Provider, p pkg.Package, packageName string, matcherType match.MatcherType) ([]match.Match, []match.IgnoredMatch, error) {
or
func MatchPackageByDistro(provider vulnerability.Provider, p pkg.Package, upstreamMatcher match.MatcherType) ([]match.Match, []match.IgnoredMatch, error) {

But basically the pattern will be searching by package type + package name + package version (essentially a PURL).

I think this CPE searching code can be removed - we don't expect (or want) to use CPEs where we can avoid it.

@wagoodman might have some more information on building the actual matcher and search changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context @willmurphyscode!
FYI: I was assigned a top priority task so you should expect a couple of days delay on my work here. I'll probably resume it next Monday.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a 1st approach at b6e1858

juan131 added 2 commits March 27, 2025 16:46
Signed-off-by: juan131 <juan.ariza@broadcom.com>
Signed-off-by: juan131 <juan.ariza@broadcom.com>
@juan131 juan131 requested a review from willmurphyscode March 31, 2025 11:58
Copy link
Contributor
@willmurphyscode willmurphyscode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I think the big question I have is: Bitnami has its own go-version lib, and uses strings like 7.4.2-3 to indicate the third bitnami release of upstream version 7.4.2 (please correct me if I'm wrong). On the other hand, your OSV data has type SEMVER on its affected version ranges. But it seems to me that since you have -3 or whatever on the end of version ranges, you're using an ecosystem-specific matcher.

I was sort of expecting a bitnami-specific entry in the version comparison files at https://github.com/anchore/grype/tree/main/grype/version, and I realize I was sort of expecting the OSV data to have ranges of type ECOSYSTEM. Did I miss something here? Do we not need to pull in https://github.com/bitnami/go-version and use it?

Happy to sync up on this as well.

@@ -72,6 +72,8 @@ func KnownPackageSpecifierOverrides() []PackageSpecifierOverride {

// jenkins plugins are a special case since they are always considered to be within the java ecosystem
{Ecosystem: string(pkg.JenkinsPluginPkg), ReplacementEcosystem: ptr(string(pkg.JavaPkg))},
// Bitnami is a special case since it's not a language ecosystem but is a package type
{Ecosystem: "bitnami", ReplacementEcosystem: ptr(string(pkg.BitnamiPkg))},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't needed. Since pkg.Bitnami is the string "bitnami" anyway ( se https://github.com/anchore/syft/blob/main/syft/pkg/type.go#L16 ) this line is a no-op.

)

var typeOrder = map[Type]int{
ExactDirectMatch: 1,
ExactIndirectMatch: 2,
CPEMatch: 3,
PURLMatch: 4,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want a new match type. This is exact-direct since we are searching a vulnerability source we know should govern this package for this exact package. (Indirect is its own type because it signals that we searched via an upstream/source package name, because CPE is its own way of searching. But a PURL search is effectively an exact direct search in this case).


var ErrEmptyPURLMatch = errors.New("attempted PURL match against package with no PURL")

func MatchPackageByPURL(provider vulnerability.Provider, p pkg.Package, upstreamMatcher match.MatcherType) ([]match.Match, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would be simpler as a facade over

func MatchPackageByEcosystemPackageName(provider vulnerability.Provider, p pkg.Package, packageName string, matcherType match.MatcherType) ([]match.Match, []match.IgnoredMatch, error) {
?

It looks like its copying some things from the CPE matcher that we don't need, like the confidence of 0.9 and the special JVM version handling.

@@ -85,6 +85,8 @@ func FormatFromPkg(p pkg.Package) Format {
switch p.Type {
case syftPkg.ApkPkg:
return ApkFormat
case syftPkg.BitnamiPkg:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This surprised me. I'll write a longer comment later, but I thought Bitnami had custom versions, like 7.4.2-12, and would need a custom version object. Many ecosystems have a custom version object.

@@ -7,15 +7,15 @@ import (
"github.com/scylladb/go-set/strset"
)

type CPEPackageParameter struct {
type PackageParameter struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of the changes in this file go away if we're not making PURL its own match type or really its own search type, but using an ecosystem search. (Ecosystem search is in language.go because most ecosystems are languages. I thought that Bitnami would be more of an outlier in this regard, but it seems not.)

@willmurphyscode
Copy link
Contributor

Hi @juan131 it seems like I wrote a ton of words on here and maybe didn't help very much. I'll try to distill it a bit (sorry for reviewing super verbosely earlier):

  1. I think we should not add whole infrastructure around the PURL search, but should rather do basically a package name and ecosystem like the language matchers do. This will make the diff much smaller.
  2. I want to know whether we need a custom version comparator or not. Do you happen to know whether regular semver will correctly order versions with the bitnami-added number after a hyphen thing? Like 14.4.2-2 is less than 14.4.2-4?

Let me know if you'd like to sync up, or if I can do anything to clarify or help out here.

@juan131
Copy link
Author
juan131 commented Apr 21, 2025

Hi @willmurphyscode

Sorry for the lack of updates on this PR! Easter came and a big part of the team was on vacations so I had to focus on other stuff.

I think we should not add whole infrastructure around the PURL search, but should rather do basically a package name and ecosystem like the language matchers do. This will make the diff much smaller.

Great! Also versioning, right?

Do you happen to know whether regular semver will correctly order versions with the bitnami-added number after a hyphen thing?

As you mentioned, Bitnami does uses its own versioning. It's almost identical to semver with a small modification: pre-releases don't exist and instead they're considered revisions. It's explained in the link below:

Like 14.4.2-2 is less than 14.4.2-4?

Exactly! Although we use an r prefix on revisions, e.g: 14.4.2-r2. That's why we created that Go module so it's easier for other projects (such as grype) to compare versions.

@willmurphyscode
Copy link
Contributor

@juan131 Thanks for getting back to me!

I have two big questions regarding the versioning:

  1. In your OSV data, you use version type "SEMVER" instead of "ECOSYSTEM", but it sounds like custom version comparison is required beyond strict semver, so I'm surprised here. Is the OSV data encoding the right type of version range events?
  2. In the Grype matching, I was expecting to see your Go module for comparing these versions used. You can see examples of other customer comparators being implemented, for example here's one for PEP440-compliant Python package versions: https://github.com/anchore/grype/blob/main/grype/version/pep440_version.go . I was expecting to see this diff add bitnami_version.go in that same directory, and I wasn't sure why it didn't.

@juan131
Copy link
Author
juan131 commented Apr 23, 2025

Hi @willmurphyscode

As you pointed out, we do use "type": "SEMVER" on our vulndb OSV records and it's because those records follow Semantic Versioning strictly. However, we use our "particular" versioning (using revisions instead of pre-releases) in our SBOM (SPDX files included in Bitnami containers) (that's why we're using it on Syft, see https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/bitnami/package.go#L10)

This is sth we need to take into account so here at Grype. Does it make sense?

@juan131
Copy link
Author
juan131 commented Apr 23, 2025

There's an important caveat with having two versioning:

What happens when a CVE lists a last_affected version? For instance, take this CVE as reference that says last affected version is 1.5.0.

If Syft inspects an ArgoCD container and the SPDX shipped on it includes sth like this:

        {
            "SPDXID": "SPDXRef-argo-cd",
            "name": "argo-cd",
            "versionInfo": "1.5.0-1",
            "downloadLocation": "git+https://github.com/argoproj/argo-cd/#refs/tags/v1.5.0",
            "licenseConcluded": "Apache-2.0",
            "licenseDeclared": "Apache-2.0",
            "filesAnalyzed": false,
            "externalRefs": [
                {
                    "referenceCategory": "SECURITY",
                    "referenceType": "cpe23Type",
                    "referenceLocator": "cpe:2.3:*:linuxfoundation:argo_continuous_delivery:1.5.0:*:*:*:*:kubernetes:*:*"
                },
                {
                    "referenceCategory": "PACKAGE-MANAGER",
                    "referenceType": "purl",
                    "referenceLocator": "pkg:bitnami/argo-cd@1.5.0-1?arch=arm64&distro=debian-12"
                }
            ],
            "copyrightText": "NOASSERTION"
        },

The version would be 1.5.0-1 which, following our "particular" versioning that considers the -1 as a revision, means that 1.5.0-1 > 1.5.0. Hence, in theory, Grype could consider the image is not affected by that CVE since last affected version is older but I'm afraid that's not correct.

Why? That revision implies we created a new "Bitnami package revision" for ArgoCD and that new revision may change the bash logic we use during the container initialization or the flags we used to compile the ArgoCD binaries. However, that new package revision uses exactly the same upstream source code we used to create the previous one. As a consequence, given Bitnami Vulndb data is created based on the upstream project vulnerabilities, 1.5.0-1 is actually affected by the CVE.

@willmurphyscode
Copy link
Contributor

Why? That revision implies we created a new "Bitnami package revision" for ArgoCD and that new revision may change the bash logic we use during the container initialization or the flags we used to compile the ArgoCD binaries. However, that new package revision uses exactly the same upstream source code we used to create the previous one. As a consequence, given Bitnami Vulndb data is created based on the upstream project vulnerabilities, 1.5.0-1 is actually affected by the CVE.

Are you saying we should only use the semver component for version comparisons and drop everything after the hyphen? That is, nothing is every fixed by a hyphen, but is only ever fixed by pulling in more upstream code? I'm never going to see a fixed event with 15.1.1-3 when upstream's 15.1.1 is vulnerable?

Or maybe you're saying last_affected doesn't imply a fix (which it doesn't). Maybe last_affected should not emit a version constraint at all, until there's an actual fix event?

@juan131
Copy link
Author
juan131 commented Apr 23, 2025

Are you saying we should only use the semver component for version comparisons and drop everything after the hyphen? That is, nothing is every fixed by a hyphen, but is only ever fixed by pulling in more upstream code? I'm never going to see a fixed event with 15.1.1-3 when upstream's 15.1.1 is vulnerable?

Yes, I know it's weird but I think it's the way to go. In the unlikely case that a Bitnami revision (-X) fixes an upstream vulnerability (e.g. code is patched at compilation or configuration makes it unexploitable), then it's Bitnami responsibliy to emit the corresponding VEX data explaining the reasons why the CVE should be ignored. However, without that VEX data that says the opposite, Grype should assume that, if Bitnami vulndb says a version is affected, there's no unaffected revision.

@willmurphyscode
Copy link
Contributor

However, without that VEX data that says the opposite, Grype should assume that, if Bitnami vulndb says a version is affected, there's no unaffected revision.

Is that VEX data something we'll have access too? Should we be adding that to the Vunnel provider? Has that every been used? How often?

I know you all opened #1826 which is about supporting VEX documents. Is that why?

@juan131
Copy link
Author
juan131 commented Apr 28, 2025

Is that VEX data something we'll have access too? Has that ever been used? How often?

We are currently producing VEX data but don't expose it publicly. It's something we exclusively offer to our TAC customers.

I know you all opened #1826 which is about supporting VEX documents. Is that why?

Yes, it is important for us that scanners have support for VEX so our customers can consume the VEX data we produce via the scanners. (e.g. we can say "run Grype passing this VEX data via the xxx flag").

Should we be adding that to the Vunnel provider?

I don't think VEX data should be used by default to discard vulnerabilities (regardless we make it public in the future or not). It should be only applied if the user activates some flags acknowledging awareness about its use.

@willmurphyscode
Copy link
Contributor

@juan131 thanks for answering my questions about the version comparison. I think we still may need some custom version comparison logic in Grype for bitnami packages:

$ grype bitnami/redis:7.4.1
... 
NAME   INSTALLED  FIXED-IN              TYPE     VULNERABILITY         SEVERITY
redis  7.4.1-1    7.4.1, 7.2.8          bitnami  BIT-redis-2024-31227  Unknown

This looks incorrect to me, because it looks like redis 7.4.1-1 should include the fix from 7.4.1. Looking at the osv.dev data for the vuln it looks like it should be fixed.

So I think we need a custom version object that will look at the exression 7.4.1-1 < 7.4.1 and return false while the current one returns true.

I also think we don't need the new MatchPackageByEcosystemAndPURL, since a PURL already describes an ecosystem and we already have a function for searching by package name and ecosystem:

func MatchPackageByEcosystemPackageName(provider vulnerability.Provider, p pkg.Package, packageName string, matcherType match.MatcherType) ([]match.Match, []match.IgnoredMatch, error) {

So I think at a high level the two changes we need to get this in are:

  1. Remove the new MatchPackageByEcosystemAndPURL and the changes it implied, and try switching to MatchPackageByEcosystemPackageName (which has the same matching behavior on this branch)
  2. Add a version type for the bitnami version that can evaluate the "semver plus hyphen" that bitnami uses on one side against the strict semver the data has on the other. In other words https://github.com/anchore/grype/pull/2538/files#r2025394534 needs to be addressed even though the fix versions have semver.

Let me know if you have questions! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use the upstream Bitmani vulndb data for matching
2 participants
0