8000 fix(matchUpstreamMavenPackages): Only search Maven if the metadata POM artifact ID and group ID are missing. by tdunlap607 · Pull Request #2547 · anchore/grype · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fix(matchUpstreamMavenPackages): Only search Maven if the metadata POM artifact ID and group ID are missing. #2547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 17, 2025

Conversation

tdunlap607
Copy link
Contributor
@tdunlap607 tdunlap607 commented Mar 20, 2025

Summary

This PR addresses issue #2216: Should only check maven central if pom info is missing. The goal is to reduce the number of Maven searches, to help reduce the possibility of rate limits (Maven Error 403 Forbidden).

Fixes #2216

Changes

Updated matchUpstreamMavenPackages to first check for metadata.PomArtifactID and metadata.PomGroupId.

  • If both are available, Maven search is skipped, and the existing POM data is used for matching.
  • If not, the existing logic remains, fetching POM data from Maven before performing the match.
  • Additional debug logging to show when Maven searches are either skipped or performed.

Testing

  • Updated the TestMatcherJava_matchUpstreamMavenPackage unit test to account for the new logic. This included added data to grype/matcher/java/matcher_mocks_test.go
  • Added an integration test (TestMavenSearch_MatchUpstreamMavenPackages) comparing these updates to Grype v0.90.0, confirming that when POM data is available, we can skip Maven searches while maintaining the same results. An additional config test file was added under grype/matcher/java/test-fixtures/config/matcher_integration_test_config.yaml to enable Maven searches for the integration test.

Here’s an example output from the integration test:

=== RUN   TestMavenSearch_MatchUpstreamMavenPackages
    ./grype/grype/matcher/java/matcher_integration_test.go:727: 'go run ./cmd/grype' - Searching Maven count: 205, Skipping Maven count: 394
    ./grype/grype/matcher/java/matcher_integration_test.go:728: 'grype' - Searching Maven count: 0, Skipping Maven count: 0
    ./grype/grype/matcher/java/matcher_integration_test.go:731: 'go run ./cmd/grype solr@sha256:16983468366aaf62417bb6a2a4b703b486b199b8461192df131455071c263916' took: 2m3.969340291s
    ./grype/grype/matcher/java/matcher_integration_test.go:732: 'grype solr@sha256:16983468366aaf62417bb6a2a4b703b486b199b8461192df131455071c263916' took: 4m21.7019325s
    ./grype/grype/matcher/java/matcher_integration_test.go:735: 'go run ./cmd/grype' - Total vulnerabilities: 158
    ./grype/grype/matcher/java/matcher_integration_test.go:737:     high: 41
    ./grype/grype/matcher/java/matcher_integration_test.go:737:     critical: 8
    ./grype/grype/matcher/java/matcher_integration_test.go:737:     unknown: 0
    ./grype/grype/matcher/java/matcher_integration_test.go:737:     negligible: 8
    ./grype/grype/matcher/java/matcher_integration_test.go:737:     low: 45
    ./grype/grype/matcher/java/matcher_integration_test.go:737:     medium: 56
    ./grype/grype/matcher/java/matcher_integration_test.go:740: 'grype' - Total vulnerabilities: 158
    ./grype/grype/matcher/java/matcher_integration_test.go:742:     low: 45
    ./grype/grype/matcher/java/matcher_integration_test.go:742:     medium: 56
    ./grype/grype/matcher/java/matcher_integration_test.go:742:     high: 41
    ./grype/grype/matcher/java/matcher_integration_test.go:742:     critical: 8
    ./grype/grype/matcher/java/matcher_integration_test.go:742:     unknown: 0
    ./grype/grype/matcher/java/matcher_integration_test.go:742:     negligible: 8
    ./grype/grype/matcher/java/matcher_integration_test.go:749: 'grype' is slower by 2m17.732592209s
--- PASS: TestMavenSearch_MatchUpstreamMavenPackages (385.67s)
PASS
ok      github.com/anchore/grype/grype/matcher/java     386.360s

Happy to chat more or make additional changes! :)

cc @luhring

tdunlap607 and others added 3 commits March 18, 2025 14:34
…M artifact ID and group ID are missing.

Signed-off-by: tdunlap607 <trevor.dunlap@chainguard.dev>
…tion test demonstrating skipped searches

Signed-off-by: tdunlap607 <trevor.dunlap@chainguard.dev>
// The config file (./test-fixtures/config/matcher_integration_test_config.yaml) enables the maven search
// Additionally, increased the rate-limit to 500ms to avoid further 403 Forbiddens from Maven
// This integration test takes approximately 400s
func TestMavenSearch_MatchUpstreamMavenPackages(t *testing.T) {
Copy link
Contributor
@wagoodman wagoodman Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thi test is more like a CLI test (running a compiled version of grype) which isn't ideal. I think a small refactor would make this much easier to write a unit test for and the test code smaller/easier to reason about -- by decoupling the decision to search by sha or not into another function:

func (m *Matcher) shouldSearchMavenBySha(p pkg.Package) (bool, []string) {
	digests := []string{}
	
	if metadata, ok := p.Metadata.(pkg.JavaMetadata); ok {
		if metadata.PomArtifactID == "" || metadata.PomGroupID == "" {
			for _, digest := range metadata.ArchiveDigests {
				if digest.Algorithm == "sha1" {
					digests = append(digests, digest.Value)
				}
			}
		}
	}
	
	return len(digests) > 0, digests
}

or something like this. Now the test can assert the correctness of the decision without mocks or needing to test the running times between two configurations.

Can you refactor this approach or another that improves the testability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @wagoodman, thank you for the quick review!!

And yes, I agree, it was a bit messy for that integration test. I like your proposal for the refactor, I'll give that a go today. Thanks.

…determine if we need to search Maven

Signed-off-by: tdunlap607 <trevor.dunlap@chainguard.dev>
Signed-off-by: tdunlap607 <trevor.dunlap@chainguard.dev>
@tdunlap607
Copy link
Contributor Author

I've updated the PR to now include the following changes based on @wagoodman’s suggestion above:

  1. Introduced a new method: shouldSearchMavenBySha, determines whether Maven should be searched based on the presence of POM metadata. Doing this improved testability and isolation of if we should search Maven.
  2. The added method shouldSearchMavenBySha required a minor refactor to matchUpstreamMavenPackages to account for the new logic.
  3. Added a new unit test in matcher_test.go (TestMatcherJava_shouldSearchMavenBySha) to test the new method.
  4. Removed the unnecessary integration test I originally included in the PR (TestMavenSearch_MatchUpstreamMavenPackages).

}
// If we need to search Maven but no valid SHA-1 digests exist, return an error
if len(digests) == 0 {
return true, nil, fmt.Errorf("missing SHA-1 digest; cannot search Maven for package %s", p.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to bail early / return an error if we have no digest nor pom info. I think we should still let the default "skipping maven search" still occur. If that's true, then this function won't need an error returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that sounds good to me. I've updated the code :)

…MavenBySha

Signed-off-by: tdunlap607 <trevor.dunlap@chainguard.dev>
@tdunlap607
Copy link
Contributor Author

Hey @wagoodman, just a friendly check-in to see if any other changes are required for this PR. Thanks!

@spiffcs spiffcs enabled auto-merge (squash) April 17, 2025 16:27
@spiffcs
Copy link
Contributor
spiffcs commented Apr 17, 2025

Hey @wagoodman, just a friendly check-in to see if any other changes are required for this PR. Thanks!

Alex is out this week - Thanks for the changes @tdunlap607 this LGTM

I've run the internal checks and will enable auto merge in a second.

@spiffcs spiffcs merged commit 9f75891 into anchore:main Apr 17, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should only check maven central if pom info is missing
3 participants
0