8000 feat: add Debian archive (.deb) file cataloger by popey · Pull Request #3704 · anchore/syft · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

feat: add Debian archive (.deb) file cataloger #3704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 19, 2025
Merged

Conversation

popey
Copy link
Contributor
@popey popey commented Mar 3, 2025

Add a cataloger that parses Debian package (.deb) archive files directly, allowing Syft to discover packages from .deb files without requiring them to be installed on the system.

Closes #3315

Key features:

  • Parse .deb AR archives to extract package metadata
  • Support for gzip, xz, and zstd compressed control files
  • Extract package metadata from control files
  • Process file information from md5sums files
  • Mark configuration files from conffiles entries
  • Parse licenses from data files
  • Handle trailing slashes in archive member names

Description

Please include a summary of the changes along with any relevant motivation and context,
or link to an issue where this is explained.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have added unit tests that cover changed behavior
  • I have tested my code in common scenarios and confirmed there are no regressions
  • I have added comments to my code, particularly in hard-to-understand sections

popey added 2 commits March 3, 2025 14:29
Add a cataloger that parses Debian package (.deb) archive files directly,
allowing Syft to discover packages from .deb files without requiring
them to be installed on the system. This implements issue #3315.

Key features:
- Parse .deb AR archives to extract package metadata
- Support for gzip, xz, and zstd compressed control files
- Extract package metadata from control files
- Process file information from md5sums files
- Mark configuration files from conffiles entries
- Handle trailing slashes in archive member names

Signed-off-by: Alan Pope <alan.pope@anchore.com>
Signed-off-by: Alan Pope <alan.pope@anchore.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
@github-actions github-actions bot added the json-schema Changes the json schema label Mar 19, 2025
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Copy link
Contributor
@wagoodman wagoodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few changes, but overall looks great!

changes made:

  • added a specific DPKG archive metadata type; though it's the same as the DB entry we want to have the flexibility to be able to change one or the other independent of one another. This required a JSON schema update.
  • Added licenses extraction from the embedded data.tar.* file
  • removed the binary test fixture and used the common container image + pkgtest test harness
  • removed io.ReadAll() uses and instead plumped through readers to prevent large chunks of memory being allocated

@wagoodman wagoodman enabled auto-merge (squash) March 19, 2025 19:40
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
@wagoodman wagoodman merged commit 5fa8e9c into main Mar 19, 2025
13 checks passed
@wagoodman wagoodman deleted the debian-catalog-work branch March 19, 2025 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
json-schema Changes the json schema
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Catalog deb archives directly
2 participants
0