Description
What would you like to be added:
I would like Syft to add a feature that returns the full license string, even when it exceeds 64 characters, instead of hashing it and returning a LicenseRef-<hash>
This could be done by:
- I think the original license string should be returned also for better traceability or maybe a flag added to output the full license string regardless
Why is this needed:
Currently, Syft hashes license strings longer than 64 characters using SHA256, replacing license strings with:
`LicenseRef-"sha256-Hash"`
This actually limits traceability during license scans and compliance checks. The redislabs/k8s-controller:7.8.2-6
image is a good example.
The following license string was found in the RPM DB on this image using:
rpm -qa --qf '%{NAME}: %{LICENSE}\n'
glibc-minimal-langpack: LGPLv2+ and LGPLv2+ with exceptions and GPLv2+ and GPLv2+ with exceptions and BSD and Inner-Net and ISC and Public Domain and GFDL
glibc-common: LGPLv2+ and LGPLv2+ with exceptions and GPLv2+ and GPLv2+ with exceptions and BSD and Inner-Net and ISC and Public Domain and GFDL
glibc: LGPLv2+ and LGPLv2+ with exceptions and GPLv2+ and GPLv2+ with exceptions and BSD and Inner-Net and ISC and Public Domain and GFDL
Syft hashes the string and returns:
LicenseRef-cedbc2fa4301332b3d3569627696d986a63b3f3a293a2759a611c7c3deebd428
Which I verified on python:
import hashlib
print(hashlib.sha256(b"LGPLv2+ and LGPLv2+ with exceptions and GPLv2+ and GPLv2+ with exceptions and BSD and Inner-Net and ISC and Public Domain and GFDL").hexdigest())
cedbc2fa4301332b3d3569627696d986a63b3f3a293a2759a611c7c3deebd428
Additional context:
The behaviour is defined here: https://github.com/anchore/syft/blob/main/syft/format/internal/spdxutil/helpers/license.go
Particularly:
if len(l.Value) <= 64 {
// if the license text is less than the size of the hash,
// just use it directly so the id is more readable
candidate.ID = spdxlicense.LicenseRefPrefix + SanitizeElementID(l.Value)
} else {
hash := sha256.Sum256([]byte(l.Value))
candidate.ID = fmt.Sprintf("%s%x", spdxlicense.LicenseRefPrefix, hash)
}
Environment:
Syft version:
syft 1.20.0
Metadata
Metadata
Assignees
Labels
Type
Projects
Status