8000 Howto add additional metadata that can be collected by `reuse spdx` · Issue #105 · fsfe/reuse-website · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Howto add additional metadata that can be collected by reuse spdx #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
choeppler opened this issue Apr 10, 2025 · 5 comments
8000 Open

Howto add additional metadata that can be collected by reuse spdx #105

choeppler opened this issue Apr 10, 2025 · 5 comments

Comments

@choeppler
Copy link

What would I like to achieve?

I'm in the process of transitioning some templates to the REUSE standard and am wondering how to document the origin of third-party code. For example, given the following snippet, license and copyright information can be conveyed as specified by the REUSE standard and can be extracted by calling reuse spdx:

    // SPDX-SnippetBegin
    // The following function is from Awesome Project V 0.9
    //   (https://github.com/awesome/project/tree/v0.9)
    // SPDX-SnippetCopyrightText: 2008, 2011 John McMaster
    // SPDX-SnippetCopyrightText: 2012-2014 Awesome Inc., Other Ltd.
    // SPDX-License-Identifier: LicenseRef-MIT-AwesomeProject
    namespace awesome {
        void sayHello() {
            std::cout << "Hello, awesome world!" << std::endl;
        }
    }
    // SPDX-SnippetEnd

However, I don't know how the original location (i.e., lines 2,3 in the above example) would be specified as per REUSE (e.g., purl, download location, commit hash, file comment, ...).

Possible Solution

I do understand that the REUSE standard is focused on license and copyright information and that we probably do not want to formally specify how to deal with the use case described above (or many others that may pop up). On the other, I think it would be very valuable to handle the snippet's original location in the example above in a way that's compatible with the spec and ideally with the same tool.

It seems to be quite natural to add additional information as the "origin" of a third-party snippet with some other SPDX-* -identifier. So, how about the following approach:

  • add a section to the docs or to the FAQ on "How to add additional information" possibly with an opinionated suggestion on how to deal with the use-cases we know about
  • add a feature to the reuse tool's spdx command to just collect additional SPDX tag-value data using the same logic as for parsing the license and copyright information from source files and REUSE.tomls?

That way the spec on what REUSE compliance means stays concise and focused on license and copyright information. Still, there's an easy way to add additional information which can be extracted reusing the reuse tool and then processed further by other means.

References

This is a follow-up to a recent thread "[REUSE] Listing the "source" of third-party artifacts" on the mailing list and also relates to the following issues:

@mxmehl
Copy link
Member
mxmehl commented Apr 10, 2025

Thanks for creating this issue. I agree it's a tool and FAQ question and not related to the spec.

I generally support the idea of gathering additional SPDX tags and reporting them e.g. for reuse lint --json and reuse spdx, ideally controllable with an optional flag. However, it's obviously a question of how much effort this would take as the tool so far is focussed on license and copyright.

@silverhook
Copy link
Contributor

I just spoke with @zvr and @pombredanne on this topic (in broad strokes and all issues with this are of my own fault) and it seems we could do this two ways:

  1. with the External repository identifier and Purl: SPDX-ExternalRef: PACKAGE-MANAGER purl pkg:github/fsfe/reuse-website
  2. with the Package download location field: SPDX-PackageDownloadLocation: https://github.com/fsfe/reuse-website/ – where I argue we could also use Purl, because while the spec does not list Purl here as an example, it does say it has to be a URL, which Purl is. (@zvr, please correct me if I’m wrong here)

@mxmehl
Copy link
Member
mxmehl commented Apr 10, 2025

Thanks for coming up with ideas for the exact field @silverhook, which is one part of the solution. The general idea would be that REUSE is agnostic about which SPDX fields to pick up for a report, so that also different use cases can be covered as long as SPDX tags are being used.

@choeppler
Copy link
Author

@silverhook , thank you for suggesting some concrete SPDX tags. From what I know about Purls, I wouldn't be able to use them for everything, since snippets could also come from sources that are not a package, e.g., blogs, forums, etc, would I? So PackageDownloadLocation seems to be a good candidate -- and if it could be a Purl, all the better.

@silverhook
Copy link
Contributor

Sound good to me.

Depending on what you want to achieve, consider also Software Heritage Persistent IDs as another option for a persistent External identifier tag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0