-
-
Notifications
You must be signed in to change notification settings - Fork 35
Create PURL CLI tool and library #267 8000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I think this only needs a few unit tests and is good to merge, you may want to extract the whole body of validate()m
@pombredanne I just noticed that despite this option we added last week
I can successfully run this command without adding
Is that what we want? I thought your point was that you wanted to require the |
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
@pombredanne I've refactored the Re the additional services we want to add (starting with fetchcode), do we want to merge this current PR (subject to your additional comments) and then create a new PR under the same issue as I add those additional services to the PURL CLI tool/library? |
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Reference: #247 Signed-off-by: John M. Horan johnmhoran@gmail.com
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
@johnmhoran We should reorganize the new code and tests that you've created into its own project. For now, I think we should copy the |
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
@JonoYang With apologies, I just saw your comment in this PR now -- was busy vetting and refactoring so I could commit and push. Perhaps we could chat briefly at some point before I start the reorg just to make sure I understand. And query whether I should do that before or after I receive some sort of feedback on the PR (definitely including the tests ;-). |
packagedb/tests/test_purlcli.py
Outdated
|
||
|
||
class TestPURLCLI_validate(object): | ||
@pytest.mark.parametrize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pombredanne How do you feel about using the pytest.mark.parametrize
to hold the test input and expected results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works. If the list is small you can use it like there:
- https://github.com/nexB/scancode-toolkit/blob/f70bbb7d9d9bab40a9d504e664bc945b6a1630e8/tests/textcode/test_markup.py#L41
- if this becomes more involved, use a small class to keep the test data expressive and readable.... long lists of tuples are NOT readable: https://github.com/nexB/python-inspector/blob/0ad23abfe051a13afb3126c4112a54ea0288e118/tests/test_utils_pypi.py#L21
- if you have a lot of test/excepted file pairs, this can work https://github.com/nexB/pip-requirements-parser/blob/6629b466001c53ff0def85e42100ca3510023314/tests/test_requirements_parser.py
(For very large test suites, the toolkit approach using YAML test data files for license and copyright can work too, but is much more work and I would likely rewrite this using pytest.mark.parametrize today though the bulk would stay the same: data files for expectations, test files for data and objects to read these )
We can have a call about this |
@JonoYang Re the restructuring, I've added several files in the new project folder, including
This does not appear relevant to our new |
@JonoYang I've added the new structure and basic files but cannot import |
@JonoYang I ran |
@JonoYang I've added
|
You would need to install your new purldb-toolkit by going to the purldb-toolkit directory and running I would add This will ensure that we install purldb-toolkit to the venv when we run
I would not put purldb-toolkit as a dependency of purldb since purldb does not use code from purldb-toolkit. |
@JonoYang I've followed your guidance and get this error when running
|
Hmm, I think you may have to state a version in the
|
purlcli.py
Outdated
"bitbucket", | ||
"rubygems", | ||
] | ||
purl_type = purl.split(":")[1].split("/")[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would create a PackageURL
object from purl
string, and then access the type
field of that object
purl_type = purl.split(":")[1].split("/")[0] | |
purl = PackageURL.from_string(purl) | |
if purl.type not in SUPPORTED_ECOSYSTEMS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonoYang Talk about over-engineering. I was so proud of devising purl.split(":")[1].split("/")[0]
, and overlooked the straightforward approach that you identified and that I've now incorporated. Thank you. ;-) Reran my purlcli tests, all 42 passed.
Reran make test
and ... error -- not sure what this is telling me to do:
. . .
_________________________________________________________________________________________________ ERROR collecting purldb-toolkit/setup.py _________________________________________________________________________________________________
import file mismatch:
imported module 'setup' has this __file__ attribute:
/home/jmh/dev/nexb/purldb/setup.py
which is not the same as the test file we want to collect:
/home/jmh/dev/nexb/purldb/purldb-toolkit/setup.py
HINT: remove __pycache__ / .pyc files and/or use a unique basename for your test file modules
========================================================================================================= short test summary info ==========================================================================================================
ERROR purldb-toolkit/setup.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================================================================================= 1 error in 3.11s =============================================================================================================
make: *** [Makefile:110: test] Error 2
(venv) Thu Jan 18, 2024 12:27 PM /home/jmh/dev/nexb/purldb jmh (247-create-purl-cli-tool)
$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test setup is a bit convoluted. Pytest didn't play nice with the directory layout I created, where we have modules like matchcode-toolkit
and purldb-toolkit
. To stop that error, I had to tell pytest to ignore those directories, then I added in extra commands to pytest to just test those specific directories.
We want to update this line: https://github.com/nexB/purldb/blob/247-create-purl-cli-tool/Makefile#L110
to ${ACTIVATE} DJANGO_SETTINGS_MODULE=purldb_project.settings ${PYTHON_EXE} -m pytest -vvs --ignore matchcode-toolkit --ignore purldb-toolkit --ignore packagedb/tests/test_throttling.py
and then add another line to test the purldb-toolkit
specifically
${ACTIVATE} ${PYTHON_EXE} -m pytest -vvs purldb-toolkit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did the pytest fine-tuning, reran 'make test'. Only failures were the same 2 as the other day, which we expect atm.
FAILED minecode/tests/test_maven.py::MavenEnd2EndTest::test_visit_and_map_with_index - AssertionError: Lists differ: [{'ur[31 chars]ven2/cnuernber/dtype-next/0.4.2/dtype-next-0.4[49087 chars]one}] != [{'ur[31 chars]ven2/.index/nexus-mav
en-repository-index.532.g[49087 chars]one}]
FAILED minecode/tests/test_ls.py::ParseDirectoryListingTest::test_parse_listing_from_lslr - AssertionError: Lists differ: [{'pa[1527 chars] '2023-01', 'target': None}, {'path': 'dists/e[974 chars]one}] != [{'pa[1527 chars] '2024-01', 't
arget': None}, {'path': 'dists/e[974 chars]one}]
. . .
2 failed, 583 passed, 2 skipped, 30 xfailed, 71 warnings in 81.68s (0:01:21)
. . .
and all 42 of my purlcli tests pass. Ready to commit and push.
Try replacing line 6 with From https://stackoverflow.com/a/77527178, looks like something changed in setuptools. |
Thanks @JonoYang . ;-) I'll make your suggested change, but first I see the last line of the error message is
which seems to refer to line 12 of pyproject.toml
which is identical to line 12 of the matchcode-toolkit pyproject.toml. FWIW. Making your change now, will then rerun |
@JonoYang Made your change and the pip install succeeded. Using matchcode-toolkit as a model, in But in |
@JonoYang Now I'm getting weird errors from my pytest commands -- seems related to
|
I'm encountering the same problems you are when I try installing the |
Thanks @JonoYang ! 🤞 |
On line 10 in |
I realize rerunning pip again was unnecessary but I did it before I realized -- and #$@%^$ VSCode's warning is plain wrong -- running my revised command -- |
@JonoYang We have 4 failed tests among the GH tests, e.g., https://github.com/nexB/purldb/actions/runs/7576042489/job/20633970493?pr=267
That was one of the makefile lines I just edited. But I think the actual failure data starts on line 1490:
Not related to my purlcli work. |
I would update the expected test results for this test: On line https://github.com/nexB/purldb/blob/main/minecode/tests/test_ls.py#L60, change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnmhoran I've suggested some variable name changes and to add docstrings for some functions. Other than that, I think the PR is looking good.
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
Expected test results have been updated and committed. |
@JonoYang I believe I've addressed all of your comments. Thank you for taking the time to provide very helpful feedback. Will commit and push shortly. |
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
@JonoYang All 42 tests in BUT: 2 failed tests in GH, both referring to
Haven't seen this before -- a result of the directory restructuring? |
This is because you have this import statement here is usually used in the context of a django project test: https://github.com/nexB/purldb/blob/3ceace31dad0adf3650027aa4147baf2dfe404fd/purldb-toolkit/tests/test_purlcli.py#L10 This line should be removed. |
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
@JonoYang Fixed as requested, committed and pushed, but again:
and the culprit this time:
I suspect these errors are triggered by unused imports, which describes the last and this error. There's a group of other unused imports, remnants of the multiple stages of testing. I'm breaking for dinner now but will attend to this after dinner -- plan to remove all unused imports, which I should always do in any event, and will keep you posted. Thanks mucho for all of your help @JonoYang |
Reference: #247 Signed-off-by: John M. Horan <johnmhoran@gmail.com>
@JonoYang Removed the remaining no-longer-used imports from |
Signed-off-by: Jono Yang <jyang@nexb.com>
@johnmhoran Thanks for the changes! I've updated |
Create PURL CLI tool and library
Implements #247.