8000 Move CI job and fix cuda12.2 cusparse matrix, coo exception, workspace reallcation by yhmtsai · Pull Request #1843 · ginkgo-project/ginkgo · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Move CI job and fix cuda12.2 cusparse matrix, coo exception, workspace reallcation #1843

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 17, 2025

Conversation

yhmtsai
Copy link
Member
@yhmtsai yhmtsai commented May 16, 2025

This PR extracts some part from #1841 . This PR does not add the spack availability or more jobs.

  • move the job from amdci to tum
    • cuda below 11.8 only check build not test now because current setup can not always get the gpu supported by these compiler
    • cudamemcheck task uses cuda12 docker image and compute-sanitizer not cuda11 with the same reason. TODO: check whether it work or not
    • some old cuda jobs can be moved to P100 now, so we do not have above changes
    • wait for cuda memcheck from https://gitlab.com/ginkgo-project/ginkgo-public-ci/-/jobs/10059330364
  • fix coo exception on empty matrix
  • fix row gatherer workspace reallocation
  • fix cuda12.2 csr does not accept nullptr as row ptr in the beginning.
  • use csr for non-local matrix in the test because some of backend does not support Coo spmv on 16bit operatio 8000 n

@yhmtsai yhmtsai requested a review from MarcelKoch May 16, 2025 11:38
@yhmtsai yhmtsai self-assigned this May 16, 2025
@yhmtsai yhmtsai added 1:ST:ready-for-review This PR is ready for review 1:ST:run-full-test labels May 16, 2025
@ginkgo-bot ginkgo-bot added reg:build This is related to the build system. reg:ci-cd This is related to the continuous integration system. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. type:preconditioner This is related to the preconditioners type:matrix-format This is related to the Matrix formats mod:hip This is related to the HIP module. mod:dpcpp This is related to the DPC++ module. labels May 16, 2025
Copy link
Member
@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the CI!

@yhmtsai yhmtsai force-pushed the fix-ci-and-error branch from 9148570 to ce96788 Compare May 16, 2025 12:30
@yhmtsai yhmtsai force-pushed the fix-ci-and-error branch from ce96788 to 3bb6def Compare May 16, 2025 13:02
@yhmtsai yhmtsai requested a review from MarcelKoch May 16, 2025 14:24
.gitlab-ci.yml Outdated
@@ -656,15 +656,12 @@ undefinedsanitizer:
-DCTEST_MEMORYCHECK_TYPE=UndefinedBehaviorSanitizer

cudamemcheck:
stage: QoS_tools
stage: build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this stay?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to ensure that it is working before I merge this.
But I wonder whether it exceed the time limit of slurm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will reverse it before merging when the job is finished once.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is passed. I have also restructured the commit history with reversing the temporary check.

@yhmtsai yhmtsai force-pushed the fix-ci-and-error branch from 6fffdde to d38eea1 Compare May 16, 2025 17:43
@yhmtsai yhmtsai requested a review from MarcelKoch May 16, 2025 17:44
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
25.9% Duplication on New Code (required ≤ 20%)

See analysis details on SonarQube Cloud

Copy link
codecov bot commented May 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.42%. Comparing base (4766cbf) to head (d38eea1).
Report is 6 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1843      +/-   ##
===========================================
- Coverage    89.42%   89.42%   -0.01%     
===========================================
  Files          849      849              
  Lines        71025    71023       -2     
===========================================
- Hits         63516    63514       -2     
  Misses        7509     7509              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yhmtsai yhmtsai changed the title Move CI job and fix issue found in #1841 Move CI job and fix cuda12.2 cusparse matrix, coo exception, workspace reallcation May 17, 2025
@yhmtsai yhmtsai merged commit 13eb133 into develop May 17, 2025
26 of 34 checks passed
@yhmtsai yhmtsai deleted the fix-ci-and-error branch May 17, 2025 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-for-review This PR is ready for review 1:ST:run-full-test mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:dpcpp This is related to the DPC++ module. mod:hip This is related to the HIP module. reg:build This is related to the build system. reg:ci-cd This is related to the continuous integration system. type:matrix-format This is related to the Matrix formats type:preconditioner This is related to the preconditioners
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0