TSAN fixes #1743

MarcelKoch · 2024-12-06T16:19:13Z

This PR includes fixes for errors reported by the TSAN. The changes include:

RCM: locking the write mutex when checking the queue size
SparsityCsr: using atomic load/store when checking of other threads report that the matrix is unsorted
Par IC/ILU (t): use atomic load/stores in the same way as the CUDA/HIP implementations
MG: 'fix' a write-after-write race in the kcycle stopping criteria. Honestly, I don't think this is necessary, but it stops TSAN from complaining.
Batch Solvers: fixed the generation of the 3pt stencil matrix for tests, which caused data races.
NOT FIXED: the PGM contains data races in the match_edge function. I didn't fix this, since that would require to extract the common kernels implementations into each backend, which I found a bit overkill. There is also a data race in the non-deterministic part of assign_to_exist_agg, but I ignored that, since it is marked as non-deterministic anyway.

This is not really necessary, but it keeps the TSAN happy.

upsj

LGTM - Thanks for finding and fixing these issues! Two points I would like to discuss (shouldn't impact the PR though):

naming - the functions in CUDA and HIP are named load_..., where the fact that a memory ordering is specified makes it clear that we are dealing with atomics. For consistency it would be best if OpenMP and CUDA/HIP use the same names. They can become pretty long though (load_relaxed_local), so I would like to avoid adding another atomic_ prefix to them. Do we need to use it?
With PGM being the only kernel that has this issue, I would probably suggest splitting it into OpenMP, SYCL and CUDA/HIP again, also since there is some optimization potential in terms of load balancing and memory access patterns. The alternative (putting atomic operations into the unified headers) seems to invite using the unified kernels for too much.

upsj · 2024-12-06T19:29:42Z

omp/components/atomic.hpp

+// implementation of double and float. The compiler will throw an error if the
+// templated version is implemented. GCC doesn't throw an error.
+template <typename T>
+struct atomic_store_helper;


Do we actually need this? Or can we get the same result with just overloaded functions? We only use atomics for a fixed set of types (value types and index types)

You're right. I first wanted to use the templated implementation, but as that doesn't work with clang, we can just use simple overloads.

MarcelKoch · 2024-12-07T13:55:47Z

@upsj regarding 1.: I will go with just load|store, since OpenMP 4.5 doesn't support specifying the memory ordering.

Regarding 2.: I would also not support adding atomics for the unified kernels. I've added now the separate kernels, it now a bit more LOC, but the kernels are simple enough that it should be fine.

upsj · 2024-12-07T13:58:38Z

I would consider still adding the _relaxed suffix, because this is the weakest possible atomic memory ordering, and likely close to what the default would be (just to avoid accidentally relying on too much ordering between operations)

MarcelKoch · 2024-12-07T14:02:49Z

I would rather leave it out to indicate that we can't make assumptions on the memory ordering at all.

upsj · 2024-12-07T14:04:23Z

relaxed is as close to "no assumptions" as possible in terms of memory ordering. But not pushing any harder than that 😄

sonarqubecloud · 2024-12-07T20:44:52Z

Quality Gate passed

Issues
14 New issues
0 Accepted issues

Measures
0 Security Hotspots
90.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

codecov · 2024-12-07T20:54:03Z

Codecov Report

Attention: Patch coverage is 95.55556% with 2 lines in your changes missing coverage. Please review.

Project coverage is 89.52%. Comparing base (f95fc48) to head (05f6082).

Files with missing lines	Patch %	Lines
omp/components/atomic.hpp	96.96%	1 Missing ⚠️
omp/factorization/par_ilut_kernels.cpp	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1743      +/-   ##
===========================================
+ Coverage    89.51%   89.52%   +0.01%     
===========================================
  Files          797      797              
  Lines        65865    65889      +24     
===========================================
+ Hits         58960    58989      +29     
+ Misses        6905     6900       -5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MarcelKoch added 3 commits December 6, 2024 15:56

[omp] fix RCM RAW

aa203c7

[omp] fix sparsity csr RAW

565921f

[omp] fix kcycle stop WAW

364e414

This is not really necessary, but it keeps the TSAN happy.

MarcelKoch self-assigned this Dec 6, 2024

MarcelKoch added this to the Ginkgo 1.9.0 milestone Dec 6, 2024

MarcelKoch requested a review from a team December 6, 2024 16:19

upsj approved these changes Dec 6, 2024

View reviewed changes

MarcelKoch added the 1:ST:run-full-test label Dec 7, 2024

MarcelKoch added 3 commits December 8, 2024 12:50

[test] fix 3pt generation for batch solver

bfab785

[omp] fix par IC and variants using atomic load/stores

82e2b53

[mg] use atomics for match_edge

6ad5853

MarcelKoch force-pushed the tsan-fixes branch from 05f6082 to 6ad5853 Compare December 8, 2024 13:00

MarcelKoch merged commit 059823f into develop Dec 8, 2024
8 of 11 checks passed

MarcelKoch deleted the tsan-fixes branch December 8, 2024 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TSAN fixes #1743

TSAN fixes #1743

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TSAN fixes #1743

TSAN fixes #1743

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Quality Gate passed

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!