Investigate removing the sycl::reqd_sub_group_size kernel attribute in reduce-then-scan kernels

Summary:
Currently, reduce-then-scan kernels in oneDPL are reliant on underlying hardware supporting sub-group sizes of 32 (and 16 for some special cases). We enforce this through the usage of the sycl::reqd_sub_group_size attribute and a fallback to the legacy scan-then-propagate implementation if sub-group sizes of 32 are not present.

Problem Statement:
Our reduce-then-scan implementation should ideally be able to support hardware with arbitrary sub-group sizes without sacrificing performance. This would enable us to use reduce-then-scan across all GPU accelerators instead of relying on our legacy implementation. It would also allow us to gracefully handle the scenarios which require special workarounds due to sub-group hardware bugs such as #2133.

Additional Context:
The decision to require a sub-group size of 32 was made to be able to use this sub-group size as a compile time constant throughout the reduce-then-scan implementation as opposed to relying on the sub_group class' get_local_range runtime query. Brief experimentation was performed during initial development with removing this sub-group size requirement and relying on run-time sub-group size queries, but large performance hits were observed.

The critical portion of code that is affected by this change is the sub-group level scan:

oneDPL/include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h

Line 83 in a5b8b61

    
           __inclusive_sub_group_masked_scan(const __dpl_sycl::__sub_group& __sub_group, _MaskOp __mask_fn,

. I had experimented with replacing the sub-group size integral template with a run-time sub-group size variable and enabling partial loop unrolling, but I saw large performance hits. We should take a deeper dive into this issue and what can be done to resolve it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions