Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs #3699

maxhgerlach · 2022-09-13T09:15:43Z

Checklist before submitting

Did you read the contributor guide?
Did you update the docs?
Did you write any tests to validate this change?
Did you update the CHANGELOG, if this change affects users?

Description

This fixes Horovod crashing when scalar inputs are passed to hvd.reducescatter or hvd.grouped_reducescatter. Attempts to do so now raise exceptions that users can deal with appropriately.

Initially I thought it would be a good idea to have hvd.reducescatter(3.14) do exactly the same thing as hvd.reducescatter([3.14]), i.e., return a single-element 1D tensor on rank 0 and an empty 1D tensor on the other ranks. However, having given this some consideration, I feel that it would be confusing and potentially error prone if zero-rank inputs produced higher ranked outputs. It seems less surprising to deal with such a possibility explicitly in user code if necessary (just as it is the case with hvd.allgather).

Fixes #3698.

github-actions · 2022-09-13T11:22:19Z

Unit Test Results

  1 125 files +    76   1 125 suites +76 10h 32m 55s ⏱️ - 1h 1m 31s
    839 tests +      8     781 ✔️ +      8     58 💤 ±    0 0 ❌ ±0
23 320 runs +2 044 16 306 ✔️ +1 308 7 014 💤 +736 0 ❌ ±0

Results for commit 8451913. ± Comparison against base commit 427b633.

♻️ This comment has been updated with latest results.

github-actions · 2022-09-13T11:22:33Z

Unit Test Results (with flaky tests)

  1 254 files -     6   1 254 suites - 6 11h 28m 38s ⏱️ - 56m 33s
    839 tests +    8     778 ✔️ +    6     58 💤 ±  0 2 ❌ +1 1 🔥 +1
25 810 runs +369 17 775 ✔️ +306 8 032 💤 +61 2 ❌ +1 1 🔥 +1

For more details on these failures and errors, see this check.

Results for commit 8451913. ± Comparison against base commit 427b633.

♻️ This comment has been updated with latest results.

…ions for scalar inputs Signed-off-by: Max H. Gerlach <git@maxgerlach.de>

Signed-off-by: Max H. Gerlach <git@maxgerlach.de>

romerojosh

LGTM @maxhgerlach! I introduced a conflict merging in another PR, but once that is resolved, good to merge.

maxhgerlach · 2022-09-20T06:30:31Z

Thanks @romerojosh

Fix reducescatter() and grouped_reducescatter() to raise clean except…

b49cbf5

…ions for scalar inputs Signed-off-by: Max H. Gerlach <git@maxgerlach.de>

maxhgerlach force-pushed the reducescatter-scalar branch from 1808d76 8000 to b49cbf5 Compare September 13, 2022 13:58

maxhgerlach changed the title ~~Fix reducescatter() and grouped_reducescatter() for scalar inputs~~ Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs Sep 13, 2022

Fix error checking in torch test

c79b5f1

Signed-off-by: Max H. Gerlach <git@maxgerlach.de>

maxhgerlach marked this pull request as ready for review September 14, 2022 09:25

maxhgerlach requested a review from romerojosh September 14, 2022 09:26

romerojosh approved these changes Sep 19, 2022

View reviewed changes

Merge branch 'master' into reducescatter-scalar

8451913

maxhgerlach merged commit 37a6d83 into master Sep 20, 2022

maxhgerlach deleted the reducescatter-scalar branch September 20, 2022 06:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs #3699

Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs #3699

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs #3699

Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs #3699

Uh oh!

Conversation

Uh oh!

Checklist before submitting

Description

Uh oh!

Uh oh!

Unit Test Results

Uh oh!

Uh oh!

Unit Test Results (with flaky tests)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!