8000 Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs by maxhgerlach · Pull Request #3699 · horovod/horovod · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs #3699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 20, 2022

Conversation

maxhgerlach
Copy link
Collaborator
@maxhgerlach maxhgerlach commented Sep 13, 2022

Checklist before submitting

  • Did you read the contributor guide?
  • Did you update the docs?
  • Did you write any tests to validate this change?
  • Did you update the CHANGELOG, if this change affects users?

Description

This fixes Horovod crashing when scalar inputs are passed to hvd.reducescatter or hvd.grouped_reducescatter. Attempts to do so now raise exceptions that users can deal with appropriately.

Initially I thought it would be a good idea to have hvd.reducescatter(3.14) do exactly the same thing as hvd.reducescatter([3.14]), i.e., return a single-element 1D tensor on rank 0 and an empty 1D tensor on the other ranks. However, having given this some consideration, I feel that it would be confusing and potentially error prone if zero-rank inputs produced higher ranked outputs. It seems less surprising to deal with such a possibility explicitly in user code if necessary (just as it is the case with hvd.allgather).

Fixes #3698.

@github-actions
Copy link
github-actions bot commented Sep 13, 2022

Unit Test Results

  1 125 files  +     76    1 125 suites  +76   10h 32m 55s ⏱️ - 1h 1m 31s
     839 tests +       8       781 ✔️ +       8       58 💤 ±    0  0 ±0 
23 320 runs  +2 044  16 306 ✔️ +1 308  7 014 💤 +736  0 ±0 

Results for commit 8451913. ± Comparison against base commit 427b633.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
github-actions bot commented Sep 13, 2022

Unit Test Results (with flaky tests)

  1 254 files   -     6    1 254 suites   - 6   11h 28m 38s ⏱️ - 56m 33s
     839 tests +    8       778 ✔️ +    6       58 💤 ±  0  2 +1  1 🔥 +1 
25 810 runs  +369  17 775 ✔️ +306  8 032 💤 +61  2 +1  1 🔥 +1 

For more details on these failures and errors, see this check.

Results for commit 8451913. ± Comparison against base commit 427b633.

♻️ This comment has been updated with latest results.

…ions for scalar inputs

Signed-off-by: Max H. Gerlach <git@maxgerlach.de>
@maxhgerlach maxhgerlach changed the title Fix reducescatter() and grouped_reducescatter() for scalar inputs Fix reducescatter() and grouped_reducescatter() to raise clean exceptions for scalar inputs Sep 13, 2022
Signed-off-by: Max H. Gerlach <git@maxgerlach.de>
@maxhgerlach maxhgerlach marked this pull request as ready for review September 14, 2022 09:25
Copy link
Collaborator
@romerojosh romerojosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @maxhgerlach! I introduced a conflict merging in another PR, but once that is resolved, good to merge.

@maxhgerlach maxhgerlach merged commit 37a6d83 into master Sep 20, 2022
@maxhgerlach maxhgerlach deleted the reducescatter-scalar branch September 20, 2022 06:30
@maxhgerlach
Copy link
Collaborator Author

Thanks @romerojosh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

reducescatter() and grouped_reducescatter() crash for scalar inputs
2 participants
0