8000
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
FP16 hierarchical allreduce
PyTorch 1.0, TF-Keras, FP16 ops on GPU
Parallelized hierarchical allreduce
Support for the upcoming PyTorch release
Add compatibility with PyTorch 0.4.1
Support for IBM PowerAI DDL & APIs to restore optimizer state
Critical Bugfix: PyTorch must wait for GPU data before allreduce
Critical Bugfix: non-fused allreduce produces incorrect results
Hierarchical allreduce & differentiable ops
Fix PyTorch master break - use proper THTensor_storage() API