PyTorch/XLA is a Python package that uses the XLA deep learning compiler to connect the PyTorch deep learning framework and Cloud TPUs. You can try it right now, for free, on a single Cloud TPU with Google Colab, and use it in production and on Cloud TPU Pods with Google Cloud.
Take a look at one of our Colab notebooks to quickly try different PyTorch networks running on Cloud TPUs and learn how to use Cloud TPUs as PyTorch devices:
- Getting Started with PyTorch on Cloud TPUs
- Training AlexNet on Fashion MNIST with a single Cloud TPU Core
- Training AlexNet on Fashion MNIST with multiple Cloud TPU Cores
- Fast Neural Style Transfer (NeurIPS 2019 Demo)
- Training A Simple Convolutional Network on MNIST
- Training a ResNet18 Network on CIFAR10
- ImageNet Inference with ResNet50
- Training DC-GAN using Colab Cloud TPU
The rest of this README covers:
- Running PyTorch on Cloud TPUs in production on Google Cloud. Google Cloud also runs networks faster than Google Colab.
- API & Best Practices
- Performance Profiling and Auto-Metrics Analysis
- Troubleshooting
- Providing Feedback
- Building and Contributing to PyTorch/XLA
Additional information on PyTorch/XLA, including a description of its semantics and functions, is available at PyTorch.org.
Google Cloud Platform lets you deploy PyTorch networks running on Cloud TPUs. This guide is split into two parts:
Please check out our Cloud TPU VM User Guide. Cloud TPU VM is currently on public preview and provides direct access to the TPU host. To learn more about the Cloud TPU System Architecture, please check out this doc.
The following tutorials are available to help you train models on a single Cloud TPU:
To start, you create a Cloud TPU node with the corresponding release you wish to consume (TPU software version: ex. pytorch-1.9
):
Once you've created a Cloud TPU node, you can train your PyTorch models by either: