To build:
-
Clone the PyTorch repo as per instructions.
git clone --recursive https://github.com/pytorch/pytorch cd pytorch/
-
Clone the PyTorch/XLA repo:
git clone --recursive https://github.com/pytorch/xla.git
-
If a file named xla/.torch_commit_id exists, use its content to checkout the PyTorch commit ID:
git checkout $(cat xla/.torch_commit_id)
-
Apply PyTorch patches:
xla/scripts/apply_patches.sh
-
Install the Lark parser used for automatic code generation:
pip install lark-parser
-
Currently PyTorch does not build with GCC 6.x, 7.x, and 8.x (various kind of ICEs). CLANG 7.x is known to be working, so install that in your VM:
sudo apt-get install clang-7 clang++-7 export CC=clang-7 CXX=clang++-7
You may need to add the following line to your /etc/apt/sources.list file:
deb http://deb.debian.org/debian/ testing main
And run the following command before trying again to install CLANG:
sudo apt-get update
-
Build PyTorch from source following the regular instructions.
python setup.py install
-
Install Bazel following the instructions. You should be installing version >= 0.24.1.
-
Build the PyTorch/XLA source:
cd xla/ python setup.py install
To run the tests, follow one of the options below:
-
Run on local CPU using the XRT client:
export XRT_DEVICE_MAP="CPU:0;/job:localservice/replica:0/task:0/device:XLA_CPU:0" export XRT_WORKERS="localservice:0;grpc://localhost:40934"
Select any free TCP port you prefer instead of 40934 (totally arbitrary).
-
Run on Cloud TPU using the XRT client, use one of the following:
-
Set the XRT_TPU_CONFIG environment variable:
export XRT_TPU_CONFIG="tpu_worker;0;<IP of the TPU node>:8470"
-
Create a
$HOME/.pytorch_tpu.conf
file with the following content:worker: tpu_worker <IP of the TPU node>:8470
-
Note that the IP of the TPU node can change if the TPU node is reset. If PyTorch seem to hang at startup, verify that the IP of your TPU node is still the same of the one you have configured.
Then run test/run_tests.sh
and test/cpp/run_tests.sh
to verify the setup is working.
It is recommended to use Conda environments to isolate PyTorch/TPU packages from the others. To install Anaconda follow the instructions. Then create an environment dedicated to PyTorch/TPU and activate it (activation should happen every time you want to work in such environment):
conda create --name pytorch_tpu --clone base
source activate pytorch_tpu
Install the gsutil package to allow access to GCS (Google Cloud Storage) following the instructions.
Then run:
scripts/update_torch_wheels.sh
The same script can be run again when you want to update the PyTorch/TPU wheels.