PPI

Running on GPU-nodes in new PPI

Recipe

Connect to MET using VPN
Before you can reach the actual compute/GPU-nodes, you need to login to one of the login nodes:

$ ssh -AX <MET username>@ppi-r8login-a1.int.met.no

or

$ ssh -AX <MET username>@ppi-r8login-b1.int.met.no

After that I do (here for GPU node 1 in room A with 10G mem allocated):

$ qlogin -q gpu-r8.q -l h=gpu-01.ppi.met.no,h_rss=10G,mem_free=10G
$ source /modules/rhel8/conda/install/etc/profile.d/conda.sh
$ conda activate gpuocean
$ module use /modules/MET/rhel8/user-modules
$ module load cuda
$ cd <root dir for notebooks or scripts>

$ jupyter notebook --no-browser --ip=$(hostname -f)

(Then connect to the URL given by jupyter. If you are in room A you need to add ".int.met.no" after the hostname.)

or for batch/script execution

$ python some_script.py

Note that some of the commands above can be collected in .bashrc

Remember to exit the notebook server and logout of the compute/GPU-node (to kill the interactive job and free the resources) when finished.

References

Information about "new" PPI (running RHEL 8) and the queue system (Sun Grid Engine) can be found in this section: https://dokit.met.no/brukerdok/post-processing_infrastructure?s[]=ppi#new_ppi_implementation_rhel8

See hostnames and commands for interactive logins at https://dokit.met.no/brukerdok/post-processing_infrastructure?s[]=ppi#gpu_nodes_in_rhel8

For batch jobs, use the qsub command. Example job script can be found at https://dokit.met.no/brukerdok/post-processing_infrastructure?s[]=ppi#basic_example_submission_script (try with just the absolutely necessary queue/job variables first, like memory and queue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PPI

Running on GPU-nodes in new PPI

Recipe

References

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally