You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am wondering what is the average gpu utilization when you train lorat on your device? It seems that you use V100 and 4090 in the paper.
When I train lorat by using your code without any modification, the GPU utilization is only about 60 by using H100. I have tested with different num_train_workers and num_io_threads_per_worker, the GPU utilization is always 60 for almost all the time. I use the cpfs (Cloud Parallel File Storage by aliyun) as the file system, which is designed for high performance computing.
I think that maybe the bottleneck is the disk io, but the cpfs is quick enough. Thus, maybe we should turn to other kinds of dataset format, such as lmdb, webdataset, or parquet, which is designed for high performance computing, instead of reading the image frequently from disk.
Do you have any plan to support other kinds of dataset format? I can give my help, if you can develop the support for other dataset format.
If you have any question, plz feel free to contact me.
Best regards.
The text was updated successfully, but these errors were encountered:
Thanks for your great work!
I am wondering what is the average gpu utilization when you train lorat on your device? It seems that you use V100 and 4090 in the paper.
When I train lorat by using your code without any modification, the GPU utilization is only about 60 by using H100. I have tested with different
num_train_workers
andnum_io_threads_per_worker
, the GPU utilization is always 60 for almost all the time. I use the cpfs (Cloud Parallel File Storage by aliyun) as the file system, which is designed for high performance computing.I think that maybe the bottleneck is the disk io, but the cpfs is quick enough. Thus, maybe we should turn to other kinds of dataset format, such as lmdb, webdataset, or parquet, which is designed for high performance computing, instead of reading the image frequently from disk.
Do you have any plan to support other kinds of dataset format? I can give my help, if you can develop the support for other dataset format.
If you have any question, plz feel free to contact me.
Best regards.
The text was updated successfully, but these errors were encountered: