-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add support for CUDA sparse BA solver #2717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@S-o-T FYI, since I believe you added this new feature to Ceres. |
#if (CERES_VERSION_MAJOR >= 3 || \ | ||
(CERES_VERSION_MAJOR == 2 && CERES_VERSION_MINOR >= 2)) && \ | ||
!defined(CERES_NO_CUDSS) && defined(CUDA_ENABLED) | ||
if (options_.use_gpu) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it might be worth experimentation/tuning to find problem size threshold to switch from CPU to GPU sparse backend, although it might be too sensitive to choice between suitesparse/eigen (until mkl sparse backend will be merged to ceres-solver).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the current parameters are not tuned. It will be difficult to find universal thresholds here that generalize across different CPU/GPU models. It was a little easier before, as we switched between algorithms that all run on the CPU. As such, I decided to expose the parameters now explicitly through the option manager for the bundle adjuster and we might need to do the same for the bundle adjustment that runs as part of the mapper. For now, this feature is disabled by default, as I want to gain some more experience with it over the next weeks.
I'm getting the following when trying to use the ba_use_gpu flag:
The only instance of ceres I know of on my machine is the ceres.dll inside of the colmap/bin folder. I am using CUDA 11.8. fwiw, nvidia-smi tells me the CUDA version is 12.3 My CUDA_HOME, CUDA_PATH both point to Am I missing something or using this incorrectly? I am only trying this because I'm getting sudden crashes during bundle adjustment and have tried numerous, known working datasets that never crashed before and now everything I process crashes during BA and I am trying to pinpoint what dependencies on my system might have changed. |
You need to custom compile the latest Ceres development version with CUDA support. This is not part of the current set of pre compiled colmap binaries. |
Okay, I can certainly do that. I updated my original message, but I'll include it in this reply as well. Sorry to hijack this topic. I'm happy to open a new issues, but was trying to avoid that. "I am only trying this because I'm getting sudden crashes during bundle adjustment and have tried numerous, known working datasets that never crashed before and now everything I process crashes during BA and I am trying to pinpoint what dependencies on my system might have changed." Perhaps something in vcpkg from VCPKG_ROOT env variable? |
Looks like ceres is causing the problem for me. I am doing another run now with a ceres dll (with CUDA support) that I built using vcpkg.
|
You'll have to recompile colmap from scratch. You cannot just recompile the ceres.dll and replace it. First of all, the colmap build system will detect CUDA support in ceres at compile and not at runtime. Second, C++ does not have a stable ABI, so you cannot mix and match different compiler or standard library versions. It will be difficult to match the exact combination of OS/compiler/stdlib/etc. on your system. |
This PR implements enhancement: #2643. It builds upon Ceres' recent CUDA_SPARSE solver type.
Initial experiments show significant runtime improvements. On my machine with an Intel Core i9 10920X and an NVidia RTX 2070, I see a consistent 3x speedup for reconstructions with ~500-5000 images. For smaller problems with ~100 images, the runtime is roughly equivalent. These experiments were done using CUDA 12.5 and cudss 0.3.0.
For now, the feature is disabled by default and requires explicit enabling of the option. This is because no robustness is implemented against situations where the GPU does not have enough memory and some of the thresholds that determine the usage of sparse direct vs. indirect solvers need to be tuned in this new scenario.