8000 [WIP] Compute the determinant of a matrix on the GPU by juancamilog · Pull Request #6193 · Theano/Theano · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
< 8000 div id="repo-content-pjax-container" class="repository-content " >

[WIP] Compute the determinant of a matrix on the GPU #6193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

juancamilog
Copy link
Contributor

This PR adds a GpuLU op for computing the LU factorization of a square matrix on the GPU. The GpuLU op is used to define the gpu_det and gpu_slogdet helper functions. These functions are used to define lifter optimizations that replace the theano.tensor.nlinalg.Det op with a graph that computes the determinant from the LU factorization as explained here: https://en.wikipedia.org/wiki/LU_decomposition#Computing_the_determinant

For small matrices on a Titan Xp, moving the determinant operation to the GPU is slightly slower than on the CPU. However, moving the determinant operation to the GPU removes a memory transfer, which can result in a significant sppedup when calling the Det op inside a scan loop. For matrices bigger than 1000x1000, the GPU version is faster.

I've marked this as WIP since I believe a better job can be done by writing a GpuDet Op that implements the same algorithm in the Op c_code. I'm looking at how the Magma Ops do this, but I'm not sure how to call cusolver directly from C code.

@juancamilog juancamilog force-pushed the cusolver_determinant branch from a187e75 to bf9a2b9 Compare August 2, 2017 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
471E
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0