[feature request] Add matrix functions #9983

carlosgmartin · 2018-07-29T03:25:01Z

Add matrix power as implemented by numpy.linalg.matrix_power, matrix exponential as implemented by scipy.linalg.expm, and matrix logarithm as implemented by scipy.linalg.logm.

matrix_power
matrix_exp
matrix_log (For an implementation see [feature request] Add matrix functions #9983 (comment))
matrix_sqrt (For an implementation see [feature request] Add matrix functions #9983 (comment))

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @jianyuh @nikitaved @pearu @mruberry @heitorschueroff @walterddr @IvanYashchuk @xwang233 @lezcano @rgommers

fmassa · 2018-07-29T15:25:47Z

Matrix power can be easily implemented in PyTorch by following the numpy implementation, which translates straight away to PyTorch, see implementation here.
Matrix exponential seems a bit more involved though

vishwakftw · 2018-07-29T20:55:43Z

Are these worth porting into ATen? If so, I can take it up.

zou3519 · 2018-07-30T17:40:03Z

Yes, please!

vadimkantorov · 2018-07-31T13:20:18Z

A generalization based on SVD (https://en.wikipedia.org/wiki/Matrix_function) may also be useful (for simple graph signal processing): diagonalize the matrix somehow (svd by default, maybe some optimized svd for symmetric matrices), apply the user-supplied function to eigenvalues, re-multiply the factors back.

For diagonalizable matrices, this way of doing matrix power may also be faster (for large powers). This would also enable matrix square root, though for matrix square root there seem to be specialized approx algorithms based on newton iteration.

vishwakftw · 2018-08-04T22:35:00Z

@zou3519 @fmassa I went through the SciPy implementation for matrixexp (named expm in SciPy), and have a few questions:

the implementation is added for sparse arrays as well. Should I be considering that, or is that supposed to be left for later?
the implementation seems to take advantage of upper triangular matrices, and there are some optimized calculations done for the same. Should I take the same approach with ATen or is a general implementation sufficient?
Thank you.

oleg-kachan · 2018-08-13T14:52:26Z

Upvote for this, matrix exponential expm, logarithm logm and (fractional) matrix power powm which includes square root (with negative powers support)

harpone · 2018-10-20T21:03:03Z

The approach by @vadimkantorov seems to make most sense IMO because there's already a (differentiable) PyTorch function for SVD, and after that the exponentiation is just elemwise exp for the diagonal.

There's also an elegant (and probably useful) expression for the gradient. I don't actually know how the pytorch gradients are defined (yet), but I could probably dig up/work out the math for the gradient if there's interest for that...
EDIT: oops actually it's probably not useful in practice... anyway, the SVD + elemwise exp seems good

ssnl · 2018-10-21T14:17:45Z

Already implemented

harpone · 2018-10-21T16:17:31Z

OK great to hear that @ssnl! Can you point out the function (also for reference to anyone else that happens to stumble on this thread)? Can't find it in the docs...

vishwakftw · 2018-10-21T16:27:58Z

This issue is partially resolved: matrix_power is implemented, whereas matrix_exp is not yet done.

fmassa · 2018-10-21T16:35:49Z

@harpone here it is for reference https://pytorch.org/docs/master/torch.html?highlight=matrix_power#torch.matrix_power
Note that it is only available in the nightly build, not in 0.4.1

ssnl · 2018-10-21T19:19:47Z

Sorry about that. I updated the issue title and desc to reflect the current status :)

vadimkantorov · 2018-10-21T21:52:24Z

8000

@ssnl It would be nice if matrix_power also supported non-integer powers like 0.5 and other cases. It may though require a full svd approach or some custom Newton-step approximation (saw it in some paper for a fast matrix square root estimation without doing an svd)

ssnl · 2018-10-21T22:16:50Z

@vadimkantorov Yep that would be nice. Do you still have a link to the paper?

vadimkantorov · 2018-10-22T11:20:40Z

@ssnl For Newton-step for matrix square root: https://arxiv.org/abs/1707.06772

vishwakftw · 2018-10-22T12:37:55Z

Since this involves a paper, maybe can this go into contrib?

vadimkantorov · 2018-10-22T13:07:57Z

I think a larger feature to decide is about supporting non-integer powers in matrix_power.

(The paper is just about a particular faster matrix square root approximation scheme - I agree, it can go to contrib.)

harshs2 · 2018-12-21T07:22:32Z

Can developers please tell us that by when we should expect the implementations of expm and logm functions in pytorch?

ferrine · 2019-01-12T22:19:21Z

This might be helpful
https://github.com/SkafteNicki/cuda_expm

and CC @SkafteNicki

mateuszatki · 2019-01-18T11:26:30Z

Would this allow back-prop with functions involving matrix_expo?

ferrine · 2019-01-18T12:21:04Z

That guy seems to have implemented this as well. I tried to build his module but gut stuck with linkage issues. The pure pytorch example was working correctly

SkafteNicki · 2019-01-24T10:46:34Z

@ferrine I never got the cuda implementation to work, because I got stuck with some linkage to cuSOLVER. The pure pytorch example works fine and can back-prop gradients. The cuda implementation was just to get a faster implementation, since I use matrix exponentials quite a bit myself.

ferrine · 2019-02-06T10:09:43Z

I've managed to implement (with a simple test) matrix exponential using torch script here:
https://github.com/ferrine/geoopt/blob/master/geoopt/linalg/_expm.py
I can open a PR to add that to pytorch master as well

t-vi · 2019-06-30T12:31:32Z

Just in case you need a taste for the limitations of various matrix exponential algorithms: https://www.cs.cornell.edu/cv/ResearchPDF/19ways+.pdf

oumarkaba · 2021-01-06T02:03:37Z

Are there any updates on this? I'm looking forward to the matrix log!

mruberry · 2021-01-06T15:20:45Z

Sorry, no update on matrix log or matrix sqrt. We are planning to review better SciPy linear algebra compatibility in the near future, however.

rgommers · 2021-01-06T15:23:37Z

@nikitaved has done work on matrix sqrt, which should be unblocked A93C once his linalg.lstsq PR is in - it's still a big job after that though, so no ETA. No one has worked on matrix log AFAIK.

annikabrundyn · 2021-02-10T02:09:22Z

also looking forward to matrix sqrt! thanks for sharing an update on this

nikitaved · 2021-02-10T10:45:25Z

The issue with the matrix sqrt that I see is its implementation for CUDA, it is not so straightforward to translate the SciPy's implementation to the gpu (no Schur decomposition, dependence in loops). It is possible, however, to implement an iterative algorithm either only for the GPU, or for both the CPU and CUDA. If we agree on iterative algorithms, then I could start working on it.

adizhol · 2021-08-03T11:17:00Z

Any idea when matrix log will be available?

https://www.tensorflow.org/api_docs/python/tf/linalg/logm

lezcano · 2021-08-03T11:42:33Z

As @nikitaved mentioned above, it is particularly tricky to implement logm on GPU for a number of reasons. We implement a poor man's version of logm using hybrid CPU and CUDA code for 1.11. Until then, given the extended compatibility between Numpy / Scipy and PyTorch, you can implement a differentiable version of logm as follows:

import scipy.linalg
import torch

def adjoint(A, E, f):
    A_H = A.mH.to(E.dtype)
    n = A.size(0)
    M = torch.zeros(2*n, 2*n, dtype=E.dtype, device=E.device)
    M[:n, :n] = A_H
    M[n:, n:] = A_H
    M[:n, n:] = E
    return f(M)[:n, n:].to(A.dtype)

def logm_scipy(A):
    return torch.from_numpy(scipy.linalg.logm(A.cpu(), disp=False)[0]).to(A.device)

class Logm(torch.autograd.Function):
    @staticmethod
    def forward(ctx, A):
        assert A.ndim == 2 and A.size(0) == A.size(1)  # Square matrix
        assert A.dtype in (torch.float32, torch.float64, torch.complex64, torch.complex128)
        ctx.save_for_backward(A)
        return logm_scipy(A)

    @staticmethod
    def backward(ctx, G):
        A, = ctx.saved_tensors
        return adjoint(A, G, logm_scipy)

logm = Logm.apply

A = torch.rand(3, 3, dtype=torch.float64, requires_grad=True)
torch.autograd.gradcheck(logm, A)
A = torch.rand(3, 3, dtype=torch.complex128, requires_grad=True)
torch.autograd.gradcheck(logm, A)

It does not support batches, as scipy.logm does not support batches.
Note that if the tensor A is on GPU this will copy it to CPU, perform the computation on CPU, and copy it back to GPU. This is somewhat expensive, but gets the job done :)

If you replace the call to linalg.logm by linalg.sqrtm or any other matrix function from Scipy, you get similar functions with correctly implemented gradients in PyTorch.

RylanSchaeffer · 2021-08-27T20:51:29Z

Any updates on the matrix square root?

lezcano · 2021-08-27T23:56:04Z

Not much yet for the same reasons as for the logm: All the algorithms to approximate it are iterative, and iterative algorithms don't play well with GPUs. In the meantime, a similar trick as the one above with logm is the best we have for now:

import scipy.linalg
import torch

def adjoint(A, E, f):
    A_H = A.T.conj().to(E.dtype)
    n = A.size(0)
    M = torch.zeros(2*n, 2*n, dtype=E.dtype, device=E.device)
    M[:n, :n] = A_H
    M[n:, n:] = A_H
    M[:n, n:] = E
    return f(M)[:n, n:].to(A.dtype)

def sqrtm_scipy(A):
    return torch.from_numpy(scipy.linalg.sqrtm(A.cpu(), disp=False)[0]).to(A.device)

class Sqrtm(torch.autograd.Function):
    @staticmethod
    def forward(ctx, A):
        assert A.ndim == 2 and A.size(0) == A.size(1)  # Square matrix
        assert A.dtype in (torch.float32, torch.float64, torch.complex64, torch.complex128)
        ctx.save_for_backward(A)
        return sqrtm_scipy(A)

    @staticmethod
    def backward(ctx, G):
        A, = ctx.saved_tensors
        return adjoint(A, G, sqrtm_scipy)

sqrtm = Sqrtm.apply

A = torch.rand(3, 3, dtype=torch.float64, requires_grad=True)
torch.autograd.gradcheck(sqrtm, A)
A = torch.rand(3, 3, dtype=torch.complex128, requires_grad=True)
torch.autograd.gradcheck(sqrtm, A)

oleg-kachan · 2022-02-06T08:18:40Z

@nikitaved, @lezcano any progress with implementing logm into pytorch? How the issues you have mentioned are worked around in the tensorflow implementation https://www.tensorflow.org/api_docs/python/tf/linalg/logm?

IvanYashchuk · 2022-02-06T10:09:07Z

Those issues are not fixed in TF. TF doesn't provide any better functionality or substantial performance improvements than scipy.linalg.logm. Unfortunately, the documentation of logm in TF doesn't mention its limitations:

There is no derivative rule implemented
LookupError: No gradient defined for operation 'MatrixLogarithm' tensorflow/tensorflow#26321.
The implementation in TF is CPU-only using Eigen (source code for the curious).

jungerm2 · 2022-09-04T23:59:32Z

Here's a simple implementation of matrix logarithm that does not use scipy. Importantly, it can run on devices other than CPUs and be backpropagated through, but it assumes that the matrix is diagonalizable. Hopefully, this helps:

def torch_logm(A):
    lam, V = torch.linalg.eig(A)
    V_inv = torch.inverse(V).to(torch.complex128)
    V = V.to(torch.complex128)
    
    if not torch.allclose(A.to(torch.complex128), V @ torch.diag(lam).to(torch.complex128) @ V_inv):
        raise ValueError("Matrix is not diagonalizable, cannot compute matrix logarithm!")
    
    log_A_prime = torch.diag(lam.log())
    return V @ log_A_prime @ V_inv

Quick example usage:

>>> A = torch.tensor(np.random.rand(3, 3), requires_grad=True).to('cuda')
>>> A.retain_grad()

>>> torch_logm(A).sum().backward()
>>> A.grad
# tensor([[0.4875, 1.0091, 0.7602],
#             [0.1207, 0.2930, 1.2993],
#             [0.6632, 0.6987, 0.3534]], device='cuda:0', dtype=torch.float64)

lezcano · 2022-09-05T07:55:53Z

The issue with this implementation is that the computation is very badly conditioned, as it doesn't exist a stable algorithm to compute the eigenvalues of an arbitrary matrix.

This implementation in particular has a number of other issues. For example, it computes the inverse of the eigenvalues explicitly, which is also problematic. Even more, it recomputes the eigenvalues again and calls them A_prime for no very good reason.

If you want a stable and numerically correct implementation of this function, albeit with some limitations due to the use of SciPy, see #9983 (comment). Feel free to have a peak at the SciPy implementation for this function, to see why is it so tricky to implement, and even more, why is it so tricky to implement in a GPU (it has lots of data-dependent parts).

jungerm2 · 2022-09-05T16:10:46Z

Thanks for the quick response! I've edited the above to remove the extra computation of A_prime (good catch). My use case is rotation matrices and the likes so the Eigen decomposition shouldn't be an issue I think. The main advantage for me is that this is roughly 20x faster than scipy on CPU.

Also, why is computing the inverse of the eigenvectors explicitly a problem? How would you go about not doing so?

lezcano · 2022-09-05T16:13:32Z

These questions related to classical linear algebra may be better suited for the forum: https://discuss.pytorch.org/ Feel free to ping me there and I may be able to help.

husisy · 2022-09-06T15:53:57Z

Hi, there. I implement an approximate version of matrix logarithm based on the discussion above (support GPU, batch, complex). It works well in my course work (density matrix, eigenvalue between 0 and 1, quantum stuff). you may try it if you need one.

~~https://github.com/husisy/matrix-logarithm~~

(I am not sure whether it works in general and how accurate it is. Also, its performance maybe not good)

add: Sorry for interruption, please ignore this which is not a good implementation (both accuracy and performance)

vadimkantorov · 2022-12-05T01:20:47Z

For constant integer exponent matrix powers, it may also be implemented / or lowered as a product of power-of-two-exponent matrix powers (corresponding to binary representation of the exponent) #84569 (comment)

lezcano · 2022-12-05T07:50:50Z

That's currently available in linalg.matrix_power.

vadimkantorov · 2022-12-05T13:15:36Z

Then it's interesting whether it uses https://en.wikipedia.org/wiki/Exponentiation_by_squaring and when it makes sense for lowering (the used memory would be higher)...

lezcano · 2022-12-05T14:18:30Z

It is implemented this way, yes. Any other way would be just too slow.

pytorch/aten/src/ATen/native/LinearAlgebra.cpp

Lines 655 to 667 in 777ac63

    
           Tensor z, result; 
        
           while (n > 0) { 
        
             const auto bit = n % 2; 
        
             n = n / 2; 
        
             z = z.defined() ? at::matmul(z, z) : a; 
        
             if (bit == 1) { 
        
               if (_out.has_value() && n <= 0) { 
        
                 // Last multiplication can use the out version 
        
                 return result.defined() ? at::matmul_out(out, result, z) : out.copy_(z); 
        
               } 
        
               result = result.defined() ? at::matmul(result, z) : z; 
        
             } 
        
           }

vadimkantorov · 2022-12-05T14:27:06Z

Probably worth mentioning this in the docs explicitly. I also wonder if it's supported for various scripting / onnx export. At least, does it lower x.matrix_power(2) to x @ x?

CompRhys · 2025-04-06T15:16:03Z

Can I BUMP this with another request for torch.matrix_log?

zou3519 added the todo Not as important as medium or high priority tasks, but we will work on these. label Jul 30, 2018

ssnl closed this as completed Oct 21, 2018

fmassa reopened this Oct 21, 2018

ssnl changed the title ~~[feature request] Add matrix power and matrix exponential~~ [feature request] Add matrix exponential Oct 21, 2018

628C

zhangguanheng66 added module: numpy Related to numpy support, and also numpy compatibility of our operators triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jul 1, 2019

nguigs mentioned this issue Feb 25, 2021

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation geomstats/geomstats#904

Closed

lezcano changed the title ~~[feature request] Add matrix exponential~~ [feature request] Add matrix functions Jul 23, 2021

lezcano added the module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul label Jul 23, 2021

nguigs mentioned this issue Sep 14, 2021

Add unit tests for geodesic regression on SE2 geomstats/geomstats#1151

Merged

captain-pool mentioned this issue Oct 30, 2021

Patching Pytorch Backend Operations geomstats/geomstats#1185

Closed

5 tasks

ngimel removed the high priority label Feb 7, 2022

nikitaved mentioned this issue Jun 7, 2022

Feature Request: Hessenberg and Schur decompositions #78809

Open

2 tasks

CompRhys mentioned this issue Mar 28, 2025

Implement batched versions of math operations used in Frechet and all treatment of complex eigs in logM Radical-AI/torch-sim#60

Open

[feature request] Add matrix functions #9983

[feature request] Add matrix functions #9983

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!