-
Notifications
You must be signed in to change notification settings - Fork 263
Simplify creation of identity matrices in the Siegel manifold #1918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify creation of identity matrices in the Siegel manifold #1918
Conversation
Once the function
This error comes from the identity matrix created in the function |
Hello @ninamiolane, @SaitejaUtpala and @luisfpereira, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please address my comments? After we change accordingly, everything related with GPUs has to be solved at the backend level, though I think it should probably work fine already.
geomstats/geometry/siegel.py
Outdated
@@ -231,7 +228,7 @@ def inner_product(self, tangent_vec_a, tangent_vec_b, base_point): | |||
inner_product : array-like, shape=[...,] | |||
Inner-product. | |||
""" | |||
identity = _create_identity_mat(base_point.shape, dtype=base_point.dtype) | |||
identity = _eye_like(base_point) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking back to the code, I've realized that every time we use this function, we actually only need to create a (n, n) identity matrix with same type as base_point
. Therefore, I suggest we remove the function _eye_like
and do:
identity = gs.eye(space.shape, dtype=base.dtype)
I think it will work fine gpu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @luisfpereira,
No, using:
identity = gs.eye(space.shape, dtype=base.dtype)
is not sufficient for two reasons.
The first reason is the shape.
The first input of the function torch.eye should be an integer n
, not a shape.
Then, a 2D tensor is returned.
I would like my identity matrix to be 2D or 3D (in this case it is a repetition of 2D identity matrices along the first axis), depending on the shape of the input arrays of my function.
The second reason is the device.
When using the PyTorch backend with GPUs enable, the following code:
a = torch.tensor([[0.5 + 0.2j, 0], [0, 0.5]], requires_grad=True, device=device)
returns
tensor([[0.5000+0.2000j, 0.0000+0.0000j],
[0.0000+0.0000j, 0.5000+0.0000j]], device='cuda:0', requires_grad=True)
Then
gs.eye(a.shape[0], dtype=a.dtype).device
returns
device(type='cpu')
which is not what we want.
However,
gs.eye(a.shape[0], dtype=a.dtype, device=a.device).device
returns
device(type='cuda', index=0)
which is what we want.
So it is not sufficient to use the dtype
optional input argument when using the PyTorch backend on GPUs, we have to use the device
optional input argument.
However, note that for the NumPy backend, np.eye
do not accept a device
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It think that the easiest way to do what I want is to use _eye_like
as I have proposed, using the function gs.zeros_like
which is designed to create a zero matrix compatible with the input matrix for both the NumPy
and the PyTorch
backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the shapes, even though it looks like you need to create 3d arrays, actually it suffices (and is better) to create 2d arrays, as the operations where they are used take advantage of broadcasting (thus, we save some memory).
For the device, I'm not suggesting passing device
as input, I'm suggesting that when we set globally the device to be the GPU, then any tensor is created by default on the GPU. I thought this was actually the behavior of pytorch
. But if not, we can easily implement this behavior in the backend, only concerning pytorch.
This is the part that we need to test, probably not the full code.
geomstats/algebra_utils.py
Outdated
identity = gs.repeat( | ||
gs.zeros_like(gs.reshape(vector, (-1, num_columns))[:1, :]), | ||
repeats=num_columns, | ||
axis=0, | ||
) | ||
identity[range(num_columns), range(num_columns)] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
identity = gs.eye(num_cols, dtype=vector.dtype
suffices, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is not sufficient for two reasons:
- the shape of the output would be 2D, we want it to be 2D or 3D.
- the
device
would always be CPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
Thanks @YannCabanes (see my review up). GPUs are in our head, but it is very unlikely that we can test code on them with github continuous integration (as far as I know, we can only access CPUs). We can definitely start creating some tests to run locally and mark them as My first intuition, is that we need to be careful mostly with the backend functions that create arrays, to ensure they are created in the right device. Most of it should be already taken care of by Anyway, I think a good starting point is to solve the problem in the places we already know. Then we need to find out a strategy to identify places where we've done similar mistakes. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1918 +/- ##
==========================================
+ Coverage 90.29% 90.55% +0.27%
==========================================
Files 143 148 +5
Lines 13407 13554 +147
==========================================
+ Hits 12105 12273 +168
+ Misses 1302 1281 -21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Hello @luisfpereira, Would you like to start adding PyTorch GPU tests in this PR?
What do you think? |
I think we can add the tests in a separate PR, as it is preferable to solve this first. Then, I suggest we add them in the backends folder. I would like to add precise tests for the gpu, as it is costly to be always running too many (reduntant?) tests. Ideally, we want to check the device where some tensors live. |
Hello @luisfpereira, It is fine for me to add the tests in a different PR. The modifications of the code you are suggesting to create the identity matrices using: However, with the modifications that you suggested, if the user does not write
then the following error is returned:
Are we willing to fix this error for the users who do not use |
With the current design of the backend (which does not take into account a device), we have to assume that the user will do with torch.device('cuda:1'):
r = torch.randn(2, 3) In the long run we can try something out where the innerly created tensors check the device of the arrays of the computations they'll be involved in. But we definitely need to make this change in a principled way and probably take a look to how it is done in other backends, to ensure we can also use them in the future (thinking about |
Hello @luisfpereira, |
Thanks @YannCabanes. Merging. |
The function
siegel.metric.isometry
returns an error when used on PyTorch GPU.When I run the following code in a Google Colab environment with the execution type "T4 GPU":
I obtain the following error message:
Indeed, the identity matrix created with the function
_create_identity_mat
is always created on CPU.To fix this error, this PR replaces the function
_create_identity_mat
by the function_eye_like
.