-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Memory leak in high order derivative? (pytorch 0.2) 8000 #2498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you try building trunk and running your test to see if it still leaks? |
@gchanan As far as I know, this issue is related with BatchNorm layer. If I remove all those layers then memory usage is constant. |
@LIU-Xuanqing did you try installing from source as I mentioned above? |
@gchanan Aha... I used the wrong source from: https://github.com/pytorch/pytorch/releases/tag/v0.2.0 It works using master branch, thanks! |
Hi Xuanqing, @xuanqing94 I'm also working on second-order derivative. Can you help with two questions here?
Thank you in advance. |
@chao1224 Hi,
|
Thanks for answering @xuanqing94 |
Quick fix on dynamo import in python tests. Patches dynamo test failure with upstream's updated API
Uh oh!
There was an error while loading. Please reload this page.
Not sure if this implementation is efficient, but when calculating Hessian-vector product in a loop, it terminates because of out of memory after ~30 iterations (Titan X), here is my implementation:
Basically, it first calculates loss, then gradient of weight. Do an inner product with v, then takes gradient of weight again. Similar code works good on Theano.
Here is the traceback:
THCudaCheck FAIL file=xxx/pytorch-0.2.0/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File "./main.py", line 50, in
Hv = torch.autograd.grad(inner_prod, grad_params, create_graph=True)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 153, in grad
inputs, only_inputs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functions/thnn/batchnorm_double_backwards.py", line 70, in batchnorm_double_backwards_fn
gI_2t = (gOinmu_sum * sigma2_eps_neg_3_2).div(M) * (ggI_sum.div(M) - ggI)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 820, in sub
return self.sub(other)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 332, in sub
return self._sub(other, False)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 326, in _sub
return Sub.apply(self, other, inplace)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/basic_ops.py", line 34, in forward
return a.sub(b)
RuntimeError: cuda runtime error (2) : out of memory at xxx/pytorch-0.2.0/torch/lib/THC/generic/THCStorage.cu:66
The text was updated successfully, but these errors were encountered: