Evaluating the gradient within the log probability function?

I have a neural network-based log probability function $log p_{NN}(\theta|x, \vec{t})$. If I increase the size of $\vec{t}$ my code essentially creates a batch of x repeated len($\vec{t}$) times. While I am able to refactor my code so I compute this log probability and add it in smaller batches of len($\vec{t}_{BATCH}$), I still run into memory issues while computing the gradient.

I further refactored the log probability function code to accumulate the gradients while evaluating the log probability function so it returns a log probability as well as its gradient with respect to the parameters $\theta$. Now, the pass_grad argument seems to only accommodate a constant tensor or function that returns a tensor of dimension D. The NN-based log probability is also stochastic so I cant wrap the gradient as a separate function and pass it separately.

I would ideally like to restructure the code so as to evaluate the gradients when it evaluates the log probability -- I was going to modify my local hamiltorch package to do this, but I first thought I'd check if there's already a function in the package that handles this or a better workaround, in case other users have encountered this before?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions