Closed
Description
Thanks for the great implementation Robin and @lucidrains !! I just wanted to ask you if pytorch will use the same random seed during the forward pass of F and the second forward pass of F during backward pass of the reversible block? If not, can you:
- During forward pass of the reversible block : sample random seeds seed_F and seed_G and set random seeds just before the forward passes of F and G using torch.manual_seed(seed). Then save these seeds to be used during backprop.
- During backward pass of the reversible block : retrieve the saved seeds seed_F and seed_G from ctx and set them before the second forward passes of F and G to ensure the same outputs as during the original forward pass.
I am asking this as, in the Reformer LSHAttention, one samples the hashes and it can be a problem if the same hashes are not reproduced during the backward pass.
Thanks in advance,
Ankit
Metadata
Metadata
Assignees
Labels
No labels