8000
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Carried on Adam solver (originally #2856) for merge.
I completed the tests and rebased it to latest master.
Authorship belongs to @PatWie, and is preserved in git commit.
Original message in #2856 :
This commit implements the Adam solver by Kingma et. al for CPU and GPU. All solver parameters are defined in the caffe.proto. This also adds an example for the MNIST dataset.
As you may see, now both solver.cpp and test_gradient_based_solver.cpp are growing to 1000+ lines. This problem will be addressed in #2890.
Sorry, something went wrong.
c482eb0
c330b84
Encountered error larger than margin on solver tests. Perhaps this is simply numerical issue. Looking into details. It was a mistake I made when implementing test code ComputeLeastSquaresUpdate for Adam, where I forgot to power beta1 and beta2 by t. Now everything is fine.
ComputeLeastSquaresUpdate
9d60495
cca6a5c
@shelhamer @jeffdonahue @philkr @PatWie Please take a look if you have time. I think this should be ready to merge.
This is last piece of the solver trilogy in #2860. After merging this one, we can address #2890.
There was an error while loading. Please relo 8000 ad this page.
There was a problem hiding this comment.
The reason will be displayed to describe this comment to others. Learn more.
We need to cite the ADAM paper somewhere*. I suggest putting a reference here, e.g. in a doxygen formatted comment like this. Eventually it would also be good to add sections to the solver tutorial on these new solvers, where the reference should also then be added.
*We probably also need to go back and add references for some of the other recently merged solvers.
Thanks for the rebase @ronghanghu and thanks @PatWie for the original implementation! See above comment; otherwise looks good.
Citation added for Adam.
Eventually it would also be good to add sections to the solver tutorial on these new solvers, where the reference should also then be added. *We probably also need to go back and add references for some of the other recently merged solvers.
Let's address that in #2890 .
af31ef9
0cd8f32
No need to create shared_ptrs for these val_* variables, is there? (I suggest using the raw pointer, e.g. Blob<Dtype>* val_m = this->history_[param_id].get();.)
shared_ptr
val_*
Blob<Dtype>* val_m = this->history_[param_id].get();
Yes, you are right. I should use raw pointers.
Thanks for adding the citation. After a final glance I noticed the one other thing I commented above; sorry about not noticing before. Feel free to merge after addressing that.
Looks good.
17afec4
bf42e6e
Adam solver
4e4c89b
Cite Adam paper in solver.hpp
Changed from shared ptrs to raw ptrs in AdamSolver<Dtype>::ComputeUpdateValue.
AdamSolver<Dtype>::ComputeUpdateValue
Merge pull request #2918 from ronghanghu/adam
cbca8fe
Successfully merging this pull request may close these issues.