8000 Fitting a keras model using scipy.optimize.minimize by ncullen93 · Pull Request #3064 · keras-team/keras · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Fitting a keras model using scipy.optimize.minimize #3064

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Fitting a keras model using scipy.optimize.minimize #3064

wants to merge 2 commits into from

Conversation

ncullen93
Copy link

Many people (including myself) have been asking about exposing a keras model to an external optimizer. It involves getting the loss and gradients, but this is complicated by the symbolic nature of theano/tensorflow. I wrote this code to fit an arbitrary Keras model with any method from scipy.optimize.minimize. It should work similarly for any other optimization code.

While I understand SGD and variants are useful for computer vision problems, routines such as L-BFGS-B are essential for training neural networks aimed at modeling real biological neural networks (e.g. those in the visual system). My attached example code shows how a Sparse Autoencoder trained on natural images fails to obtain the expected oriented edge filters using SGD (or any other Keras optimizer, trust me I tried) but works perfectly when trained on the L-BFGS-B routine.

This example is taken from the stanford course: http://ufldl.stanford.edu/wiki/index.php/Exercise:Sparse_Autoencoder

In short, this code is essential for researchers in the visual neuroscience domain (myself) who want the flexible and intuitive model building of keras, but need other optimizers.

Any code review/suggestions are obviously welcome. If this idea doesn't fit the scope of the project, no worries. I know it would be best to turn this code into its own "optimizer" but that would involve some changes to the keras internals.

Also, it doesn't work on TensorFlow because of the weird K.learning_phase() issue.. I figured others would have a good suggestion for that.

Thanks,
Nick

example.zip

@ncullen93
Copy link
Author

Forgot some code in example folder.. should work by running 'sparse_ae.py' -- you should get edge oriented filters. If you uncomment the "ae.fit(..)" line and comment out the "fit_scipy" line, thereby using SGD, you should see random/bad filters.

example.zip

@EderSantana
Copy link
Contributor

there is already code in the examples folder doing something similar: https://github.com/fchollet/keras/blob/master/examples/deep_dream.py#L222-L223

maybe we should've raised awareness about this possibility?

9752
""" Flattens a set tensor variables (gradients) """
x = np.empty(0)
for g in grads:
x = np.concatenate((x, g.reshape(-1)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Thanks for this PR. Saved me some time.
I'm using GPU optimized Theano as my backend, which means that g on this line is a CudaNdarray type object. Executing g.reshape(-1) raises the exception:

*** ValueError: size must remain unchanged, changed from 2000 to -1

Changing this to g.reshape(g.size) fixes the issue for me. I haven't investigated the problem further, but maybe others will run into this as well. You might want to change this before the PR is merged?

@fchollet fchollet closed this Nov 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0