Fitting a keras model using scipy.optimize.minimize #3064

ncullen93 · 2016-06-24T19:52:03Z

Many people (including myself) have been asking about exposing a keras model to an external optimizer. It involves getting the loss and gradients, but this is complicated by the symbolic nature of theano/tensorflow. I wrote this code to fit an arbitrary Keras model with any method from scipy.optimize.minimize. It should work similarly for any other optimization code.

While I understand SGD and variants are useful for computer vision problems, routines such as L-BFGS-B are essential for training neural networks aimed at modeling real biological neural networks (e.g. those in the visual system). My attached example code shows how a Sparse Autoencoder trained on natural images fails to obtain the expected oriented edge filters using SGD (or any other Keras optimizer, trust me I tried) but works perfectly when trained on the L-BFGS-B routine.

This example is taken from the stanford course: http://ufldl.stanford.edu/wiki/index.php/Exercise:Sparse_Autoencoder

In short, this code is essential for researchers in the visual neuroscience domain (myself) who want the flexible and intuitive model building of keras, but need other optimizers.

Any code review/suggestions are obviously welcome. If this idea doesn't fit the scope of the project, no worries. I know it would be best to turn this code into its own "optimizer" but that would involve some changes to the keras internals.

Also, it doesn't work on TensorFlow because of the weird K.learning_phase() issue.. I figured others would have a good suggestion for that.

Thanks,
Nick

example.zip

ncullen93 · 2016-06-24T20:06:19Z

Forgot some code in example folder.. should work by running 'sparse_ae.py' -- you should get edge oriented filters. If you uncomment the "ae.fit(..)" line and comment out the "fit_scipy" line, thereby using SGD, you should see random/bad filters.

example.zip

EderSantana · 2016-07-08T18:35:21Z

there is already code in the examples folder doing something similar: https://github.com/fchollet/keras/blob/master/examples/deep_dream.py#L222-L223

maybe we should've raised awareness about this possibility?

sorig · 2016-07-21T11:50:44Z

keras/fit_scipy.py

+    """ Flattens a set tensor variables (gradients) """
+    x = np.empty(0)
+    for g in grads:
+        x = np.concatenate((x, g.reshape(-1)))


Hi! Thanks for this PR. Saved me some time.
I'm using GPU optimized Theano as my backend, which means that g on this line is a CudaNdarray type object. Executing g.reshape(-1) raises the exception:

*** ValueError: size must remain unchanged, changed from 2000 to -1

Changing this to g.reshape(g.size) fixes the issue for me. I haven't investigated the problem further, but maybe others will run into this as well. You might want to change this before the PR is merged?

fit

b29b6e7

pep8 fixed hopefully

e06d9c9

ncullen93 mentioned this pull request Jun 24, 2016

Computing gradients for custom optimization routine (and actor-critic RL methods) #3062

Closed

sorig reviewed Jul 21, 2016
View reviewed changes

fchollet closed this Nov 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fitting a keras model using scipy.optimize.minimize #3064

Fitting a keras model using scipy.optimize.minimize #3064

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fitting a keras model using scipy.optimize.minimize #3064

Fitting a keras model using scipy.optimize.minimize #3064

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!