8000 Feature Request: General translation-invariant filters · Issue #5618 · Theano/Theano · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Feature Request: General translation-invariant filters #5618

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
VHarisop opened this issue Feb 24, 2017 · 2 comments
Open

Feature Request: General translation-invariant filters #5618

VHarisop opened this issue Feb 24, 2017 · 2 comments

Comments

@VHarisop
Copy link

Hello everyone,

I have been using Theano lately for experimenting on morphological neural networks, which are the lattice-theoretic counterparts of "traditional" models like the perceptron. Instead of a dot product, as in the case of a multiply-accumulate activation:

z = T.dot(W, x)   # letting x be a vector

the activation of those models is

z = T.max(W + x, axis=1)   # or T.min()

As these activations are essentially dilations and erosions, we would like to experiment with their performance as building blocks of convolutional layers. However, as far as I know Theano does not provide a general way to define a translation-invariant filter on images, with the exception of the convolution operator.

My suggestion would be to add this functionality in the theano.nnet.abstract_conv package, possibly extending BaseAbstractConv2d so that it supports any type of translation-invariant filtering (not necessarily linear).
If you agree that this addition may be useful to others, I would also be willing to work on its implementation, provided that I have some sort of feedback or guidance as to which parts of the codebase should be modified, and how to do this in an efficient manner.

@lamblin
Copy link
Member
lamblin commented Feb 24, 2017

A generalization of Images2Neighbs, as well as an implementation for its gradient, could be a general base if you want to experiment with different activations. You may not get all the speed possible, but more flexibility, since you would have all the patches explicitly, and could apply all operations on them.

For specific activations (for instance, max), then maybe starting with something like the implementations of Pool could make sense (you could see it as max pooling with a bias).

For the moment, I'm not sure what a good abstract interface for that family of operations would be, or even if it will end up being something widely used, so I would say it is too early to start from AbstractConv or BaseAbstractConv (abstract ops are mostly useful when there are several implementations available for the same kind of device), but maybe we should start by a CPU or GPU implementation.

We can help you here, on theano-dev, or in the comments of a PR if you start one, but I can't guarantee we will always be reactive.

@khaotik
Copy link
Contributor
khaotik commented Feb 28, 2017

Just a personal idea, if there's a way to generalize elemwise ops a bit, Theano should be able to generate a single efficient kernel for a nonlinear filter, hopefully for both CPU and GPU.

For example, see #5471, this can be done if pad is fused with elemwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants
0