-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Full advanced Indexing support + gradient #1083
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…e general case
This is the right channel to go through. Thanks for submitting this! It will be great to have advanced indexing. When some more people get to the office we'll figure out where to put things / how to handle the dependencies. |
discrete_dtypes = map(str, scal.discrete_types) | ||
all_dtypes = map(str, scal.all_types) | ||
int_dtypes = map(str, scal.int_types) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove the uint_dtypes and all_dtypes variable? Even if they are not used in this file, we need them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what happened here.
Thanks for this PR. It will be really useful I think. For the advinc repo, I think you should put the containt of the advinc.c file in a c_code_support() fct of the Op. In the c_code() of this op, if it is an inc, you generate some c code that will call your function. I probably won't have the time to do it before December, but if you want to do it, I can answer questions. This documentation page could help you understand how we generate c code. http://www.deeplearning.net/software/theano/extending/cop.html |
I've done that once before, but this seems like it will make the code significantly more complicated, as I would have to manually put the inplace arguments into a tuple. Isn't there a place I can put code to be compiled when theano is installed and thus simply import the function into python? |
There is only 1 function we currently compile like you tell and we want to get rid of it. But this won't happen shortly. So if you choose to do so, I'll accept that. It is in the file theano/gof/cutils.py. But why doing a tuple is so complicated? Yes a little, but I think much less then what you did for this PR. Or I forget something? It is a long time I didn't do a tuple in C. |
@nouiz Okay, this should be ready to go now. |
OK, so short of forking the development version of NumPy and bundle it with Theano, I guess this is as good as it's going to get. |
I think we can probably just say something like "advanced indexing now fully supported, but currently requires numpy dev version, which can be a pain". For me and probably some others advanced indexing is really valuable, and those people would like to know that it's there. Of course, if you don't need advanced indexing then updating numpy is probably not worth it. |
I like @jsalvatier's suggestion -- consider not even mentioning that it's a pain. Just say you "advanced indexing and gradients are supported, but these currently require numpy dev version >= X (date)." |
@nouiz When I run the following program, I get a segfault (on the last line):
I'd really like to get this patch settled soon. |
I should also say, that the same C code works when I compile it in its own library. https://github.com/jsalvatier/advinc/blob/master3/advinc/advinc.c |
We are in a deadline for ICML the 15 fev and I need to help for optimization. So probably I won't be able to test it before Thursday/Friday. But I'll look at it. |
If I run your code with my current numpy version, it work, but don't use your faster version. If I try to install your numpy fork, the installation fail with an error about mapping.c not being available. How this file is supposed to be created from the mapping.c.src file? I install it like this: |
By current numpy version do you mean the current dev version? My numpy fork has been integrated into the numpy master, for which mapping.c.src is no longer relevant. |
ok, I used the numpy master and I'm able to reproduce the crash. But what is this PR? numpy/numpy#326 I was thinking it was what you needed? I check why it segfault. |
Oh, I should have closed the request, we decided to just add API support (see numpy/numpy#377) |
I made a PR to your branch with the fix. We need to import some stuff for numpy in the init of the python module. |
Fix segfault as we didn't imported numpy c module stuff.
Awesome! Thanks :) On Fri, Feb 8, 2013 at 11:19 AM, nouiz notifications@github.com wrote:
|
@@ -1095,7 +1095,8 @@ def output_types(self, *input_types): | |||
return upcast_out(*input_types[0]) | |||
|
|||
def grad(self, inputs, output_gradients): | |||
return [None, None] | |||
a,b = inputs | |||
return [a.zeros_like().astype(theano.config.floatX), b.zeros_like().astype(theano.config.floatX)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goodfeli can you confirm that both changes in this file are fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any word on this? I want to push out an alpha of pymc3, and this is PR is an important feature for pymc3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure it's correct.
@@ -4183,7 +4185,7 @@ def init_entry(entry, depth=0): | |||
rval = """ | |||
#define PyArray_set_dim(obj, idx, d) PyArray_DIMS(obj)[idx]=d | |||
#define PyArray_set_stride(obj, idx, d) PyArray_STRIDES(obj)[idx]=d | |||
#define PyArray_set_data(obj, ptr, base) PyArray_BYTES(obj)=ptr | |||
#define PyArray_set_data(obj, ptr, base) ((PyArrayObject*)(obj))->data=ptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you changed that? With the new NumPy C-API, we won't have direct access to the ->data field. Ok the old behavior won't work too, but I'm curious why you changed that? This is the type of change that could break stuff on some other compiler/OS. So if their is no well defined reason, I would prefer not to include that.
I didn't review the code in c_utils. I suppose it is copied from the numpy code, so it should be working correctly. I made a few small comments. Also, I didn't found existing tests for AdvancedSubtensor and AdvancedIncSubtensor. We need to have some. For example, I catch a case where if we don't do an implace AdvancedIncSubtensor, it would have crashed. The tests should be in tensor/tests/test_basic.py. There is already the classes TestIncSubtensor1 and T_subtensor. But as said, they don't test the op you changed. If you know existing test for this class, we need to add test for new shape/index that you add support for. Don't change the T_subtensor class, as they are reused for GPU tests. You can make a new class TestAdvancedSubtensor in the same spirit as TestIncSubtensor1. thanks for this big new feature. |
In found some indirect tests for advanced subtensor. In fact it is test for optimization that remove it from the graph by creating some crossentropy ops. Those tests are in theano/tensor/nnet/tests/test_nnet.py. But I don't think we can reuse them and this is not a good place for the rest of the op. |
Continued in gh-1269, closing. |
This patch implements full advanced indexing support as well as the gradient.
It currently relies on this package being installed (https://github.com/jsalvatier/advinc), which only has c function. Obviously this should go in Theano, but it's not clear to me where it should go. I would like some input here.
Also, the patch currently relies on numpy having an interface to MapIter (which is a recent patch), so it's also necessary to check that this works.
Since the patch isn't strictly ready, I'm not sure if this is the right channel to go through, so let me know.