Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an uncomplete PR to support multiclass loss in caffe.
I doubt this will be ever merged but I wanted to share some modifications I made to memoryDataLayer to accept multiple labels and to softmax to handle mutliclass problems.
A good discussion about multilabel, multiclass and multitask problems in caffe is here
The main news this PR introduce are:
MemoryDataLayer
calledaddMatVectorMultilabel
that accepts avector < vector < int > >
as label for each data Mat. So a Mat can be easily associated with multiple classes. The resulting label blob will have the label values on the channel dimension (or more exactly in the dimension1
since the N-dimension support).SoftmaxLossLayer
to let it to compute the loss on different labels and on slices ofbottom[0]
. The number of slices are specified with the prototxt parameterslice
and must matchbottom[1]->shape(softmax_axis_)
SoftmaxLayer
to let it to compute the softmax independently on slices ofbottom[0]
, so theSoftmaxLossLayer
can compute the loss for each slice comparing it with each corresponding label.The idea is to feed the network with
MemoryDataLayer
andaddMatVectorMultilabel
, then attaching aSoftmaxLossLayer
specifying the parameterslice
to be equal to the size of the vector of the labels added. It is important to set the output ofbottom[0]
of theSoftmaxLossLayer
to be a multiple of the number of classes and of the slices.For example let's say we want to add a vector of 3 labels, and each labels can have 4 different classes, the input for
addMatVectorMultilabel
could be a vector like '{{0,0,0,1}, {1,0,0,0}, {0,1,0,0}}.Now after some conv layer we attach an
Inner_product
layer withnum_output: 12
(3*4 = 12). Then we attach to theInnerProductLayer
layer aSoftmaxLossLayer
specifyingslice: 3
. So the softmax will be applied independently to each slice of 4 elements of the output of theInnerProductLayer
and the loss calculated for each softmax.Actually the cpu version of
SoftmaxLayer
andSoftmaxLossLayer
is working, but the gpu is not, I wrote some modification on the gpu part of theSoftmaxLayer
but some index is wrong ( the test doesn't pass whenbatch_size>2
.I don't know when I will have time to fix that, if someone wants to contribute, any help is appreciated :)