multi-label softmax support #3268

xdshang · 2015-11-01T12:55:57Z

Support N labels along the softmax axis. The final loss is average loss over all labels.
By combining the ignore_label, it supports variable number of labels.

…average loss value over all given labels.

bhack · 2015-11-01T13:24:24Z

…abel, since the size of label blob is doubled. When the size of label blob is (10, 1, 2, 3), it also causes checking failure with checking accuracy of 5e-5.

mtamburrano · 2015-11-09T18:59:09Z

src/caffe/layers/softmax_loss_layer.cpp

+        }
+        DCHECK_GE(label_value, 0);
+        DCHECK_LT(label_value, prob_.shape(softmax_axis_));
+        loss -= log(std::max(prob_data[i * dim + label_value * inner_num_ + j],


Are you sure is this right?
It shouldn't be something like
loss -= log(std::max(prob_data[(i * dim) + (dim/label_num_*k) + label_value * inner_num_ + j], Dtype(FLT_MIN))); ?
I'm not sure how you think to feed bottom[0] to match the dimension of the labels, it shouldn't be larger, with a size of previous_size*label_num_?
Let's say we had 3-classes single labels, so an INNER_PRODUCT layer with num_output: 3 was enough. Now if we have for each input 2 labels each one with 3-classes, the INNER_PRODUCT should have num_output: 6 and you should iterate on the prob_ blob with an offset that considers both the number of the classes and the size of the labels.
Is it right or I'm missing something?

I am not addressing multi-class problem, but multi-label problem where multiple labels are assigned to each instance. In your example, supposing instance i is assigned with 1st and 3rd class, the loss for the instance is simply the average of losses on that two classes, i.e. log(prob_[i * dim + 0 * inner_num_ + j]) and log(prob_[i * dim + 2 * inner_num_ + j]). The INNER_PRODUCT still have num_output: 3 in this case.
Generally, if there is M classes and each instance has K labels, the shape of data blob is still (N, M) and the shape of label blob is (N, K). Originally it only allows (N, 1).

Ok I get it, I supposed you were addressing multi-class problems.
Thank you

BlGene · 2015-11-11T14:32:31Z

Just for reference there was as Multi label Data and MultiLabel Accuracy PR ( #523) previously.
Which was followed by #1380. #523 was closed because @shelhamer said:

we concluded that losses and layers are capable of handling multilabel problems

Playing devil's advocate:

As comparison, what would be the closest way to implement this functionality with existing layers? I presume a inner product that has num_output: N*K, re-size into (N,K) and then do a normal softmax that compares this to a (N,1) label.
And how is this PR better than that.

8000

taras-sereda · 2015-11-11T17:23:06Z

I'm looking for multi label functionality in caffe.
What I've seen is a lot of PRs which are abandoned.
Can somebody clarify what exactly should be done to perform such classification, what parts are missing now and what is already present in framework?

How I understand this task.
Having pairs of img and vector of labels (hot encoded) [1, 0, 0, 1, 0, 1]
it would be enough to have only CrosEntropyLoss which will be in fact minimisation of KL-divergence, right? Between true labels distribution (normalised) and predicted through SoftMax.

@bhack I've seen your comments on all the PR branches so may be you know more details.
I'm ready to contribute, create tutorial and examples of model deffiniton. I just need some update on this problem to understand what should be added to make it real.
Thanks in advance.

beniz · 2015-11-11T19:15:19Z

@taras-sereda got frustrated with the multiple PRs so I decided to take the time to lay down a few solutions I know of the other day, the discussion thread lies here: https://groups.google.com/forum/#!topic/caffe-users/RuT1TgwiRCo

taras-sereda · 2015-11-11T20:37:13Z

@beniz Thanks for sharing.
I mean if there are multiple ways to solve this problem, it would be nice to create some examples. And I'm ready to work on this part.

xdshang · 2015-11-11T23:13:04Z

@BlGene For your first advice, I think the problem is how you get the N * K output. For each instance, supposing there are N labels and the num_output is K, you need to duplicate the K-dimensional output N times. This is much more consumptive than that we simply compute N positions over the K-dimensional output.

PR #523 only proposed multilabel accuracy layer, which is for test phase. Furthermore, multilabel loss may often be used in webly-tag classification where the number of tags is huge. In our experiment, we use around 30,000 tags (classes) and there are only 20 positive tags in average for each instance, so it is quite sparse. In this case, hot encoded label ([1, 0, 0, 1, 0, 1]) is not proper.

bhack · 2015-11-12T14:10:33Z

@beniz and others. See also #3326

diPDew · 2015-12-03T01:02:45Z

In terms of classification performance, how does this multi-label version of the Softmax loss compare with the SigmoidCrossEntropyLoss layer?

huhusuperma · 2016-04-26T05:49:16Z

how to load the multi labels for data, cause i cannot find any examples in your project. It seems that u didn't modify the datalayer.

Xindi Shang added 2 commits November 1, 2015 17:09

added support for multi-label softmax. The softmax loss simply takes …

d788f3a

…average loss value over all given labels.

Merge branch 'master' of https://github.com/xdshang/caffe-1

9f8d697

Xindi Shang added 2 commits November 1, 2015 22:25

fixed bugs when label softmax axis does not exist explicitly.

90c239c

changed the checking accuracy from 1e-4 to 2e-4 in TestForwardIgnoreL…

7c4e3f5

…abel, since the size of label blob is doubled. When the size of label blob is (10, 1, 2, 3), it also causes checking failure with checking accuracy of 5e-5.

mtamburrano reviewed Nov 9, 2015
View reviewed changes

xdshang mentioned this pull request Nov 12, 2015

MultiClass label support #3326

Open

samratkokula mentioned this pull request Dec 11, 2015

Master #3443

Closed

elezar mentioned this pull request Feb 9, 2016

Add support for multiple labels to be specified in the image_data_layer #3653

Closed

xdshang closed this Jun 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

multi-label softmax support #3268

multi-label softmax support #3268

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

multi-label softmax support #3268

multi-label softmax support #3268

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!