8000 Appropriate layer to adapt · Issue #5 · erictzeng/adda · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Appropriate layer to adapt #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasi 8000 onally send you account related emails.

Already on GitHub? Sign in to your account

Open
avisekiit opened this issue Dec 23, 2017 · 3 comments
Open

Appropriate layer to adapt #5

avisekiit opened this issue Dec 23, 2017 · 3 comments

Comments

@avisekiit
Copy link

Hi,
Please do correct me if I am wrong. From the code, it seems that we will adapt the extreme last(10-way classification output) fully connected layer. However, I think that this layer is obviously giving the distribution of class probabilities and we cannot adapt this layer if we don't know the label of the target dataset during ADDA.
For example, if we feed a digit 5 from source domain and a digit 7 from the target domain, obviously the distributions of the last layer will be different. It only makes sense to align the last layer if we are sure that we feed the images of same class; which in turn means that we have to know the class labels of target domain during the ADDA adaptation phase.
Is my understanding wrong ?

@rshaojimmy
Copy link

@avisekiit Yes, I got the same understanding and question as yours. May this be a trick in this paper?

@jhoffman
Copy link
Collaborator

The comparison of the last layers is only done in aggregate -- usually with batch sizes over 100 for each of the source and target. It's more of a comparison of the distribution shape, rather than a comparison between the similarity of individual images. Thus, the target (and source) labels are not used during ADDA training.

@avisekiit
Copy link
Author

@jhoffman Thanks for your reply. I just have a thought: why do you think that it is better to adapt the ultimate last layer which mainly outputs classification logits instead of some intermediate layers which can capture high level features. At least from the diagram of your flow diagram of your paper I got a feel that source encoder and classifier were 2 separate modular components and you adapt the features extracted by the encoder section of both target and source..
Thanks again for your time....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0