Open
Description
Thanks for your excellent work! When I try to reproduce HRank in other networks, I found an interesting result:
- When I applied torch.matrix_rank to a network with leaky_relu, it turns out that all channels almost have the same rank and the rank is always full. At first, I thought the reason is the network I used is not redundant.
- Then, I applied torch.matrix_rank to the resnet56-cifar10 network in your repository, and I found it works fine. However, if I move the torch.matrix_rank function to the BN layer (i.e. before ReLU Layer), the ranks also become the same full rank!
- Hence, I put an extra ReLU Layer after the origin
4D47
al Leaky_ReLU in my own network, and the ranks behave normally, i.e. some channels have high ranks and other channels have low ranks.
So I'm curious about two things: - What is the meaning of low rank or even zero rank? It seems not to be related to the amount of information, and only represents the mean activation distribution.
- Is it reasonable if I calculate rank by adding an extra ReLU layer?
Metadata
Metadata
Assignees
Labels
No labels