Information-hGRU

Description

Artificial neural networks have achieved great success in the field of computer vision. However, the deep neural network itself still remains a blackbox to us and there lacks an universal approach to judge the efficiency of the network structure. Besides that, the state-of-the-art computer vision algorithm struggles in some tasks which are easy to human. Our work applies a method promising in opening the blackbox, information bottleneck theory, to analyze a newly developed RNN cell model, hGRU, which managed to solve pattern recognition problem with long-range dependencies. Three points are made in our work: First, the hGRU is superior in long-range dependent problems. Second, the learning of hGRU is concentrated. Third, connection between hGRU layer and output layer restricts the model's performance.

Reproduction of Information Bottleneck result on DNN

Following is the reproducing result of information bottleneck theory. Each trajectory presents one hidden layer's dynamic during the whole training process. The left one is the output layer and the right ones are hidden layers. As for layer sequence, previous hidden layer is on the right.

Mutual information analysis on hGRU

This figure shows the information planes containing output and hGRU layer with 4, 6, 8 timesteps. The trajectory for output layer is on the left with an irregular shape while hGRU information trajectory appears to be a concave curve. During training process, both two layers' mutual information with labels increases, from zero to around one. Only the output layer in the four timesteps information plane reaches much lower mutual information with labels than others, which agrees with the truth that the four timesteps model fails in the pathfinder problem.

Three conclusions can be judged from this result: First, hGRU layer has its superiority in capturing information from long-range dependent problems. Noticed that hGRU layer makes progress steadily instead of moving in roundabout ways, its trajectory is relatively smooth. In the beginning of training, the trajectory slope is smaller as a result of absorbing more task-unrelated information. After that, it starts to take more and more task-related information as the slope grows bigger. This might be that it finds the correct way for capturing the information for target after the first exploring stage. It stays and moves little for the following 35000 steps. Second, the learning for hGRU layer is concentrating. Most of the learning happens in the steps around 14000. This agrees with our claim about hGRU layer's superiority because it can learn with fewer training steps. Even in the four timesteps model, the hGRU layer finishes its learning before 21000 training step. Third, the information transmittion between hGRU layer and output layer is the restriction for model performance. From step 21000 to 56000, the hGRU stays almost unchanged, waiting for the learning of the output layer. To make an analogy, output becomes the bottleneck for the whole networks performance. Enhancing the representing ability between these two layers might be a good choice by adding some hidden layers.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Summary		Summary
datasets		datasets
experimental_figures		experimental_figures
IB_average.py		IB_average.py
IB_main.py		IB_main.py
IB_main_tf.py		IB_main_tf.py
README.md		README.md
hGRU_plot.py		hGRU_plot.py
hGRU_plot_tuning.py		hGRU_plot_tuning.py
information_toolbox.py		information_toolbox.py
information_toolbox_tf.py		information_toolbox_tf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Information-hGRU

Description

Reproduction of Information Bottleneck result on DNN

Mutual information analysis on hGRU

About

Uh oh!

Releases

Packages

Uh oh!

Languages

PixelsForest/Information-hGRU

Folders and files

Latest commit

History

Repository files navigation

Information-hGRU

Description

Reproduction of Information Bottleneck result on DNN

Mutual information analysis on hGRU

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages