This PyG example implements the GNN model RECT (or more specifically its supervised part RECT-L) proposed in the paper Network Embedding with Completely-imbalanced Labels. The authors' original implementation can be found here.
Three PyG's build-in datasets (Cora, Citeseer, and Pubmed) with their default train/val/test settings are used in this example. In addition, as this paper considers the zero-shot (i.e., completely-imbalanced) label setting, those "unseen" classes should be removed from the training set, as suggested in the paper. In this example, in each dataset, we simply remove the 1-3 classes (i.e., these 1-3 classes are unseen classes) from the labeled training set. Then, we obtain graph embedding results by different models. Finally, with the obtained embedding results and the original balanced labels, we train a logistic regression classifier to evaluate the model performance.
python rect.py --dataset citeseer --removed-classes 1 2 5
#reproducing the RECT-L on "citeseer" datasets in the zero-shot label setting
python run_gcn_feats.py --dataset citeseer --removed-classes 1 2 5
#reproducing the GCN on "citeseer" datasets in the zero-shot label setting and also evaluating the original node features
The performance results are as follows:
Table 1: Node classification results with some classes as "unseen"Datasets | Citeseer | Cora | Pubmed | ||
Unseen Classes | {1, 2, 5} | {3, 4} | {1, 2, 3} | {3, 4, 6} | {2} |
RECT-L | 66.30 | 68.20 | 74.60 | 71.20 | 75.30 |
GCN | 51.80 | 55.70 | 55.80 | 57.10 | 59.80 |
NodeFeats | 61.40 | 61.40 | 57.50 | 57.50 | 73.10 |