This repository contains the code for our NeurIPS 2024 paper, A Surprisingly Simple Approach to Generalized Few-Shot Semantic Segmentation .
Abstract: The goal of generalized few-shot semantic segmentation (GFSS) is to recognize novel-class objects through training with a few annotated examples and the base-class model that learned the knowledge about base classes.Unlike the classic few-shot semantic segmentation, GFSS aims to classify pixels into both base and novel classes, meaning that GFSS is a more practical setting.To this end, the existing methods rely on several techniques, such as carefully customized models, various combinations of loss functions, and transductive learning.However, we found that a simple rule and standard supervised learning substantially improve performances.In this paper, we propose a simple yet effective method for GFSS that does not employ the techniques mentioned earlier in the existing methods.Also, we theoretically show that our method perfectly maintains the segmentation performance of the base-class model over most of the base classes.Through numerical experiments, we demonstrate the effectiveness of the proposed method.In particular, our method improves the novel-class segmentation performances in the 1-shot scenario by 6.1% on PASCAL-$5^i$, 4.7% on PASCAL- $10^i$, and 1.0% on COCO-$20^i$.
We used Python 3.9
in our experiments and the list of packages is available in the pyproject.toml
file. You can install them using uv sync
by uv.
uv sync
. .venv/bin/activate
Please follow the procedure of Download data in the DIaM repository.
Please follow the procedure of Pre-trained backbone and models in the DIaM repository.
Default configuration files can be found in config/
. Data are located in data/
. lists/
contains the train/val splits for each dataset. All the codes are provided in src/
. Testing script is located at the root of the repo.
To test the model, use the test.sh
script, which its general syntax is:
bash test.sh {benchmark} {shot} {method} {[gpu_ids]} {log_path}
This script tests successively on all folds of the benchmark and reports the results individually. The overall performance is the average over all the folds. Some example commands are presented below, with their description in the comments.
bash test.sh pascal5i 1 BCM [0] out.log # PASCAL-5i benchmark, 1-shot
bash test.sh pascal10i 5 BCM [0] out.log # PASCAL-10i benchmark, 5-shot
bash test.sh coco20i 5 BCM [0] out.log # COCO-20i benchmark, 5-shot
If you run out of memory, reduce batch_size_val
in the config files.
To reproduce the results, please first download the pre-trained models from here (also mentioned in the "download pre-trained models" section) and then run the test.sh
script with different inputs, as explained above.
1-Shot | 5-Shot | ||||||
---|---|---|---|---|---|---|---|
Benchmark | Fold | Base | Novel | Mean | Base | Novel | Mean |
PASCAL-5i | 0 | 71.60 | 36.39 | 54.00 | 71.62 | 54.83 | 63.22 |
1 | 69.52 | 49.80 | 59.66 | 69.62 | 61.99 | 65.80 | |
2 | 69.45 | 37.93 | 53.69 | 69.49 | 55.38 | 62.44 | |
3 | 74.02 | 40.35 | 57.19 | 74.17 | 49.26 | 61.72 | |
mean | 71.15 | 41.12 | 56.14 | 71.23 | 55.37 | 63.30 | |
COCO-20i | 0 | 49.74 | 14.53 | 32.13 | 49.82 | 25.63 | 37.73 |
1 | 48.13 | 21.88 | 35.01 | 48.50 | 35.40 | 41.95 | |
2 | 49.09 | 20.56 | 34.83 | 51.20 | 30.34 | 40.77 | |
3 | 50.74 | 16.14 | 33.44 | 49.99 | 33.44 | 40.45 | |
mean | 49.43 | 18.28 | 33.85 | 49.88 | 30.57 | 40.23 | |
PASCAL-10i | 0 | 68.38 | 37.96 | 53.17 | 68.49 | 56.50 | 62.49 |
1 | 71.77 | 33.93 | 52.85 | 71.75 | 50.46 | 61.11 | |
mean | 70.08 | 35.95 | 53.01 | 70.12 | 53.48 | 61.80 |
We gratefully thank the authors of DIaM, RePRI, BAM, PFENet, and PyTorch Semantic Segmentation from which some parts of our code are inspired.
If you find this project useful, please consider citing 5896 :
@inproceedings{sakai2024bcm,
title={A Surprisingly Simple Approach to Generalized Few-Shot Semantic Segmentation},
author={Sakai, Tomoya and Qiu, Haoxiang and Katsuki, Takayuki and Kimura, Daiki and Osogami, Takayuki and Inoue, Tadanobu},
booktitle={Advances in Neural Information Processing Systems 37 (NeurIPS 2024)},
pages={},
volume = {37},
year={2024}
}