A multi-task learning framework designed for simultaneous depth estimation and semantic segmentation using the Swin Transformer architecture.
- [30th June] Paper Accepted at the IROS 2024 Conference 🔥🔥🔥
To get started, follow these steps:
-
Only for ROS installation (otherwise skip this part)
cd catkin_ws/src catkin_create_pkg SwinMTL_ROS std_msgs rospy cd .. catkin_make source devel/setup.bash cd src/SwinMTL_ROS/src git clone https://github.com/PardisTaghavi/SwinMTL.git chmod +x inference_ros.py mv ./launch/ ./..
-
Clone the repository:
git clone https://github.com/PardisTaghavi/SwinMTL.git cd SwinMTL
-
Create a conda environment and activate it:
conda env create --file environment.yml conda activate prc
To run the testing for the project, follow the below steps:
-
Download Pretrained Models:
- here access the pretrained models.
- Download the pretrained models you need.
-
Move Pretrained Models:
- Create a new folder named
model_zoo
and move the pretrained models into the model_zoo folder you created in the project directory. - Refer to
testLive.ipynb
for testing.
- Create a new folder named
roslaunch SwinMTL_ROS swinmtl_launch.launch
If you find our project useful, please consider citing:
@inproceedings{taghavi2024swinmtl,
title={SwinMTL: A shared architecture for simultaneous depth estimation and semantic segmentation from monocular camera images},
author={Taghavi, Pardis and Langari, Reza and Pandey, Gaurav},
booktitle={2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
pages={4957--4964},
year={2024},
organization={IEEE}
}