To use dockerfile: cd /<path_to_repo>/SiamDW docker build -t siamrpn_image . docker run -it --gpus all --name siamrpn siamrpn_image Deeper and Wider Siamese Networks for Real-Time Visual Tracking We are hiring research interns for visual tracking and neural architecture search projects: houwen.peng@microsoft.com News 🏆 We are the Winner of VOT-19 RGB-D challenge [codes and models] 🏆 We won the Runner-ups in VOT-19 Long-term and RGB-T challenges [codes and models] ☀️☀️ We add the results on VOT-18, VOT-19, GOT10K, VISDRONE19, and LaSOT datasets. ☀️☀️ The training and testing code of SiamFC+ and SiamRPN+ have been released. ☀️☀️ Our paper has been accepted by CVPR2019 (Oral). ☀️☀️ We provide a parameter tuning toolkit for siamese tracking framework. Introduction Siamese networks have drawn great attention in visual tracking because of their balanced accuracy and speed. However, the backbone network utilized in these trackers is still the classical AlexNet, which does not fully take advantage of the capability of modern deep neural networks. Our proposals improve the performances of fully convolutional siamese trackers by, introducing CIR and CIR-D units to unveil the power of deeper and wider networks like ResNet and Inceptipon; designing backbone networks according to the analysis on internal network factors (e.g. receptive field, stride, output feature size), which affect tracking performances. Main Results Main results on VOT and OTB Models OTB13 OTB15 VOT15 VOT16 VOT17 Alex-FC 0.608 0.579 0.289 0.235 0.188 Alex-RPN - 0.637 0.349 0.344 0.244 CIResNet22-FC 0.663 0.644 0.318 0.303 0.234 CIResIncep22-FC 0.662 0.642 0.310 0.295 0.236 CIResNext23-FC 0.659 0.633 0.297 0.278 0.229 CIResNet22-RPN 0.674 0.666 0.381 0.376 0.294 Main results trained with GOT-10k (SiamFC) Models OTB13 OTB15 VOT15 VOT16 VOT17 Alex-FC - - - - 0.188 CIResNet22-FC 0.664 0.654 0.361 0.335 0.266 CIResNet22W-FC 0.689 0.674 0.368 0.352 0.269 CIResIncep22-FC 0.673 0.650 0.332 0.305 0.251 CIResNext22-FC 0.668 0.651 0.336 0.304 0.246 Raw Results 📎 OTB2013 📎 OTB2015 📎 VOT15 📎 VOT16 📎 VOT17 Some reproduced results listed above are slightly better than the ones in the paper. Recently we found that training on GOT10K dataset can achieve better performance for SiamFC. So we provide the results being trained on GOT10K. New added results Benchmark VOT18 VOT19 GOT10K VISDRONE19 LaSOT Performance 0.270 0.242 0.416 0.383 0.384 Raw Results 📎 VOT18 📎 VOT19 📎 GOT10K 📎 VISDRONE 📎 LaSOT We add resutls of SiamFCRes22W on recent benchmarks. Download pretrained on GOT10K model and hyper-parameters. Environment The code is developed with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz GPU: NVIDIA .GTX1080 Quick Start Test See details in test.md Train See details in train.md ☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️☁️ Citation If any part of our paper and code is helpful to your work, please generously cite with: @InProceedings{SiamDW_2019_CVPR, author = {Zhang, Zhipeng and Peng, Houwen}, title = {Deeper and Wider Siamese Networks for Real-Time Visual Tracking}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2019} } License Licensed under an MIT license.