This repository is a PyTorch implementation for semantic segmentation / scene parsing. The code is easy to use for training and testing on various datasets. The codebase mainly uses ResNet50/101/152 as backbone and can be easily adapted to other basic classification structures. Implemented networks including PSPNet and PSANet, which ranked 1st places in ImageNet Scene Parsing Challenge 2016 @ECCV16, LSUN Semantic Segmentation Challenge 2017 @CVPR17 and WAD Drivable Area Segmentation Challenge 2018 @CVPR18. Sample experimented datasets are ADE20K, PASCAL VOC 2012 and Cityscapes.
master
, use official nn.SyncBatchNorm, only multiprocessing training is supported, tested with pytorch 1.4.0.1.0.0
, both multithreading training (nn.DataParallel) and multiprocessing training (nn.parallel.DistributedDataParallel) (recommended) are supported. And the later one is much faster. Use syncbn
from EncNet and apex, tested with pytorch 1.0.0.Highlight:
Requirement:
Clone the repository:
git clone https://github.com/hszhao/semseg.git
Train:
Download related datasets and symlink the paths to them as follows (you can alternatively modify the relevant paths specified in folder config
):
cd semseg
mkdir -p dataset
ln -s /path_to_ade20k_dataset dataset/ade20k
Download ImageNet pre-trained models and put them under folder initmodel
for weight initialization. Remember to use the right dataset format detailed in FAQ.md.
Specify the gpu used in config then do training:
sh tool/train.sh ade20k pspnet50
If you are using SLURM for nodes manager, uncomment lines in train.sh and then do training:
sbatch tool/train.sh ade20k pspnet50
Test:
Download trained segmentation models and put them under folder specified in config or modify the specified paths.
For full testing (get listed performance):
sh tool/test.sh ade20k pspnet50
Quick demo on one image:
PYTHONPATH=./ python tool/demo.py --config=config/ade20k/ade20k_pspnet50.yaml --image=figure/demo/ADE_val_00001515.jpg TEST.scales '[1.0]'
Visualization: tensorboardX incorporated for better visualization.
tensorboard --logdir=exp/ade20k
Other:
names
and colors
) are in folder dataset
and some sample lists can be accessed.Description: mIoU/mAcc/aAcc stands for mean IoU, mean accuracy of each class and all pixel accuracy respectively. ss denotes single scale testing and ms indicates multi-scale testing. Training time is measured on a sever with 8 GeForce RTX 2080 Ti. General parameters cross different datasets are listed below:
ADE20K: Train Parameters: classes(150), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(100). Test Parameters: classes(150), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).
PSACAL VOC 2012: Train Parameters: classes(21), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(50). Test Parameters: classes(21), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).
Cityscapes: Train Parameters: classes(19), train_h(713/709-PSP/A), train_w(713/709-PSP/A), epochs(200). Test Parameters: classes(19), test_h(713/709-PSP/A), test_w(713/709-PSP/A), base_size(2048).
If you find the code or trained models useful, please consider citing:
@misc{semseg2019,
author={Zhao, Hengshuang},
title={semseg},
howpublished={\url{https://github.com/hszhao/semseg}},
year={2019}
}
@inproceedings{zhao2017pspnet,
title={Pyramid Scene Parsing Network},
author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},
booktitle={CVPR},
year={2017}
}
@inproceedings{zhao2018psanet,
title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},
author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},
booktitle={ECCV},
year={2018}
}
Some FAQ.md collected. You are welcome to send pull requests or give some advices. Contact information: hengshuangzhao at gmail.com
.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4