语义分割-基于Pytorch在MIT-ADE20K数据集上实现语义分割+场景理解算法-附项目源码-优质项目实战.zip

共101个文件

py：68个

yaml：14个

odgt：4个

版权申诉

语义分割

Pytorch

场景理解

项目源码

5星 · 超过95%的资源 148 浏览量 2024-05-26 21:00:27 上传评论 3 收藏 2.93MB ZIP 举报

在这个名为“语义分割-基于Pytorch在MIT-ADE20K数据集上实现语义分割+场景理解算法-附项目源码-优质项目实战.zip”的压缩包中，包含了一个全面的语义分割项目，使用了Pytorch深度学习框架，并在MIT-ADE20K数据集上进行了训练和验证。这个项目不仅提供了完整的代码实现，还涉及到场景理解这一高级应用，对于学习和研究计算机视觉领域的开发者来说是一个宝贵的资源。语义分割是计算机视觉领域的一个关键任务，它的目标是对图像中的每个像素进行分类，以识别出图像中的不同对象和背景。这种技术广泛应用于自动驾驶、医疗影像分析、遥感图像处理等多个领域。Pytorch是目前非常流行的一个深度学习框架，它以其易用性、灵活性和强大的功能深受开发者喜爱。 MIT-ADE20K数据集是语义分割领域的一个大规模多类标注数据集，包含了超过20,000张来自不同场景的图像，涵盖了150个类别，包括室内和室外环境，物体和背景等。这个数据集的复杂性和多样性使得在其中训练的模型具有更强的泛化能力。场景理解是语义分割的延伸，它不仅仅是识别出图像中的各个部分，还要求模型能够理解这些部分之间的关系和整个场景的结构。例如，在一个城市街景中，不仅要识别出汽车、行人、建筑等，还要理解它们的位置关系，如道路、行人道和建筑物的相对位置。项目源码通常包括数据预处理、模型构建、训练过程、评估和可视化等关键模块。通过阅读和实践这个项目，开发者可以深入了解Pytorch如何处理大型数据集，以及如何设计和优化语义分割网络，例如UNet、SegNet或FCN等经典架构。此外，还可以学习到如何利用数据增强技术提高模型性能，以及如何运用损失函数（如交叉熵损失）来指导模型训练。在实际操作中，开发者可以按照源码的指导，逐步运行代码，观察模型在MIT-ADE20K数据集上的表现，调整超参数，甚至尝试引入新的网络结构或训练策略，以进一步提升模型的分割精度。这将有助于深入理解和掌握语义分割的理论与实践，同时也可以锻炼到编程和问题解决的能力。这个压缩包提供的项目实战是一个极好的学习平台，它将理论知识与实践经验相结合，可以帮助开发者快速提升在语义分割和场景理解领域的技能，对从事相关工作的专业人士来说是一份极具价值的参考资料。

资源推荐

资源详情

资源评论

收起资源包目录

语义分割-基于Pytorch在MIT-ADE20K数据集上实现语义分割+场景理解算法-附项目源码-优质项目实战.zip （101个子文件）

object150_info.csv 6KB

DemoSegmenter.ipynb 7KB

color150.mat 502B

README.md 7KB

README.md 466B

training.odgt 3.62MB

validation.odgt 367KB

ADE_val_00001519.png 698KB

ADE_val_00000278.png 616KB

models.py 21KB

hrnet.py 16KB

dataloader.py 16KB

batchnorm.py 13KB

dataset.py 12KB

train.py 9KB

eval_multipro.py 7KB

resnet.py 7KB

eval.py 6KB

test.py 6KB

utils.py 6KB

resnext.py 5KB

mobilenet.py 5KB

comm.py 4KB

sampler.py 4KB

test_sync_batchnorm.py 3KB

dataset.py 3KB

data_parallel.py 3KB

defaults.py 3KB

replicate.py 3KB

ipynb_drop_output.py 3KB

distributed.py 2KB

test_numeric_batchnorm.py 2KB

th.py 1KB

unittest.py 835B

setup.py 817B

utils.py 577B

__init__.py 449B

__init__.py 110B

__init__.py 95B

__init__.py 92B

__init__.py 63B

__init__.py 53B

__init__.py 32B

__init__.py 18B

__init__.py 0B

demo_test.sh 910B

setup_notebooks.sh 447B

download_ADE20K.sh 221B

requirements.txt 57B

ade20k-mobilenetv2dilated-c1_deepsup.yaml 762B

ade20k-resnet101dilated-ppm_deepsup.yaml 761B

ade20k-resnet50dilated-ppm_deepsup.yaml 759B

ade20k-resnet18dilated-ppm_deepsup.yaml 758B

ade20k-resnet101-upernet.yaml 740B

ade20k-resnet50-upernet.yaml 738B

ade20k-hrnetv2.yaml 725B

共 101 条

# Semantic Segmentation on MIT ADE20K dataset in PyTorch <img src="./teaser/ADE_val_00000278.png" width="900"/> <img src="./teaser/ADE_val_00001519.png" width="900"/> ## Supported models We split our models into encoder and decoder, where encoders are usually modified directly from classification networks, and decoders consist of final convolutions and upsampling. We have provided some pre-configured models in the ```config``` folder. Encoder: - MobileNetV2dilated - ResNet18/ResNet18dilated - ResNet50/ResNet50dilated - ResNet101/ResNet101dilated - HRNetV2 (W48) Decoder: - C1 (one convolution module) - C1_deepsup (C1 + deep supervision trick) - PPM (Pyramid Pooling Module, see [PSPNet](https://hszhao.github.io/projects/pspnet) paper for details.) - PPM_deepsup (PPM + deep supervision trick) - UPerNet (Pyramid Pooling + FPN head, see [UperNet](https://arxiv.org/abs/1807.10221) for details.) ## Performance: IMPORTANT: The base ResNet in our repository is a customized (different from the one in torchvision). The base models will be automatically downloaded when needed. <table><tbody> <th valign="bottom">Architecture</th> <th valign="bottom">MultiScale Testing</th> <th valign="bottom">Mean IoU</th> <th valign="bottom">Pixel Accuracy(%)</th> <th valign="bottom">Overall Score</th> <th valign="bottom">Inference Speed(fps)</th> <tr> <td rowspan="2">MobileNetV2dilated + C1_deepsup</td> <td>No</td><td>34.84</td><td>75.75</td><td>54.07</td> <td>17.2</td> </tr> <tr> <td>Yes</td><td>33.84</td><td>76.80</td><td>55.32</td> <td>10.3</td> </tr> <tr> <td rowspan="2">MobileNetV2dilated + PPM_deepsup</td> <td>No</td><td>35.76</td><td>77.77</td><td>56.27</td> <td>14.9</td> </tr> <tr> <td>Yes</td><td>36.28</td><td>78.26</td><td>57.27</td> <td>6.7</td> </tr> <tr> <td rowspan="2">ResNet18dilated + C1_deepsup</td> <td>No</td><td>33.82</td><td>76.05</td><td>54.94</td> <td>13.9</td> </tr> <tr> <td>Yes</td><td>35.34</td><td>77.41</td><td>56.38</td> <td>5.8</td> </tr> <tr> <td rowspan="2">ResNet18dilated + PPM_deepsup</td> <td>No</td><td>38.00</td><td>78.64</td><td>58.32</td> <td>11.7</td> </tr> <tr> <td>Yes</td><td>38.81</td><td>79.29</td><td>59.05</td> <td>4.2</td> </tr> <tr> <td rowspan="2">ResNet50dilated + PPM_deepsup</td> <td>No</td><td>41.26</td><td>79.73</td><td>60.50</td> <td>8.3</td> </tr> <tr> <td>Yes</td><td>42.14</td><td>80.13</td><td>61.14</td> <td>2.6</td> </tr> <tr> <td rowspan="2">ResNet101dilated + PPM_deepsup</td> <td>No</td><td>42.19</td><td>80.59</td><td>61.39</td> <td>6.8</td> </tr> <tr> <td>Yes</td><td>42.53</td><td>80.91</td><td>61.72</td> <td>2.0</td> </tr> <tr> <td rowspan="2">UperNet50</td> <td>No</td><td>40.44</td><td>79.80</td><td>60.12</td> <td>8.4</td> </tr> <tr> <td>Yes</td><td>41.55</td><td>80.23</td><td>60.89</td> <td>2.9</td> </tr> <tr> <td rowspan="2">UperNet101</td> <td>No</td><td>42.00</td><td>80.79</td><td>61.40</td> <td>7.8</td> </tr> <tr> <td>Yes</td><td>42.66</td><td>81.01</td><td>61.84</td> <td>2.3</td> </tr> <tr> <td rowspan="2">HRNetV2</td> <td>No</td><td>42.03</td><td>80.77</td><td>61.40</td> <td>5.8</td> </tr> <tr> <td>Yes</td><td>43.20</td><td>81.47</td><td>62.34</td> <td>1.9</td> </tr> </tbody></table> The training is benchmarked on a server with 8 NVIDIA Pascal Titan Xp GPUs (12GB GPU memory), the inference speed is benchmarked a single NVIDIA Pascal Titan Xp GPU, without visualization. ## Environment The code is developed under the following configurations. - Hardware: >=4 GPUs for training, >=1 GPU for testing (set ```[--gpus GPUS]``` accordingly) - Software: Ubuntu 16.04.3 LTS, ***CUDA>=8.0, Python>=3.5, PyTorch>=0.4.0*** - Dependencies: numpy, scipy, opencv, yacs, tqdm ## Quick start: Test on an image using our trained model 1. Here is a simple demo to do inference on a single image: ```bash chmod +x demo_test.sh ./demo_test.sh ``` This script downloads a trained model (ResNet50dilated + PPM_deepsup) and a test image, runs the test script, and saves predicted segmentation (.png) to the working directory. 2. To test on an image or a folder of images (```$PATH_IMG```), you can simply do the following: ``` python3 -u test.py --imgs $PATH_IMG --gpu $GPU --cfg $CFG ``` ## Training 1. Download the ADE20K scene parsing dataset: ```bash chmod +x download_ADE20K.sh ./download_ADE20K.sh ``` 2. Train a model by selecting the GPUs (```$GPUS```) and configuration file (```$CFG```) to use. During training, checkpoints by default are saved in folder ```ckpt```. ```bash python3 train.py --gpus $GPUS --cfg $CFG ``` - To choose which gpus to use, you can either do ```--gpus 0-7```, or ```--gpus 0,2,4,6```. For example, you can start with our provided configurations: * Train MobileNetV2dilated + C1_deepsup ```bash python3 train.py --gpus GPUS --cfg config/ade20k-mobilenetv2dilated-c1_deepsup.yaml ``` * Train ResNet50dilated + PPM_deepsup ```bash python3 train.py --gpus GPUS --cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml ``` * Train UPerNet101 ```bash python3 train.py --gpus GPUS --cfg config/ade20k-resnet101-upernet.yaml ``` 3. You can also override options in commandline, for example ```python3 train.py TRAIN.num_epoch 10 ```. ## Evaluation 1. Evaluate a trained model on the validation set. Add ```VAL.visualize True``` in argument to output visualizations as shown in teaser. For example: * Evaluate MobileNetV2dilated + C1_deepsup ```bash python3 eval_multipro.py --gpus GPUS --cfg config/ade20k-mobilenetv2dilated-c1_deepsup.yaml ``` * Evaluate ResNet50dilated + PPM_deepsup ```bash python3 eval_multipro.py --gpus GPUS --cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml ``` * Evaluate UPerNet101 ```bash python3 eval_multipro.py --gpus GPUS --cfg config/ade20k-resnet101-upernet.yaml ``` ## Integration with other projects This library can be installed via `pip` to easily integrate with another codebase ```bash pip install git+https://github.com/CSAILVision/semantic-segmentation-pytorch.git@master ``` Now this library can easily be consumed programmatically. For example ```python from mit_semseg.config import cfg from mit_semseg.dataset import TestDataset from mit_semseg.models import ModelBuilder, SegmentationModule ```

评论收藏

内容反馈

版权申诉