语义分割(semantic segmentation)_在线语义分割-CSDN博客

本文链接：https://blog.csdn.net/m0_63276919/article/details/143895372

语义分割(semantic segmentation)

文章目录

语义分割(semantic segmentation)
- 图像分割和实例分割
- - 代码实现

语义分割指将图片中的每个像素分类到对应的类别，语义区域的标注和预测是 像素级的，语义分割标注的像素级的边界框显然更加精细。应用：背景虚化

在这里插入图片描述

图像分割和实例分割

图像分割将图像划分为若干组成区域，这类问题的方法通常利用图像中像素之间的相关性。它在训练时不需要有关图像像素的标签信息，在预测时也无法保证分割出的区域具有我们希望得到的语义。以下图中的图像作为输入，图像分割可能会将狗分为两个区域：一个覆盖以黑色为主的嘴和眼睛，另一个覆盖以黄色为主的其余部分身体。

在这里插入图片描述

实例分割也叫同时检测并分割（simultaneous detection and segmentation），它研究如何识别图像中各个目标实例的像素级区域。与语义分割不同，实例分割不仅需要区分语义，还要区分不同的目标实例。例如，如果图像中有两条狗，则实例分割需要区分像素属于的两条狗中的哪一条。

语义分割vs实例分割：

语义分割：只关心像素是哪一个类别

实例分割：区别具体对每个实例的识别

在这里插入图片描述

代码实现

%matplotlib inline
import os
import torch
import torchvision
from d2l import torch as d2l

# 导入文件
d2l.DATA_HUB['voc2012'] = (d2l.DATA_URL + 'VOCtrainval_11-May-2012.tar',
                           '4e443f8a2eca6b1dac8a6c57641b67dd40621a49')
voc_dir = d2l.download_extract('voc2012', 'VOCdevkit/VOC2012')

#加载图片
def read_voc_images(voc_dir, is_train=True):
    """读取所有VOC图像并标注"""
    txt_fname = os.path.join(voc_dir, 'ImageSets', 'Segmentation',
                             'train.txt' if is_train else 'val.txt')
    mode = torchvision.io.image.ImageReadMode.RGB
    with open(txt_fname, 'r') as f:
        images = f.read().split()
    features, labels = [], []
    for i, fname in enumerate(images):
        # 分别读取图片和像素特征点，png的文件大保存了图像质量，jpg文件小，适合存储色彩丰富、细节复杂的照片
        features.append(torchvision.io.read_image(os.path.join(
            voc_dir, 'JPEGImages', f'{
     
     fname}.jpg')))
        labels.append(torchvision.io.read_image(os.path.join(
            voc_dir, 'SegmentationClass' ,f'{
     
     fname}.png'), mode))
    return features, labels

train_features, train_labels = read_voc_images(voc_dir, True)

展示图片

n = 5
imgs = train_features[0:n] + train_labels[0:n]
imgs = [img.permute(1,2,0) for img in imgs]
d2l.show_images(imgs, 2, n);