小数据点训练与计算机视觉和自然语言处理结合技术
立即解锁
发布时间: 2025-09-01 01:10:59 阅读量: 3 订阅数: 10 AIGC 

### 小数据点训练与计算机视觉和自然语言处理结合技术
#### 1. 孪生网络(Siamese Network)
##### 1.1 孪生网络工作原理
孪生网络用于处理两张图像(一张参考图像和一张查询图像),以识别是否为同一人的图像。其工作步骤如下:
1. 将一张图像通过卷积网络。
2. 将另一张图像通过与步骤 1 相同的神经网络。
3. 计算两张图像的编码(特征)。
4. 计算两个特征向量之间的差异。
5. 将差异向量通过 sigmoid 激活函数,以表示两张图像是否相似。
“孪生”一词源于将两张图像通过孪生网络(复制网络以处理两张图像)来获取每张图像的编码,然后比较编码以得到相似度得分。若相似度得分(或不相似度得分)超过阈值,则认为是同一人的图像。
##### 1.2 编码孪生网络
以下是实现孪生网络的具体步骤和代码:
1. **导入相关包和数据集**:
```python
!pip install torch_snippets
from torch_snippets import *
!wget https://www.dropbox.com/s/ua1rr8btkmpqjxh/face-detection.zip
!unzip face-detection.zip
device = 'cuda' if torch.cuda.is_available() else 'cpu'
```
训练数据包含 38 个文件夹,每个文件夹对应一个人,包含 10 张该人的样本图像;测试数据包含 3 个文件夹,每个文件夹对应一个人,包含 10 张图像。
2. **定义数据集类 `SiameseNetworkDataset`**:
```python
class SiameseNetworkDataset(Dataset):
def __init__(self, folder, transform=None, should_invert=True):
self.folder = folder
self.items = Glob(f'{self.folder}/*/*')
self.transform = transform
def __getitem__(self, ix):
itemA = self.items[ix]
person = fname(parent(itemA))
same_person = randint(2)
if same_person:
itemB = choose(Glob(f'{self.folder}/{person}/*', silent=True))
else:
while True:
itemB = choose(self.items)
if person != fname(parent(itemB)):
break
imgA = read(itemA)
imgB = read(itemB)
if self.transform:
imgA = self.transform(imgA)
imgB = self.transform(imgB)
return imgA, imgB, np.array([1-same_person])
def __len__(self):
return len(self.items)
```
该类用于获取两张图像,并返回一个标签,0 表示同一人,1 表示不同人。
3. **定义变换并准备数据集和数据加载器**:
```python
from torchvision import transforms
trn_tfms = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomHorizontalFlip(),
transforms.RandomAffine(5, (0.01,0.2), scale=(0.9,1.1)),
transforms.Resize((100,100)),
transforms.ToTensor(),
transforms.Normalize((0.5), (0.5))
])
val_tfms = transforms.Compose([
transforms.ToPILImage(),
transforms.Resize((100,100)),
transforms.ToTensor(),
transforms.Normalize((0.5), (0.5))
])
trn_ds = SiameseNetworkDataset(folder="./data/faces/training/", transform=trn_tfms)
val_ds = SiameseNetworkDataset(folder="./data/faces/testing/", transform=val_tfms)
trn_dl = DataLoader(trn_ds, shuffle=True, batch_size=64)
val_dl = DataLoader(val_ds, shuffle=False, batch_size=64)
```
4. **定义神经网络架构**:
- **定义卷积块 `convBlock`**:
```python
def convBlock(ni, no):
return nn.Sequential(
nn.Dropout(0.2),
nn.Conv2d(ni, no, kernel_size=3, padding=1, padding_mode='reflect'),
nn.ReLU(inplace=True),
nn.BatchNorm2d(no),
)
```
- **定义 `SiameseNetwork` 架构**:
```python
class SiameseNetwork(nn.Module):
def __init__(self):
super(SiameseNetwork, self).__init__()
self.features = nn.Sequential(
convBlock(1,4),
convBlock(4,8),
convBlock(8,8),
nn.Flatten(),
nn.Linear(8*100*100, 500), nn.ReLU(inplace=True),
nn.Linear(500, 500), nn.ReLU(inplace=True),
nn.Linear(500, 5)
)
def forward(self, input1, input2):
output1 = self.features(input1)
output2 = self.features(input2)
return output1, output2
```
5. **定义对比损失函数 `ContrastiveLoss`**:
```python
class ContrastiveLoss(torch.nn.Module):
def __init__(self, margin=2.0):
super(ContrastiveLoss, self).__init__()
self.margin = margin
def forward(self, output1, output2, label):
euclidean_distance = F.pairwise_distance(output1, output2, keepdim = True)
loss_contrastive = torch.mean((1-label) * torch.pow(euclidean_distance, 2) + (label) * torch.pow(torch.clamp(self.margin - euclidean_distance, min=0.0), 2))
acc = ((euclidean_distance>0.6)==label).float().mean()
return loss_contrastive, acc
```
6. **定义训练和验证函数**:
```python
def train_batch(model, data, optimizer, criterion):
imgsA, imgsB, labels = [t.to(device) for t in data]
optimizer.zero_grad()
codesA, codesB = model(imgsA, imgsB)
loss, acc = criterion(codesA, codesB, labels)
loss.backward()
optimizer.step()
return loss.item(), acc.item()
@torch.no_grad()
def validate_batch(model, data, criterion):
imgsA, imgsB, labels = [t.to(device) for t in data]
codesA, codesB = model(imgsA, imgsB)
loss,
```
0
0
复制全文
相关推荐









