大模型笔记2 Longformer for Extractive Summarization任务的模型修改

目录

LongformerForTokenClassification调通

将7分类的预训练模型改为2分类

利用分类标签取出token对应子词

将token转换为完整单词取出


LongformerForTokenClassification调通

对应文档:

https://huggingface.co/docs/transformers/en/model_doc/longformer#transformers.LongformerForTokenClassification

下载预训练模型:

https://huggingface.co/docs/transformers/en/model_doc/longformer#transformers.LongformerForTokenClassification

修改使用模型预测与训练时的输出获取

from transformers import AutoTokenizer, LongformerForTokenClassification

import torch

# tokenizer = AutoTokenizer.from_pretrained("brad1141/Longformer-finetuned-norm")

# model = LongformerForTokenClassification.from_pretrained("brad1141/Longformer-finetuned-norm")

tokenizer = AutoTokenizer.from_pretrained("tmp/Longformer-finetuned-norm")

model = LongformerForTokenClassification.from_pretrained("tmp/Longformer-finetuned-norm")

inputs = tokenizer(

    "HuggingFace is a company based in Paris and New York", add_special_tokens=False, return_tensors="pt"

)

#预测

with torch.no_grad():

    outputs=model(**inputs)

    # 如果输出是元组,可以手动解析

    if isinstance(outputs, tuple):

        logits, = outputs

    else:

        logits = outputs.logits

predicted_token_class_ids = logits.argmax(-1)

# Note that tokens are classified rather then input words which means that

# there might be more predicted token classes than words.

# Multiple token classes might account for the same word

predicted_tokens_classes = [model.config.id2label[t.item()] for t in predicted_token_class_ids[0]]

predicted_tokens_classes

print("predicted_tokens_classes:", predicted_tokens_classes)

# 训练

labels = predicted_token_class_ids

# loss = model(**inputs, labels=labels).loss

outputs = model(**inputs, labels=labels)

if isinstance(outputs, tuple):

    loss,logits = outputs

else:

    loss = outputs.loss

round(loss.item(), 2)

print("loss:", round(loss.item(), 2))

目前输出是NER任务的针对每一个token分类:

predicted_tokens_classes ['Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence']

Debug很重要的一步是看模型输出的各个维度什么意思, 这个可以从源文件和文档找,

此处longformer

logits (torch.FloatTensor of shape (batch_size, sequence_length, config.num_labels)) — Classification scores (before SoftMax).

将7分类的预训练模型改为2分类

例子中的logits是[1, 12, 7], 其中sequence_length是句子中所有token的数量. config.num_labels 由config文件的id2label计算:

  "id2label": {

    "0": "Lead",

    "1": "Position",

    "2": "Evidence",

    "3": "Claim",

    "4": "Concluding Statement",

    "5": "Counterclaim",

    "6": "Rebuttal"

  },

此处将config原件保存副本, 然后修改类别为2个

"id2label": {

    "0": "Non-dataset description",

    "1": "Dataset description"

  },

为了将 Longformer 的输出从 7 分类修改为 2 分类,需要调整模型的分类层(classifier layer):

加载预训练的 LongformerForTokenClassification 模型。

修改模型的分类层。

重新初始化模型的分类层。

# 修改分类层为2分类

model.num_labels = 2

model.classifier = nn.Linear(model.config.hidden_size, model.num_labels)

# 初始化分类层权重

model.classifier.weight.data.normal_(mean=0.0, std=model.config.initializer_range)

if model.classifier.bias is not None:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值