目录
LongformerForTokenClassification调通
LongformerForTokenClassification调通
对应文档:
下载预训练模型:
修改使用模型预测与训练时的输出获取
from transformers import AutoTokenizer, LongformerForTokenClassification import torch # tokenizer = AutoTokenizer.from_pretrained("brad1141/Longformer-finetuned-norm") # model = LongformerForTokenClassification.from_pretrained("brad1141/Longformer-finetuned-norm") tokenizer = AutoTokenizer.from_pretrained("tmp/Longformer-finetuned-norm") model = LongformerForTokenClassification.from_pretrained("tmp/Longformer-finetuned-norm") inputs = tokenizer( "HuggingFace is a company based in Paris and New York", add_special_tokens=False, return_tensors="pt" ) #预测 with torch.no_grad(): outputs=model(**inputs) # 如果输出是元组,可以手动解析 if isinstance(outputs, tuple): logits, = outputs else: logits = outputs.logits predicted_token_class_ids = logits.argmax(-1) # Note that tokens are classified rather then input words which means that # there might be more predicted token classes than words. # Multiple token classes might account for the same word predicted_tokens_classes = [model.config.id2label[t.item()] for t in predicted_token_class_ids[0]] predicted_tokens_classes print("predicted_tokens_classes:", predicted_tokens_classes) # 训练 labels = predicted_token_class_ids # loss = model(**inputs, labels=labels).loss outputs = model(**inputs, labels=labels) if isinstance(outputs, tuple): loss,logits = outputs else: loss = outputs.loss round(loss.item(), 2) print("loss:", round(loss.item(), 2)) |
目前输出是NER任务的针对每一个token分类:
predicted_tokens_classes ['Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence', 'Evidence'] |
Debug很重要的一步是看模型输出的各个维度什么意思, 这个可以从源文件和文档找,
此处longformer
logits (torch.FloatTensor of shape (batch_size, sequence_length, config.num_labels)) — Classification scores (before SoftMax).
将7分类的预训练模型改为2分类
例子中的logits是[1, 12, 7], 其中sequence_length是句子中所有token的数量. config.num_labels 由config文件的id2label计算:
"id2label": { "0": "Lead", "1": "Position", "2": "Evidence", "3": "Claim", "4": "Concluding Statement", "5": "Counterclaim", "6": "Rebuttal" }, |
此处将config原件保存副本, 然后修改类别为2个
"id2label": { "0": "Non-dataset description", "1": "Dataset description" }, |
为了将 Longformer 的输出从 7 分类修改为 2 分类,需要调整模型的分类层(classifier layer):
加载预训练的 LongformerForTokenClassification 模型。
修改模型的分类层。
重新初始化模型的分类层。
# 修改分类层为2分类 model.num_labels = 2 model.classifier = nn.Linear(model.config.hidden_size, model.num_labels) # 初始化分类层权重 model.classifier.weight.data.normal_(mean=0.0, std=model.config.initializer_range) if model.classifier.bias is not None: |