使用pytorch实现LSTM语言模型

Baijiu in my cup

已于 2023-05-08 21:47:27 修改

阅读量570

点赞数

CC 4.0 BY-SA版权

文章标签： pytorch lstm 语言模型

于 2023-05-06 22:53:27 首次发布

本文链接：https://blog.csdn.net/qq_50224852/article/details/130537022

文章介绍了如何在PyTorch中构建一个不使用批处理的LSTM语言模型。关键在于理解和重塑输入输出的维度，特别是当`batch_first=False`和`batch_first=True`时的差异。模型包含嵌入层、LSTM层和全连接层，用于预测下一个单词的概率。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

使用pytorch实现LSTM语言模型

网上有很多关于nn.LSTM参数解释，因为我理解不深且较为表面，所以不再复制粘贴

首先明确下文的处理不使用批处理。当在进行文本分析时，即一条一条句子输入。因此，我们可以不用再关注LSTM层的批数据填充。

LSTM的重要参数：batch_first
ss：句子长度，b:batch_size,e:embedding_dim,h:hid_dim

batch_first=F（默认）
- 输入：(s,b,e)
- 输出：(s,b,h)
batch_first=T
- 输入：(b,s,e)
- 输出：(b,s,h)

因为不使用批处理，batch_size=1

问题来了，输入输出LSTM的数据是四维的，这里需要输出输入进行维度重构

方法(具体用法请baidu)：

unsqueeze(0)，squeeze(0)
.view()

class LSTM_LM(nn.Module):
  
  def __init__(self,vocab_size, embedding_dim,hid_dim=128):
    super(LSTM_LM, self).__init__()
    self.embeddings = nn.Embedding(vocab_size, embedding_dim)
    self.linear1=nn.LSTM(embedding_dim,hid_dim)
    self.linear2=nn.Linear(hid_dim,vocab_size)

  def forward(self, inputs):
    embeds = self.embeddings(inputs)
    embeds=embeds.view(embeds.shape[0],1,-1)
    out,_=self.linear1(embeds)
    s,_,_=out.size()
    out = self.linear2(out.view(s,-1))
    log_probs = F.log_softmax(out, dim=1)
    return log_probs

class LSTM_LM(nn.Module):
  
  def __init__(self,vocab_size, embedding_dim,hid_dim=128):
    super(LSTM_LM, self).__init__()
    self.embeddings = nn.Embedding(vocab_size, embedding_dim)
    self.linear1=nn.LSTM(embedding_dim,hid_dim,batch_first=True)
    self.linear2=nn.Linear(hid_dim,vocab_size)

  def forward(self, inputs):
    embeds = self.embeddings(inputs).unsqueeze(0)
    out,_=self.linear1(embeds)
    out=out.squeeze(0)
    out = self.linear2(out)
    log_probs = F.log_softmax(out, dim=1)
    return log_probs