事情是这样的,MAX_SENTENCE_LENGTH我看您默认设置为了250。默认的时候不管是训练还是decode都是正常的,而且效果很不错。
而我的数据集中的句子可能会有长度为1000左右的,如果按照默认的250,该句子就会消失在decode生成的结果文件中。源码中我也看到有个if判断,当MAX_SENTENCE_LENGTH设置为负数时会让所有长度的句子通过。因此我设置了MAX_SENTENCE_LENGTH为-1。结果在训练模型的时候没有报错,在decode的时候显示CRF层的一个语句报错,详细报错信息如下。
File "C:\Users\jmy\PycharmProjects\LatticeLSTM\model\crf.py", line 171, in _viterbi_decode
last_partition = torch.gather(partition_history, 1, last_position).view(batch_size,tag_size,1)
RuntimeError: Invalid index in gather at C:\w\1\s\tmp_conda_3.7_021303\conda\conda-bld\pytorch_1565316900252\work\aten\src\TH/generic/THTensorEvenMoreMath.cpp:472
报错提示gather函数的index是无效的,然后我打印了gather函数的第一个参数和第三个参数,也就是partition_history和last_position。
打印的代码是这样写的:
print(partition_history.size())
print(last_position.size())
last_partition = torch.gather(partition_history, 1, last_position).view(batch_size,tag_size,1)
print("success")
打印过程如下:
torch.Size([1, 312, 7])
torch.Size([1, 1, 7])
success
torch.Size([1, 164, 7])
torch.Size([1, 1, 7])
success
torch.Size([1, 219, 7])
torch.Size([1, 1, 7])
success
torch.Size([1, 256, 7])
torch.Size([1, 1, 7])
Traceback (most recent call last):
File "main.py", line 449, in
decode_results = load_model_decode(model_dir, data, 'raw', gpu, seg)
File "main.py", line 355, in load_model_decode
speed, acc, p, r, f, pred_results = evaluate(data, model, name)
File "main.py", line 153, in evaluate
tag_seq = model(gaz_list,batch_word, batch_biword, batch_wordlen, batch_char, batch_charlen, batch_charrecover, mask)
File "C:\Users\jmy\Anaconda3\envs\python3.7\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "C:\Users\jmy\PycharmProjects\LatticeLSTM\model\bilstmcrf.py", line 42, in forward
scores, tag_seq = self.crf._viterbi_decode(outs, mask)
File "C:\Users\jmy\PycharmProjects\LatticeLSTM\model\crf.py", line 171, in _viterbi_decode
last_partition = torch.gather(partition_history, 1, last_position).view(batch_size,tag_size,1)
RuntimeError: Invalid index in gather at C:\w\1\s\tmp_conda_3.7_021303\conda\conda-bld\pytorch_1565316900252\work\aten\src\TH/generic/THTensorEvenMoreMath.cpp:472
请问作者知道是哪里有问题吗?为什么MAX_SENTENCE_LENGTH设置会导致这里有问题。250长度的默认值就可以完美train和decode。已经困惑了很久了,望作者解惑,十分感谢。
事情是这样的,MAX_SENTENCE_LENGTH我看您默认设置为了250。默认的时候不管是训练还是decode都是正常的,而且效果很不错。
而我的数据集中的句子可能会有长度为1000左右的,如果按照默认的250,该句子就会消失在decode生成的结果文件中。源码中我也看到有个if判断,当MAX_SENTENCE_LENGTH设置为负数时会让所有长度的句子通过。因此我设置了MAX_SENTENCE_LENGTH为-1。结果在训练模型的时候没有报错,在decode的时候显示CRF层的一个语句报错,详细报错信息如下。
File "C:\Users\jmy\PycharmProjects\LatticeLSTM\model\crf.py", line 171, in _viterbi_decode
last_partition = torch.gather(partition_history, 1, last_position).view(batch_size,tag_size,1)
RuntimeError: Invalid index in gather at C:\w\1\s\tmp_conda_3.7_021303\conda\conda-bld\pytorch_1565316900252\work\aten\src\TH/generic/THTensorEvenMoreMath.cpp:472
报错提示gather函数的index是无效的,然后我打印了gather函数的第一个参数和第三个参数,也就是partition_history和last_position。
打印的代码是这样写的:
打印过程如下:
torch.Size([1, 312, 7])
torch.Size([1, 1, 7])
success
torch.Size([1, 164, 7])
torch.Size([1, 1, 7])
success
torch.Size([1, 219, 7])
torch.Size([1, 1, 7])
success
torch.Size([1, 256, 7])
torch.Size([1, 1, 7])
Traceback (most recent call last):
File "main.py", line 449, in
decode_results = load_model_decode(model_dir, data, 'raw', gpu, seg)
File "main.py", line 355, in load_model_decode
speed, acc, p, r, f, pred_results = evaluate(data, model, name)
File "main.py", line 153, in evaluate
tag_seq = model(gaz_list,batch_word, batch_biword, batch_wordlen, batch_char, batch_charlen, batch_charrecover, mask)
File "C:\Users\jmy\Anaconda3\envs\python3.7\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "C:\Users\jmy\PycharmProjects\LatticeLSTM\model\bilstmcrf.py", line 42, in forward
scores, tag_seq = self.crf._viterbi_decode(outs, mask)
File "C:\Users\jmy\PycharmProjects\LatticeLSTM\model\crf.py", line 171, in _viterbi_decode
last_partition = torch.gather(partition_history, 1, last_position).view(batch_size,tag_size,1)
RuntimeError: Invalid index in gather at C:\w\1\s\tmp_conda_3.7_021303\conda\conda-bld\pytorch_1565316900252\work\aten\src\TH/generic/THTensorEvenMoreMath.cpp:472
请问作者知道是哪里有问题吗?为什么MAX_SENTENCE_LENGTH设置会导致这里有问题。250长度的默认值就可以完美train和decode。已经困惑了很久了,望作者解惑,十分感谢。