Model() got multiple values for argument #39;nr_class#39; - SpaCy multi-classification model (BERT integration)(Model() 为参数“nr_class获得了多个值 - SpaCy 多分类模型(BERT 集成))
问题描述
您好,我正在使用新的 SpaCy 模型 en_pytt_bertbaseuncased_lg
实现多分类模型(5 个类).新管道的代码在这里:
Hi I am working on implementing a multi-classification model (5 classes) with the new SpaCy Model en_pytt_bertbaseuncased_lg
. The code for the new pipe is here:
nlp = spacy.load('en_pytt_bertbaseuncased_lg')
textcat = nlp.create_pipe(
'pytt_textcat',
config={
"nr_class":5,
"exclusive_classes": True,
}
)
nlp.add_pipe(textcat, last = True)
textcat.add_label("class1")
textcat.add_label("class2")
textcat.add_label("class3")
textcat.add_label("class4")
textcat.add_label("class5")
训练代码如下,基于这里的例子(https://pypi.org/project/spacy-pytorch-transformers/):
The code for the training is as follows and is based on the example from here(https://pypi.org/project/spacy-pytorch-transformers/):
def extract_cat(x):
for key in x.keys():
if x[key]:
return key
# get names of other pipes to disable them during training
n_iter = 250 # number of epochs
train_data = list(zip(train_texts, [{"cats": cats} for cats in train_cats]))
dev_cats_single = [extract_cat(x) for x in dev_cats]
train_cats_single = [extract_cat(x) for x in train_cats]
cats = list(set(train_cats_single))
recall = {}
for c in cats:
if c is not None:
recall['dev_'+c] = []
recall['train_'+c] = []
optimizer = nlp.resume_training()
batch_sizes = compounding(1.0, round(len(train_texts)/2), 1.001)
for i in range(n_iter):
random.shuffle(train_data)
losses = {}
batches = minibatch(train_data, size=batch_sizes)
for batch in batches:
texts, annotations = zip(*batch)
nlp.update(texts, annotations, sgd=optimizer, drop=0.2, losses=losses)
print(i, losses)
所以我的数据结构是这样的:
So the structure of my data looks like this:
[('TEXT TEXT TEXT',
{'cats': {'class1': False,
'class2': False,
'class3': False,
'class4': True,
'class5': False}}), ... ]
我不知道为什么会出现以下错误:
I am not sure why I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-32-1588a4eadc8d> in <module>
21
22
---> 23 optimizer = nlp.resume_training()
24 batch_sizes = compounding(1.0, round(len(train_texts)/2), 1.001)
25
TypeError: Model() got multiple values for argument 'nr_class'
如果我去掉 nr_class 参数,我会在这里得到这个错误:
if I take out the nr_class argument, I get this error here:
ValueError: operands could not be broadcast together with shapes (1,2) (1,5)
我实际上认为这会发生,因为我没有指定 nr_class 参数.那是对的吗?
I actually thought this would happen because I didn't specify the nr_class argument. Is that correct?
推荐答案
这是我们发布的 spacy-pytorch-transformers
最新版本的回归.对不起!
This is a regression in the most recent version we released of spacy-pytorch-transformers
. Sorry about this!
根本原因是,这是**kwargs
邪恶的又一案例.我期待着改进 spaCy API 以防止将来出现这些问题.
The root cause is, this is another case of the evils of **kwargs
. I'm looking forward to refining the spaCy API to prevent these issues in future.
你可以在这里看到违规行:https://github.com/explosion/spacy-pytorch-transformers/blob/c1def95e1df783c69bff9bc8b40b5461800e9231/spacy_pytorch_transformers/pipeline/textcat.py#L71 .我们提供了 nr_class
位置参数,它与您在配置期间传入的显式参数重叠.
You can see the offending line here: https://github.com/explosion/spacy-pytorch-transformers/blob/c1def95e1df783c69bff9bc8b40b5461800e9231/spacy_pytorch_transformers/pipeline/textcat.py#L71 . We provide the nr_class
positional argument, which overlaps with the explicit argument you passed in during the config.
为了解决这个问题,您可以简单地从您传入 spacy.create_pipe()
config
的字典中删除 nr_class
键/代码>.
In order to workaround the problem, you can simply remove the nr_class
key from your the config
dict you're passing into spacy.create_pipe()
.
这篇关于Model() 为参数“nr_class"获得了多个值 - SpaCy 多分类模型(BERT 集成)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Model() 为参数“nr_class"获得了多个值 - SpaCy 多


- 检查具有纬度和经度的地理点是否在 shapefile 中 2022-01-01
- 我如何透明地重定向一个Python导入? 2022-01-01
- 我如何卸载 PyTorch? 2022-01-01
- ";find_element_by_name(';name';)";和&QOOT;FIND_ELEMENT(BY NAME,';NAME';)";之间有什么区别? 2022-01-01
- 如何使用PYSPARK从Spark获得批次行 2022-01-01
- 计算测试数量的Python单元测试 2022-01-01
- CTR 中的 AES 如何用于 Python 和 PyCrypto? 2022-01-01
- 使用 Cython 将 Python 链接到共享库 2022-01-01
- YouTube API v3 返回截断的观看记录 2022-01-01
- 使用公司代理使Python3.x Slack(松弛客户端) 2022-01-01