Adding skip_special_tokens=True to FillMaskPipeline (#9783)
* We most likely don't want special tokens in this output. * Adding `skip_special_tokens=True` to FillMaskPipeline - It's backward incompatible. - It makes for sense for pipelines to remove references to special_tokens (all of the other pipelines do that). - Keeping special tokens makes it hard for users to actually remove them because all models have different tokens (<s>, <cls>, [CLS], ....) * Fixing `token_str` in the same vein, and actually fix the tests too !
This commit is contained in:
@@ -179,10 +179,10 @@ class FillMaskPipeline(Pipeline):
|
||||
tokens = tokens[np.where(tokens != self.tokenizer.pad_token_id)]
|
||||
result.append(
|
||||
{
|
||||
"sequence": self.tokenizer.decode(tokens),
|
||||
"sequence": self.tokenizer.decode(tokens, skip_special_tokens=True),
|
||||
"score": v,
|
||||
"token": p,
|
||||
"token_str": self.tokenizer.convert_ids_to_tokens(p),
|
||||
"token_str": self.tokenizer.decode(p),
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user