Switch return_dict to True by default. (#8530)

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests
This commit is contained in:
Sylvain Gugger
2020-11-16 11:43:00 -05:00
committed by GitHub
parent 0d0a0785fd
commit 1073a2bde5
106 changed files with 138 additions and 234 deletions

View File

@@ -23,7 +23,7 @@ target_str = "us rejects charges against its ambassador in bolivia"
input_ids = tokenizer(input_str, return_tensors="pt").input_ids
labels = tokenizer(target_str, return_tensors="pt").input_ids
loss = model(input_ids, labels=labels, return_dict=True).loss
loss = model(input_ids, labels=labels).loss
```
### Citation

View File

@@ -26,7 +26,7 @@ target_str = "us rejects charges against its ambassador in bolivia"
input_ids = tokenizer(input_str, return_tensors="pt").input_ids
labels = tokenizer(target_str, return_tensors="pt").input_ids
loss = model(input_ids, labels=labels, return_dict=True).loss
loss = model(input_ids, labels=labels).loss
```
Note that since this model is a multi-lingual model it can be fine-tuned on all kinds of other languages.

View File

@@ -45,7 +45,7 @@ from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np
tokenizer = AutoTokenizer.from_pretrained('mrm8488/codebert-base-finetuned-detect-insecure-code')
model = AutoModelForSequenceClassification.from_pretrained('mrm8488/codebert-base-finetuned-detect-insecure-code', return_dict=True)
model = AutoModelForSequenceClassification.from_pretrained('mrm8488/codebert-base-finetuned-detect-insecure-code')
inputs = tokenizer("your code here", return_tensors="pt", truncation=True, padding='max_length')
labels = torch.tensor([1]).unsqueeze(0) # Batch size 1

View File

@@ -13,7 +13,7 @@ sentences = ["Hello World", "Hallo Welt"]
encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=64, return_tensors='pt')
with torch.no_grad():
model_output = model(**encoded_input, return_dict=True)
model_output = model(**encoded_input)
embeddings = model_output.pooler_output
embeddings = torch.nn.functional.normalize(embeddings)