Pass datasets trust_remote_code (#31406)
* Pass datasets trust_remote_code * Pass trust_remote_code in more tests * Add trust_remote_dataset_code arg to some tests * Revert "Temporarily pin datasets upper version to fix CI" This reverts commitb7672826ca. * Pass trust_remote_code in librispeech_asr_dummy docstrings * Revert "Pin datasets<2.20.0 for examples" This reverts commit833fc17a3e. * Pass trust_remote_code to all examples * Revert "Add trust_remote_dataset_code arg to some tests" to research_projects * Pass trust_remote_code to tests * Pass trust_remote_code to docstrings * Fix flax examples tests requirements * Pass trust_remote_dataset_code arg to tests * Replace trust_remote_dataset_code with trust_remote_code in one example * Fix duplicate trust_remote_code * Replace args.trust_remote_dataset_code with args.trust_remote_code * Replace trust_remote_dataset_code with trust_remote_code in parser * Replace trust_remote_dataset_code with trust_remote_code in dataclasses * Replace trust_remote_dataset_code with trust_remote_code arg
This commit is contained in:
committed by
GitHub
parent
485fd81471
commit
a14b055b65
@@ -56,7 +56,7 @@ if __name__ == "__main__":
|
||||
tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path)
|
||||
|
||||
# Load dataset
|
||||
train_dataset, test_dataset = load_dataset("imdb", split=["train", "test"])
|
||||
train_dataset, test_dataset = load_dataset("stanfordnlp/imdb", split=["train", "test"])
|
||||
train_dataset = train_dataset.shuffle().select(range(5000)) # smaller the size for train dataset to 5k
|
||||
test_dataset = test_dataset.shuffle().select(range(500)) # smaller the size for test dataset to 500
|
||||
|
||||
|
||||
Reference in New Issue
Block a user