Correct validation_split_percentage argument from int (ex:5) to float (0.05) (#12897)
* Fixed train_test_split test_size argument * `Seq2SeqTrainer` set max_length and num_beams only when non None (#12899) * set max_length and num_beams only when non None * fix instance variables * fix code style * [FLAX] Minor fixes in CLM example (#12914) * readme: fix retrieval of vocab size for flax clm example * examples: fix flax clm example when using training/evaluation files * Fix module path for symbolic_trace example Co-authored-by: cchen-dialpad <47165889+cchen-dialpad@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
This commit is contained in:
@@ -438,7 +438,7 @@ def main():
|
||||
f"Validation file not found: using {data_args.validation_split_percentage}% of the dataset as validation as provided in data_args"
|
||||
)
|
||||
train_indices, val_indices = train_test_split(
|
||||
list(range(len(train_dataset))), test_size=data_args.validation_split_percentage
|
||||
list(range(len(train_dataset))), test_size=data_args.validation_split_percentage / 100
|
||||
)
|
||||
|
||||
eval_dataset = train_dataset.select(val_indices)
|
||||
|
||||
@@ -499,7 +499,7 @@ def main():
|
||||
f"Validation file not found: using {data_args.validation_split_percentage}% of the dataset as validation as provided in data_args"
|
||||
)
|
||||
train_indices, val_indices = train_test_split(
|
||||
list(range(len(train_dataset))), test_size=data_args.validation_split_percentage
|
||||
list(range(len(train_dataset))), test_size=data_args.validation_split_percentage / 100
|
||||
)
|
||||
|
||||
eval_dataset = train_dataset.select(val_indices)
|
||||
|
||||
Reference in New Issue
Block a user