HuggingFace_transformer

Author	SHA1	Message	Date
Yih-Dar	44fe1a1cc4	Avoid using uncessary `get_values(MODEL_MAPPING)` (#29362 ) * more fixes * more fixes --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-29 17:19:17 +08:00
Younes Belkada	b647acdb53	FIX [`CI`] `require_read_token` in the llama FA2 test (#29361 ) Update test_modeling_llama.py	2024-02-29 04:49:01 +01:00
Younes Belkada	8d8ac9c2df	FIX [`CI`]: Fix failing tests for peft integration (#29330 ) fix failing tests for peft integration	2024-02-29 03:56:16 +01:00
Younes Belkada	1aee9afd1c	FIX [`CI` / `starcoder2`] Change starcoder2 path to correct one for slow tests (#29359 ) change starcoder2 path to correct one	2024-02-29 03:52:13 +01:00
Arthur	8a8a0a4ae0	[`Llama ROPE`] Fix torch export but also slow downs in forward (#29198 ) * remove control flow * update gptneox * update .... * nits * Actually let's just break. Otherwise we are silently failing which imo is not optimal * version BC * fix tests * fix eager causal * nit * add a test * style * nits * nits * more nits for the test * update and fix * make sure cuda graphs are not skipped * read token is needed for meta llama * update! * fiixup * compile test should be slow * fix thet fix copies * stle 🫠	2024-02-28 10:45:53 +01:00
Younes Belkada	ad00c482c7	FIX [`Gemma` / `CI`] Make sure our runners have access to the model (#29242 ) * pu hf token in gemma tests * update suggestion * add to flax * revert * fix * fixup * forward contrib credits from discussion --------- Co-authored-by: ArthurZucker <ArthurZucker@users.noreply.github.com>	2024-02-28 06:25:23 +01:00
RaymondLi0	63caa370e6	Starcoder2 model - bis (#29215 ) * Copy model * changes * misc * fixes * add embed and residual dropout (#30) * misc * remove rms norm and gated MLP * remove copied mentions where its not a copy anymore * remove unused _shape * copied from mistral instead * fix copies * fix copies * add not doctested * fix * fix copyright * Update docs/source/en/model_doc/starcoder2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix doc * revert some changes * add fa2 tests * fix styling nit * fix * push dummy docs --------- Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-28 01:24:34 +01:00
Raushan Turganbay	ddf7ac4237	Token level timestamps for long-form generation in Whisper (#29148 )	2024-02-27 18:15:26 +00:00
Andrei Panferov	e3fc90ae68	Cleaner Cache `dtype` and `device` extraction for CUDA graph generation for quantizers compatibility (#29079 ) * input_layernorm as the beacon of hope * cleaner dtype extraction * AQLM + CUDA graph test * is available check * shorter text test	2024-02-27 09:32:39 +01:00
FredericOdermatt	871ba71dfa	GenerationConfig validate both constraints and force_words_ids (#29163 ) GenerationConfig validate both options for constrained decoding: constraints and force_words_ids	2024-02-27 01:43:52 +01:00
Eduardo Pacheco	3fcfbe7549	Adding SegGPT (#27735 ) * First commit * Improvements * More improvements * Converted original checkpoint to HF checkpoint * Fix style * Fixed forward * More improvements * More improvements * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove asserts * Remove unnecessary attributes * Changed model name to camel case * Improve forward doc * Improve tests * More improvements * Fix copies * Fix doc * Make SegGptImageProcessor more flexible * Added few-shot test * Fix style * Update READMEs and docs * Update READMEs * Make inputs required * Add SegGptForImageSegmentation * Make tests pass * Rename to out_indicies * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fixed naming convention * Copying SegGptMlp from modeling_sam.py * Some minor improvements * Remove mlp_ratio * Fix docstrings * Fixed docstring match * Objects defined before use * Storing only patch_size and beta for SegGptLoss * removed _prepare_inputs method * Removed modified from headers * Renamed to output_indicies * Removed unnecessary einsums * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixing issues * Raise error as soon as possible * More fixes * Fix merge * Added palette to SegGptImageProcessor * Fixed typo * Fixed shape typo * Added permute before doing palette to class mapping * Fixed style * Fixed and added tests * Fixed docstrings * Matching SegFormer API for post_processing_semantic_segmentation * Fixed copies * Fixed SegGptImageProcessor to handle both binary and RGB masks * Updated docstrings of SegGptImageProcessor * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/seggpt.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Object definitions above & fix style * Renamed output_indices to intermediate_feature_indices * Removed unnecessary check on bool_masked_pos * Loss first in the outputs * Added validation for do_normalize * Improved SegGptImageProcessor and added new tests * Added comment * Added docstrings to SegGptLoss * Reimplemented ensemble condition logic in SegGptEncoder * Update src/transformers/models/seggpt/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated docstrings to use post_process_semantic_segmentation * Fixed typo on docstrings * moved pixel values test to test_image_processing_seggpt * Addressed comments * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated docstrings for SegGptLoss * Address comments * Added SegGpt example to model docs * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * moved patchify and unpatchify * Rename checkpoint * Renamed intermediate_features to intermediate_hidden_states for consistency * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Replaced post_process_masks for post_process_semantic_segmentation in the docs --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-26 18:17:19 +00:00
Raushan Turganbay	8f2f0f0f85	Track each row separately for stopping criteria (#29116 )	2024-02-26 16:06:16 +00:00
Merve Noyan	7c4995f93d	Add feature extraction mapping for automatic metadata update (#28944 ) * add feature extraction mapping * added prefix * ruff check * minor fix * Update modeling_auto.py * fix typo * remove prefix to make variable public/importable * Update src/transformers/models/auto/modeling_auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixes * addressed comments * nit * fix-copies * remove from tests * this should fix * Update tests/models/convnextv2/test_modeling_convnextv2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-26 10:35:37 +00:00
Matt	371b572e55	Allow remote code repo names to contain "." (#29175 ) * stash commit * stash commit * It works! * Remove unnecessary change * We don't actually need the cache_dir! * Update docstring * Add test * Add test with custom cache dir too * Update model repo path	2024-02-23 12:46:31 +00:00
Sanchit Gandhi	2a9b1f80c4	[Gemma] Fix eager attention (#29187 ) * fix modelling code * add tests * fix tests * add some logit tests * style * fix fix	2024-02-22 01:07:52 +01:00
Arthur	594c1277b2	[ `gemma`] Adds support for Gemma 💎 (#29167 ) * inital commit * update * update conversion checkpoint * update conversion script * nits * some fixes * nits * merge * fix permute * nits * fix * nits * nits * nits * fix rope * fix both rope * nites * style * make sure flax works * fix flax init code * fix foward * nits * print flax generation out * current code * nits * SIIIIIIIIIIIIIIIIIII * update * add new tokenizer * correct fast tokenizer * fix conversion * more comments * fix modeling and conversion * nits and nits * nits testing * add some tokenization tests * add some edge cases * add slow tests and fix them * fixup * fix copies for modeling * fix copies * add 7B slow tests * fix * fix * fix tests * make tokenizer cis go green * styling * last tokenizer nits * update jax tests * fix flax for 7b * add jit testing 🤗 * cleanups * isolated nit, inv_freq for rotary_emb.inv_freq * propagate to jax * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adjust test * fix conversion script * change name * correct file names * update conversion script * Fix bos and eos token ids in the model configuration (#3) * update modelling * update conversion script * add static cache for gemma * fix sdpa generate * fix batched * multiple fixes * fix FA2 * final fix * Rename a few missing strings and filenames (#4) * merge with upstream main * fix copies * fix copies * fix fixup * fix fixup * fix * fix * final tests * fix fx gemma tests * fix fx bf16/fp16 tests * update slow fx tests * fx slow tests: one logits, one generation * move jit test standalone * Apply suggestions from code review * nits * tokenizer updates * more tokenization updates: custom GemmaSentencepieceExtrator * style * Update src/transformers/cache_utils.py * Update src/transformers/models/gemma/__init__.py * Update tests/models/gemma/test_modeling_flax_gemma.py * small nits * style * update tokenization test * fix the rotary embedding * with style * fix slow tests * WARNING this commit might be very important for precisions * Update tests/models/gemma/test_modeling_flax_gemma.py * Update src/transformers/models/gemma/configuration_gemma.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Update src/transformers/models/gemma/modeling_flax_gemma.py Co-authored-by: Lysandre Debut <hi@lysand.re> * small nits here and there! * forgotten nit * remove on the fly computation of inv_freq * revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_flax_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * nit conversion script link * fix some tests * add not doctest and pr doctest * repo consistency * fix last CIs 🚀 * update all readmes --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-02-21 14:21:28 +01:00
Ekaterina Aidova	1d0ea7abe0	support SDPA Attention in stablelm (#29106 ) * support SDPA Attention in stablelm * add integration test * add fallback for output_attentions * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/stablelm/test_modeling_stablelm.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * handle non-contiguous states --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-21 13:12:49 +01:00
Joao Gante	3994fa5baf	🚨 Llama: update rope scaling to match static cache changes (#29143 )	2024-02-21 09:47:41 +00:00
amyeroberts	e770f0316d	[`pipeline`] Add pool option to image feature extraction pipeline (#28985 ) * Add pool option * PR comments - error message and exact outputs check	2024-02-20 20:22:08 +00:00
Joao Gante	857fd8eaab	Generate: missing generation config eos token setting in encoder-decoder tests (#29146 )	2024-02-20 16:17:51 +00:00
Pablo Montalvo	1c81132e80	Raise unused kwargs image processor (#29063 ) * draft processor arg capture * add missing vivit model * add new common test for image preprocess signature * fix quality * fix up * add back missing validations * quality * move info level to warning for unused kwargs	2024-02-20 16:20:20 +01:00
amyeroberts	0996a10077	Revert low cpu mem tie weights (#29135 ) * Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)" This reverts commit `725f4ad1cc`. * Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)" This reverts commit `4156f517ce`.	2024-02-20 12:06:46 +00:00
Arthur	15cfe38942	[`Core tokenization`] `add_dummy_prefix_space` option to help with latest issues (#28010 ) * add add_dummy_prefix_space option to slow * checking kwargs might be better. Should be there for all spm tokenizer IMO * nits * fix copies * more copied * nits * add prefix space * nit * nits * Update src/transformers/convert_slow_tokenizer.py * fix inti * revert wrong styling * fix * nits * style * updates * make sure we use slow tokenizer for conversion instead of looking for the decoder * support llama ast well * update llama tokenizer fast * nits * nits nits nits * update the doc * update * update to fix tests * skip unrelated tailing test * Update src/transformers/convert_slow_tokenizer.py * add proper testing * test decode as well * more testing * format * fix llama test * Apply suggestions from code review	2024-02-20 12:50:31 +01:00
Younes Belkada	efdd436663	FIX [`PEFT` / `Trainer` ] Handle better peft + quantized compiled models (#29055 ) * handle peft + compiled models * add tests * fixup * adapt from suggestions * clarify comment	2024-02-20 12:45:08 +01:00
Joao Gante	a7755d2409	Generate: unset GenerationConfig parameters do not raise warning (#29119 )	2024-02-20 11:34:31 +00:00
Joao Gante	7d312ad2e9	Llama: fix batched generation (#29109 )	2024-02-20 10:23:17 +00:00
Younes Belkada	ff76e7c212	FIX [`bnb` / `tests`] Propagate the changes from #29092 to 4-bit tests (#29122 ) * forgot to push the changes for 4bit .. * trigger CI	2024-02-20 11:11:15 +01:00
Younes Belkada	f7ef7cec6c	FEAT [`Trainer` / `bnb`]: Add RMSProp from `bitsandbytes` to HF `Trainer` (#29082 ) * add RMSProp to Trainer * revert some change * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-20 02:43:02 +01:00
Titus	5ce90f3212	Bnb test fix for different hardwares (#29066 ) * generated text on A10G * generated text in CI * Apply suggestions from code review add explanatory comments Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-19 18:04:44 +00:00
Max Baak	08cd694ef0	ENH: added new output_logits option to generate function (#28667 ) output_logits option behaves like output_scores, but returns the raw, unprocessed prediction logit scores, ie. the values before they undergo logit processing and/or warping. The latter happens by default for the regular output scores. It's useful to have the unprocessed logit scores in certain circumstances. For example, unprocessed logit scores are very useful with causallm models when one wants to determine the probability of a certain answer, e.g. when asking a question with a yes/no answer. In that case getting the next-token probabilities of both "yes" and "no" (and/or their relative ratio) is of interest for classification. The reason for getting these _before_ logit processing and/or warping is b/c a) that can change the probabilities or b) reject the tokens of interest / reduce the number of tokens to just 1. For an example use-case see paper TabLLM: Few-shot Classification of Tabular Data with Large Language Models by Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. https://arxiv.org/abs/2210.10723 In addition: - added dedicated unit test: tests/generation/test_utils/test_return_unprocessed_logit_scores which tests return of logics with output_logits=True in generation. - set output_logits=True in all other generation unit tests, that also have output_scores=True. Implemented @gante's and @amyeroberts review feedback Co-authored-by: kx79wq <max.baak@ing.com>	2024-02-19 17:34:17 +00:00
Lysandre Debut	9830858671	Fix the `bert-base-cased` tokenizer configuration test (#29105 ) Fix test	2024-02-19 13:23:25 +01:00
Younes Belkada	a75a6c9315	FIX [`bnb` / `tests`]: Fix currently failing bnb tests (#29092 ) Update test_mixed_int8.py	2024-02-19 10:39:12 +01:00
Matt	2f1003be86	Add chat support to text generation pipeline (#28945 ) * Add chat support to text generation pipeline * Better handling of single elements * Deprecate ConversationalPipeline * stash commit * Add missing add_special_tokens kwarg * Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline * Add ✨TF✨ tests * @require_tf * Add type hint * Add specific deprecation version * Remove unnecessary do_sample * Remove todo - the discrepancy has been resolved * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/pipelines/text_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 16:41:01 +00:00
Zach Mueller	636b03244c	Fix trainer test wrt DeepSpeed + auto_find_bs (#29061 ) * FIx trainer test * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 10:04:24 -05:00
Richard Lee	be42c24d14	Honor trust_remote_code for custom tokenizers (#28854 ) * pass through trust_remote_code for dynamically loading unregistered tokenizers specified by config add test * change directories back to previous directory after test * fix ruff check * Add a note to that block for future in case we want to remove it later --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2024-02-16 13:40:23 +00:00
Sourab Mangrulkar	b262808656	fix failing trainer ds tests (#29057 )	2024-02-16 17:18:45 +05:30
Jonathan Mamou	258da40efd	fix num_assistant_tokens with heuristic schedule (#28759 ) * fix heuristic num_assistant_tokens_schedule * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update utils.py check that candidate_generator.assistant_model exists since some some speculations (like ngram and PLD) don't have assistant_model attribute * Update src/transformers/generation/candidate_generator.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * merge conflict * fix docstring * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 11:44:58 +00:00
Raushan Turganbay	aee11fe427	Fix max_length criteria when using inputs_embeds (#28994 ) * fix max_length for inputs_embeds * make style * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Static Cache: load models with MQA or GQA (#28975) * fix * fix tests * fix tests * Update src/transformers/generation/utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more fixes * make style --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-16 11:25:12 +00:00
Lysandre Debut	f497f564bb	Update all references to canonical models (#29001 ) * Script & Manual edition * Update	2024-02-16 08:16:58 +01:00
amyeroberts	4156f517ce	Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043 ) * Patch to skip currently failing tests * Whoops - wrong place	2024-02-15 17:26:33 +00:00
Donggeun Yu	5b6fa2306a	DeformableDetrModel support fp16 (#29013 ) * Update ms_deform_attn_cuda.cu * Update ms_deform_attn_cuda.cuh * Update modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_deformable_detr.py * python utils/check_copies.py --fix_and_overwrite * Fix dtype missmatch error * Update test_modeling_deformable_detr.py * Update test_modeling_deformable_detr.py * Update modeling_deformable_detr.py * Update modeling_deformable_detr.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-15 12:31:09 +00:00
Arthur	f3788b09e1	Fix static generation when compiling! (#28937 ) * wow I was scared! * fix everything * nits * make it BC? * add todo * nits * is_tracing should still be used to pass tracing tests * nits * some nits to make sure genration works with static cache uncompiled * fix sdpa * fix FA2 for both static and dynamic in a better way? * style * fix-copies * fix fix copies * fix sequential beam searcg * style * use `keys_to_ignore` * nit * correct dtype inference when init * :( the fix for FA2 is still not optimal to investigate! * styling * nits * nit * this might work better * add comment * Update src/transformers/models/llama/modeling_llama.py * "position_ids" -> "cache_position" * style * nit * Remove changes that should no be propagatted just yet * Apply suggestions from code review * Styling * make sure we raise an errir for static cache with FA2 enabled * move to the bottom of the signature * style * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py * nit in the name --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-15 06:27:40 +01:00
Younes Belkada	7a0fccc6eb	FIX [`Trainer` / tags]: Fix trainer + tags when users do not pass `"tags"` to `trainer.push_to_hub()` (#29009 ) * fix trainer tags * add test	2024-02-14 23:56:35 +01:00
amyeroberts	0199a484eb	Backbone kwargs in config (#28784 ) * Enable instantiating model with pretrained backbone weights * Clarify pretrained import * Use load_backbone instead * Add backbone_kwargs to config * Pass kwargs to constructors * Fix up * Input verification * Add tests * Tidy up * Update tests/utils/test_backbone_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-14 20:46:44 +00:00
JB (Don)	725f4ad1cc	Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948 ) * Add tie_weights() to LM heads and set bias in set_output_embeddings() The bias were not tied correctly in some LM heads, and this change should fix that. * Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin * Adding _tie_weights() to MPNet and Vilt * Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device * Rename to test name to save_load to match the convention	2024-02-14 20:39:01 +00:00
Raushan Turganbay	354775bc57	Fix flaky test vision encoder-decoder generate (#28923 )	2024-02-14 15:40:57 +00:00
Zach Mueller	0507e69d34	Introduce AcceleratorConfig dataclass (#28664 ) * Introduce acceleratorconfig dataclass * Extra second warn * Move import * Try moving import under is_accelerate_available * Quality * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean * Remove to_kwargs * Change version * Improve tests by including dispatch and split batches * Improve reliability * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixup tests and review nits * Make tests pass * protect import * Protect import * Empty-Commit * Make training_args.to_dict handle the AcceleratorConfig --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-02-14 10:18:09 -05:00
Huazhong Ji	69ca640dd6	Set the dataset format used by `test_trainer` to float32 (#28920 ) Co-authored-by: unit_test <test@unit.com>	2024-02-14 13:55:12 +00:00
Andrei Panferov	1ecf5f7c98	AQLM quantizer support (#28928 ) * aqlm init * calibration and dtypes * docs * Readme update * is_aqlm_available * Simpler link in docs * Test TODO real reference * init _import_structure fix * AqlmConfig autodoc * integration aqlm * integrations in tests * docstring fix * legacy typing * Less typings * More kernels information * Performance -> Accuracy * correct tests * remoced multi-gpu test * Update docs/source/en/quantization.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Brought back multi-gpu tests * Update src/transformers/integrations/aqlm.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/aqlm_integration/test_aqlm.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Andrei Panferov <blacksamorez@yandex-team.ru> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-02-14 09:25:41 +01:00
NielsRogge	63ffd56d02	Add SiglipForImageClassification and CLIPForImageClassification (#28952 ) * First draft * Add CLIPForImageClassification * Remove scripts * Fix doctests	2024-02-14 08:41:31 +01:00

1 2 3 4 5 ...

3453 Commits