Arthur
594c1277b2
[ gemma] Adds support for Gemma 💎 (#29167)
* inital commit
* update
* update conversion checkpoint
* update conversion script
* nits
* some fixes
* nits
* merge
* fix permute
* nits
* fix
* nits
* nits
* nits
* fix rope
* fix both rope
* nites
* style
* make sure flax works
* fix flax init code
* fix foward
* nits
* print flax generation out
* current code
* nits
* SIIIIIIIIIIIIIIIIIII
* update
* add new tokenizer
* correct fast tokenizer
* fix conversion
* more comments
* fix modeling and conversion
* nits and nits
* nits testing
* add some tokenization tests
* add some edge cases
* add slow tests and fix them
* fixup
* fix copies for modeling
* fix copies
* add 7B slow tests
* fix
* fix
* fix tests
* make tokenizer cis go green
* styling
* last tokenizer nits
* update jax tests
* fix flax for 7b
* add jit testing 🤗
* cleanups
* isolated nit, inv_freq for rotary_emb.inv_freq
* propagate to jax
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* adjust test
* fix conversion script
* change name
* correct file names
* update conversion script
* Fix bos and eos token ids in the model configuration (#3)
* update modelling
* update conversion script
* add static cache for gemma
* fix sdpa generate
* fix batched
* multiple fixes
* fix FA2
* final fix
* Rename a few missing strings and filenames (#4)
* merge with upstream main
* fix copies
* fix copies
* fix fixup
* fix fixup
* fix
* fix
* final tests
* fix fx gemma tests
* fix fx bf16/fp16 tests
* update slow fx tests
* fx slow tests: one logits, one generation
* move jit test standalone
* Apply suggestions from code review
* nits
* tokenizer updates
* more tokenization updates: custom GemmaSentencepieceExtrator
* style
* Update src/transformers/cache_utils.py
* Update src/transformers/models/gemma/__init__.py
* Update tests/models/gemma/test_modeling_flax_gemma.py
* small nits
* style
* update tokenization test
* fix the rotary embedding
* with style
* fix slow tests
* WARNING this commit might be very important for precisions
* Update tests/models/gemma/test_modeling_flax_gemma.py
* Update src/transformers/models/gemma/configuration_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* Update src/transformers/models/gemma/modeling_flax_gemma.py
Co-authored-by: Lysandre Debut <hi@lysand.re>
* small nits here and there!
* forgotten nit
* remove on the fly computation of inv_freq
* revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_flax_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_tokenization_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update tests/models/gemma/test_modeling_gemma.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* nit conversion script link
* fix some tests
* add not doctest and pr doctest
* repo consistency
* fix last CIs 🚀
* update all readmes
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-02-21 14:21:28 +01:00
..
2022-11-08 19:54:41 +00:00
2021-02-15 07:55:10 -05:00
2023-04-06 18:08:14 +02:00
2023-05-18 14:14:43 -04:00
2024-02-14 20:46:44 +00:00
2024-02-16 08:16:58 +01:00
2024-02-20 12:08:31 +00:00
2023-08-10 10:53:22 +02:00
2024-02-21 14:21:28 +01:00
2023-11-03 12:47:07 +01:00
2023-08-10 10:53:22 +02:00
2023-12-01 15:51:10 +01:00
2023-03-13 19:11:19 +01:00
2024-01-15 18:36:40 +00:00
2023-06-06 18:17:41 +02:00
2023-12-22 12:56:11 +01:00
2024-02-20 12:08:31 +00:00
2023-08-17 07:58:35 +02:00
2021-02-15 07:55:10 -05:00
2023-11-28 10:05:34 +01:00
2023-08-17 07:58:35 +02:00
2021-10-07 12:44:23 +05:30
2023-02-28 17:12:44 +01:00
2024-01-31 15:58:17 +01:00
2023-02-28 17:12:44 +01:00
2023-02-03 12:57:02 -05:00
2023-04-21 20:36:35 +02:00
2023-03-01 17:53:29 +01:00
2024-02-21 14:21:28 +01:00
2023-10-30 10:48:24 +01:00
2024-01-31 18:04:43 +01:00
2023-03-30 21:06:35 +02:00
2022-06-02 10:24:16 +02:00
2023-08-17 07:58:35 +02:00
2023-11-30 20:24:43 +01:00
2023-08-17 07:58:35 +02:00
2024-01-31 18:04:43 +01:00
2024-02-16 11:31:51 +01:00
2023-11-14 20:05:54 +00:00
2023-04-06 22:52:59 +02:00