Lysandre Debut
f497f564bb
Update all references to canonical models ( #29001 )
...
* Script & Manual edition
* Update
2024-02-16 08:16:58 +01:00
Klaus Hipp
fe3df9d5b3
[Docs] Add language identifiers to fenced code blocks ( #28955 )
...
Add language identifiers to code blocks
2024-02-12 10:48:31 -08:00
Klaus Hipp
2749e479f3
[Docs] Fix broken links and syntax issues ( #28918 )
...
* Fix model documentation links in attention.md
* Fix external link syntax
* Fix target anchor names of section links
* Fix copyright statement comments
* Fix documentation headings
2024-02-08 14:13:35 -08:00
Hamza FILALI
002566f398
Improving Training Performance and Scalability Documentation ( #28497 )
...
* Improving Training Performance and Scaling documentation by adding PEFT techniques to suggestions to reduce memory requirements for training
* Update docs/source/en/perf_train_gpu_one.md
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2024-01-16 11:30:26 +01:00
fxmarty
c13a43aaf2
Reflect RoCm support in the documentation ( #27636 )
...
* reflect RoCm support in the documentation
* Update docs/source/en/main_classes/trainer.md
Co-authored-by: Lysandre Debut <hi@lysand.re >
* fix review comments
* use ROCm instead of RoCm
---------
Co-authored-by: Lysandre Debut <hi@lysand.re >
2023-11-25 00:59:17 +09:00
Maria Khalusova
9beb2737d7
[docs] fixed links with 404 ( #27327 )
...
* fixed links with 404
* make style
2023-11-06 19:45:03 +00:00
Younes Belkada
368a58e61c
[core ] Integrate Flash attention 2 in most used models ( #25598 )
...
* v1
* oops
* working v1
* fixup
* add some TODOs
* fixup
* padding support + try with module replacement
* nit
* alternative design
* oops
* add `use_cache` support for llama
* v1 falcon
* nit
* a bit of refactor
* nit
* nits nits
* add v1 padding support falcon (even though it seemed to work before)
* nit
* falcon works
* fixup
* v1 tests
* nit
* fix generation llama flash
* update tests
* fix tests + nits
* fix copies
* fix nit
* test- padding mask
* stype
* add more mem efficient support
* Update src/transformers/modeling_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* fixup
* nit
* fixup
* remove it from config when saving
* fixup
* revert docstring
* add more checks
* use values
* oops
* new version
* fixup
* add same trick for falcon
* nit
* add another test
* change tests
* fix issues with GC and also falcon
* fixup
* oops
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* add init_rope
* updates
* fix copies
* fixup
* fixup
* more clarification
* fixup
* right padding tests
* add docs
* add FA in docker image
* more clarifications
* add some figures
* add todo
* rectify comment
* Change to FA2
* Update docs/source/en/perf_infer_gpu_one.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* split in two lines
* change test name
* add more tests
* some clean up
* remove `rearrange` deps
* add more docs
* revert changes on dockerfile
* Revert "revert changes on dockerfile"
This reverts commit 8d72a66b4b9b771abc3f15a9b9506b4246d62d8e.
* revert changes on dockerfile
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re >
* address some comments
* docs
* use inheritance
* Update src/transformers/testing_utils.py
Co-authored-by: Lysandre Debut <hi@lysand.re >
* fixup
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/modeling_utils.py
* final comments
* clean up
* style
* add cast + warning for PEFT models
* fixup
---------
Co-authored-by: Felix Marty <9808326+fxmarty@users.noreply.github.com >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Lysandre Debut <hi@lysand.re >
2023-09-22 17:42:10 +02:00
Vibhor Kumar
99fc3ac8ac
Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer ( #25807 )
...
* Modify single-GPU efficient training doc with now-available adamw_bnb_8bit optimizer
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2023-08-31 10:55:10 +01:00
Younes Belkada
940d1a76b0
[Docs / BetterTransformer ] Added more details about flash attention + SDPA ( #25265 )
...
* added more details about flash attention
* correct and add more details
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* few modifs
* more details
* up
* Apply suggestions from code review
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com >
* adapt from suggestion
* Apply suggestions from code review
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com >
* trigger CI
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix nits and copies
* add new section
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com >
2023-08-18 10:32:28 +02:00
Xuehai Pan
6bc61aa7af
Set TF32 flag for PyTorch cuDNN backend ( #25075 )
2023-07-25 08:04:48 -04:00
Maria Khalusova
75317aefb3
[docs] Performance docs tidy up, part 1 ( #23963 )
...
* first pass at the single gpu doc
* overview: improved clarity and navigation
* WIP
* updated intro and deepspeed sections
* improved torch.compile section
* more improvements
* minor improvements
* make style
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* feedback addressed
* mdx -> md
* link fix
* feedback addressed
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2023-07-24 08:57:24 -04:00
Sylvain Gugger
eb849f6604
Migrate doc files to Markdown. ( #24376 )
...
* Rename index.mdx to index.md
* With saved modifs
* Address review comment
* Treat all files
* .mdx -> .md
* Remove special char
* Update utils/tests_fetcher.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr >
---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr >
2023-06-20 18:07:47 -04:00