Eric Bezzam
2c0af41ce5
Fix bad markdown links ( #39819 )
...
Fix bad markdown links.
2025-07-31 09:14:14 -07:00
Tommy Chiang
4fcf455517
Fix broken links ( #39809 )
...
Replace links in the form of `[text]((url))` to `[text](url)`. This is
the correct format of a url in the markdown.
2025-07-31 13:23:04 +00:00
Joao Gante
6c3f27ba61
[docs] fix korean docs yet again ( #39813 )
...
fix korean docs yet again
2025-07-31 09:13:25 +00:00
Joao Gante
33aa49df9d
[docs] Ko doc fixes after toc update ( #39660 )
...
* update docs
* doc builder working
* make fixup
2025-07-29 17:05:26 +01:00
Jaehyeon Shin
1d061536cf
🌐 [i18n-KO] Translated how_to_hack_models.md to Korean ( #39536 )
...
* docs: ko: how_to_hack_models.md
* feat: nmt draft
* fix: manual edits
2025-07-29 08:09:16 -07:00
박종범
43fe41c0a8
🌐 [i18n-KO] Translated perf_train_gpu_one.md to Korean ( #39552 )
...
* docs: ko: perf_train_gpu_one.md
* feat: nmt draft
* fix: manual edits
* fix: Manually added missing backticks
* Update docs/source/ko/perf_train_gpu_one.md
fix: remove space between heading and GPU anchor
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
* Update docs/source/ko/perf_train_gpu_one.md
fix: clarify table headers to indicate training speed boost and memory savings
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
* Update docs/source/ko/perf_train_gpu_one.md
fix: improve readability
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/perf_train_gpu_one.md
fix : rephrase explanation of data preloading to improve readability
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
---------
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
2025-07-29 08:08:57 -07:00
Ahn Joon Sung
9f38763731
🌐 [i18n-KO] Translated pipeline_gradio.md to Korean ( #39520 )
...
* docs: ko: pipeline_gradio.md
* feat: nmt draft
* fix: manual edits
* docs: ko: pipeline_gradio.md
2025-07-29 08:04:30 -07:00
Lio (임승섭)
f72311796b
🌐 [i18n-KO] Translated tokenizer.md to Korean ( #39532 )
...
* docs: ko: tokenizer.md
* feat: nmt draft
* fix: manual edits
* fix: resolve suggestions
Co-authored-by: Yijun Lee <yijun-lee@users.noreply.github.com >
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com >
* fix: resolve suggestions
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com >
---------
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com >
2025-07-29 08:04:14 -07:00
Kim Juwon
d346d46752
🌐 [i18n-KO] Translated tvp.md to Korean ( #39578 )
...
* docs: ko: tvp.md
* feat: nmt draft
* fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: manual edits
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
---------
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
2025-07-29 08:04:00 -07:00
Ahnjj_DEV
2f59c15b33
🌐 [i18n-KO] Translated albert.md to Korean ( #39524 )
...
* docs: ko: albert.md
* feat: nmt draft
* fix: manual edits
2025-07-29 08:03:40 -07:00
Minseo Kim
98386dcee9
🌐 [i18n-KO] Translated main_classes/peft.md ( #39515 )
...
* docs: ko: main_classes/peft.md
* feat: nmt draft
* docs: add missing TOC to documentation for `PeftAdapterMixin` section
Added a table of contents (TOC) to the documentation, specifically for the `transformers.integrations.PeftAdapterMixin` section, following the structure and content outlined in [this link](https://huggingface.co/docs/transformers/main/en/main_classes/peft#transformers.integrations.PeftAdapterMixin ).
* fix: Improve naturalness of purpose expression in Korean
Changed '관리하기 위한' to '관리할 수 있도록' for more natural Korean expression when describing the purpose of providing functions.
* fix: Simplify plural form and make expression more concise
Changed '~할 수 없기 때문에' to '~할 수 없어' for more concise expression while maintaining clarity.
* fix: Replace technical term '주입' with more natural '적용'
Changed '주입할 수 없어' to '적용할 수 없어' for better readability.
Considered alternatives:
'삽입': Too literal translation of 'inject'
'입력': Could be misunderstood as data input
'통합': Implies merging two systems
'추가': Simple but less precise
'적용' was chosen as it's the most natural and widely used term in Korean technical documentation for this context.
* fix: update toctree path for PEFT to lowercase
Changed the toctree path from 'PEFT' (uppercase) to 'peft' (lowercase) to match the correct directory naming convention and prevent broken links.
* docs: update as per reviewer feedback after rebase
2025-07-29 08:03:17 -07:00
lgai-exaone
c06d4cd6ce
Add EXAONE 4.0 model ( #39129 )
...
* Add EXAONE 4.0 model
* Refactor EXAONE 4.0 modeling code
* Fix cache slicing on SWA + FA2
* Fix cache slicing on FA2 + HybridCache
* Update EXAONE 4.0 modeling code for main branch
* Update o_proj for asymmetric projection
* Address PR feedback
* Add EXAONE 4.0 docs
* Update EXAONE 4.0 modeling code for main branch
* update
* fix updates
* updates
* fix
* fix
* fix
---------
Co-authored-by: Arthur <arthur.zucker@gmail.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2025-07-25 19:58:28 +02:00
Lysandre Debut
f90de364c2
Rename huggingface_cli to hf ( #39630 )
...
* Rename huggingface_cli to hf
* hfh
2025-07-25 14:10:04 +02:00
Joao Gante
e3760501b0
[docs] fix ko cache docs ( #39644 )
...
fix ko docs
2025-07-25 10:06:03 +01:00
Woojun Jung
601260fd96
Update docs/source/ko/_toctree.yml ( #39516 )
...
docs: update `docs/source/ko/_toctree.yml`
2025-07-22 09:00:42 -07:00
Manuel de Prada Corral
c338fd43b0
[cache refactor] Move all the caching logic to a per-layer approach ( #39106 )
...
* Squash for refactor: Replace monolithic cache classes with modular LayeredCache (#38077 )
- Introduces CacheLayer and Cache base classes
- Ports Static, Dynamic, Offloaded, Quantized, Hybrid, etc. to use layers
- Implements method/attr dispatch across layers to reduce boilerplate
- Adds CacheProcessor hooks for offloading, quantization, etc.
- Updates and passes tests
* fix quantized, add tests
* remove CacheProcessorList
* raushan review, arthur review
* joao review: minor things
* remove cache configs, make CacheLayer a mixin (joaos review)
* back to storage inside Cache()
* remove cachebase for decorator
* no more __getattr__
* fix tests
* joaos review except docs
* fix ast deprecations for python 3.14: replace node.n by node.value and use `ast.Constant`
More verbose exceptions in `fix_docstring` on docstring formatting issues.
* Revert "back to storage inside Cache()"
This reverts commit 27916bc2737806bf849ce2148cb1e66d59573913.
* cyril review
* simplify cache export
* fix lfm2 cache
* HybridChunked to layer
* BC proxy object for cache.key_cache[i]=...
* reorder classes
* bfff come on LFM2
* better tests for hybrid and hybridChunked
* complete coverage for hybrid chunked caches (prefill chunking)
* reimplementing HybridChunked
* cyril review
* fix ci
* docs for cache refactor
* docs
* oopsie
* oopsie
* fix after merge
* cyril review
* arthur review
* opsie
* fix lfm2
* opsie2
2025-07-22 16:10:25 +02:00
김민서
2da97f0943
🌐 [i18n-KO] Translated perf_infer_gpu_multi.md to Korean ( #39441 )
...
* docs: ko: perf_infer_gpu_many.md
* feat: nmt draft
* docs: refine KO translation and enhance naturalness
* docs: add missing TOC to documentation
* Align toctree and filename with original: perf_infer_gpu_multi
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
* Refine Korean translation
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com >
* Update docs/source/ko/perf_infer_gpu_multi.md
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
---------
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com >
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com >
2025-07-21 09:14:15 -07:00
MaCAT
4652677c89
🌐 [i18n-KO] Translated quark.md to Korean ( #39268 )
...
* initial translation
* removed english parts
* maintain consistency
* Update docs/source/ko/quantization/quark.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
* Update docs/source/ko/quantization/quark.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
* Update docs/source/ko/quantization/quark.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
* Update docs/source/ko/quantization/quark.md
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
* add toctree
* fixed indentation
---------
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
2025-07-09 09:29:51 -07:00
Joao Gante
6f1a43896c
[CI] fix docs ( #39273 )
...
* fix docs
* add ko gloassary file to toctree
2025-07-08 11:31:03 +01:00
Joosun Hwang
9698052560
Add Korean translation for glossary.md ( #38804 )
...
* Add Korean translation for glossary.md
* Update docs/source/ko/glossary.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/glossary.md
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
---------
Co-authored-by: Joosun40 <77312900+Joosun40@users.noreply.github.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
2025-07-07 09:12:55 -07:00
Joao Gante
1d45d90e5d
[tests] remove TF tests (uses of require_tf) ( #38944 )
...
* remove uses of require_tf
* remove redundant import guards
* this class has no tests
* nits
* del tf rng comment
2025-06-25 17:29:10 +00:00
Matt
508a704055
No more Tuple, List, Dict ( #38797 )
...
* No more Tuple, List, Dict
* make fixup
* More style fixes
* Docstring fixes with regex replacement
* Trigger tests
* Redo fixes after rebase
* Fix copies
* [test all]
* update
* [test all]
* update
* [test all]
* make style after rebase
* Patch the hf_argparser test
* Patch the hf_argparser test
* style fixes
* style fixes
* style fixes
* Fix docstrings in Cohere test
* [test all]
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-17 19:37:18 +01:00
Quentin Gallouédec
c989ddd294
Simplify and update trl examples ( #38772 )
...
* Simplify and update trl examples
* Remove optim_args from SFTConfig in Trainer documentation
* Update docs/source/en/trainer.md
* Apply suggestions from code review
* Update docs/source/en/trainer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Quentin Gallouédec <qgallouedec@Quentins-MacBook-Pro.local >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-13 12:03:49 +00:00
Quentin Gallouédec
de24fb63ed
Use HF papers ( #38184 )
...
* Use hf papers
* Hugging Face papers
* doi to hf papers
* style
2025-06-13 11:07:09 +00:00
Cyril Vallez
4b8ec667e9
Remove all traces of low_cpu_mem_usage ( #38792 )
...
* remove it from all py files
* remove it from the doc
* remove it from examples
* style
* remove traces of _fast_init
* Update test_peft_integration.py
* CIs
2025-06-12 16:39:33 +02:00
Joao Gante
beaed8ce01
[generate] move SinkCache to a custom_generate repo ( #38399 )
...
remove sink cache
2025-06-02 12:13:30 +02:00
Jinan Zhou
135163e9c5
Expose AutoModelForTimeSeriesPrediction for import ( #38307 )
...
* expose AutoModelForTimeSeriesPrediction for import
* add in docs
2025-05-23 13:09:29 +00:00
Kyungmin Lee
7db5d5b9ea
Fix typo ( #37964 )
2025-05-06 14:59:00 +01:00
Bogeum Kim
d20aa68193
🌐 [i18n-KO] Translated gpu_selection.md to Korean ( #36757 )
...
* Add _toctree.yml
* feat: serving.md draft
* Add _toctree.yml
* feat: gpu_selection.md nmt draft
* fix: TOC edit
* Update docs/source/ko/serving.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/ko/gpu_selection.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/ko/serving.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update _toctree.yml
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-01 08:44:12 -07:00
Lysandre Debut
d538293f62
Transformers cli clean command ( #37657 )
...
* transformers-cli -> transformers
* Chat command works with positional argument
* update doc references to transformers-cli
* doc headers
* deepspeed
---------
Co-authored-by: Joao Gante <joao@huggingface.co >
2025-04-30 12:15:43 +01:00
Kim Juwon
50f8caaa48
🌐 [i18n-KO] Translated electra.md to Korean ( #36763 )
...
* docs: ko: electra.md
* feat: nmt draft
* fix: manual edits
* fix: manual edits
2025-04-29 14:03:39 -07:00
Minki Kim
eb4afdd1fb
[i18n-KO] Translated keypoint_detection.md to Korean ( #36649 )
...
* fix: manual edits
* fix: manual edits
* fix: manual edits
* Update docs/source/ko/tasks/keypoint_detection.md
Anchor lower modify
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/tasks/keypoint_detection.md
connect letter
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/tasks/keypoint_detection.md
modify to usual words
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/tasks/keypoint_detection.md
modify extension word
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/ko/tasks/keypoint_detection.md
modify to usual words
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/tasks/keypoint_detection.md
modify to usual words
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
* Update docs/source/ko/tasks/keypoint_detection.md
modify to usual representation
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
---------
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-25 12:24:12 -07:00
김가영
7bb619d710
🌐 [i18n-KO] Translated roberta.md to Korean ( #37069 )
...
* docs: ko: roberta.md
* fix: manual edits
* Apply suggestions from code review
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
---------
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com >
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com >
2025-04-24 10:00:24 -07:00
Jinyong Lee
fbfa1dd4db
🌐 [i18n-KO] Translated siglip.md to Korean ( #37145 )
...
* docs: ko: siglip.md
* feat: nmt draft
* fix: manual edits
* chore: Correct document title to kebab-case format
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Apply suggestions from code review
Convert unnatural language to natural Korean
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
Co-authored-by: Yijun Lee <119404328+yijun-lee@users.noreply.github.com >
2025-04-22 12:23:19 -07:00
Pavel Iakubovskii
4f58fc9c82
Deprecate modeling_utils.py classes ( #37298 )
...
* Move utils classes into models
* Add deprecation warnings
* Remove from docs
* Update config attributes check
2025-04-18 18:47:34 +01:00
Cypher Pepe
7bff4bdcf6
Fixed broken links ( #37466 )
...
* Update broken link
* Update broken link
2025-04-14 14:16:07 +01:00
Joao Gante
aaf129cdae
[agents] remove agents 🧹 ( #37368 )
2025-04-11 18:42:37 +01:00
Mehant Kammakomati
7d76876498
(Part 2) feat: allow for tp_size attr for tplizing the model ( #37054 )
...
* feat: custom tp_size, new transformers tp interface
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* fix: review cmt - error when tp_plan not set for tp_size
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* fix: nit in docs
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
---------
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com >
2025-04-10 17:44:09 +02:00
Joao Gante
9a1c1fe7ed
[CI] green llama tests ( #37244 )
...
* green llama tests
* use cleanup instead
* better test comment; cleanup upgrade
* better test comment; cleanup upgrade
2025-04-03 14:15:53 +01:00
MinJu-Ha
0d6a60fe55
🌐 [i18n-KO] Translated qwen2_vl.md to Korean ( #36750 )
...
* fix: manual edits
* fix: resolve suggestions
* Update toctree.yml
2025-03-30 15:00:27 -07:00
MaCAT
25992b493c
🌐 [i18n-KO] Translated codegen.md to Korean ( #36698 )
...
* Initial translation
* Add _toctree.yml
2025-03-14 09:31:18 -07:00
Matt
1e4286fd59
Remove research projects ( #36645 )
...
* Remove research projects
* Add new README to explain where the projects went
* Trigger tests
* Cleanup all references to research_projects
2025-03-11 13:47:38 +00:00
Shaohon Chen
0440dbc0e1
Integrate SwanLab for offline/online experiment tracking and local visualization ( #36433 )
...
* add swanlab integration
* feat(integrate): add SwanLab as an optional experiment tracking tool in transformers
- Integrated SwanLab into the transformers library as an alternative for experiment tracking.
- Users can now log training metrics, hyperparameters, and other experiment details to SwanLab by setting `report_to="swanlab"` in the `TrainingArguments`.
- Added necessary dependencies and documentation for SwanLab integration.
* Fix the spelling error of SwanLabCallback in callback.md
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Fix typo in comment
* Fix typo in comment
* Fix typos and update comments
* fix annotation
* chore: opt some comments
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: AAssets <20010618@qq.com >
Co-authored-by: ZeYi Lin <944270057@qq.com >
Co-authored-by: KAAANG <79990647+SAKURA-CAT@users.noreply.github.com >
2025-03-06 17:35:30 +01:00
Joao Gante
99adc74462
[tests] remove flax-pt equivalence and cross tests ( #36283 )
2025-02-19 15:13:27 +00:00
Joao Gante
0863eef248
[tests] remove pt_tf equivalence tests ( #36253 )
2025-02-19 11:55:11 +00:00
Mehant Kammakomati
c3ba53303b
feat: add support for tensor parallel training workflow with accelerate ( #34194 )
...
* feat: add support for tensor parallel flow using accelerate
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* fix: add tp degree to env variable
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* fix: add version check for accelerate to allow TP
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* docs: tensor parallelism
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* nit: rename plugin name
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* fix: guard accelerate version before allow tp
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* docs: add more docs and updates related to TP
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
---------
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-02-18 14:05:46 +01:00
Thomas Bauwens
8f137b2427
Move DataCollatorForMultipleChoice from the docs to the package ( #34763 )
...
* Add implementation for DataCollatorForMultipleChoice based on docs.
* Add DataCollatorForMultipleChoice to import structure.
* Remove custom DataCollatorForMultipleChoice implementations from example scripts.
* Remove custom implementations of DataCollatorForMultipleChoice from docs in English, Spanish, Japanese and Korean.
* Refactor torch version of DataCollatorForMultipleChoice to be more easily understandable.
* Apply suggested changes and run make fixup.
* fix copies, style and fixup
* add missing documentation
* nits
* fix docstring
* style
* nits
* isort
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com >
2025-02-13 12:01:28 +01:00
Stas Bekman
9dc1efa5d4
DeepSpeed github repo move sync ( #36021 )
...
deepspeed github repo move
2025-02-05 08:19:31 -08:00
Joao Gante
90b46e983f
Remove old benchmark code ( #35730 )
...
* remove traces of the old deprecated benchmarks
* also remove old tf benchmark example, which uses deleted code
* run doc builder
2025-01-21 17:56:43 +00:00
Woojun Jung
3b1be043cd
🌐 [i18n-KO] Remove duplicates in toctree ( #35496 )
...
fix(docs): remove duplicates in toctree
2025-01-06 09:14:22 -08:00