Add Nemotron HF Support (#31699)
* Add nemotron support * fix inference * add unit test * add layernorm1p as a class to avoid meta device mismatch * test fixed * Add copied_from statements * remove pretraining_tp args * remove nemotronlayernorm * force LN computation done in FP32 * remove nemotrontokenizer and use llamatokenizer * license update * add option for kv_channels for minitron8b * remove assert * o_proj fixed * o_proj reshape * add gated_proj option * typo * remove todos * fix broken test after merging latest main * remove nezha/nat after meging main * chnage default config to 15b model * add nemo conversion script * rename conversion script * remove gate_proj option * pr comment resolved * fix unit test * rename kv_channels to head_dim * resolve PR issue * add nemotron md * fix broken tests * refactor rope for nemotron * test fix * remove linearscaling * whitespace and import * fix some copied-from * code style fix * reformatted * add position_embedding to nemotronattention * rope refactor to only use config, copied-from fix * format * Run make fix-copies * nemotron md with autodoc * doc fix * fix order * pass check_config_docstrings.py * fix config_attributes * remove all llama BC related code * Use PreTrainedTokenizerFast * ruff check examples * conversion script update * add nemotron to toctree
This commit is contained in:
@@ -222,6 +222,7 @@ Flax), PyTorch, and/or TensorFlow.
|
||||
| [MusicGen Melody](model_doc/musicgen_melody) | ✅ | ❌ | ❌ |
|
||||
| [MVP](model_doc/mvp) | ✅ | ❌ | ❌ |
|
||||
| [NAT](model_doc/nat) | ✅ | ❌ | ❌ |
|
||||
| [Nemotron](model_doc/nemotron) | ✅ | ❌ | ❌ |
|
||||
| [Nezha](model_doc/nezha) | ✅ | ❌ | ❌ |
|
||||
| [NLLB](model_doc/nllb) | ✅ | ❌ | ❌ |
|
||||
| [NLLB-MOE](model_doc/nllb-moe) | ✅ | ❌ | ❌ |
|
||||
|
||||
Reference in New Issue
Block a user