LLaMA Implementation (#21955)

* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
2023-03-16 09:00:53 -04:00
parent 09922da4a7
commit 0041be5b3d
27 changed files with 1970 additions and 2 deletions
--- a/utils/check_config_docstrings.py
+++ b/utils/check_config_docstrings.py
@@ -41,6 +41,7 @@ CONFIG_CLASSES_TO_IGNORE_FOR_DOCSTRING_CHECKPOINT_CHECK = {
    "SpeechEncoderDecoderConfig",
    "VisionEncoderDecoderConfig",
    "VisionTextDualEncoderConfig",
+    "LlamaConfig",
 }


--- a/utils/check_repo.py
+++ b/utils/check_repo.py
@@ -57,6 +57,7 @@ PRIVATE_MODELS = [
 # Being in this list is an exception and should **not** be the rule.
 IGNORE_NON_TESTED = PRIVATE_MODELS.copy() + [
    # models to ignore for not tested
+    "LlamaDecoder",  # Building part of bigger (tested) model.
    "Blip2QFormerModel",  # Building part of bigger (tested) model.
    "DetaEncoder",  # Building part of bigger (tested) model.
    "DetaDecoder",  # Building part of bigger (tested) model.