LLaMA Implementation (#21955)

* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <stellabiderman@gmail.com>

This commit is contained in:

Jason Phang

2023-03-16 09:00:53 -04:00

committed by

GitHub

parent 09922da4a7

commit 0041be5b3d

27 changed files with 1970 additions and 2 deletions

									
										2

docs/source/en/_toctree.yml
									
												View File
												
				@@ -319,6 +319,8 @@

				        title: Jukebox

				      - local: model_doc/led

				        title: LED

				      - local: model_doc/llama

				        title: LLaMA

				      - local: model_doc/longformer

				        title: Longformer

				      - local: model_doc/longt5

LLaMA Implementation (#21955)

2 docs/source/en/_toctree.yml Unescape Escape View File

2

docs/source/en/_toctree.yml

View File