mig-mfreitas
34b43211d7
Add YaRN and Dynamic-YaRN RoPE Scaling Methods ( #30910 )
...
* Add YaRN and Dynamic-YaRN RoPE Scaling Methods
YaRN (Yet another RoPE extension method) combines the NTK-By-Parts
Interpolation and Attention Scaling methods, improving upon existing
RoPE interpolation methods for longer context window sizes.
Fine-tuned models maintain their original performance across benchmarks
while enabling efficient extrapolation and transfer learning for
quicker convergence, especially in compute-limited environments.
We implement YaRN and Dynamic-YaRN for the following list of models:
- LLaMA
- Falcon
- GPT-NeoX
- Olmo
- Persimmon
- Phi
- StableLM
- OpenLLaMA
New unit tests are added to assert YaRN's correct behavior on both
short and long sequence inputs.
For more details, please refer to https://arxiv.org/abs/2309.00071 .
Co-authored-by: Miguel Almeida <miguel.pessanha.almeida@tecnico.ulisboa.pt >
* Refactor YaRN implementation for LLaMA
Iterate on YaRN implementation for LLaMA and remove diff from remaining
models for increased PR modularity.
This commit includes the following changes:
- Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries
- Remove unnecessary attributes ('extrapolation_factor' and 'finetuned')
from YaRN classes
- Inherit 'forward' method in YaRN classes from superclass
- Rename 'yarn' method to 'compute_yarn_scaling'
- Extend YaRN tests with further assertions
- Fix style inconsistencies
Co-authored-by: Miguel Monte e Freitas <miguelmontefreitas@tecnico.ulisboa.pt >
* Refactor Tensor Building Logic for YaRN
- Comply with the the tensor building logic introduced in #30743
- Add referencing to the optimized Attention Factor equation
- Remove Dynamic YaRN for a more agile deployment
Co-authored-by: mig-mfreitas <mig-mfreitas@users.noreply.github.com >
* remove unwanted file
---------
Co-authored-by: Miguel Almeida <miguel.pessanha.almeida@tecnico.ulisboa.pt >
Co-authored-by: mig-mfreitas <mig-mfreitas@users.noreply.github.com >
Co-authored-by: Joao Gante <joao@huggingface.co >
2024-07-23 10:07:58 +01:00
..
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-07-22 16:07:29 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-03-13 14:53:27 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-03-13 14:53:27 +01:00
2024-03-13 14:53:27 +01:00
2024-03-13 14:53:27 +01:00
2024-07-17 08:37:43 +01:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-25 13:36:58 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-18 16:54:07 +05:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-07-18 10:30:37 +05:30
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2023-03-22 20:02:24 +01:00
2024-06-26 21:59:08 +01:00
2024-03-25 10:33:38 +01:00
2024-06-19 10:18:08 +01:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-03-25 10:33:38 +01:00
2024-06-26 21:59:08 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2023-06-29 10:17:36 +01:00
2024-06-26 21:59:08 +01:00
2024-03-25 10:33:38 +01:00
2024-06-11 15:47:38 +01:00
2024-07-08 13:49:21 +02:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-10 13:46:31 +01:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-07-16 16:49:54 +01:00
2024-07-03 11:43:44 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-23 10:11:12 +02:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-07-23 10:11:12 +02:00
2024-06-26 21:59:08 +01:00
2024-03-13 14:53:27 +01:00
2024-03-25 10:33:38 +01:00
2024-06-25 13:36:58 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-11 22:13:56 +01:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-07 19:40:26 +01:00
2024-06-25 15:45:39 +05:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-22 18:24:43 +01:00
2024-07-22 18:24:43 +01:00
2024-07-22 18:24:43 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-03-25 10:33:38 +01:00
2024-07-23 10:07:58 +01:00
2024-07-19 10:08:56 +05:00
2024-07-23 10:23:55 +05:00
2024-07-23 10:23:55 +05:00
2024-06-26 21:59:08 +01:00
2024-07-16 09:32:01 -04:00
2024-07-22 17:46:17 +01:00
2024-07-16 09:32:01 -04:00
2024-07-16 09:32:01 -04:00
2024-07-18 11:54:54 -04:00
2024-07-22 14:14:47 +01:00
2024-07-22 18:24:43 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-16 16:49:54 +01:00
2024-06-26 21:59:08 +01:00
2024-07-22 17:46:17 +01:00
2024-06-26 21:59:08 +01:00
2024-06-11 15:47:38 +01:00
2024-06-11 15:47:38 +01:00
2024-06-17 17:29:13 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-05-22 06:40:15 +02:00
2024-03-13 14:53:27 +01:00
2024-06-26 21:59:08 +01:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-10 13:46:31 +01:00
2024-06-26 21:59:08 +01:00
2024-06-19 10:18:08 +01:00
2024-06-21 01:48:10 -07:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-04 10:09:24 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-07-03 14:29:02 +01:00
2024-06-20 14:15:01 +01:00
2024-06-07 19:40:26 +01:00
2024-07-23 14:54:38 +08:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-06-11 15:47:38 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-07-08 11:10:02 +01:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-05-22 06:40:15 +02:00
2024-03-25 10:33:38 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-19 10:18:08 +01:00
2024-06-07 19:40:26 +01:00
2024-06-19 10:18:08 +01:00
2024-06-26 18:46:48 +01:00
2024-06-26 18:46:48 +01:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-22 18:24:43 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-11 15:47:38 +01:00
2024-07-22 18:24:43 +01:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:52:28 +05:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-19 10:08:56 +05:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-25 13:36:58 +01:00
2024-06-07 19:40:26 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-06-19 10:18:08 +01:00
2024-07-08 13:49:21 +02:00
2024-06-11 15:47:38 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-17 17:29:13 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-04-02 10:27:26 +02:00
2024-06-26 21:59:08 +01:00
2024-06-06 14:44:35 +01:00
2024-06-26 21:59:08 +01:00
2024-06-04 10:09:24 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-08 11:43:33 +02:00
2022-05-03 14:42:02 +02:00