Niklas Muennighoff
ecd61c6286
Add OLMoE (#32406)
* Add OLMoE
* Add OLMoE
* Updates
* Make norm optional; add keys
* Add output
* Add
* Fix dtype
* Fix eos config
* Update
* Add OLMoE
* Fix OLMoE path
* Format
* Format
* Rmv copy statement
* Rmv copy statement
* Format
* Add copies
* Cp rotary
* Fix aming
* Fix naming
* Update RoPE integration; num_logits_to_keep; Add copy statements
* Add eps to config
* Format
* Add aux loss
* Adapt router_aux_loss_coef
* Update md
* Adapt
* adapt tests
2024-09-03 18:43:12 +02:00
..
2024-06-26 21:59:08 +01:00
2024-09-03 18:43:12 +02:00
2024-08-26 13:15:43 +02:00
2024-06-28 18:02:30 +02:00
2024-04-16 11:58:55 +02:00
2024-08-26 13:15:43 +02:00
2024-08-26 13:15:43 +02:00
2024-08-30 09:52:41 -07:00
2024-04-23 16:06:20 +01:00
2024-08-26 13:15:43 +02:00
2024-06-12 11:33:00 +01:00
2023-11-08 08:35:20 -05:00
2024-08-26 13:15:43 +02:00
2024-04-08 14:21:16 +01:00