Files
HuggingFace_transformer/tests/models
Andreas Madsen b4b613b102 Implement Roberta PreLayerNorm (#20305)
* Copy RoBERTa

* formatting

* implement RoBERTa with prelayer normalization

* update test expectations

* add documentation

* add convertion script for DinkyTrain weights

* update checkpoint repo

Unfortunately the original checkpoints assumes a hacked roberta model

* add to RoBERTa-PreLayerNorm docs to toc

* run utils/check_copies.py

* lint files

* remove unused import

* fix check_repo reporting wrongly a test is missing

* fix import error, caused by rebase

* run make fix-copies

* add RobertaPreLayerNormConfig to ROBERTA_EMBEDDING_ADJUSMENT_CONFIGS

* Fix documentation <Facebook> -> Facebook

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup: Fix documentation <Facebook> -> Facebook

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add missing Flax header

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* expected_slice -> EXPECTED_SLICE

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update copies after rebase

* add missing copied from statements

* make fix-copies

* make prelayernorm explicit in code

* fix checkpoint path for the original implementation

* add flax integration tests

* improve docs

* update utils/documentation_tests.txt

* lint files

* Remove Copyright notice

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fix-copies

* Remove EXPECTED_SLICE calculation comments

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-19 09:30:17 +01:00
..
2022-11-02 17:38:44 +01:00
2022-05-03 14:42:02 +02:00
2022-05-03 14:42:02 +02:00
2022-11-02 11:57:36 +00:00
2022-05-03 14:42:02 +02:00
2022-12-05 10:12:03 -05:00
2022-05-03 14:42:02 +02:00
2022-11-02 17:38:44 +01:00
2022-05-03 14:42:02 +02:00
2022-05-03 14:42:02 +02:00
2022-12-07 17:01:55 +01:00
2022-11-02 17:38:44 +01:00
2022-11-02 17:38:44 +01:00
2022-05-03 14:42:02 +02:00
2022-11-30 14:47:54 +00:00
2022-05-03 14:42:02 +02:00
2022-11-02 17:38:44 +01:00
2022-12-06 16:14:03 +01:00
2022-11-14 01:04:26 -05:00
2022-11-02 11:57:36 +00:00
2022-11-02 17:38:44 +01:00
2022-11-15 11:33:09 -05:00
2022-05-12 16:25:55 -04:00
2022-11-02 11:57:36 +00:00
2022-05-03 14:42:02 +02:00
2022-05-03 14:42:02 +02:00
2022-05-03 14:42:02 +02:00
2022-05-12 16:25:55 -04:00
2022-05-03 14:42:02 +02:00
2022-11-02 11:57:36 +00:00
2022-05-03 14:42:02 +02:00
2022-05-03 14:42:02 +02:00
2022-06-24 16:26:14 +02:00
2022-12-14 19:35:28 +01:00
2022-12-16 16:24:01 +01:00
2022-07-27 11:14:47 -04:00
2022-11-14 01:04:26 -05:00
2022-11-02 11:57:36 +00:00
2022-11-02 11:57:36 +00:00
2022-05-03 14:42:02 +02:00
2022-05-12 16:25:55 -04:00
2022-05-12 16:25:55 -04:00
2022-05-03 14:42:02 +02:00