Sanchit Gandhi
e93103632b
Add bloom flax (#25094)
* First commit
* step 1 working
* add alibi
* placeholder for `scan`
* add matrix mult alibi
* beta scaling factor for bmm
* working v1 - simple forward pass
* move layer_number from attribute to arg in call
* partial functioning scan
* hacky working scan
* add more modifs
* add test
* update scan for new kwarg order
* fix position_ids problem
* fix bug in attention layer
* small fix
- do the alibi broadcasting only once
* prelim refactor
* finish refactor
* alibi shifting
* incorporate dropout_add to attention module
* make style
* make padding work again
* update
* remove bogus file
* up
* get generation to work
* clean code a bit
* added small tests
* adding albii test
* make CI tests pass:
- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work
* fix few nits
* fix nit onnx
* fix onnx nit
* add missing dtype args to nn.Modules
* remove debugging statements
* fix scan generate
* Update modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* fix small test issue + make style
* clean up
* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* fix function name
* small fix test
* forward contrib credits from PR17761
* Fix failing test
* fix small typo documentation
* fix non passing test
- remove device from build alibi
* refactor call
- refactor `FlaxBloomBlockCollection` module
* make style
* upcast to fp32
* cleaner way to upcast
* remove unused args
* remove layer number
* fix scan test
* make style
* fix i4 casting
* fix slow test
* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove `layer_past`
* refactor a bit
* fix `scan` slow test
* remove useless import
* major changes
- remove unused code
- refactor a bit
- revert import `torch`
* major refactoring
- change build alibi
* remove scan
* fix tests
* make style
* clean-up alibi
* add integration tests
* up
* fix batch norm conversion
* style
* style
* update pt-fx cross tests
* update copyright
* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* per-weight check
* style
* line formats
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-27 18:24:56 +01:00
..
2023-06-30 16:30:33 +01:00
2023-06-29 16:05:24 +02:00
2023-06-29 16:05:24 +02:00
2023-05-23 14:37:35 -04:00
2023-06-06 14:31:14 -04:00
2023-06-26 18:36:47 +02:00
2023-07-27 15:35:17 +01:00
2023-06-30 16:30:33 +01:00
2023-02-06 18:10:56 -05:00
2023-02-06 18:10:56 -05:00
2023-06-29 10:17:36 +01:00
2023-07-11 11:07:58 -04:00
2023-02-28 19:40:57 +01:00
2023-02-06 18:10:56 -05:00
2023-02-06 18:10:56 -05:00
2023-06-22 16:11:27 +02:00
2023-05-18 11:04:51 +01:00
2023-05-18 17:29:04 +02:00
2023-06-29 10:17:36 +01:00
2023-06-30 16:30:33 +01:00
2023-06-30 16:30:33 +01:00
2023-06-30 16:30:33 +01:00
2023-06-26 11:23:57 +02:00
2023-07-27 18:24:56 +01:00
2023-07-14 14:43:19 -04:00
2023-02-06 18:10:56 -05:00
2023-05-24 13:52:52 +01:00
2023-07-14 14:43:19 -04:00
2023-06-29 16:05:24 +02:00
2023-07-14 14:43:19 -04:00
2023-07-11 11:07:58 -04:00
2023-06-29 16:05:24 +02:00
2023-06-27 12:15:49 +01:00
2023-07-14 14:43:19 -04:00
2023-06-30 16:30:33 +01:00
2023-06-29 10:17:36 +01:00
2023-05-16 23:35:11 +02:00
2023-03-22 20:02:24 +01:00
2023-05-18 17:29:04 +02:00
2023-07-24 10:08:47 -04:00
2023-07-24 10:08:47 -04:00
2023-06-30 16:30:33 +01:00
2023-06-30 16:30:33 +01:00
2023-06-30 16:30:33 +01:00
2023-02-28 19:40:57 +01:00
2023-07-14 14:43:19 -04:00
2023-06-30 16:30:33 +01:00
2023-07-24 10:08:47 -04:00
2023-07-14 14:43:19 -04:00
2023-06-29 10:17:36 +01:00
2023-07-18 15:34:06 +01:00
2023-06-30 16:30:33 +01:00
2023-06-29 10:17:36 +01:00
2023-02-28 19:40:57 +01:00
2023-06-30 16:30:33 +01:00
2023-07-24 10:08:47 -04:00
2023-06-29 10:17:36 +01:00
2023-07-24 10:08:47 -04:00
2023-06-30 16:30:33 +01:00
2023-07-24 10:08:47 -04:00
2023-06-06 18:30:51 +01:00
2023-02-28 19:40:57 +01:00
2023-03-23 19:14:17 +01:00
2023-07-24 10:08:47 -04:00
2023-07-11 13:36:31 +01:00
2023-06-30 16:30:33 +01:00
2023-07-24 10:08:47 -04:00
2023-06-22 16:11:27 +02:00
2023-05-18 17:29:04 +02:00
2023-04-06 17:56:06 +02:00
2023-06-16 15:40:49 +01:00
2023-07-24 10:08:47 -04:00
2023-06-29 10:17:36 +01:00
2023-06-30 16:30:33 +01:00
2023-06-13 19:19:40 +02:00
2023-05-18 17:29:04 +02:00
2023-07-13 16:47:30 +01:00
2023-02-28 19:40:57 +01:00
2022-12-12 13:12:13 -05:00
2023-06-30 16:30:33 +01:00
2023-07-24 10:08:47 -04:00
2023-07-24 10:08:47 -04:00
2023-06-30 16:30:33 +01:00
2023-02-06 18:10:56 -05:00
2023-06-30 16:30:33 +01:00
2023-02-28 19:40:57 +01:00
2023-06-29 16:05:24 +02:00
2023-06-22 16:11:27 +02:00
2023-07-11 20:43:01 +01:00
2023-07-25 08:04:48 -04:00
2023-07-25 07:56:04 -04:00
2023-07-25 07:56:04 -04:00
2023-06-30 16:30:33 +01:00
2023-06-29 10:17:36 +01:00
2023-06-30 16:30:33 +01:00
2023-07-24 10:08:47 -04:00
2023-06-22 16:11:27 +02:00
2023-07-21 14:32:34 +02:00
2023-06-30 16:30:33 +01:00
2023-04-06 17:56:06 +02:00
2023-06-22 16:11:27 +02:00
2023-06-16 15:40:49 +01:00
2023-04-06 17:56:06 +02:00
2023-06-30 16:30:33 +01:00
2023-04-11 10:00:34 +02:00
2023-07-24 10:08:47 -04:00
2023-07-26 16:23:30 +01:00
2023-06-30 16:30:33 +01:00
2022-07-29 08:09:09 -04:00
2023-04-07 17:13:04 +02:00
2023-02-28 19:40:57 +01:00
2022-05-03 14:42:02 +02:00
2023-06-29 10:17:36 +01:00
2022-05-12 16:25:55 -04:00
2023-06-30 16:30:33 +01:00
2023-06-29 10:17:36 +01:00
2023-06-29 10:17:36 +01:00
2023-07-24 10:08:47 -04:00
2023-07-24 10:08:47 -04:00
2023-06-30 16:30:33 +01:00
2023-07-27 16:24:52 +02:00
2023-07-10 10:50:43 +01:00
2023-05-24 13:52:52 +01:00
2023-07-13 15:54:18 +02:00
2023-05-18 11:04:51 +01:00
2023-06-29 10:17:36 +01:00
2023-02-28 19:40:57 +01:00
2023-04-04 14:53:06 +02:00
2023-04-21 17:09:40 +01:00
2023-02-28 19:40:57 +01:00
2023-07-25 07:56:04 -04:00
2023-06-30 16:30:33 +01:00
2023-06-16 15:40:49 +01:00
2023-06-29 16:05:24 +02:00
2023-06-30 16:30:33 +01:00
2023-04-06 17:56:06 +02:00
2023-07-25 07:56:04 -04:00
2023-02-06 18:10:56 -05:00
2023-06-29 16:05:24 +02:00
2023-05-18 11:04:51 +01:00
2023-06-29 10:17:36 +01:00
2023-05-18 11:04:51 +01:00
2023-07-26 14:57:44 +02:00
2023-02-28 19:40:57 +01:00
2023-06-16 15:40:49 +01:00
2023-05-31 14:59:30 +01:00
2023-05-18 11:04:51 +01:00
2023-06-29 10:17:36 +01:00
2023-06-30 16:30:33 +01:00
2023-06-29 10:17:36 +01:00
2023-06-30 16:30:33 +01:00
2023-06-30 16:30:33 +01:00
2023-03-14 10:03:02 +01:00
2023-06-30 16:30:33 +01:00
2023-06-27 12:15:49 +01:00
2023-06-22 16:11:27 +02:00
2023-07-25 07:56:04 -04:00
2023-06-27 12:15:49 +01:00
2023-06-27 12:15:49 +01:00
2023-02-06 18:10:56 -05:00
2023-06-29 16:05:24 +02:00
2023-02-28 19:40:57 +01:00
2023-07-25 07:56:04 -04:00
2023-04-06 17:45:55 +02:00
2023-02-28 19:40:57 +01:00
2023-07-25 07:56:04 -04:00
2023-06-29 10:17:36 +01:00
2023-03-07 14:23:36 +01:00
2023-06-29 10:17:36 +01:00
2023-07-11 15:02:18 +02:00
2023-07-25 21:02:49 +02:00
2023-07-25 07:56:04 -04:00
2023-06-30 16:30:33 +01:00
2023-06-22 16:11:27 +02:00
2023-06-29 10:17:36 +01:00
2023-07-25 07:56:04 -04:00
2023-06-30 16:30:33 +01:00
2023-05-18 11:04:51 +01:00
2023-07-25 07:56:04 -04:00
2023-07-25 21:02:49 +02:00
2023-06-27 12:15:49 +01:00
2023-06-27 12:15:49 +01:00
2023-07-25 07:56:04 -04:00
2023-07-25 07:56:04 -04:00
2023-06-29 15:09:51 +01:00
2023-06-29 10:17:36 +01:00
2023-07-05 13:44:30 +02:00
2023-06-22 16:11:27 +02:00
2023-06-30 16:30:33 +01:00
2023-06-29 10:17:36 +01:00
2023-07-25 07:56:04 -04:00
2023-06-29 10:17:36 +01:00
2023-07-25 07:56:04 -04:00
2023-06-30 16:30:33 +01:00
2023-06-27 12:15:49 +01:00
2022-05-03 14:42:02 +02:00
2023-04-17 12:41:55 +02:00
2023-06-27 12:15:49 +01:00
2023-06-29 16:05:24 +02:00
2023-06-29 16:05:24 +02:00
2023-06-30 16:30:33 +01:00
2023-06-30 16:30:33 +01:00
2023-03-06 09:15:44 +01:00
2023-05-24 13:52:52 +01:00
2023-03-29 16:16:23 +02:00
2023-06-30 16:30:33 +01:00
2023-03-29 16:16:23 +02:00
2023-06-29 10:17:36 +01:00
2023-02-28 19:40:57 +01:00
2022-05-03 14:42:02 +02:00