Files
HuggingFace_transformer/tests
Kristian Holsheimer f8eda599bd [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes (#8791)
* [FlaxBert] Fix non-broadcastable attention mask for batched forward-passes

* [FlaxRoberta] Fix non-broadcastable attention mask

* Use jax.numpy instead of ordinary numpy (otherwise not jit-able)

* Partially revert "Use jax.numpy ..."

* Add tests for batched forward passes

* Avoid unnecessary OOMs due to preallocation of GPU memory by XLA

* Auto-fix style

* Re-enable GPU memory preallocation but with mem fraction < 1/paralleism
2020-11-27 13:21:19 +01:00
..
2020-11-10 07:11:02 -05:00
2020-11-16 21:43:42 -05:00
2020-11-20 22:07:21 +01:00
2020-11-11 12:59:40 -05:00
2020-11-16 21:43:42 -05:00
2020-11-16 21:43:42 -05:00
2020-11-16 21:43:42 -05:00
2020-11-16 21:43:42 -05:00
2020-11-17 12:23:09 +01:00
2020-11-16 21:43:42 -05:00
2020-11-23 18:20:19 -05:00
2020-11-17 12:23:09 +01:00
2020-11-17 12:23:09 +01:00
2020-11-16 21:43:42 -05:00
2020-10-30 10:25:48 -04:00
2020-08-27 18:31:51 +02:00
2020-11-16 21:43:42 -05:00