* Add flash-attention-2 backend for ESM-2
Signed-off-by: Peter St. John <pstjohn@nvidia.com>
* update extended_attention_mask for fa2
Signed-off-by: Peter St. John <pstjohn@nvidia.com>
* add test_flash_attn_2_equivalence test
Signed-off-by: Peter St. John <pstjohn@nvidia.com>
---------
Signed-off-by: Peter St. John <pstjohn@nvidia.com>