Files
HuggingFace_transformer/transformers/tests
Rostislav Nedelchev 76c0bc06d5 [XLNet] Changed post-processing of attention w.r.t to target_mapping
Whenever target_mapping is provided to the input, XLNet outputs two different attention streams.
Based on that the attention output would be on of the two:
- a list of tensors (usual case for most transformers)
- a list of 2-tuples of tensors, one tesor for each of attention streams
Docs and unit-tests have been updated
2019-11-30 21:01:04 +01:00
..
2019-11-26 13:08:12 -05:00
2019-11-06 14:03:47 -05:00
2019-11-26 13:08:12 -05:00
2019-10-08 17:11:58 +02:00
2019-10-09 11:07:43 +02:00
2019-10-09 11:07:43 +02:00
2019-10-11 15:55:01 +02:00
2019-10-08 17:19:28 +02:00