Files
HuggingFace_transformer/tests
Teven 4b506a37e3 Xlnet outputs (#5883)
Slightly breaking change, changes functionality for `use_cache` in XLNet: if use_cache is True and mem_len is 0 or None (which is the case in the base model config), the model behaves like GPT-2 and returns mems to be used as past in generation. At training time `use_cache` is overriden and always True.
2020-07-18 17:33:13 +02:00
..
2020-05-07 13:48:44 -04:00
2020-07-18 17:33:13 +02:00