PyTorch XLNet

2020-01-17 14:59:31 -05:00
parent 83fa8d9fb5
commit cd656fb21a
2 changed files with 355 additions and 388 deletions
--- a/docs/source/model_doc/xlnet.rst
+++ b/docs/source/model_doc/xlnet.rst
@@ -1,6 +1,22 @@
 XLNet
 ----------------------------------------------------

+The XLNet model was proposed in `XLNet: Generalized Autoregressive Pretraining for Language Understanding`_
+by Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le.
+XLnet is an extension of the Transformer-XL model pre-trained using an autoregressive method
+to learn bidirectional contexts by maximizing the expected likelihood over all permutations
+of the input sequence factorization order.
+
+The specific attention pattern can be controlled at training and test time using the `perm_mask` input.
+
+Due to the difficulty of training a fully auto-regressive model over various factorization order,
+XLNet is pretrained using only a sub-set of the output tokens as target which are selected
+with the `target_mapping` input.
+
+To use XLNet for sequential decoding (i.e. not in fully bi-directional setting), use the `perm_mask` and
+`target_mapping` inputs to control the attention span and outputs (see examples in `examples/run_generation.py`)
+
+
 ``XLNetConfig``
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~