From 52679fbc2e6bdc6e8722e27d9bc6550295c0a68b Mon Sep 17 00:00:00 2001 From: Patrick von Platen Date: Tue, 28 Apr 2020 14:32:31 +0200 Subject: [PATCH] add dialogpt training tips (#3996) --- docs/source/model_doc/dialogpt.rst | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/docs/source/model_doc/dialogpt.rst b/docs/source/model_doc/dialogpt.rst index 90b19a74d5..4381698829 100644 --- a/docs/source/model_doc/dialogpt.rst +++ b/docs/source/model_doc/dialogpt.rst @@ -22,7 +22,17 @@ Tips: the right rather than the left. - DialoGPT was trained with a causal language modeling (CLM) objective on conversational data and is therefore powerful at response generation in open-domain dialogue systems. - DialoGPT enables the user to create a chat bot in just 10 lines of code as shown on `DialoGPT's model card `_. - + +Training: + +In order to train or fine-tune DialoGPT, one can use causal language modeling training. +To cite the official paper: +*We follow the OpenAI GPT-2 to model a multiturn dialogue session +as a long text and frame the generation task as language modeling. We first +concatenate all dialog turns within a dialogue session into a long text +x_1,..., x_N (N is the sequence length), ended by the end-of-text token.* +For more information please confer to the original paper. + DialoGPT's architecture is based on the GPT2 model, so one can refer to GPT2's `docstring `_.