Documentation specification (#4294)
This commit is contained in:
128
docs/README.md
128
docs/README.md
@@ -67,3 +67,131 @@ It should build the static app that will be available under `/docs/_build/html`
|
||||
|
||||
Accepted files are reStructuredText (.rst) and Markdown (.md). Create a file with its extension and put it
|
||||
in the source directory. You can then link it to the toc-tree by putting the filename without the extension.
|
||||
|
||||
## Writing Documentation - Specification
|
||||
|
||||
The `huggingface/transformers` documentation follows the
|
||||
[Google documentation](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) style. It is
|
||||
mostly written in ReStructuredText
|
||||
([Sphinx simple documentation](https://www.sphinx-doc.org/en/master/usage/restructuredtext/index.html),
|
||||
[Sourceforge complete documentation](https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html))
|
||||
|
||||
### Adding a new section
|
||||
|
||||
A section is a page held in the `Notes` toc-tree on the documentation. Adding a new section is done in two steps:
|
||||
|
||||
- Add a new file under `./source`. This file can either be ReStructuredText (.rst) or Markdown (.md).
|
||||
- Link that file in `./source/index.rst` on the correct toc-tree.
|
||||
|
||||
### Adding a new model
|
||||
|
||||
When adding a new model:
|
||||
|
||||
- Create a file `xxx.rst` under `./source/model_doc`.
|
||||
- Link that file in `./source/index.rst` on the `model_doc` toc-tree.
|
||||
- Write a short overview of the model:
|
||||
- Overview with paper & authors
|
||||
- Paper abstract
|
||||
- Tips and tricks and how to use it best
|
||||
- Add the classes that should be linked in the model. This generally includes the configuration, the tokenizer, and
|
||||
every model of that class (the base model, alongside models with additional heads), both in PyTorch and TensorFlow.
|
||||
The order is generally:
|
||||
- Configuration,
|
||||
- Tokenizer
|
||||
- PyTorch base model
|
||||
- PyTorch head models
|
||||
- TensorFlow base model
|
||||
- TensorFlow head models
|
||||
|
||||
These classes should be added using the RST syntax. Usually as follows:
|
||||
```
|
||||
XXXConfig
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.XXXConfig
|
||||
:members:
|
||||
```
|
||||
|
||||
This will include every public method of the configuration. If for some reason you wish for a method not to be displayed
|
||||
in the documentation, you can do so by specifying which methods should be in the docs:
|
||||
|
||||
```
|
||||
XXXTokenizer
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.XXXTokenizer
|
||||
:members: build_inputs_with_special_tokens, get_special_tokens_mask,
|
||||
create_token_type_ids_from_sequences, save_vocabulary
|
||||
|
||||
```
|
||||
|
||||
### Writing source documentation
|
||||
|
||||
Values that should be put in `code` should either be surrounded by double backticks: \`\`like so\`\` or be written as an object
|
||||
using the :obj: syntax: :obj:\`like so\`.
|
||||
|
||||
When mentionning a class, it is recommended to use the :class: syntax as the mentioned class will be automatically
|
||||
linked by Sphinx: :class:\`transformers.XXXClass\`
|
||||
|
||||
When mentioning a function, it is recommended to use the :func: syntax as the mentioned method will be automatically
|
||||
linked by Sphinx: :func:\`transformers.XXXClass.method\`
|
||||
|
||||
Links should be done as so (note the double underscore at the end): \`text for the link <./local-link-or-global-link#loc>\`__
|
||||
|
||||
#### Defining arguments in a method
|
||||
|
||||
Arguments should be defined with the `Args:` prefix, followed by a line return and an indentation.
|
||||
The argument should be followed by its type, with its shape if it is a tensor, and a line return.
|
||||
Another indentation is necessary before writing the description of the argument.
|
||||
|
||||
Here's an example showcasing everything so far:
|
||||
|
||||
```
|
||||
Args:
|
||||
input_ids (:obj:`torch.LongTensor` of shape :obj:`(batch_size, sequence_length)`):
|
||||
Indices of input sequence tokens in the vocabulary.
|
||||
|
||||
Indices can be obtained using :class:`transformers.AlbertTokenizer`.
|
||||
See :func:`transformers.PreTrainedTokenizer.encode` and
|
||||
:func:`transformers.PreTrainedTokenizer.encode_plus` for details.
|
||||
|
||||
`What are input IDs? <../glossary.html#input-ids>`__
|
||||
```
|
||||
|
||||
#### Writing a multi-line code block
|
||||
|
||||
Multi-line code blocks can be useful for displaying examples. They are done like so:
|
||||
|
||||
```
|
||||
Example::
|
||||
|
||||
# first line of code
|
||||
# second line
|
||||
# etc
|
||||
```
|
||||
|
||||
The `Example` string at the beginning can be replaced by anything as long as there are two semicolons following it.
|
||||
|
||||
#### Writing a return block
|
||||
|
||||
Arguments should be defined with the `Args:` prefix, followed by a line return and an indentation.
|
||||
The first line should be the type of the return, followed by a line return. No need to indent further for the elements
|
||||
building the return.
|
||||
|
||||
Here's an example for tuple return, comprising several objects:
|
||||
|
||||
```
|
||||
Returns:
|
||||
:obj:`tuple(torch.FloatTensor)` comprising various elements depending on the configuration (:class:`~transformers.BertConfig`) and inputs:
|
||||
loss (`optional`, returned when ``masked_lm_labels`` is provided) ``torch.FloatTensor`` of shape ``(1,)``:
|
||||
Total loss as the sum of the masked language modeling loss and the next sequence prediction (classification) loss.
|
||||
prediction_scores (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, sequence_length, config.vocab_size)`)
|
||||
Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
|
||||
```
|
||||
|
||||
Here's an example for a single value return:
|
||||
|
||||
```
|
||||
Returns:
|
||||
A list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user