diff --git a/README.md b/README.md index 9f1037c954..ea5ddf144c 100644 --- a/README.md +++ b/README.md @@ -8,14 +8,14 @@ This implementation is provided with [Google's pre-trained models](https://githu | Section | Description | |-|-| -| [Installation](##installation) | How to install the package | -| [Overview](##overview) | Overview of the package | -| [Usage](##usage) | Quickstart examples | -| [Doc](##doc) | Detailed documentation | -| [Examples](##examples) | Detailed examples on how to fine-tune Bert | -| [Notebooks](##notebooks) | Introduction on the provided Jupyter Notebooks | -| [TPU](##tup) | Notes on TPU support and pretraining scripts | -| [Command-line interface](##Command-line-interface) | Convert a TensorFlow checkpoint in a PyTorch dump | +| [Installation](#installation) | How to install the package | +| [Overview](#overview) | Overview of the package | +| [Usage](#usage) | Quickstart examples | +| [Doc](#doc) | Detailed documentation | +| [Examples](#examples) | Detailed examples on how to fine-tune Bert | +| [Notebooks](#notebooks) | Introduction on the provided Jupyter Notebooks | +| [TPU](#tup) | Notes on TPU support and pretraining scripts | +| [Command-line interface](#Command-line-interface) | Convert a TensorFlow checkpoint in a PyTorch dump | ## Installation @@ -44,7 +44,7 @@ python -m pytest -sv tests/ ## Overview -This package comprises the following classes that can be imported in Python and are detailed in the [Doc](##doc) section of this readme: +This package comprises the following classes that can be imported in Python and are detailed in the [Doc](#doc) section of this readme: - Six PyTorch models (`torch.nn.Module`) for Bert with pre-trained weights: - `BertModel` - raw BERT Transformer model (**fully pre-trained**), @@ -72,22 +72,22 @@ The repository further comprises: - [`run_classifier.py`](./examples/run_classifier.py) - Show how to fine-tune an instance of `BertForSequenceClassification` on GLUE's MRPC task, - [`run_squad.py`](./examples/run_squad.py) - Show how to fine-tune an instance of `BertForQuestionAnswering` on SQuAD v1.0 task. - These examples are detailed in the [Examples](##examples) section of this readme. + These examples are detailed in the [Examples](#examples) section of this readme. - Three notebooks that were used to check that the TensorFlow and PyTorch models behave identically (in the [`notebooks` folder](./notebooks)): - [`Comparing-TF-and-PT-models.ipynb`](./notebooks/Comparing-TF-and-PT-models.ipynb) - Compare the hidden states predicted by `BertModel`, - [`Comparing-TF-and-PT-models-SQuAD.ipynb`](./notebooks/Comparing-TF-and-PT-models-SQuAD.ipynb) - Compare the spans predicted by `BertForQuestionAnswering` instances, - [`Comparing-TF-and-PT-models-MLM-NSP.ipynb`](./notebooks/Comparing-TF-and-PT-models-MLM-NSP.ipynb) - Compare the predictions of the `BertForPretraining` instances. - These notebooks are detailed in the [Notebooks](##notebooks) section of this readme. + These notebooks are detailed in the [Notebooks](#notebooks) section of this readme. - A command-line interface to convert any TensorFlow checkpoint in a PyTorch dump: - This CLI is detailed in the [Command-line interface](##Command-line-interface) section of this readme. + This CLI is detailed in the [Command-line interface](#Command-line-interface) section of this readme. ## Usage -Here is a quick-start example using `BertTokenizer`, `BertModel` and `BertForMaskedLM` class with Google AI's pre-trained `Bert base uncased` model. See the [doc section](##doc) below for all the details on these classes. +Here is a quick-start example using `BertTokenizer`, `BertModel` and `BertForMaskedLM` class with Google AI's pre-trained `Bert base uncased` model. See the [doc section](#doc) below for all the details on these classes. First let's prepare a tokenized input with `BertTokenizer` @@ -216,7 +216,7 @@ An example on how to use this class is given in the `extract_features.py` script - the masked language modeling head, and - the next sentence classification head. -*Inputs* comprises the inputs of the [`BertModel`](####-1.-`BertModel`) class plus two optional labels: +*Inputs* comprises the inputs of the [`BertModel`](#-1.-`BertModel`) class plus two optional labels: - `masked_lm_labels`: masked language modeling labels: torch.LongTensor of shape [batch_size, sequence_length] with indices selected in [-1, 0, ..., vocab_size]. All labels set to -1 are ignored (masked), the loss is only computed for the labels set in [0, ..., vocab_size] - `next_sentence_label`: next sentence classification loss: torch.LongTensor of shape [batch_size] with indices selected in [0, 1]. 0 => next sentence is the continuation, 1 => next sentence is a random sentence. @@ -232,7 +232,7 @@ An example on how to use this class is given in the `extract_features.py` script `BertForMaskedLM` includes the `BertModel` Transformer followed by the (possibly) pre-trained masked language modeling head. -*Inputs* comprises the inputs of the [`BertModel`](####-1.-`BertModel`) class plus optional label: +*Inputs* comprises the inputs of the [`BertModel`](#-1.-`BertModel`) class plus optional label: - `masked_lm_labels`: masked language modeling labels: torch.LongTensor of shape [batch_size, sequence_length] with indices selected in [-1, 0, ..., vocab_size]. All labels set to -1 are ignored (masked), the loss is only computed for the labels set in [0, ..., vocab_size] @@ -245,7 +245,7 @@ An example on how to use this class is given in the `extract_features.py` script `BertForNextSentencePrediction` includes the `BertModel` Transformer followed by the next sentence classification head. -*Inputs* comprises the inputs of the [`BertModel`](####-1.-`BertModel`) class plus an optional label: +*Inputs* comprises the inputs of the [`BertModel`](#-1.-`BertModel`) class plus an optional label: - `next_sentence_label`: next sentence classification loss: torch.LongTensor of shape [batch_size] with indices selected in [0, 1]. 0 => next sentence is the continuation, 1 => next sentence is a random sentence.