Update all references to canonical models (#29001)
* Script & Manual edition * Update
This commit is contained in:
@@ -159,13 +159,13 @@ to be used, but that everybody in team is on the same page on what type of model
|
||||
To give an example, a well-defined project would be the following:
|
||||
|
||||
- task: summarization
|
||||
- model: [t5-small](https://huggingface.co/t5-small)
|
||||
- model: [google-t5/t5-small](https://huggingface.co/google-t5/t5-small)
|
||||
- dataset: [CNN/Daily mail](https://huggingface.co/datasets/cnn_dailymail)
|
||||
- training script: [run_summarization_flax.py](https://github.com/huggingface/transformers/blob/main/examples/flax/summarization/run_summarization_flax.py)
|
||||
- outcome: t5 model that can summarize news
|
||||
- work flow: adapt `run_summarization_flax.py` to work with `t5-small`.
|
||||
- work flow: adapt `run_summarization_flax.py` to work with `google-t5/t5-small`.
|
||||
|
||||
This example is a very easy and not the most interesting project since a `t5-small`
|
||||
This example is a very easy and not the most interesting project since a `google-t5/t5-small`
|
||||
summarization model exists already for CNN/Daily mail and pretty much no code has to be
|
||||
written.
|
||||
A well-defined project does not need to have the dataset be part of
|
||||
@@ -335,7 +335,7 @@ dataset = load_dataset('oscar', "unshuffled_deduplicated_en", split='train', str
|
||||
|
||||
dummy_input = next(iter(dataset))["text"]
|
||||
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base")
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained("FacebookAI/roberta-base")
|
||||
input_ids = tokenizer(dummy_input, return_tensors="np").input_ids[:, :10]
|
||||
|
||||
model = FlaxRobertaModel.from_pretrained("julien-c/dummy-unknown")
|
||||
@@ -492,7 +492,7 @@ dataset = load_dataset('oscar', "unshuffled_deduplicated_en", split='train', str
|
||||
|
||||
dummy_input = next(iter(dataset))["text"]
|
||||
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base")
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained("FacebookAI/roberta-base")
|
||||
input_ids = tokenizer(dummy_input, return_tensors="np").input_ids[:, :10]
|
||||
|
||||
model = FlaxRobertaModel.from_pretrained("julien-c/dummy-unknown")
|
||||
@@ -518,7 +518,7 @@ be available in a couple of days.
|
||||
- [BigBird](https://github.com/huggingface/transformers/blob/main/src/transformers/models/big_bird/modeling_flax_big_bird.py)
|
||||
- [CLIP](https://github.com/huggingface/transformers/blob/main/src/transformers/models/clip/modeling_flax_clip.py)
|
||||
- [ELECTRA](https://github.com/huggingface/transformers/blob/main/src/transformers/models/electra/modeling_flax_electra.py)
|
||||
- [GPT2](https://github.com/huggingface/transformers/blob/main/src/transformers/models/gpt2/modeling_flax_gpt2.py)
|
||||
- [GPT2](https://github.com/huggingface/transformers/blob/main/src/transformers/models/openai-community/gpt2/modeling_flax_gpt2.py)
|
||||
- [(TODO) MBART](https://github.com/huggingface/transformers/blob/main/src/transformers/models/mbart/modeling_flax_mbart.py)
|
||||
- [RoBERTa](https://github.com/huggingface/transformers/blob/main/src/transformers/models/roberta/modeling_flax_roberta.py)
|
||||
- [T5](https://github.com/huggingface/transformers/blob/main/src/transformers/models/t5/modeling_flax_t5.py)
|
||||
@@ -729,7 +729,7 @@ Let's use the base `FlaxRobertaModel` without any heads as an example.
|
||||
from transformers import FlaxRobertaModel, RobertaTokenizerFast
|
||||
import jax
|
||||
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base")
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained("FacebookAI/roberta-base")
|
||||
inputs = tokenizer("JAX/Flax is amazing ", padding="max_length", max_length=128, return_tensors="np")
|
||||
|
||||
model = FlaxRobertaModel.from_pretrained("julien-c/dummy-unknown")
|
||||
@@ -1011,7 +1011,7 @@ and run the following commands in a Python shell to save a config.
|
||||
```python
|
||||
from transformers import RobertaConfig
|
||||
|
||||
config = RobertaConfig.from_pretrained("roberta-base")
|
||||
config = RobertaConfig.from_pretrained("FacebookAI/roberta-base")
|
||||
config.save_pretrained("./")
|
||||
```
|
||||
|
||||
@@ -1193,12 +1193,12 @@ All the widgets are open sourced in the `huggingface_hub` [repo](https://github.
|
||||
**NLP**
|
||||
* **Conversational:** To have the best conversations!. [Example](https://huggingface.co/microsoft/DialoGPT-large?).
|
||||
* **Feature Extraction:** Retrieve the input embeddings. [Example](https://huggingface.co/sentence-transformers/distilbert-base-nli-mean-tokens?text=test).
|
||||
* **Fill Mask:** Predict potential words for a mask token. [Example](https://huggingface.co/bert-base-uncased?).
|
||||
* **Question Answering:** Given a context and a question, predict the answer. [Example](https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad).
|
||||
* **Fill Mask:** Predict potential words for a mask token. [Example](https://huggingface.co/google-bert/bert-base-uncased?).
|
||||
* **Question Answering:** Given a context and a question, predict the answer. [Example](https://huggingface.co/google-bert/bert-large-uncased-whole-word-masking-finetuned-squad).
|
||||
* **Sentence Simmilarity:** Predict how similar a set of sentences are. Useful for Sentence Transformers.
|
||||
* **Summarization:** Given a text, output a summary of it. [Example](https://huggingface.co/sshleifer/distilbart-cnn-12-6).
|
||||
* **Table Question Answering:** Given a table and a question, predict the answer. [Example](https://huggingface.co/google/tapas-base-finetuned-wtq).
|
||||
* **Text Generation:** Generate text based on a prompt. [Example](https://huggingface.co/gpt2)
|
||||
* **Text Generation:** Generate text based on a prompt. [Example](https://huggingface.co/openai-community/gpt2)
|
||||
* **Token Classification:** Useful for tasks such as Named Entity Recognition and Part of Speech. [Example](https://huggingface.co/dslim/bert-base-NER).
|
||||
* **Zero-Shot Classification:** Too cool to explain with words. Here is an [example](https://huggingface.co/typeform/distilbert-base-uncased-mnli)
|
||||
* ([WIP](https://github.com/huggingface/huggingface_hub/issues/99)) **Table to Text Generation**.
|
||||
|
||||
@@ -31,7 +31,7 @@ without ever having to download the full dataset.
|
||||
In the following, we demonstrate how to train a bi-directional transformer model
|
||||
using masked language modeling objective as introduced in [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805).
|
||||
More specifically, we demonstrate how JAX/Flax and dataset streaming can be leveraged
|
||||
to pre-train [**`roberta-base`**](https://huggingface.co/roberta-base)
|
||||
to pre-train [**`FacebookAI/roberta-base`**](https://huggingface.co/FacebookAI/roberta-base)
|
||||
in English on a single TPUv3-8 pod for 10000 update steps.
|
||||
|
||||
The example script uses the 🤗 Datasets library. You can easily customize them to your needs if you need extra processing on your datasets.
|
||||
@@ -80,8 +80,8 @@ from transformers import RobertaTokenizerFast, RobertaConfig
|
||||
|
||||
model_dir = "./english-roberta-base-dummy"
|
||||
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base")
|
||||
config = RobertaConfig.from_pretrained("roberta-base")
|
||||
tokenizer = RobertaTokenizerFast.from_pretrained("FacebookAI/roberta-base")
|
||||
config = RobertaConfig.from_pretrained("FacebookAI/roberta-base")
|
||||
|
||||
tokenizer.save_pretrained(model_dir)
|
||||
config.save_pretrained(model_dir)
|
||||
|
||||
@@ -32,7 +32,7 @@ Models written in JAX/Flax are **immutable** and updated in a purely functional
|
||||
way which enables simple and efficient model parallelism.
|
||||
|
||||
In this example we will use the vision model from [CLIP](https://huggingface.co/models?filter=clip)
|
||||
as the image encoder and [`roberta-base`](https://huggingface.co/roberta-base) as the text encoder.
|
||||
as the image encoder and [`FacebookAI/roberta-base`](https://huggingface.co/FacebookAI/roberta-base) as the text encoder.
|
||||
Note that one can also use the [ViT](https://huggingface.co/models?filter=vit) model as image encoder and any other BERT or ROBERTa model as text encoder.
|
||||
To train the model on languages other than English one should choose a text encoder trained on the desired
|
||||
language and a image-text dataset in that language. One such dataset is [WIT](https://github.com/google-research-datasets/wit).
|
||||
@@ -76,7 +76,7 @@ Here is an example of how to load the model using pre-trained text and vision mo
|
||||
```python
|
||||
from modeling_hybrid_clip import FlaxHybridCLIP
|
||||
|
||||
model = FlaxHybridCLIP.from_text_vision_pretrained("bert-base-uncased", "openai/clip-vit-base-patch32")
|
||||
model = FlaxHybridCLIP.from_text_vision_pretrained("google-bert/bert-base-uncased", "openai/clip-vit-base-patch32")
|
||||
|
||||
# save the model
|
||||
model.save_pretrained("bert-clip")
|
||||
@@ -89,7 +89,7 @@ If the checkpoints are in PyTorch then one could pass `text_from_pt=True` and `v
|
||||
PyTorch checkpoints convert them to flax and load the model.
|
||||
|
||||
```python
|
||||
model = FlaxHybridCLIP.from_text_vision_pretrained("bert-base-uncased", "openai/clip-vit-base-patch32", text_from_pt=True, vision_from_pt=True)
|
||||
model = FlaxHybridCLIP.from_text_vision_pretrained("google-bert/bert-base-uncased", "openai/clip-vit-base-patch32", text_from_pt=True, vision_from_pt=True)
|
||||
```
|
||||
|
||||
This loads both the text and vision encoders using pre-trained weights, the projection layers are randomly
|
||||
@@ -154,9 +154,9 @@ Next we can run the example script to train the model:
|
||||
```bash
|
||||
python run_hybrid_clip.py \
|
||||
--output_dir ${MODEL_DIR} \
|
||||
--text_model_name_or_path="roberta-base" \
|
||||
--text_model_name_or_path="FacebookAI/roberta-base" \
|
||||
--vision_model_name_or_path="openai/clip-vit-base-patch32" \
|
||||
--tokenizer_name="roberta-base" \
|
||||
--tokenizer_name="FacebookAI/roberta-base" \
|
||||
--train_file="coco_dataset/train_dataset.json" \
|
||||
--validation_file="coco_dataset/validation_dataset.json" \
|
||||
--do_train --do_eval \
|
||||
|
||||
@@ -314,8 +314,6 @@ class FlaxHybridCLIP(FlaxPreTrainedModel):
|
||||
Information necessary to initiate the text model. Can be either:
|
||||
|
||||
- A string, the `model id` of a pretrained model hosted inside a model repo on huggingface.co.
|
||||
Valid model ids can be located at the root-level, like ``bert-base-uncased``, or namespaced under
|
||||
a user or organization name, like ``dbmdz/bert-base-german-cased``.
|
||||
- A path to a `directory` containing model weights saved using
|
||||
:func:`~transformers.FlaxPreTrainedModel.save_pretrained`, e.g., ``./my_model_directory/``.
|
||||
- A path or url to a `PyTorch checkpoint folder` (e.g, ``./pt_model``). In
|
||||
@@ -327,8 +325,6 @@ class FlaxHybridCLIP(FlaxPreTrainedModel):
|
||||
Information necessary to initiate the vision model. Can be either:
|
||||
|
||||
- A string, the `model id` of a pretrained model hosted inside a model repo on huggingface.co.
|
||||
Valid model ids can be located at the root-level, like ``bert-base-uncased``, or namespaced under
|
||||
a user or organization name, like ``dbmdz/bert-base-german-cased``.
|
||||
- A path to a `directory` containing model weights saved using
|
||||
:func:`~transformers.FlaxPreTrainedModel.save_pretrained`, e.g., ``./my_model_directory/``.
|
||||
- A path or url to a `PyTorch checkpoint folder` (e.g, ``./pt_model``). In
|
||||
@@ -354,7 +350,7 @@ class FlaxHybridCLIP(FlaxPreTrainedModel):
|
||||
>>> from transformers import FlaxHybridCLIP
|
||||
>>> # initialize a model from pretrained BERT and CLIP models. Note that the projection layers will be randomly initialized.
|
||||
>>> # If using CLIP's vision model the vision projection layer will be initialized using pre-trained weights
|
||||
>>> model = FlaxHybridCLIP.from_text_vision_pretrained('bert-base-uncased', 'openai/clip-vit-base-patch32')
|
||||
>>> model = FlaxHybridCLIP.from_text_vision_pretrained('google-bert/bert-base-uncased', 'openai/clip-vit-base-patch32')
|
||||
>>> # saving model after fine-tuning
|
||||
>>> model.save_pretrained("./bert-clip")
|
||||
>>> # load fine-tuned model
|
||||
|
||||
@@ -54,7 +54,7 @@ model.save_pretrained("gpt-neo-1.3B")
|
||||
```bash
|
||||
python run_clm_mp.py \
|
||||
--model_name_or_path gpt-neo-1.3B \
|
||||
--tokenizer_name gpt2 \
|
||||
--tokenizer_name openai-community/gpt2 \
|
||||
--dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 \
|
||||
--do_train --do_eval \
|
||||
--block_size 1024 \
|
||||
|
||||
Reference in New Issue
Block a user