[docs] flax/jax purge (#40372)

flax/jax purge
This commit is contained in:
Joao Gante
2025-08-25 10:25:00 +01:00
committed by GitHub
parent 14b89fed24
commit 0031c044f8
7 changed files with 16 additions and 183 deletions

View File

@@ -68,8 +68,7 @@ already reported** (use the search bar on GitHub under Issues). Your issue shoul
Once you've confirmed the bug hasn't already been reported, please include the following information in your issue so we can quickly resolve it:
* Your **OS type and version** and **Python**, **PyTorch** and
**TensorFlow** versions when applicable.
* Your **OS type and version** and **Python**, and **PyTorch** versions when applicable.
* A short, self-contained, code snippet that allows us to reproduce the bug in
less than 30s.
* The *full* traceback if an exception is raised.
@@ -165,8 +164,7 @@ You'll need **[Python 3.9](https://github.com/huggingface/transformers/blob/main
mode with the `-e` flag.
Depending on your OS, and since the number of optional dependencies of Transformers is growing, you might get a
failure with this command. If that's the case make sure to install the Deep Learning framework you are working with
(PyTorch, TensorFlow and/or Flax) then do:
failure with this command. If that's the case make sure to install Pytorch then do:
```bash
pip install -e ".[quality]"

View File

@@ -20,7 +20,7 @@ rendered properly in your Markdown viewer.
# Installation
Transformers works with [PyTorch](https://pytorch.org/get-started/locally/), [TensorFlow 2.0](https://www.tensorflow.org/install/pip), and [Flax](https://flax.readthedocs.io/en/latest/). It has been tested on Python 3.9+, PyTorch 2.1+, TensorFlow 2.6+, and Flax 0.4.1+.
Transformers works with [PyTorch](https://pytorch.org/get-started/locally/). It has been tested on Python 3.9+ and PyTorch 2.2+.
## Virtual environment
@@ -74,7 +74,7 @@ uv pip install transformers
</hfoption>
</hfoptions>
For GPU acceleration, install the appropriate CUDA drivers for [PyTorch](https://pytorch.org/get-started/locally) and [TensorFlow](https://www.tensorflow.org/install/pip).
For GPU acceleration, install the appropriate CUDA drivers for [PyTorch](https://pytorch.org/get-started/locally).
Run the command below to check if your system detects an NVIDIA GPU.
@@ -84,42 +84,11 @@ nvidia-smi
To install a CPU-only version of Transformers and a machine learning framework, run the following command.
<hfoptions id="cpu-only">
<hfoption id="PyTorch">
```bash
pip install 'transformers[torch]'
uv pip install 'transformers[torch]'
```
</hfoption>
<hfoption id="TensorFlow">
For Apple M1 hardware, you need to install CMake and pkg-config first.
```bash
brew install cmake
brew install pkg-config
```
Install TensorFlow 2.0.
```bash
pip install 'transformers[tf-cpu]'
uv pip install 'transformers[tf-cpu]'
```
</hfoption>
<hfoption id="Flax">
```bash
pip install 'transformers[flax]'
uv pip install 'transformers[flax]'
```
</hfoption>
</hfoptions>
Test whether the install was successful with the following command. It should return a label and score for the provided text.
```bash

View File

@@ -73,53 +73,9 @@ A model repository also includes an inference [widget](https://hf.co/docs/hub/mo
Check out the Hub [Models](https://hf.co/docs/hub/models) documentation to for more information.
## Model framework conversion
Reach a wider audience by making a model available in PyTorch, TensorFlow, and Flax. While users can still load a model if they're using a different framework, it is slower because Transformers needs to convert the checkpoint on the fly. It is faster to convert the checkpoint first.
<hfoptions id="convert">
<hfoption id="PyTorch">
Set `from_tf=True` to convert a checkpoint from TensorFlow to PyTorch and then save it.
```py
from transformers import DistilBertForSequenceClassification
pt_model = DistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_tf=True)
pt_model.save_pretrained("path/to/awesome-name-you-picked")
```
</hfoption>
<hfoption id="TensorFlow">
Set `from_pt=True` to convert a checkpoint from PyTorch to TensorFlow and then save it.
```py
from transformers import TFDistilBertForSequenceClassification
tf_model = TFDistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_pt=True)
tf_model.save_pretrained("path/to/awesome-name-you-picked")
```
</hfoption>
<hfoption id="Flax">
Set `from_pt=True` to convert a checkpoint from PyTorch to Flax and then save it.
```py
from transformers import FlaxDistilBertForSequenceClassification
flax_model = FlaxDistilBertForSequenceClassification.from_pretrained(
"path/to/awesome-name-you-picked", from_pt=True
)
flax_model.save_pretrained("path/to/awesome-name-you-picked")
```
</hfoption>
</hfoptions>
## Uploading a model
There are several ways to upload a model to the Hub depending on your workflow preference. You can push a model with [`Trainer`], a callback for TensorFlow models, call [`~PreTrainedModel.push_to_hub`] directly on a model, or use the Hub web interface.
There are several ways to upload a model to the Hub depending on your workflow preference. You can push a model with [`Trainer`], call [`~PreTrainedModel.push_to_hub`] directly on a model, or use the Hub web interface.
<Youtube id="Z1-XMy-GNLQ"/>
@@ -143,19 +99,6 @@ trainer = Trainer(
trainer.push_to_hub()
```
### PushToHubCallback
For TensorFlow models, add the [`PushToHubCallback`] to the [fit](https://keras.io/api/models/model_training_apis/#fit-method) method.
```py
from transformers import PushToHubCallback
push_to_hub_callback = PushToHubCallback(
output_dir="./your_model_save_path", tokenizer=tokenizer, hub_model_id="your-username/my-awesome-model"
)
model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3, callbacks=push_to_hub_callback)
```
### PushToHubMixin
The [`~utils.PushToHubMixin`] provides functionality for pushing a model or tokenizer to the Hub.
@@ -166,7 +109,7 @@ Call [`~utils.PushToHubMixin.push_to_hub`] directly on a model to upload it to t
model.push_to_hub("my-awesome-model")
```
Other objects like a tokenizer or TensorFlow model are also pushed to the Hub in the same way.
Other objects like a tokenizer are also pushed to the Hub in the same way.
```py
tokenizer.push_to_hub("my-awesome-model")

View File

@@ -45,43 +45,6 @@ There are two general types of models you can load:
1. A barebones model, like [`AutoModel`] or [`LlamaModel`], that outputs hidden states.
2. A model with a specific *head* attached, like [`AutoModelForCausalLM`] or [`LlamaForCausalLM`], for performing specific tasks.
For each model type, there is a separate class for each machine learning framework (PyTorch, TensorFlow, Flax). Pick the corresponding prefix for the framework you're using.
<hfoptions id="backend">
<hfoption id="PyTorch">
```py
from transformers import AutoModelForCausalLM, MistralForCausalLM
# load with AutoClass or model-specific class
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1", dtype="auto", device_map="auto")
model = MistralForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1", dtype="auto", device_map="auto")
```
</hfoption>
<hfoption id="TensorFlow">
```py
from transformers import TFAutoModelForCausalLM, TFMistralForCausalLM
# load with AutoClass or model-specific class
model = TFAutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
model = TFMistralForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
```
</hfoption>
<hfoption id="Flax">
```py
from transformers import FlaxAutoModelForCausalLM, FlaxMistralForCausalLM
# load with AutoClass or model-specific class
model = FlaxAutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
model = FlaxMistralForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
```
</hfoption>
</hfoptions>
## Model classes

View File

@@ -34,9 +34,9 @@ The library was designed with two strong goals in mind:
loads the related class instance and associated data (configurations' hyperparameters, tokenizers' vocabulary,
and models' weights) from a pretrained checkpoint provided on [Hugging Face Hub](https://huggingface.co/models) or your own saved checkpoint.
- On top of those three base classes, the library provides two APIs: [`pipeline`] for quickly
using a model for inference on a given task and [`Trainer`] to quickly train or fine-tune a PyTorch model (all TensorFlow models are compatible with `Keras.fit`).
using a model for inference on a given task and [`Trainer`] to quickly train or fine-tune a PyTorch model.
- As a consequence, this library is NOT a modular toolbox of building blocks for neural nets. If you want to
extend or build upon the library, just use regular Python, PyTorch, TensorFlow, Keras modules and inherit from the base
extend or build upon the library, just use regular Python or PyTorch and inherit from the base
classes of the library to reuse functionalities like model loading and saving. If you'd like to learn more about our coding philosophy for models, check out our [Repeat Yourself](https://huggingface.co/blog/transformers-design-philosophy) blog post.
2. Provide state-of-the-art models with performances as close as possible to the original models:
@@ -44,7 +44,7 @@ The library was designed with two strong goals in mind:
- We provide at least one example for each architecture which reproduces a result provided by the official authors
of said architecture.
- The code is usually as close to the original code base as possible which means some PyTorch code may be not as
*pytorchic* as it could be as a result of being converted TensorFlow code and vice versa.
*pytorchic* as it could be as a result of being converted from other Deep Learning frameworks.
A few other goals:
@@ -58,13 +58,11 @@ A few other goals:
- A simple and consistent way to add new tokens to the vocabulary and embeddings for fine-tuning.
- Simple ways to mask and prune Transformer heads.
- Easily switch between PyTorch, TensorFlow 2.0 and Flax, allowing training with one framework and inference with another.
## Main concepts
The library is built around three types of classes for each model:
- **Model classes** can be PyTorch models ([torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)), Keras models ([tf.keras.Model](https://www.tensorflow.org/api_docs/python/tf/keras/Model)) or JAX/Flax models ([flax.linen.Module](https://flax.readthedocs.io/en/latest/api_reference/flax.linen/module.html)) that work with the pretrained weights provided in the library.
- **Model classes** are be PyTorch models ([torch.nn.Module](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)).
- **Configuration classes** store the hyperparameters required to build a model (such as the number of layers and hidden size). You don't always need to instantiate these yourself. In particular, if you are using a pretrained model without any modification, creating the model will automatically take care of instantiating the configuration (which is part of the model).
- **Preprocessing classes** convert the raw data into a format accepted by the model. A [tokenizer](main_classes/tokenizer) stores the vocabulary for each model and provide methods for encoding and decoding strings in a list of token embedding indices to be fed to a model. [Image processors](main_classes/image_processor) preprocess vision inputs, [feature extractors](main_classes/feature_extractor) preprocess audio inputs, and a [processor](main_classes/processors) handles multimodal inputs.
@@ -76,4 +74,3 @@ All these classes can be instantiated from pretrained instances, saved locally,
- `save_pretrained()` lets you save a model, configuration, and preprocessing class locally so that it can be reloaded using
`from_pretrained()`.
- `push_to_hub()` lets you share a model, configuration, and a preprocessing class to the Hub, so it is easily accessible to everyone.

View File

@@ -40,7 +40,7 @@ or for an editable install:
pip install -e .[dev]
```
inside the Transformers repo. Since the number of optional dependencies of Transformers has grown a lot, it's possible you don't manage to get all of them. If the dev install fails, make sure to install the Deep Learning framework you are working with (PyTorch, TensorFlow and/or Flax) then do
inside the Transformers repo. Since the number of optional dependencies of Transformers has grown a lot, it's possible you don't manage to get all of them. If the dev install fails, make sure to install PyTorch then do
```bash
pip install transformers[quality]
@@ -55,7 +55,7 @@ pip install -e .[quality]
## Tests
All the jobs that begin with `ci/circleci: run_tests_` run parts of the Transformers testing suite. Each of those jobs focuses on a part of the library in a certain environment: for instance `ci/circleci: run_tests_pipelines_tf` runs the pipelines test in an environment where TensorFlow only is installed.
All the jobs that begin with `ci/circleci: run_tests_` run parts of the Transformers testing suite. Each of those jobs focuses on a part of the library in a certain environment: for instance `ci/circleci: run_tests_pipelines` runs the pipeline tests in an environment where all pipeline-related requirements are installed.
Note that to avoid running tests when there is no real change in the modules they are testing, only part of the test suite is run each time: a utility is run to determine the differences in the library between before and after the PR (what GitHub shows you in the "Files changes" tab) and picks the tests impacted by that diff. That utility can be run locally with:

View File

@@ -16,13 +16,13 @@ rendered properly in your Markdown viewer.
# Training scripts
Transformers provides many example training scripts for deep learning frameworks (PyTorch, TensorFlow, Flax) and tasks in [transformers/examples](https://github.com/huggingface/transformers/tree/main/examples). There are additional scripts in [transformers/research projects](https://github.com/huggingface/transformers-research-projects/) and [transformers/legacy](https://github.com/huggingface/transformers/tree/main/examples/legacy), but these aren't actively maintained and requires a specific version of Transformers.
Transformers provides many example training scripts for PyTorch and tasks in [transformers/examples](https://github.com/huggingface/transformers/tree/main/examples). There are additional scripts in [transformers/research projects](https://github.com/huggingface/transformers-research-projects/) and [transformers/legacy](https://github.com/huggingface/transformers/tree/main/examples/legacy), but these aren't actively maintained and requires a specific version of Transformers.
Example scripts are only examples and you may need to adapt the script to your use-case. To help you with this, most scripts are very transparent in how data is preprocessed, allowing you to edit it as necessary.
For any feature you'd like to implement in an example script, please discuss it on the [forum](https://discuss.huggingface.co/) or in an [issue](https://github.com/huggingface/transformers/issues) before submitting a pull request. While we welcome contributions, it is unlikely a pull request that adds more functionality is added at the cost of readability.
This guide will show you how to run an example summarization training script in [PyTorch](https://github.com/huggingface/transformers/tree/main/examples/pytorch/summarization) and [TensorFlow](https://github.com/huggingface/transformers/tree/main/examples/tensorflow/summarization).
This guide will show you how to run an example summarization training script in [PyTorch](https://github.com/huggingface/transformers/tree/main/examples/pytorch/summarization).
## Setup
@@ -58,10 +58,7 @@ Start with a smaller dataset by including the `max_train_samples`, `max_eval_sam
The example below fine-tunes [T5-small](https://huggingface.co/google-t5/t5-small) on the [CNN/DailyMail](https://huggingface.co/datasets/abisee/cnn_dailymail) dataset. T5 requires an additional `source_prefix` parameter to prompt it to summarize.
<hfoptions id="script">
<hfoption id="PyTorch">
The example script downloads and preprocesses a dataset, and then fine-tunes it with [`Trainer`] with a supported model architecture.
The example script downloads and preprocesses a dataset, and then fine-tunes it with [`Trainer`] with a supported model architecture.
Resuming training from a checkpoint is very useful if training is interrupted because you don't have to start over again. There are two ways to resume training from a checkpoint.
@@ -116,40 +113,6 @@ python xla_spawn.py --num_cores 8 pytorch/summarization/run_summarization.py \
...
```
</hfoption>
<hfoption id="TensorFlow">
```bash
python examples/tensorflow/summarization/run_summarization.py \
--model_name_or_path google-t5/t5-small \
# remove the `max_train_samples`, `max_eval_samples` and `max_predict_samples` if everything works
--max_train_samples 50 \
--max_eval_samples 50 \
--max_predict_samples 50 \
--dataset_name cnn_dailymail \
--dataset_config "3.0.0" \
--output_dir /tmp/tst-summarization \
--per_device_train_batch_size 8 \
--per_device_eval_batch_size 16 \
--num_train_epochs 3 \
--do_train \
--do_eval \
```
TensorFlow uses the [MirroredStrategy](https://www.tensorflow.org/guide/distributed_training#mirroredstrategy) for distributed training and doesn't require adding any additional parameters. The script uses multiple GPUs by default if they are available.
For TPU training, TensorFlow scripts use the [TPUStrategy](https://www.tensorflow.org/guide/distributed_training#tpustrategy). Pass the TPU resource name to the `--tpu` parameter.
```bash
python run_summarization.py \
--tpu name_of_tpu_resource \
...
...
```
</hfoption>
</hfoptions>
## Accelerate
[Accelerate](https://huggingface.co/docs/accelerate) is designed to simplify distributed training while offering complete visibility into the PyTorch training loop. If you're planning on training with a script with Accelerate, use the `_no_trainer.py` version of the script.
@@ -160,7 +123,7 @@ Install Accelerate from source to ensure you have the latest version.
pip install git+https://github.com/huggingface/accelerate
```
Run the [accelerate config](https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-config) command to answer a few questions about your training setup. This creates and saves a config file about your system.
Run the [accelerate config](https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-config) command to answer a few questions about your training setup. This creates and saves a config file about your system.
```bash
accelerate config