byebye torch 2.0 (#37277)
* bump Torch 2.1 with broken compatibility `torch.compile`
* dep table
* remove usage of is_torch_greater_or_equal_than_2_1
* remove usage of is_torch_greater_or_equal_than_2_1
* remove if is_torch_greater_or_equal("2.1.0")
* remove torch >= "2.1.0"
* deal with 2.0.0
* PyTorch 2.0+ --> PyTorch 2.1+
* ruff 1
* difficult ruff
* address comment
* address comment
---------
Co-authored-by: Jirka B <j.borovec+github@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
This commit is contained in:
@@ -70,7 +70,7 @@ Explore the [Hub](https://huggingface.com/) today to find a model and use Transf
|
|||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
Transformers works with Python 3.9+ [PyTorch](https://pytorch.org/get-started/locally/) 2.0+, [TensorFlow](https://www.tensorflow.org/install/pip) 2.6+, and [Flax](https://flax.readthedocs.io/en/latest/) 0.4.1+.
|
Transformers works with Python 3.9+ [PyTorch](https://pytorch.org/get-started/locally/) 2.1+, [TensorFlow](https://www.tensorflow.org/install/pip) 2.6+, and [Flax](https://flax.readthedocs.io/en/latest/) 0.4.1+.
|
||||||
|
|
||||||
Create and activate a virtual environment with [venv](https://docs.python.org/3/library/venv.html) or [uv](https://docs.astral.sh/uv/), a fast Rust-based Python package and project manager.
|
Create and activate a virtual environment with [venv](https://docs.python.org/3/library/venv.html) or [uv](https://docs.astral.sh/uv/), a fast Rust-based Python package and project manager.
|
||||||
|
|
||||||
|
|||||||
@@ -20,7 +20,7 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
# Installation
|
# Installation
|
||||||
|
|
||||||
Transformers works with [PyTorch](https://pytorch.org/get-started/locally/), [TensorFlow 2.0](https://www.tensorflow.org/install/pip), and [Flax](https://flax.readthedocs.io/en/latest/). It has been tested on Python 3.9+, PyTorch 2.0+, TensorFlow 2.6+, and Flax 0.4.1+.
|
Transformers works with [PyTorch](https://pytorch.org/get-started/locally/), [TensorFlow 2.0](https://www.tensorflow.org/install/pip), and [Flax](https://flax.readthedocs.io/en/latest/). It has been tested on Python 3.9+, PyTorch 2.1+, TensorFlow 2.6+, and Flax 0.4.1+.
|
||||||
|
|
||||||
## Virtual environment
|
## Virtual environment
|
||||||
|
|
||||||
|
|||||||
@@ -245,7 +245,7 @@ limitations under the License.
|
|||||||
|
|
||||||
### باستخدام pip
|
### باستخدام pip
|
||||||
|
|
||||||
تم اختبار هذا المستودع على Python 3.9+، Flax 0.4.1+، PyTorch 2.0+، و TensorFlow 2.6+.
|
تم اختبار هذا المستودع على Python 3.9+، Flax 0.4.1+، PyTorch 2.1+، و TensorFlow 2.6+.
|
||||||
|
|
||||||
يجب تثبيت 🤗 Transformers في [بيئة افتراضية](https://docs.python.org/3/library/venv.html). إذا كنت غير معتاد على البيئات الافتراضية Python، فراجع [دليل المستخدم](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
يجب تثبيت 🤗 Transformers في [بيئة افتراضية](https://docs.python.org/3/library/venv.html). إذا كنت غير معتاد على البيئات الافتراضية Python، فراجع [دليل المستخدم](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
||||||
|
|
||||||
|
|||||||
@@ -246,7 +246,7 @@ Das Modell selbst ist ein reguläres [PyTorch `nn.Module`](https://pytorch.org/d
|
|||||||
|
|
||||||
### Mit pip
|
### Mit pip
|
||||||
|
|
||||||
Dieses Repository wurde mit Python 3.9+, Flax 0.4.1+, PyTorch 2.0+ und TensorFlow 2.6+ getestet.
|
Dieses Repository wurde mit Python 3.9+, Flax 0.4.1+, PyTorch 2.1+ und TensorFlow 2.6+ getestet.
|
||||||
|
|
||||||
Sie sollten 🤗 Transformers in einer [virtuellen Umgebung](https://docs.python.org/3/library/venv.html) installieren. Wenn Sie mit virtuellen Python-Umgebungen nicht vertraut sind, schauen Sie sich den [Benutzerleitfaden](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) an.
|
Sie sollten 🤗 Transformers in einer [virtuellen Umgebung](https://docs.python.org/3/library/venv.html) installieren. Wenn Sie mit virtuellen Python-Umgebungen nicht vertraut sind, schauen Sie sich den [Benutzerleitfaden](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) an.
|
||||||
|
|
||||||
|
|||||||
@@ -222,7 +222,7 @@ El modelo en si es un [Pytorch `nn.Module`](https://pytorch.org/docs/stable/nn.h
|
|||||||
|
|
||||||
### Con pip
|
### Con pip
|
||||||
|
|
||||||
Este repositorio está probado en Python 3.9+, Flax 0.4.1+, PyTorch 2.0+ y TensorFlow 2.6+.
|
Este repositorio está probado en Python 3.9+, Flax 0.4.1+, PyTorch 2.1+ y TensorFlow 2.6+.
|
||||||
|
|
||||||
Deberías instalar 🤗 Transformers en un [entorno virtual](https://docs.python.org/3/library/venv.html). Si no estas familiarizado con los entornos virtuales de Python, consulta la [guía de usuario](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
Deberías instalar 🤗 Transformers en un [entorno virtual](https://docs.python.org/3/library/venv.html). Si no estas familiarizado con los entornos virtuales de Python, consulta la [guía de usuario](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
||||||
|
|
||||||
|
|||||||
@@ -243,7 +243,7 @@ Le modèle lui-même est un module [`nn.Module` PyTorch](https://pytorch.org/doc
|
|||||||
|
|
||||||
### Avec pip
|
### Avec pip
|
||||||
|
|
||||||
Ce référentiel est testé sur Python 3.9+, Flax 0.4.1+, PyTorch 2.0+ et TensorFlow 2.6+.
|
Ce référentiel est testé sur Python 3.9+, Flax 0.4.1+, PyTorch 2.1+ et TensorFlow 2.6+.
|
||||||
|
|
||||||
Vous devriez installer 🤗 Transformers dans un [environnement virtuel](https://docs.python.org/3/library/venv.html). Si vous n'êtes pas familier avec les environnements virtuels Python, consultez le [guide utilisateur](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
Vous devriez installer 🤗 Transformers dans un [environnement virtuel](https://docs.python.org/3/library/venv.html). Si vous n'êtes pas familier avec les environnements virtuels Python, consultez le [guide utilisateur](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
||||||
|
|
||||||
|
|||||||
@@ -198,7 +198,7 @@ checkpoint: जाँच बिंदु
|
|||||||
|
|
||||||
### पिप का उपयोग करना
|
### पिप का उपयोग करना
|
||||||
|
|
||||||
इस रिपॉजिटरी का परीक्षण Python 3.9+, Flax 0.4.1+, PyTorch 2.0+ और TensorFlow 2.6+ के तहत किया गया है।
|
इस रिपॉजिटरी का परीक्षण Python 3.9+, Flax 0.4.1+, PyTorch 2.1+ और TensorFlow 2.6+ के तहत किया गया है।
|
||||||
|
|
||||||
आप [वर्चुअल एनवायरनमेंट](https://docs.python.org/3/library/venv.html) में 🤗 ट्रांसफॉर्मर इंस्टॉल कर सकते हैं। यदि आप अभी तक पायथन के वर्चुअल एनवायरनमेंट से परिचित नहीं हैं, तो कृपया इसे [उपयोगकर्ता निर्देश](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) पढ़ें।
|
आप [वर्चुअल एनवायरनमेंट](https://docs.python.org/3/library/venv.html) में 🤗 ट्रांसफॉर्मर इंस्टॉल कर सकते हैं। यदि आप अभी तक पायथन के वर्चुअल एनवायरनमेंट से परिचित नहीं हैं, तो कृपया इसे [उपयोगकर्ता निर्देश](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) पढ़ें।
|
||||||
|
|
||||||
|
|||||||
@@ -256,7 +256,7 @@ Hugging Faceチームによって作られた **[トランスフォーマーを
|
|||||||
|
|
||||||
### pipにて
|
### pipにて
|
||||||
|
|
||||||
このリポジトリは、Python 3.9+, Flax 0.4.1+, PyTorch 2.0+, TensorFlow 2.6+ でテストされています。
|
このリポジトリは、Python 3.9+, Flax 0.4.1+, PyTorch 2.1+, TensorFlow 2.6+ でテストされています。
|
||||||
|
|
||||||
🤗Transformersは[仮想環境](https://docs.python.org/3/library/venv.html)にインストールする必要があります。Pythonの仮想環境に慣れていない場合は、[ユーザーガイド](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)を確認してください。
|
🤗Transformersは[仮想環境](https://docs.python.org/3/library/venv.html)にインストールする必要があります。Pythonの仮想環境に慣れていない場合は、[ユーザーガイド](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)を確認してください。
|
||||||
|
|
||||||
|
|||||||
@@ -242,7 +242,7 @@ Transformers에 달린 100,000개의 별을 축하하기 위해, 우리는 커
|
|||||||
|
|
||||||
### pip로 설치하기
|
### pip로 설치하기
|
||||||
|
|
||||||
이 저장소는 Python 3.9+, Flax 0.4.1+, PyTorch 2.0+, TensorFlow 2.6+에서 테스트 되었습니다.
|
이 저장소는 Python 3.9+, Flax 0.4.1+, PyTorch 2.1+, TensorFlow 2.6+에서 테스트 되었습니다.
|
||||||
|
|
||||||
[가상 환경](https://docs.python.org/3/library/venv.html)에 🤗 Transformers를 설치하세요. Python 가상 환경에 익숙하지 않다면, [사용자 가이드](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)를 확인하세요.
|
[가상 환경](https://docs.python.org/3/library/venv.html)에 🤗 Transformers를 설치하세요. Python 가상 환경에 익숙하지 않다면, [사용자 가이드](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)를 확인하세요.
|
||||||
|
|
||||||
|
|||||||
@@ -253,7 +253,7 @@ O modelo em si é um [Pytorch `nn.Module`](https://pytorch.org/docs/stable/nn.ht
|
|||||||
|
|
||||||
### Com pip
|
### Com pip
|
||||||
|
|
||||||
Este repositório é testado no Python 3.9+, Flax 0.4.1+, PyTorch 2.0+ e TensorFlow 2.6+.
|
Este repositório é testado no Python 3.9+, Flax 0.4.1+, PyTorch 2.1+ e TensorFlow 2.6+.
|
||||||
|
|
||||||
Você deve instalar o 🤗 Transformers em um [ambiente virtual](https://docs.python.org/3/library/venv.html). Se você não está familiarizado com ambientes virtuais em Python, confira o [guia do usuário](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
Você deve instalar o 🤗 Transformers em um [ambiente virtual](https://docs.python.org/3/library/venv.html). Se você não está familiarizado com ambientes virtuais em Python, confira o [guia do usuário](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
||||||
|
|
||||||
|
|||||||
@@ -244,7 +244,7 @@ Hugging Face Hub. Мы хотим, чтобы Transformers позволил ра
|
|||||||
|
|
||||||
### С помощью pip
|
### С помощью pip
|
||||||
|
|
||||||
Данный репозиторий протестирован на Python 3.9+, Flax 0.4.1+, PyTorch 2.0+ и TensorFlow 2.6+.
|
Данный репозиторий протестирован на Python 3.9+, Flax 0.4.1+, PyTorch 2.1+ и TensorFlow 2.6+.
|
||||||
|
|
||||||
Устанавливать 🤗 Transformers следует в [виртуальной среде](https://docs.python.org/3/library/venv.html). Если вы не знакомы с виртуальными средами Python, ознакомьтесь с [руководством пользователя](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
Устанавливать 🤗 Transformers следует в [виртуальной среде](https://docs.python.org/3/library/venv.html). Если вы не знакомы с виртуальными средами Python, ознакомьтесь с [руководством пользователя](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
||||||
|
|
||||||
|
|||||||
@@ -246,7 +246,7 @@ limitations under the License.
|
|||||||
|
|
||||||
### పిప్ తో
|
### పిప్ తో
|
||||||
|
|
||||||
ఈ రిపోజిటరీ పైథాన్ 3.9+, ఫ్లాక్స్ 0.4.1+, PyTorch 2.0+ మరియు TensorFlow 2.6+లో పరీక్షించబడింది.
|
ఈ రిపోజిటరీ పైథాన్ 3.9+, ఫ్లాక్స్ 0.4.1+, PyTorch 2.1+ మరియు TensorFlow 2.6+లో పరీక్షించబడింది.
|
||||||
|
|
||||||
మీరు [వర్చువల్ వాతావరణం](https://docs.python.org/3/library/venv.html)లో 🤗 ట్రాన్స్ఫార్మర్లను ఇన్స్టాల్ చేయాలి. మీకు పైథాన్ వర్చువల్ పరిసరాల గురించి తెలియకుంటే, [యూజర్ గైడ్](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) చూడండి.
|
మీరు [వర్చువల్ వాతావరణం](https://docs.python.org/3/library/venv.html)లో 🤗 ట్రాన్స్ఫార్మర్లను ఇన్స్టాల్ చేయాలి. మీకు పైథాన్ వర్చువల్ పరిసరాల గురించి తెలియకుంటే, [యూజర్ గైడ్](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) చూడండి.
|
||||||
|
|
||||||
|
|||||||
@@ -259,7 +259,7 @@ limitations under the License.
|
|||||||
|
|
||||||
#### ‏ pip کے ساتھ
|
#### ‏ pip کے ساتھ
|
||||||
|
|
||||||
یہ ریپوزٹری Python 3.9+، Flax 0.4.1+، PyTorch 2.0+، اور TensorFlow 2.6+ پر ٹیسٹ کی گئی ہے۔
|
یہ ریپوزٹری Python 3.9+، Flax 0.4.1+، PyTorch 2.1+، اور TensorFlow 2.6+ پر ٹیسٹ کی گئی ہے۔
|
||||||
|
|
||||||
آپ کو 🤗 Transformers کو ایک [ورچوئل ماحول](https://docs.python.org/3/library/venv.html) میں انسٹال کرنا چاہیے۔ اگر آپ Python ورچوئل ماحول سے واقف نہیں ہیں، تو [یوزر گائیڈ](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) دیکھیں۔
|
آپ کو 🤗 Transformers کو ایک [ورچوئل ماحول](https://docs.python.org/3/library/venv.html) میں انسٹال کرنا چاہیے۔ اگر آپ Python ورچوئل ماحول سے واقف نہیں ہیں، تو [یوزر گائیڈ](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) دیکھیں۔
|
||||||
|
|
||||||
|
|||||||
@@ -245,7 +245,7 @@ Chính mô hình là một [Pytorch `nn.Module`](https://pytorch.org/docs/stable
|
|||||||
|
|
||||||
### Sử dụng pip
|
### Sử dụng pip
|
||||||
|
|
||||||
Thư viện này được kiểm tra trên Python 3.9+, Flax 0.4.1+, PyTorch 2.0+ và TensorFlow 2.6+.
|
Thư viện này được kiểm tra trên Python 3.9+, Flax 0.4.1+, PyTorch 2.1+ và TensorFlow 2.6+.
|
||||||
|
|
||||||
Bạn nên cài đặt 🤗 Transformers trong một [môi trường ảo Python](https://docs.python.org/3/library/venv.html). Nếu bạn chưa quen với môi trường ảo Python, hãy xem [hướng dẫn sử dụng](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
Bạn nên cài đặt 🤗 Transformers trong một [môi trường ảo Python](https://docs.python.org/3/library/venv.html). Nếu bạn chưa quen với môi trường ảo Python, hãy xem [hướng dẫn sử dụng](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
|
||||||
|
|
||||||
|
|||||||
@@ -198,7 +198,7 @@ checkpoint: 检查点
|
|||||||
|
|
||||||
### 使用 pip
|
### 使用 pip
|
||||||
|
|
||||||
这个仓库已在 Python 3.9+、Flax 0.4.1+、PyTorch 2.0+ 和 TensorFlow 2.6+ 下经过测试。
|
这个仓库已在 Python 3.9+、Flax 0.4.1+、PyTorch 2.1+ 和 TensorFlow 2.6+ 下经过测试。
|
||||||
|
|
||||||
你可以在[虚拟环境](https://docs.python.org/3/library/venv.html)中安装 🤗 Transformers。如果你还不熟悉 Python 的虚拟环境,请阅此[用户说明](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)。
|
你可以在[虚拟环境](https://docs.python.org/3/library/venv.html)中安装 🤗 Transformers。如果你还不熟悉 Python 的虚拟环境,请阅此[用户说明](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)。
|
||||||
|
|
||||||
|
|||||||
@@ -210,7 +210,7 @@ Tokenizer 為所有的預訓練模型提供了預處理,並可以直接轉換
|
|||||||
|
|
||||||
### 使用 pip
|
### 使用 pip
|
||||||
|
|
||||||
這個 Repository 已在 Python 3.9+、Flax 0.4.1+、PyTorch 2.0+ 和 TensorFlow 2.6+ 下經過測試。
|
這個 Repository 已在 Python 3.9+、Flax 0.4.1+、PyTorch 2.1+ 和 TensorFlow 2.6+ 下經過測試。
|
||||||
|
|
||||||
你可以在[虛擬環境](https://docs.python.org/3/library/venv.html)中安裝 🤗 Transformers。如果你還不熟悉 Python 的虛擬環境,請閱此[使用者指引](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)。
|
你可以在[虛擬環境](https://docs.python.org/3/library/venv.html)中安裝 🤗 Transformers。如果你還不熟悉 Python 的虛擬環境,請閱此[使用者指引](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)。
|
||||||
|
|
||||||
|
|||||||
2
setup.py
2
setup.py
@@ -187,7 +187,7 @@ _deps = [
|
|||||||
"tiktoken",
|
"tiktoken",
|
||||||
"timm<=1.0.11",
|
"timm<=1.0.11",
|
||||||
"tokenizers>=0.21,<0.22",
|
"tokenizers>=0.21,<0.22",
|
||||||
"torch>=2.0",
|
"torch>=2.1",
|
||||||
"torchaudio",
|
"torchaudio",
|
||||||
"torchvision",
|
"torchvision",
|
||||||
"pyctcdecode>=0.4.0",
|
"pyctcdecode>=0.4.0",
|
||||||
|
|||||||
@@ -92,7 +92,7 @@ deps = {
|
|||||||
"tiktoken": "tiktoken",
|
"tiktoken": "tiktoken",
|
||||||
"timm": "timm<=1.0.11",
|
"timm": "timm<=1.0.11",
|
||||||
"tokenizers": "tokenizers>=0.21,<0.22",
|
"tokenizers": "tokenizers>=0.21,<0.22",
|
||||||
"torch": "torch>=2.0",
|
"torch": "torch>=2.1",
|
||||||
"torchaudio": "torchaudio",
|
"torchaudio": "torchaudio",
|
||||||
"torchvision": "torchvision",
|
"torchvision": "torchvision",
|
||||||
"pyctcdecode": "pyctcdecode>=0.4.0",
|
"pyctcdecode": "pyctcdecode>=0.4.0",
|
||||||
|
|||||||
@@ -485,20 +485,15 @@ str_to_torch_dtype = {
|
|||||||
"F64": torch.float64,
|
"F64": torch.float64,
|
||||||
"I64": torch.int64,
|
"I64": torch.int64,
|
||||||
"F8_E4M3": torch.float8_e4m3fn,
|
"F8_E4M3": torch.float8_e4m3fn,
|
||||||
|
"F8_E5M2": torch.float8_e5m2,
|
||||||
}
|
}
|
||||||
|
|
||||||
if is_torch_greater_or_equal("2.1.0"):
|
|
||||||
str_to_torch_dtype["F8_E4M3"] = torch.float8_e4m3fn
|
|
||||||
|
|
||||||
if is_torch_greater_or_equal("2.3.0"):
|
if is_torch_greater_or_equal("2.3.0"):
|
||||||
str_to_torch_dtype["U16"] = torch.uint16
|
str_to_torch_dtype["U16"] = torch.uint16
|
||||||
str_to_torch_dtype["U32"] = torch.uint32
|
str_to_torch_dtype["U32"] = torch.uint32
|
||||||
str_to_torch_dtype["U64"] = torch.uint64
|
str_to_torch_dtype["U64"] = torch.uint64
|
||||||
|
|
||||||
if is_torch_greater_or_equal("2.1.0"):
|
|
||||||
str_to_torch_dtype["F8_E4M3"] = torch.float8_e4m3fn
|
|
||||||
str_to_torch_dtype["F8_E5M2"] = torch.float8_e5m2
|
|
||||||
|
|
||||||
|
|
||||||
def load_state_dict(
|
def load_state_dict(
|
||||||
checkpoint_file: Union[str, os.PathLike],
|
checkpoint_file: Union[str, os.PathLike],
|
||||||
@@ -546,12 +541,7 @@ def load_state_dict(
|
|||||||
map_location = "cpu"
|
map_location = "cpu"
|
||||||
extra_args = {}
|
extra_args = {}
|
||||||
# mmap can only be used with files serialized with zipfile-based format.
|
# mmap can only be used with files serialized with zipfile-based format.
|
||||||
if (
|
if isinstance(checkpoint_file, str) and map_location != "meta" and is_zipfile(checkpoint_file):
|
||||||
isinstance(checkpoint_file, str)
|
|
||||||
and map_location != "meta"
|
|
||||||
and version.parse(torch.__version__) >= version.parse("2.1.0")
|
|
||||||
and is_zipfile(checkpoint_file)
|
|
||||||
):
|
|
||||||
extra_args = {"mmap": True}
|
extra_args = {"mmap": True}
|
||||||
return torch.load(
|
return torch.load(
|
||||||
checkpoint_file,
|
checkpoint_file,
|
||||||
|
|||||||
@@ -34,10 +34,8 @@ from ...file_utils import (
|
|||||||
)
|
)
|
||||||
from ...modeling_outputs import BaseModelOutput, BaseModelOutputWithCrossAttentions
|
from ...modeling_outputs import BaseModelOutput, BaseModelOutputWithCrossAttentions
|
||||||
from ...modeling_utils import PreTrainedModel
|
from ...modeling_utils import PreTrainedModel
|
||||||
from ...pytorch_utils import is_torch_greater_or_equal_than_2_1
|
|
||||||
from ...utils import is_accelerate_available, logging
|
from ...utils import is_accelerate_available, logging
|
||||||
from ...utils.backbone_utils import load_backbone
|
from ...utils.backbone_utils import load_backbone
|
||||||
from ...utils.import_utils import is_torchdynamo_compiling
|
|
||||||
from .configuration_mask2former import Mask2FormerConfig
|
from .configuration_mask2former import Mask2FormerConfig
|
||||||
|
|
||||||
|
|
||||||
@@ -2018,17 +2016,7 @@ class Mask2FormerMaskPredictor(nn.Module):
|
|||||||
):
|
):
|
||||||
mask_embeddings = self.mask_embedder(outputs.transpose(0, 1))
|
mask_embeddings = self.mask_embedder(outputs.transpose(0, 1))
|
||||||
|
|
||||||
is_tracing = torch.jit.is_tracing() or isinstance(outputs, torch.fx.Proxy) or is_torchdynamo_compiling()
|
|
||||||
# Sum up over the channels
|
# Sum up over the channels
|
||||||
if is_tracing and not is_torch_greater_or_equal_than_2_1:
|
|
||||||
# Equivalent to einsum('bqc, bchw -> bqhw') but jit friendly
|
|
||||||
batch_size, num_queries, num_channels = mask_embeddings.shape
|
|
||||||
_, _, height, width = pixel_embeddings.shape
|
|
||||||
outputs_mask = torch.zeros((batch_size, num_queries, height, width), device=mask_embeddings.device)
|
|
||||||
for c in range(num_channels):
|
|
||||||
outputs_mask += mask_embeddings[..., c][..., None, None] * pixel_embeddings[:, None, c]
|
|
||||||
|
|
||||||
else:
|
|
||||||
outputs_mask = torch.einsum("bqc, bchw -> bqhw", mask_embeddings, pixel_embeddings)
|
outputs_mask = torch.einsum("bqc, bchw -> bqhw", mask_embeddings, pixel_embeddings)
|
||||||
|
|
||||||
attention_mask = nn.functional.interpolate(
|
attention_mask = nn.functional.interpolate(
|
||||||
|
|||||||
@@ -27,7 +27,6 @@ from ...activations import ACT2FN
|
|||||||
from ...modeling_attn_mask_utils import _prepare_4d_attention_mask
|
from ...modeling_attn_mask_utils import _prepare_4d_attention_mask
|
||||||
from ...modeling_outputs import BaseModelOutputWithCrossAttentions
|
from ...modeling_outputs import BaseModelOutputWithCrossAttentions
|
||||||
from ...modeling_utils import PreTrainedModel
|
from ...modeling_utils import PreTrainedModel
|
||||||
from ...pytorch_utils import is_torch_greater_or_equal_than_2_1
|
|
||||||
from ...utils import (
|
from ...utils import (
|
||||||
ModelOutput,
|
ModelOutput,
|
||||||
add_start_docstrings,
|
add_start_docstrings,
|
||||||
@@ -39,7 +38,6 @@ from ...utils import (
|
|||||||
requires_backends,
|
requires_backends,
|
||||||
)
|
)
|
||||||
from ...utils.backbone_utils import load_backbone
|
from ...utils.backbone_utils import load_backbone
|
||||||
from ...utils.import_utils import is_torchdynamo_compiling
|
|
||||||
from ..detr import DetrConfig
|
from ..detr import DetrConfig
|
||||||
from .configuration_maskformer import MaskFormerConfig
|
from .configuration_maskformer import MaskFormerConfig
|
||||||
from .configuration_maskformer_swin import MaskFormerSwinConfig
|
from .configuration_maskformer_swin import MaskFormerSwinConfig
|
||||||
@@ -1685,7 +1683,6 @@ class MaskFormerForInstanceSegmentation(MaskFormerPreTrainedModel):
|
|||||||
# get the auxiliary predictions (one for each decoder's layer)
|
# get the auxiliary predictions (one for each decoder's layer)
|
||||||
auxiliary_logits: List[str, Tensor] = []
|
auxiliary_logits: List[str, Tensor] = []
|
||||||
|
|
||||||
is_tracing = torch.jit.is_tracing() or isinstance(outputs, torch.fx.Proxy) or is_torchdynamo_compiling()
|
|
||||||
# This code is a little bit cumbersome, an improvement can be to return a list of predictions. If we have auxiliary loss then we are going to return more than one element in the list
|
# This code is a little bit cumbersome, an improvement can be to return a list of predictions. If we have auxiliary loss then we are going to return more than one element in the list
|
||||||
if self.config.use_auxiliary_loss:
|
if self.config.use_auxiliary_loss:
|
||||||
stacked_transformer_decoder_outputs = torch.stack(outputs.transformer_decoder_hidden_states)
|
stacked_transformer_decoder_outputs = torch.stack(outputs.transformer_decoder_hidden_states)
|
||||||
@@ -1693,17 +1690,6 @@ class MaskFormerForInstanceSegmentation(MaskFormerPreTrainedModel):
|
|||||||
class_queries_logits = classes[-1]
|
class_queries_logits = classes[-1]
|
||||||
# get the masks
|
# get the masks
|
||||||
mask_embeddings = self.mask_embedder(stacked_transformer_decoder_outputs)
|
mask_embeddings = self.mask_embedder(stacked_transformer_decoder_outputs)
|
||||||
|
|
||||||
if is_tracing and not is_torch_greater_or_equal_than_2_1:
|
|
||||||
# Equivalent to einsum('lbqc, bchw -> lbqhw') but jit friendly
|
|
||||||
num_embeddings, batch_size, num_queries, num_channels = mask_embeddings.shape
|
|
||||||
_, _, height, width = pixel_embeddings.shape
|
|
||||||
binaries_masks = torch.zeros(
|
|
||||||
(num_embeddings, batch_size, num_queries, height, width), device=mask_embeddings.device
|
|
||||||
)
|
|
||||||
for c in range(num_channels):
|
|
||||||
binaries_masks += mask_embeddings[..., c][..., None, None] * pixel_embeddings[None, :, None, c]
|
|
||||||
else:
|
|
||||||
binaries_masks = torch.einsum("lbqc, bchw -> lbqhw", mask_embeddings, pixel_embeddings)
|
binaries_masks = torch.einsum("lbqc, bchw -> lbqhw", mask_embeddings, pixel_embeddings)
|
||||||
|
|
||||||
masks_queries_logits = binaries_masks[-1]
|
masks_queries_logits = binaries_masks[-1]
|
||||||
@@ -1720,17 +1706,6 @@ class MaskFormerForInstanceSegmentation(MaskFormerPreTrainedModel):
|
|||||||
# get the masks
|
# get the masks
|
||||||
mask_embeddings = self.mask_embedder(transformer_decoder_hidden_states)
|
mask_embeddings = self.mask_embedder(transformer_decoder_hidden_states)
|
||||||
# sum up over the channels
|
# sum up over the channels
|
||||||
|
|
||||||
if is_tracing and not is_torch_greater_or_equal_than_2_1:
|
|
||||||
# Equivalent to einsum('bqc, bchw -> bqhw') but jit friendly
|
|
||||||
batch_size, num_queries, num_channels = mask_embeddings.shape
|
|
||||||
_, _, height, width = pixel_embeddings.shape
|
|
||||||
masks_queries_logits = torch.zeros(
|
|
||||||
(batch_size, num_queries, height, width), device=mask_embeddings.device
|
|
||||||
)
|
|
||||||
for c in range(num_channels):
|
|
||||||
masks_queries_logits += mask_embeddings[..., c][..., None, None] * pixel_embeddings[:, None, c]
|
|
||||||
else:
|
|
||||||
masks_queries_logits = torch.einsum("bqc, bchw -> bqhw", mask_embeddings, pixel_embeddings)
|
masks_queries_logits = torch.einsum("bqc, bchw -> bqhw", mask_embeddings, pixel_embeddings)
|
||||||
|
|
||||||
return class_queries_logits, masks_queries_logits, auxiliary_logits
|
return class_queries_logits, masks_queries_logits, auxiliary_logits
|
||||||
|
|||||||
@@ -32,9 +32,9 @@ is_torch_greater_or_equal_than_2_6 = is_torch_greater_or_equal("2.6", accept_dev
|
|||||||
is_torch_greater_or_equal_than_2_4 = is_torch_greater_or_equal("2.4", accept_dev=True)
|
is_torch_greater_or_equal_than_2_4 = is_torch_greater_or_equal("2.4", accept_dev=True)
|
||||||
is_torch_greater_or_equal_than_2_3 = is_torch_greater_or_equal("2.3", accept_dev=True)
|
is_torch_greater_or_equal_than_2_3 = is_torch_greater_or_equal("2.3", accept_dev=True)
|
||||||
is_torch_greater_or_equal_than_2_2 = is_torch_greater_or_equal("2.2", accept_dev=True)
|
is_torch_greater_or_equal_than_2_2 = is_torch_greater_or_equal("2.2", accept_dev=True)
|
||||||
is_torch_greater_or_equal_than_2_1 = is_torch_greater_or_equal("2.1", accept_dev=True)
|
|
||||||
|
|
||||||
# For backwards compatibility (e.g. some remote codes on Hub using those variables).
|
# For backwards compatibility (e.g. some remote codes on Hub using those variables).
|
||||||
|
is_torch_greater_or_equal_than_2_1 = is_torch_greater_or_equal("2.1", accept_dev=True)
|
||||||
is_torch_greater_or_equal_than_2_0 = is_torch_greater_or_equal("2.0", accept_dev=True)
|
is_torch_greater_or_equal_than_2_0 = is_torch_greater_or_equal("2.0", accept_dev=True)
|
||||||
is_torch_greater_or_equal_than_1_13 = is_torch_greater_or_equal("1.13", accept_dev=True)
|
is_torch_greater_or_equal_than_1_13 = is_torch_greater_or_equal("1.13", accept_dev=True)
|
||||||
is_torch_greater_or_equal_than_1_12 = is_torch_greater_or_equal("1.12", accept_dev=True)
|
is_torch_greater_or_equal_than_1_12 = is_torch_greater_or_equal("1.12", accept_dev=True)
|
||||||
|
|||||||
@@ -11,11 +11,8 @@
|
|||||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
# See the License for the specific language governing permissions and
|
# See the License for the specific language governing permissions and
|
||||||
# limitations under the License.
|
# limitations under the License.
|
||||||
import importlib
|
|
||||||
from typing import TYPE_CHECKING, Any, Dict, List, Optional
|
from typing import TYPE_CHECKING, Any, Dict, List, Optional
|
||||||
|
|
||||||
from packaging import version
|
|
||||||
|
|
||||||
from .base import HfQuantizer
|
from .base import HfQuantizer
|
||||||
|
|
||||||
|
|
||||||
@@ -48,9 +45,9 @@ class FbgemmFp8HfQuantizer(HfQuantizer):
|
|||||||
self.quantization_config = quantization_config
|
self.quantization_config = quantization_config
|
||||||
|
|
||||||
def validate_environment(self, *args, **kwargs):
|
def validate_environment(self, *args, **kwargs):
|
||||||
if not is_torch_available() or version.parse(importlib.metadata.version("torch")) < version.parse("2.1.0"):
|
if not is_torch_available():
|
||||||
raise ImportError(
|
raise ImportError(
|
||||||
"Using fbgemm fp8 quantization requires torch > 2.1.0"
|
"Using fbgemm fp8 quantization requires torch >= 2.1.0"
|
||||||
"Please install the latest version of torch ( pip install --upgrade torch )"
|
"Please install the latest version of torch ( pip install --upgrade torch )"
|
||||||
)
|
)
|
||||||
if not is_fbgemm_gpu_available():
|
if not is_fbgemm_gpu_available():
|
||||||
|
|||||||
@@ -1,8 +1,5 @@
|
|||||||
import importlib
|
|
||||||
from typing import TYPE_CHECKING, Any, Dict, List, Optional
|
from typing import TYPE_CHECKING, Any, Dict, List, Optional
|
||||||
|
|
||||||
from packaging import version
|
|
||||||
|
|
||||||
from ..utils import is_accelerate_available, is_torch_available, logging
|
from ..utils import is_accelerate_available, is_torch_available, logging
|
||||||
from .base import HfQuantizer
|
from .base import HfQuantizer
|
||||||
from .quantizers_utils import get_module_from_name
|
from .quantizers_utils import get_module_from_name
|
||||||
@@ -32,7 +29,7 @@ class FineGrainedFP8HfQuantizer(HfQuantizer):
|
|||||||
self.quantization_config = quantization_config
|
self.quantization_config = quantization_config
|
||||||
|
|
||||||
def validate_environment(self, *args, **kwargs):
|
def validate_environment(self, *args, **kwargs):
|
||||||
if not is_torch_available() or version.parse(importlib.metadata.version("torch")) < version.parse("2.1.0"):
|
if not is_torch_available():
|
||||||
raise ImportError(
|
raise ImportError(
|
||||||
"Using fp8 quantization requires torch >= 2.1.0"
|
"Using fp8 quantization requires torch >= 2.1.0"
|
||||||
"Please install the latest version of torch ( pip install --upgrade torch )"
|
"Please install the latest version of torch ( pip install --upgrade torch )"
|
||||||
|
|||||||
@@ -1902,7 +1902,6 @@ class Trainer:
|
|||||||
jit_model.forward = original_forward
|
jit_model.forward = original_forward
|
||||||
autocast_handler = AutocastKwargs(cache_enabled=False)
|
autocast_handler = AutocastKwargs(cache_enabled=False)
|
||||||
with self.accelerator.autocast(autocast_handler=autocast_handler), torch.no_grad():
|
with self.accelerator.autocast(autocast_handler=autocast_handler), torch.no_grad():
|
||||||
if version.parse(version.parse(torch.__version__).base_version) >= version.parse("2.0.0"):
|
|
||||||
if isinstance(example_batch, dict):
|
if isinstance(example_batch, dict):
|
||||||
jit_model = torch.jit.trace(jit_model, example_kwarg_inputs=example_batch, strict=False)
|
jit_model = torch.jit.trace(jit_model, example_kwarg_inputs=example_batch, strict=False)
|
||||||
else:
|
else:
|
||||||
@@ -1911,13 +1910,6 @@ class Trainer:
|
|||||||
example_kwarg_inputs={key: example_batch[key] for key in example_batch},
|
example_kwarg_inputs={key: example_batch[key] for key in example_batch},
|
||||||
strict=False,
|
strict=False,
|
||||||
)
|
)
|
||||||
else:
|
|
||||||
jit_inputs = []
|
|
||||||
for key in example_batch:
|
|
||||||
example_tensor = torch.ones_like(example_batch[key])
|
|
||||||
jit_inputs.append(example_tensor)
|
|
||||||
jit_inputs = tuple(jit_inputs)
|
|
||||||
jit_model = torch.jit.trace(jit_model, jit_inputs, strict=False)
|
|
||||||
jit_model = torch.jit.freeze(jit_model)
|
jit_model = torch.jit.freeze(jit_model)
|
||||||
with torch.no_grad():
|
with torch.no_grad():
|
||||||
jit_model(**example_batch)
|
jit_model(**example_batch)
|
||||||
|
|||||||
@@ -24,7 +24,6 @@ from pathlib import Path
|
|||||||
from typing import Any, Optional, Union
|
from typing import Any, Optional, Union
|
||||||
|
|
||||||
from huggingface_hub import get_full_repo_name
|
from huggingface_hub import get_full_repo_name
|
||||||
from packaging import version
|
|
||||||
|
|
||||||
from .debug_utils import DebugOption
|
from .debug_utils import DebugOption
|
||||||
from .trainer_utils import (
|
from .trainer_utils import (
|
||||||
@@ -1290,7 +1289,7 @@ class TrainingArguments:
|
|||||||
|
|
||||||
default_optim = "adamw_torch"
|
default_optim = "adamw_torch"
|
||||||
# XXX: enable when pytorch==2.0.1 comes out - we want to give it time to get all the bugs sorted out
|
# XXX: enable when pytorch==2.0.1 comes out - we want to give it time to get all the bugs sorted out
|
||||||
# if is_torch_available() and version.parse(version.parse(torch.__version__).base_version) >= version.parse("2.1.0"):
|
# if is_torch_available():
|
||||||
# default_optim = "adamw_torch_fused"
|
# default_optim = "adamw_torch_fused"
|
||||||
# and update the doc above to:
|
# and update the doc above to:
|
||||||
# optim (`str` or [`training_args.OptimizerNames`], *optional*, defaults to `"adamw_torch_fused"` (for torch<2.1.0 `"adamw_torch"`):
|
# optim (`str` or [`training_args.OptimizerNames`], *optional*, defaults to `"adamw_torch_fused"` (for torch<2.1.0 `"adamw_torch"`):
|
||||||
@@ -1732,12 +1731,6 @@ class TrainingArguments:
|
|||||||
FutureWarning,
|
FutureWarning,
|
||||||
)
|
)
|
||||||
self.optim = OptimizerNames.ADAFACTOR
|
self.optim = OptimizerNames.ADAFACTOR
|
||||||
if self.optim == OptimizerNames.ADAMW_TORCH_FUSED and is_torch_available():
|
|
||||||
if version.parse(version.parse(torch.__version__).base_version) < version.parse("2.0.0"):
|
|
||||||
raise ValueError("--optim adamw_torch_fused requires PyTorch 2.0 or higher")
|
|
||||||
# there is a bug in fp16/AMP in pt-2.0.0
|
|
||||||
if version.parse(version.parse(torch.__version__).base_version) == version.parse("2.0.0") and self.fp16:
|
|
||||||
raise ValueError("--optim adamw_torch_fused with --fp16 requires PyTorch>2.0")
|
|
||||||
|
|
||||||
# We need to setup the accelerator config here *before* the first call to `self.device`
|
# We need to setup the accelerator config here *before* the first call to `self.device`
|
||||||
if is_accelerate_available():
|
if is_accelerate_available():
|
||||||
|
|||||||
@@ -379,15 +379,12 @@ def is_torch_sdpa_available():
|
|||||||
elif _torch_version == "N/A":
|
elif _torch_version == "N/A":
|
||||||
return False
|
return False
|
||||||
|
|
||||||
# NOTE: We require torch>=2.1 (and not torch>=2.0) to use SDPA in Transformers for two reasons:
|
|
||||||
# - Allow the global use of the `scale` argument introduced in https://github.com/pytorch/pytorch/pull/95259
|
|
||||||
# - Memory-efficient attention supports arbitrary attention_mask: https://github.com/pytorch/pytorch/pull/104310
|
|
||||||
# NOTE: MLU is OK with non-contiguous inputs.
|
# NOTE: MLU is OK with non-contiguous inputs.
|
||||||
if is_torch_mlu_available():
|
if is_torch_mlu_available():
|
||||||
return version.parse(_torch_version) >= version.parse("2.1.0")
|
return True
|
||||||
# NOTE: NPU can use SDPA in Transformers with torch>=2.1.0.
|
# NOTE: NPU can use SDPA in Transformers with torch>=2.1.0.
|
||||||
if is_torch_npu_available():
|
if is_torch_npu_available():
|
||||||
return version.parse(_torch_version) >= version.parse("2.1.0")
|
return True
|
||||||
# NOTE: We require torch>=2.1.1 to avoid a numerical issue in SDPA with non-contiguous inputs: https://github.com/pytorch/pytorch/issues/112577
|
# NOTE: We require torch>=2.1.1 to avoid a numerical issue in SDPA with non-contiguous inputs: https://github.com/pytorch/pytorch/issues/112577
|
||||||
return version.parse(_torch_version) >= version.parse("2.1.1")
|
return version.parse(_torch_version) >= version.parse("2.1.1")
|
||||||
|
|
||||||
@@ -833,7 +830,7 @@ def is_torchdynamo_available():
|
|||||||
if not is_torch_available():
|
if not is_torch_available():
|
||||||
return False
|
return False
|
||||||
|
|
||||||
return version.parse(_torch_version) >= version.parse("2.0.0")
|
return True
|
||||||
|
|
||||||
|
|
||||||
def is_torch_compile_available():
|
def is_torch_compile_available():
|
||||||
|
|||||||
@@ -47,10 +47,7 @@ from transformers.utils import (
|
|||||||
|
|
||||||
|
|
||||||
if is_torch_available():
|
if is_torch_available():
|
||||||
from transformers.pytorch_utils import is_torch_greater_or_equal_than_2_1
|
|
||||||
from transformers.trainer import FSDP_MODEL_NAME
|
from transformers.trainer import FSDP_MODEL_NAME
|
||||||
else:
|
|
||||||
is_torch_greater_or_equal_than_2_1 = False
|
|
||||||
|
|
||||||
# default torch.distributed port
|
# default torch.distributed port
|
||||||
DEFAULT_MASTER_PORT = "10999"
|
DEFAULT_MASTER_PORT = "10999"
|
||||||
@@ -260,7 +257,6 @@ class TrainerIntegrationFSDP(TestCasePlus, TrainerIntegrationCommon):
|
|||||||
@require_torch_multi_accelerator
|
@require_torch_multi_accelerator
|
||||||
@run_first
|
@run_first
|
||||||
@slow
|
@slow
|
||||||
@unittest.skipIf(not is_torch_greater_or_equal_than_2_1, reason="This test on pytorch 2.0 takes 4 hours.")
|
|
||||||
def test_basic_run_with_cpu_offload(self, dtype):
|
def test_basic_run_with_cpu_offload(self, dtype):
|
||||||
launcher = get_launcher(distributed=True, use_accelerate=False)
|
launcher = get_launcher(distributed=True, use_accelerate=False)
|
||||||
output_dir = self.get_auto_remove_tmp_dir()
|
output_dir = self.get_auto_remove_tmp_dir()
|
||||||
|
|||||||
Reference in New Issue
Block a user