Add dates to the model docs (#39320)
* added dates to the models with a single hf papers link * added the dates for models with multiple papers * half of no_papers models done * rest of no_papers models also done, only the exceptions left * added copyright disclaimer to sam_hw, cohere, cohere2 + dates * some more fixes, hf links + typo * some new models + a rough script * the script looks robust, changed all paper links to hf * minor change to handle technical reports along with blogs * ran make fixup to remove the white space * refactor
This commit is contained in:
1
Makefile
1
Makefile
@@ -52,6 +52,7 @@ repo-consistency:
|
|||||||
python utils/check_doctest_list.py
|
python utils/check_doctest_list.py
|
||||||
python utils/update_metadata.py --check-only
|
python utils/update_metadata.py --check-only
|
||||||
python utils/check_docstrings.py
|
python utils/check_docstrings.py
|
||||||
|
python utils/add_dates.py
|
||||||
|
|
||||||
# this target runs checks on all files
|
# this target runs checks on all files
|
||||||
|
|
||||||
|
|||||||
@@ -13,12 +13,13 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-11-21 and added to Hugging Face Transformers on 2025-07-08.*
|
||||||
|
|
||||||
# AIMv2
|
# AIMv2
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
The AIMv2 model was proposed in [Multimodal Autoregressive Pre-training of Large Vision Encoders](https://arxiv.org/abs/2411.14402) by Enrico Fini, Mustafa Shukor, Xiujun Li, Philipp Dufter, Michal Klein, David Haldimann, Sai Aitharaju, Victor Guilherme Turrisi da Costa, Louis Béthune, Zhe Gan, Alexander T Toshev, Marcin Eichner, Moin Nabi, Yinfei Yang, Joshua M. Susskind, Alaaeldin El-Nouby.
|
The AIMv2 model was proposed in [Multimodal Autoregressive Pre-training of Large Vision Encoders](https://huggingface.co/papers/2411.14402) by Enrico Fini, Mustafa Shukor, Xiujun Li, Philipp Dufter, Michal Klein, David Haldimann, Sai Aitharaju, Victor Guilherme Turrisi da Costa, Louis Béthune, Zhe Gan, Alexander T Toshev, Marcin Eichner, Moin Nabi, Yinfei Yang, Joshua M. Susskind, Alaaeldin El-Nouby.
|
||||||
|
|
||||||
The abstract from the paper is the following:
|
The abstract from the paper is the following:
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-09-26 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-02-11 and added to Hugging Face Transformers on 2023-03-01.*
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-11-12 and added to Hugging Face Transformers on 2023-01-04.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2025-06-18 and added to Hugging Face Transformers on 2025-06-24.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
@@ -24,7 +25,7 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
# Arcee
|
# Arcee
|
||||||
|
|
||||||
Arcee is a decoder-only transformer model based on the Llama architecture with a key modification: it uses ReLU² (ReLU-squared) activation in the MLP blocks instead of SiLU, following recent research showing improved training efficiency with squared activations. This architecture is designed for efficient training and inference while maintaining the proven stability of the Llama design.
|
[Arcee](https://www.arcee.ai/blog/deep-dive-afm-4-5b-the-first-arcee-foundational-model) is a decoder-only transformer model based on the Llama architecture with a key modification: it uses ReLU² (ReLU-squared) activation in the MLP blocks instead of SiLU, following recent research showing improved training efficiency with squared activations. This architecture is designed for efficient training and inference while maintaining the proven stability of the Llama design.
|
||||||
|
|
||||||
The Arcee model is architecturally similar to Llama but uses `x * relu(x)` in MLP layers for improved gradient flow and is optimized for efficiency in both training and inference scenarios.
|
The Arcee model is architecturally similar to Llama but uses `x * relu(x)` in MLP layers for improved gradient flow and is optimized for efficiency in both training and inference scenarios.
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-10-08 and added to Hugging Face Transformers on 2024-12-06.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-04-05 and added to Hugging Face Transformers on 2022-11-21.*
|
||||||
|
|
||||||
# Audio Spectrogram Transformer
|
# Audio Spectrogram Transformer
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-06-24 and added to Hugging Face Transformers on 2023-05-30.*
|
||||||
|
|
||||||
# Autoformer
|
# Autoformer
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2025-05-13 and added to Hugging Face Transformers on 2025-03-04.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-12-18 and added to Hugging Face Transformers on 2024-12-19.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ Unless required by applicable law or agreed to in writing, software distributed
|
|||||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
specific language governing permissions and limitations under the License.
|
specific language governing permissions and limitations under the License.
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2023-04-09 and added to Hugging Face Transformers on 2023-07-17.*
|
||||||
|
|
||||||
# Bark
|
# Bark
|
||||||
|
|
||||||
@@ -19,7 +20,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
Bark is a transformer-based text-to-speech model proposed by Suno AI in [suno-ai/bark](https://github.com/suno-ai/bark).
|
[Bark](https://huggingface.co/suno/bark) is a transformer-based text-to-speech model proposed by Suno AI in [suno-ai/bark](https://github.com/suno-ai/bark).
|
||||||
|
|
||||||
Bark is made of 4 main models:
|
Bark is made of 4 main models:
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-10-29 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-10-23 and added to Hugging Face Transformers on 2020-11-27.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-09-20 and added to Hugging Face Transformers on 2021-10-18.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-06-15 and added to Hugging Face Transformers on 2021-08-04.*
|
||||||
|
|
||||||
# BEiT
|
# BEiT
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-07-29 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
# BertGeneration
|
# BertGeneration
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-03-24 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
# BertJapanese
|
# BertJapanese
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2018-10-11 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-05-20 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
# BERTweet
|
# BERTweet
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-07-28 and added to Hugging Face Transformers on 2021-03-30.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-07-28 and added to Hugging Face Transformers on 2021-05-07.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-10-19 and added to Hugging Face Transformers on 2022-12-05.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-12-24 and added to Hugging Face Transformers on 2022-12-07.*
|
||||||
|
|
||||||
# Big Transfer (BiT)
|
# Big Transfer (BiT)
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2025-04-16 and added to Hugging Face Transformers on 2025-04-28.*
|
||||||
|
|
||||||
# BitNet
|
# BitNet
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-04-28 and added to Hugging Face Transformers on 2021-01-05.*
|
||||||
|
|
||||||
# Blenderbot Small
|
# Blenderbot Small
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-04-28 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
# Blenderbot
|
# Blenderbot
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2023-01-30 and added to Hugging Face Transformers on 2023-02-09.*
|
||||||
|
|
||||||
# BLIP-2
|
# BLIP-2
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-01-28 and added to Hugging Face Transformers on 2022-12-21.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-11-09 and added to Hugging Face Transformers on 2022-06-09.*
|
||||||
|
|
||||||
# BLOOM
|
# BLOOM
|
||||||
|
|
||||||
@@ -24,7 +25,7 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
The BLOOM model has been proposed with its various versions through the [BigScience Workshop](https://bigscience.huggingface.co/). BigScience is inspired by other open science initiatives where researchers have pooled their time and resources to collectively achieve a higher impact.
|
The [BLOOM](https://huggingface.co/papers/2211.05100) model has been proposed with its various versions through the [BigScience Workshop](https://bigscience.huggingface.co/). BigScience is inspired by other open science initiatives where researchers have pooled their time and resources to collectively achieve a higher impact.
|
||||||
The architecture of BLOOM is essentially similar to GPT3 (auto-regressive model for next token prediction), but has been trained on 46 different languages and 13 programming languages.
|
The architecture of BLOOM is essentially similar to GPT3 (auto-regressive model for next token prediction), but has been trained on 46 different languages and 13 programming languages.
|
||||||
Several smaller versions of the models have been trained on the same dataset. BLOOM is available in the following versions:
|
Several smaller versions of the models have been trained on the same dataset. BLOOM is available in the following versions:
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-10-20 and added to Hugging Face Transformers on 2023-06-20.*
|
||||||
|
|
||||||
# BORT
|
# BORT
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-06-17 and added to Hugging Face Transformers on 2023-01-25.*
|
||||||
|
|
||||||
# BridgeTower
|
# BridgeTower
|
||||||
|
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ Unless required by applicable law or agreed to in writing, software distributed
|
|||||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
specific language governing permissions and limitations under the License.
|
specific language governing permissions and limitations under the License.
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-08-10 and added to Hugging Face Transformers on 2023-09-15.*
|
||||||
|
|
||||||
# BROS
|
# BROS
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-05-28 and added to Hugging Face Transformers on 2021-06-01.*
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-11-10 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-03-11 and added to Hugging Face Transformers on 2021-06-30.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-05-16 and added to Hugging Face Transformers on 2024-07-17.*
|
||||||
|
|
||||||
# Chameleon
|
# Chameleon
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-11-02 and added to Hugging Face Transformers on 2022-12-01.*
|
||||||
|
|
||||||
# Chinese-CLIP
|
# Chinese-CLIP
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-11-12 and added to Hugging Face Transformers on 2023-02-16.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-02-26 and added to Hugging Face Transformers on 2021-05-12.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-12-18 and added to Hugging Face Transformers on 2022-11-08.*
|
||||||
|
|
||||||
# CLIPSeg
|
# CLIPSeg
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2023-05-12 and added to Hugging Face Transformers on 2023-11-10.*
|
||||||
|
|
||||||
# CLVP
|
# CLVP
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2023-08-24 and added to Hugging Face Transformers on 2023-08-25.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-03-25 and added to Hugging Face Transformers on 2022-06-24.*
|
||||||
|
|
||||||
# CodeGen
|
# CodeGen
|
||||||
|
|
||||||
|
|||||||
@@ -1,3 +1,18 @@
|
|||||||
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
|
||||||
|
⚠️ Note that this file is in Markdown but contains specific syntax for our doc-builder (similar to MDX) that may not be
|
||||||
|
rendered properly in your Markdown viewer.
|
||||||
|
-->
|
||||||
|
*This model was released on 2024-03-12 and added to Hugging Face Transformers on 2024-03-15.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
||||||
@@ -10,7 +25,7 @@
|
|||||||
|
|
||||||
# Cohere
|
# Cohere
|
||||||
|
|
||||||
Cohere Command-R is a 35B parameter multilingual large language model designed for long context tasks like retrieval-augmented generation (RAG) and calling external APIs and tools. The model is specifically trained for grounded generation and supports both single-step and multi-step tool use. It supports a context length of 128K tokens.
|
Cohere [Command-R](https://cohere.com/blog/command-r) is a 35B parameter multilingual large language model designed for long context tasks like retrieval-augmented generation (RAG) and calling external APIs and tools. The model is specifically trained for grounded generation and supports both single-step and multi-step tool use. It supports a context length of 128K tokens.
|
||||||
|
|
||||||
You can find all the original Command-R checkpoints under the [Command Models](https://huggingface.co/collections/CohereForAI/command-models-67652b401665205e17b192ad) collection.
|
You can find all the original Command-R checkpoints under the [Command Models](https://huggingface.co/collections/CohereForAI/command-models-67652b401665205e17b192ad) collection.
|
||||||
|
|
||||||
|
|||||||
@@ -1,3 +1,18 @@
|
|||||||
|
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
|
||||||
|
⚠️ Note that this file is in Markdown but contains specific syntax for our doc-builder (similar to MDX) that may not be
|
||||||
|
rendered properly in your Markdown viewer.
|
||||||
|
-->
|
||||||
|
*This model was released on 2024-12-13 and added to Hugging Face Transformers on 2024-12-13.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
||||||
@@ -8,7 +23,7 @@
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
|
||||||
# Cohere2
|
# Cohere 2
|
||||||
|
|
||||||
[Cohere Command R7B](https://cohere.com/blog/command-r7b) is an open weights research release of a 7B billion parameter model. It is a multilingual model trained on 23 languages and has a context window of 128k. The model features three layers with sliding window attention and ROPE for efficient local context modeling and relative positional encoding. A fourth layer uses global attention without positional embeddings, enabling unrestricted token interactions across the entire sequence.
|
[Cohere Command R7B](https://cohere.com/blog/command-r7b) is an open weights research release of a 7B billion parameter model. It is a multilingual model trained on 23 languages and has a context window of 128k. The model features three layers with sliding window attention and ROPE for efficient local context modeling and relative positional encoding. A fourth layer uses global attention without positional embeddings, enabling unrestricted token interactions across the entire sequence.
|
||||||
|
|
||||||
|
|||||||
@@ -1,3 +1,20 @@
|
|||||||
|
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
|
||||||
|
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
|
||||||
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
|
-->
|
||||||
|
*This model was released on 2025-07-31 and added to Hugging Face Transformers on 2025-07-31.*
|
||||||
|
|
||||||
# Command A Vision
|
# Command A Vision
|
||||||
|
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
@@ -9,7 +26,7 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
Command A Vision is a state-of-the-art multimodal model designed to seamlessly integrate visual and textual information for a wide range of applications. By combining advanced computer vision techniques with natural language processing capabilities, Command A Vision enables users to analyze, understand, and generate insights from both visual and textual data.
|
Command A Vision ([blog post](https://cohere.com/blog/command-a-vision)) is a state-of-the-art multimodal model designed to seamlessly integrate visual and textual information for a wide range of applications. By combining advanced computer vision techniques with natural language processing capabilities, Command A Vision enables users to analyze, understand, and generate insights from both visual and textual data.
|
||||||
|
|
||||||
The model excels at tasks including image captioning, visual question answering, document understanding, and chart understanding. This makes it a versatile tool for AI practitioners. Its ability to process complex visual and textual inputs makes it useful in settings where text-only representations are imprecise or unavailable, like real-world image understanding and graphics-heavy document processing.
|
The model excels at tasks including image captioning, visual question answering, document understanding, and chart understanding. This makes it a versatile tool for AI practitioners. Its ability to process complex visual and textual inputs makes it useful in settings where text-only representations are imprecise or unavailable, like real-world image understanding and graphics-heavy document processing.
|
||||||
|
|
||||||
|
|||||||
@@ -11,6 +11,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
⚠️ Note that this file is in Markdown but contains specific syntax for our doc-builder (similar to MDX) that may not be
|
⚠️ Note that this file is in Markdown but contains specific syntax for our doc-builder (similar to MDX) that may not be
|
||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-06-27 and added to Hugging Face Transformers on 2024-12-17.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-06-27 and added to Hugging Face Transformers on 2025-06-02.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-08-13 and added to Hugging Face Transformers on 2022-09-22.*
|
||||||
|
|
||||||
# Conditional DETR
|
# Conditional DETR
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-08-06 and added to Hugging Face Transformers on 2021-01-27.*
|
||||||
|
|
||||||
# ConvBERT
|
# ConvBERT
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-01-10 and added to Hugging Face Transformers on 2022-02-07.*
|
||||||
|
|
||||||
# ConvNeXT
|
# ConvNeXT
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2023-01-02 and added to Hugging Face Transformers on 2023-03-14.*
|
||||||
|
|
||||||
# ConvNeXt V2
|
# ConvNeXt V2
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-12-01 and added to Hugging Face Transformers on 2021-04-10.*
|
||||||
|
|
||||||
# CPM
|
# CPM
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-09-16 and added to Hugging Face Transformers on 2023-04-12.*
|
||||||
|
|
||||||
# CPMAnt
|
# CPMAnt
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2025-02-27 and added to Hugging Face Transformers on 2025-05-07.*
|
||||||
|
|
||||||
# Csm
|
# Csm
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-09-11 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
# CTRL
|
# CTRL
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-03-29 and added to Hugging Face Transformers on 2022-05-18.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
@@ -23,7 +24,7 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
# Convolutional Vision Transformer (CvT)
|
# Convolutional Vision Transformer (CvT)
|
||||||
|
|
||||||
Convolutional Vision Transformer (CvT) is a model that combines the strengths of convolutional neural networks (CNNs) and Vision transformers for the computer vision tasks. It introduces convolutional layers into the vision transformer architecture, allowing it to capture local patterns in images while maintaining the global context provided by self-attention mechanisms.
|
[Convolutional Vision Transformer (CvT)](https://huggingface.co/papers/2103.15808) is a model that combines the strengths of convolutional neural networks (CNNs) and Vision transformers for the computer vision tasks. It introduces convolutional layers into the vision transformer architecture, allowing it to capture local patterns in images while maintaining the global context provided by self-attention mechanisms.
|
||||||
|
|
||||||
You can find all the CvT checkpoints under the [Microsoft](https://huggingface.co/microsoft?search_models=cvt) organization.
|
You can find all the CvT checkpoints under the [Microsoft](https://huggingface.co/microsoft?search_models=cvt) organization.
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-10-17 and added to Hugging Face Transformers on 2025-04-29.*
|
||||||
|
|
||||||
# D-FINE
|
# D-FINE
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-01-28 and added to Hugging Face Transformers on 2025-02-04.*
|
||||||
|
|
||||||
# DAB-DETR
|
# DAB-DETR
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2023-06-11 and added to Hugging Face Transformers on 2024-08-19.*
|
||||||
|
|
||||||
# DAC
|
# DAC
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-02-07 and added to Hugging Face Transformers on 2022-03-01.*
|
||||||
|
|
||||||
# Data2Vec
|
# Data2Vec
|
||||||
|
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ Unless required by applicable law or agreed to in writing, software distributed
|
|||||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
specific language governing permissions and limitations under the License.
|
specific language governing permissions and limitations under the License.
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-03-27 and added to Hugging Face Transformers on 2024-04-18.*
|
||||||
|
|
||||||
# DBRX
|
# DBRX
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-06-05 and added to Hugging Face Transformers on 2021-02-19.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-06-05 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-06-02 and added to Hugging Face Transformers on 2022-03-23.*
|
||||||
|
|
||||||
# Decision Transformer
|
# Decision Transformer
|
||||||
|
|
||||||
|
|||||||
@@ -13,12 +13,13 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-05-07 and added to Hugging Face Transformers on 2025-07-09.*
|
||||||
|
|
||||||
# DeepSeek-V2
|
# DeepSeek-V2
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
The DeepSeek-V2 model was proposed in [DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model](https://arxiv.org/abs/2405.04434) by DeepSeek-AI Team.
|
The DeepSeek-V2 model was proposed in [DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model](https://huggingface.co/papers/2405.04434) by DeepSeek-AI Team.
|
||||||
|
|
||||||
The abstract from the paper is the following:
|
The abstract from the paper is the following:
|
||||||
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models.
|
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models.
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-12-27 and added to Hugging Face Transformers on 2025-03-28.*
|
||||||
|
|
||||||
# DeepSeek-V3
|
# DeepSeek-V3
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-03-08 and added to Hugging Face Transformers on 2025-07-25.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
@@ -24,7 +25,7 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
# DeepseekVL
|
# DeepseekVL
|
||||||
|
|
||||||
[Deepseek-VL](https://arxiv.org/abs/2403.05525) was introduced by the DeepSeek AI team. It is a vision-language model (VLM) designed to process both text and images for generating contextually relevant responses. The model leverages [LLaMA](./llama) as its text encoder, while [SigLip](./siglip) is used for encoding images.
|
[Deepseek-VL](https://huggingface.co/papers/2403.05525) was introduced by the DeepSeek AI team. It is a vision-language model (VLM) designed to process both text and images for generating contextually relevant responses. The model leverages [LLaMA](./llama) as its text encoder, while [SigLip](./siglip) is used for encoding images.
|
||||||
|
|
||||||
You can find all the original Deepseek-VL checkpoints under the [DeepSeek-community](https://huggingface.co/deepseek-community) organization.
|
You can find all the original Deepseek-VL checkpoints under the [DeepSeek-community](https://huggingface.co/deepseek-community) organization.
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-03-08 and added to Hugging Face Transformers on 2025-07-25.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
@@ -23,7 +24,7 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
# DeepseekVLHybrid
|
# DeepseekVLHybrid
|
||||||
|
|
||||||
[Deepseek-VL-Hybrid](https://arxiv.org/abs/2403.05525) was introduced by the DeepSeek AI team. It is a vision-language model (VLM) designed to process both text and images for generating contextually relevant responses. The model leverages [LLaMA](./llama) as its text encoder, while [SigLip](./siglip) is used for encoding low-resolution images and [SAM (Segment Anything Model)](./sam) is incorporated to handle high-resolution image encoding, enhancing the model’s ability to process fine-grained visual details. Deepseek-VL-Hybrid is a variant of Deepseek-VL that uses [SAM (Segment Anything Model)](./sam) to handle high-resolution image encoding.
|
[Deepseek-VL-Hybrid](https://huggingface.co/papers/2403.05525) was introduced by the DeepSeek AI team. It is a vision-language model (VLM) designed to process both text and images for generating contextually relevant responses. The model leverages [LLaMA](./llama) as its text encoder, while [SigLip](./siglip) is used for encoding low-resolution images and [SAM (Segment Anything Model)](./sam) is incorporated to handle high-resolution image encoding, enhancing the model’s ability to process fine-grained visual details. Deepseek-VL-Hybrid is a variant of Deepseek-VL that uses [SAM (Segment Anything Model)](./sam) to handle high-resolution image encoding.
|
||||||
|
|
||||||
You can find all the original Deepseek-VL-Hybrid checkpoints under the [DeepSeek-community](https://huggingface.co/deepseek-community) organization.
|
You can find all the original Deepseek-VL-Hybrid checkpoints under the [DeepSeek-community](https://huggingface.co/deepseek-community) organization.
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-10-08 and added to Hugging Face Transformers on 2022-09-14.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-12-23 and added to Hugging Face Transformers on 2021-04-13.*
|
||||||
|
|
||||||
# DeiT
|
# DeiT
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-12-20 and added to Hugging Face Transformers on 2023-06-20.*
|
||||||
|
|
||||||
# DePlot
|
# DePlot
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-01-19 and added to Hugging Face Transformers on 2024-01-25.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-06-13 and added to Hugging Face Transformers on 2024-07-05.*
|
||||||
|
|
||||||
# Depth Anything V2
|
# Depth Anything V2
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-10-02 and added to Hugging Face Transformers on 2025-02-10.*
|
||||||
|
|
||||||
# DepthPro
|
# DepthPro
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-12-12 and added to Hugging Face Transformers on 2023-06-20.*
|
||||||
|
|
||||||
# DETA
|
# DETA
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-05-26 and added to Hugging Face Transformers on 2021-06-09.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2025-04-21 and added to Hugging Face Transformers on 2025-06-26.*
|
||||||
|
|
||||||
# Dia
|
# Dia
|
||||||
|
|
||||||
@@ -26,7 +27,7 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
Dia is an open-source text-to-speech (TTS) model (1.6B parameters) developed by [Nari Labs](https://huggingface.co/nari-labs).
|
[Dia](https://github.com/nari-labs/dia) is an open-source text-to-speech (TTS) model (1.6B parameters) developed by [Nari Labs](https://huggingface.co/nari-labs).
|
||||||
It can generate highly realistic dialogue from transcript including non-verbal communications such as laughter and coughing.
|
It can generate highly realistic dialogue from transcript including non-verbal communications such as laughter and coughing.
|
||||||
Furthermore, emotion and tone control is also possible via audio conditioning (voice cloning).
|
Furthermore, emotion and tone control is also possible via audio conditioning (voice cloning).
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-11-01 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
# DialoGPT
|
# DialoGPT
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-10-07 and added to Hugging Face Transformers on 2025-01-07.*
|
||||||
|
|
||||||
# DiffLlama
|
# DiffLlama
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-09-29 and added to Hugging Face Transformers on 2022-11-18.*
|
||||||
|
|
||||||
# Dilated Neighborhood Attention Transformer
|
# Dilated Neighborhood Attention Transformer
|
||||||
|
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ Unless required by applicable law or agreed to in writing, software distributed
|
|||||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
specific language governing permissions and limitations under the License.
|
specific language governing permissions and limitations under the License.
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2023-04-14 and added to Hugging Face Transformers on 2023-07-18.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -6,6 +6,7 @@ Unless required by applicable law or agreed to in writing, software distributed
|
|||||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
specific language governing permissions and limitations under the License.
|
specific language governing permissions and limitations under the License.
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2023-09-28 and added to Hugging Face Transformers on 2024-12-24.*
|
||||||
|
|
||||||
# DINOv2 with Registers
|
# DINOv2 with Registers
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-10-02 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-03-04 and added to Hugging Face Transformers on 2022-03-10.*
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
<img alt="PyTorch" src="https://img.shields.io/badge/PyTorch-DE3412?style=flat&logo=pytorch&logoColor=white">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-12-27 and added to Hugging Face Transformers on 2025-07-08.*
|
||||||
|
|
||||||
# Doge
|
# Doge
|
||||||
|
|
||||||
|
|||||||
@@ -12,6 +12,7 @@ Unless required by applicable law or agreed to in writing, software distributed
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
specific language governing permissions and limitations under the License. -->
|
specific language governing permissions and limitations under the License. -->
|
||||||
|
*This model was released on 2021-11-30 and added to Hugging Face Transformers on 2022-08-12.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
@@ -21,7 +22,7 @@ specific language governing permissions and limitations under the License. -->
|
|||||||
|
|
||||||
# Donut
|
# Donut
|
||||||
|
|
||||||
[Donut (Document Understanding Transformer)](https://huggingface.co/papers2111.15664) is a visual document understanding model that doesn't require an Optical Character Recognition (OCR) engine. Unlike traditional approaches that extract text using OCR before processing, Donut employs an end-to-end Transformer-based architecture to directly analyze document images. This eliminates OCR-related inefficiencies making it more accurate and adaptable to diverse languages and formats.
|
[Donut (Document Understanding Transformer)](https://huggingface.co/papers/2111.15664) is a visual document understanding model that doesn't require an Optical Character Recognition (OCR) engine. Unlike traditional approaches that extract text using OCR before processing, Donut employs an end-to-end Transformer-based architecture to directly analyze document images. This eliminates OCR-related inefficiencies making it more accurate and adaptable to diverse languages and formats.
|
||||||
|
|
||||||
Donut features vision encoder ([Swin](./swin)) and a text decoder ([BART](./bart)). Swin converts document images into embeddings and BART processes them into meaningful text sequences.
|
Donut features vision encoder ([Swin](./swin)) and a text decoder ([BART](./bart)). Swin converts document images into embeddings and BART processes them into meaningful text sequences.
|
||||||
|
|
||||||
|
|||||||
@@ -13,12 +13,13 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2025-06-06 and added to Hugging Face Transformers on 2025-06-25.*
|
||||||
|
|
||||||
# dots.llm1
|
# dots.llm1
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
The `dots.llm1` model was proposed in [dots.llm1 technical report](https://www.arxiv.org/pdf/2506.05767) by rednote-hilab team.
|
The `dots.llm1` model was proposed in [dots.llm1 technical report](https://huggingface.co/papers/2506.05767) by rednote-hilab team.
|
||||||
|
|
||||||
The abstract from the report is the following:
|
The abstract from the report is the following:
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-04-10 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
# DPR
|
# DPR
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2021-03-24 and added to Hugging Face Transformers on 2022-03-28.*
|
||||||
|
|
||||||
# DPT
|
# DPT
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-06-02 and added to Hugging Face Transformers on 2023-06-20.*
|
||||||
|
|
||||||
# EfficientFormer
|
# EfficientFormer
|
||||||
|
|
||||||
|
|||||||
@@ -11,6 +11,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-03-07 and added to Hugging Face Transformers on 2025-07-22.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-05-28 and added to Hugging Face Transformers on 2023-02-20.*
|
||||||
|
|
||||||
# EfficientNet
|
# EfficientNet
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2020-03-23 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2024-09-27 and added to Hugging Face Transformers on 2025-01-10.*
|
||||||
|
|
||||||
# Emu3
|
# Emu3
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2022-10-24 and added to Hugging Face Transformers on 2023-06-14.*
|
||||||
|
|
||||||
# EnCodec
|
# EnCodec
|
||||||
|
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2017-06-12 and added to Hugging Face Transformers on 2020-11-16.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
|
|||||||
@@ -8,6 +8,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
|
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
|
||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2025-03-24 and added to Hugging Face Transformers on 2025-06-27.*
|
||||||
|
|
||||||
# EoMT
|
# EoMT
|
||||||
|
|
||||||
@@ -17,7 +18,7 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
The Encoder-only Mask Transformer (EoMT) model was introduced in the CVPR 2025 Highlight Paper [Your ViT is Secretly an Image Segmentation Model](https://www.tue-mps.org/eomt) by Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, and Daan de Geus.
|
[The Encoder-only Mask Transformer]((https://www.tue-mps.org/eomt)) (EoMT) model was introduced in the CVPR 2025 Highlight Paper *[Your ViT is Secretly an Image Segmentation Model](https://huggingface.co/papers/2503.19108)* by Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, and Daan de Geus.
|
||||||
EoMT reveals Vision Transformers can perform image segmentation efficiently without task-specific components.
|
EoMT reveals Vision Transformers can perform image segmentation efficiently without task-specific components.
|
||||||
|
|
||||||
The abstract from the paper is the following:
|
The abstract from the paper is the following:
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ specific language governing permissions and limitations under the License.
|
|||||||
rendered properly in your Markdown viewer.
|
rendered properly in your Markdown viewer.
|
||||||
|
|
||||||
-->
|
-->
|
||||||
|
*This model was released on 2019-04-19 and added to Hugging Face Transformers on 2022-09-09.*
|
||||||
|
|
||||||
<div style="float: right;">
|
<div style="float: right;">
|
||||||
<div class="flex flex-wrap space-x-1">
|
<div class="flex flex-wrap space-x-1">
|
||||||
@@ -22,8 +23,8 @@ rendered properly in your Markdown viewer.
|
|||||||
|
|
||||||
# ERNIE
|
# ERNIE
|
||||||
|
|
||||||
[ERNIE1.0](https://arxiv.org/abs/1904.09223), [ERNIE2.0](https://ojs.aaai.org/index.php/AAAI/article/view/6428),
|
[ERNIE1.0](https://huggingface.co/papers/1904.09223), [ERNIE2.0](https://ojs.aaai.org/index.php/AAAI/article/view/6428),
|
||||||
[ERNIE3.0](https://arxiv.org/abs/2107.02137), [ERNIE-Gram](https://arxiv.org/abs/2010.12148), [ERNIE-health](https://arxiv.org/abs/2110.07244) are a series of powerful models proposed by baidu, especially in Chinese tasks.
|
[ERNIE3.0](https://huggingface.co/papers/2107.02137), [ERNIE-Gram](https://huggingface.co/papers/2010.12148), [ERNIE-health](https://huggingface.co/papers/2110.07244) are a series of powerful models proposed by baidu, especially in Chinese tasks.
|
||||||
|
|
||||||
ERNIE (Enhanced Representation through kNowledge IntEgration) is designed to learn language representation enhanced by knowledge masking strategies, which includes entity-level masking and phrase-level masking.
|
ERNIE (Enhanced Representation through kNowledge IntEgration) is designed to learn language representation enhanced by knowledge masking strategies, which includes entity-level masking and phrase-level masking.
|
||||||
|
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user