Add support for Japanese GPT-NeoX-based model by ABEJA, Inc. (#18814)
* add gpt-neox-japanese model and tokenizer as new model * Correction to PR's comment for GPT NeoX Japanese - Fix to be able to use gpu - Add comment # Copied... at the top of RotaryEmbedding - Implement nn.Linear instead of original linear class - Add generation test under @slow * fix bias treatment for gpt-neox-japanese * Modidy gpt-neox-japanese following PR - add doc for bias_dropout_add - style change following a PR comment * add document for gpt-neox-japanese * remove unused import from gpt-neox-japanese * fix README for gpt-neox-japanese
This commit is contained in:
66
docs/source/en/model_doc/gpt_neox_japanese.mdx
Normal file
66
docs/source/en/model_doc/gpt_neox_japanese.mdx
Normal file
@@ -0,0 +1,66 @@
|
||||
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations under the License.
|
||||
-->
|
||||
|
||||
# GPT-NeoX-Japanese
|
||||
|
||||
## Overview
|
||||
|
||||
We introduce GPT-NeoX-Japanese, which is an autoregressive language model for Japanese, trained on top of [https://github.com/EleutherAI/gpt-neox](https://github.com/EleutherAI/gpt-neox).
|
||||
Japanese is a unique language with its large vocabulary and a combination of hiragana, katakana, and kanji writing scripts.
|
||||
To address this distinct structure of the Japanese language, we use a [special sub-word tokenizer](https://github.com/tanreinama/Japanese-BPEEncoder_V2). We are very grateful to *tanreinama* for open-sourcing this incredibly helpful tokenizer.
|
||||
Following the recommendations from Google's research on [PaLM](https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html), we have removed bias parameters from transformer blocks, achieving better model performance. Please refer [this article](https://medium.com/ml-abeja/training-a-better-gpt-2-93b157662ae4) in detail.
|
||||
|
||||
Development of the model was led by [Shinya Otani](https://github.com/SO0529), [Takayoshi Makabe](https://github.com/spider-man-tm), [Anuj Arora](https://github.com/Anuj040), and [Kyo Hattori](https://github.com/go5paopao) from [ABEJA, Inc.](https://www.abejainc.com/). For more information on this model-building activity, please refer [here (ja)](https://tech-blog.abeja.asia/entry/abeja-gpt-project-202207).
|
||||
|
||||
### Generation
|
||||
|
||||
The `generate()` method can be used to generate text using GPT NeoX Japanese model.
|
||||
|
||||
```python
|
||||
>>> from transformers import GPTNeoXJapaneseForCausalLM, GPTNeoXJapaneseTokenizer
|
||||
|
||||
>>> model = GPTNeoXJapaneseForCausalLM.from_pretrained("abeja/gpt-neox-japanese-2.7b")
|
||||
>>> tokenizer = GPTNeoXJapaneseTokenizer.from_pretrained("abeja/gpt-neox-japanese-2.7b")
|
||||
|
||||
>>> prompt = "人とAIが協調するためには、"
|
||||
|
||||
>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids
|
||||
|
||||
>>> gen_tokens = model.generate(
|
||||
... input_ids,
|
||||
... do_sample=True,
|
||||
... temperature=0.9,
|
||||
... max_length=100,
|
||||
... )
|
||||
>>> gen_text = tokenizer.batch_decode(gen_tokens, skip_special_tokens=True)[0]
|
||||
|
||||
>>> print(gen_text)
|
||||
人とAIが協調するためには、AIと人が共存し、AIを正しく理解する必要があります。
|
||||
```
|
||||
|
||||
## GPTNeoXJapaneseConfig
|
||||
|
||||
[[autodoc]] GPTNeoXJapaneseConfig
|
||||
|
||||
## GPTNeoXJapaneseTokenizer
|
||||
|
||||
[[autodoc]] GPTNeoXJapaneseTokenizer
|
||||
|
||||
## GPTNeoXJapaneseModel
|
||||
|
||||
[[autodoc]] GPTNeoXJapaneseModel
|
||||
- forward
|
||||
|
||||
## GPTNeoXJapaneseForCausalLM
|
||||
|
||||
[[autodoc]] GPTNeoXJapaneseForCausalLM
|
||||
- forward
|
||||
Reference in New Issue
Block a user