[MMS] Update docs with HF TTS implementation (#25907)
* [MMS] Update docs with HF TTS implementation * Update docs/source/en/model_doc/mms.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add uromanise to docs --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
This commit is contained in:
@@ -98,6 +98,54 @@ print(tokenizer.is_uroman)
|
||||
If required, you should apply the uroman package to your text inputs **prior** to passing them to the `VitsTokenizer`,
|
||||
since currently the tokenizer does not support performing the pre-processing itself.
|
||||
|
||||
To do this, first clone the uroman repository to your local machine and set the bash variable `UROMAN` to the local path:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/isi-nlp/uroman.git
|
||||
cd uroman
|
||||
export UROMAN=$(pwd)
|
||||
```
|
||||
|
||||
You can then pre-process the text input using the following code snippet. You can either rely on using the bash variable
|
||||
`UROMAN` to point to the uroman repository, or you can pass the uroman directory as an argument to the `uromaize` function:
|
||||
|
||||
```python
|
||||
import torch
|
||||
from transformers import VitsTokenizer, VitsModel, set_seed
|
||||
import os
|
||||
import subprocess
|
||||
|
||||
tokenizer = VitsTokenizer.from_pretrained("facebook/mms-tts-kor")
|
||||
model = VitsModel.from_pretrained("facebook/mms-tts-kor")
|
||||
|
||||
def uromanize(input_string, uroman_path):
|
||||
"""Convert non-Roman strings to Roman using the `uroman` perl package."""
|
||||
script_path = os.path.join(uroman_path, "bin", "uroman.pl")
|
||||
|
||||
command = ["perl", script_path]
|
||||
|
||||
process = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
|
||||
# Execute the perl command
|
||||
stdout, stderr = process.communicate(input=input_string.encode())
|
||||
|
||||
if process.returncode != 0:
|
||||
raise ValueError(f"Error {process.returncode}: {stderr.decode()}")
|
||||
|
||||
# Return the output as a string and skip the new-line character at the end
|
||||
return stdout.decode()[:-1]
|
||||
|
||||
text = "이봐 무슨 일이야"
|
||||
uromaized_text = uromanize(text, uroman_path=os.environ["UROMAN"])
|
||||
|
||||
inputs = tokenizer(text=uromaized_text, return_tensors="pt")
|
||||
|
||||
set_seed(555) # make deterministic
|
||||
with torch.no_grad():
|
||||
outputs = model(inputs["input_ids"])
|
||||
|
||||
waveform = outputs.waveform[0]
|
||||
```
|
||||
|
||||
## VitsConfig
|
||||
|
||||
[[autodoc]] VitsConfig
|
||||
|
||||
Reference in New Issue
Block a user