[Wav2Vec2] PyCTCDecode Integration to support language model boosted decoding (#14339)

* up

* up

* up

* make it cleaner

* correct

* make styhahalal

* add more tests

* finish

* small fix

* make style

* up

* tryout to solve cicrle ci

* up

* fix more tests

* fix more tests

* apply sylvains suggestions

* fix import

* correct docs

* add pyctcdecode only to speech tests

* fix more tests

* add tf, flax and pt tests

* add pt

* fix last tests

* fix more tests

* Apply suggestions from code review

* change lines

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* correct tests

* correct tests

* add doc string

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
This commit is contained in:
Patrick von Platen
2021-12-08 12:07:54 +01:00
committed by GitHub
parent 2e12d90b9e
commit 961732c276
16 changed files with 831 additions and 19 deletions

View File

@@ -51,7 +51,7 @@ To create the package for pypi.
pip install -i https://testpypi.python.org/pypi transformers
Check you can run the following commands:
python -c "from transformers import pipeline; classifier = pipeline('text-classification'); print(classifier('What a nice release'))"
python -c "from transformers import pipeline; classifier = pipeline('text-classification'); print(classifier('What a nice release'))"
python -c "from transformers import *"
9. Upload the final version to actual pypi:
@@ -59,7 +59,7 @@ To create the package for pypi.
10. Copy the release notes from RELEASE.md to the tag in github once everything is looking hunky-dory.
11. Run `make post-release` (or, for a patch release, `make post-patch`). If you were on a branch for the release,
11. Run `make post-release` (or, for a patch release, `make post-patch`). If you were on a branch for the release,
you need to go back to master before executing this.
"""
@@ -159,6 +159,7 @@ _deps = [
"tokenizers>=0.10.1,<0.11",
"torch>=1.0",
"torchaudio",
"pyctcdecode>=0.2.0",
"tqdm>=4.27",
"unidic>=1.0.2",
"unidic_lite>=1.0.7",
@@ -262,7 +263,7 @@ extras["sigopt"] = deps_list("sigopt")
extras["integrations"] = extras["optuna"] + extras["ray"] + extras["sigopt"]
extras["serving"] = deps_list("pydantic", "uvicorn", "fastapi", "starlette")
extras["audio"] = deps_list("librosa")
extras["audio"] = deps_list("librosa", "pyctcdecode")
extras["speech"] = deps_list("torchaudio") + extras["audio"] # `pip install ".[speech]"` is deprecated and `pip install ".[torch-speech]"` should be used instead
extras["torch-speech"] = deps_list("torchaudio") + extras["audio"]
extras["tf-speech"] = extras["audio"]