Support additional dictionaries for BERT Japanese tokenizers (#6515)

* Update BERT Japanese tokenizers

* Update CircleCI config to download unidic

* Specify to use the latest dictionary packages
This commit is contained in:
Masatoshi Suzuki
2020-08-17 13:00:23 +09:00
committed by GitHub
parent 423eb5b1d7
commit 48c6c6139f
4 changed files with 97 additions and 15 deletions

View File

@@ -150,6 +150,7 @@ jobs:
- v0.3-{{ checksum "setup.py" }}
- run: pip install --upgrade pip
- run: pip install .[ja,testing]
- run: python -m unidic download
- save_cache:
key: v0.3-custom_tokenizers-{{ checksum "setup.py" }}
paths: