Add custom tokenizer for zh and ja

This commit is contained in:
Shijie Wu
2019-08-23 20:27:52 -04:00
parent 436ce07218
commit e85123d398
3 changed files with 61 additions and 22 deletions

View File

@@ -11,4 +11,8 @@ regex
# For XLNet
sentencepiece
# For XLM
sacremoses
sacremoses
pythainlp
kytea
nltk
jieba