Add a TF in-graph tokenizer for BERT (#17701)
* Add a TF in-graph tokenizer for BERT * Add from_pretrained * Add proper truncation, option handling to match other tokenizers * Add proper imports and guards * Add test, fix all the bugs exposed by said test * Fix truncation of paired texts in graph mode, more test updates * Small fixes, add a (very careful) test for savedmodel * Add tensorflow-text dependency, make fixup * Update documentation * Update documentation * make fixup * Slight changes to tests * Add some docstring examples * Update tests * Update tests and add proper lowercasing/normalization * make fixup * Add docstring for padding! * Mark slow tests * make fixup * Fall back to BertTokenizerFast if BertTokenizer is unavailable * Fall back to BertTokenizerFast if BertTokenizer is unavailable * make fixup * Properly handle tensorflow-text dummies
This commit is contained in:
5
setup.py
5
setup.py
@@ -155,6 +155,7 @@ _deps = [
|
||||
"starlette",
|
||||
"tensorflow-cpu>=2.3",
|
||||
"tensorflow>=2.3",
|
||||
"tensorflow-text",
|
||||
"tf2onnx",
|
||||
"timeout-decorator",
|
||||
"timm",
|
||||
@@ -238,8 +239,8 @@ extras = {}
|
||||
extras["ja"] = deps_list("fugashi", "ipadic", "unidic_lite", "unidic")
|
||||
extras["sklearn"] = deps_list("scikit-learn")
|
||||
|
||||
extras["tf"] = deps_list("tensorflow", "onnxconverter-common", "tf2onnx")
|
||||
extras["tf-cpu"] = deps_list("tensorflow-cpu", "onnxconverter-common", "tf2onnx")
|
||||
extras["tf"] = deps_list("tensorflow", "onnxconverter-common", "tf2onnx", "tensorflow-text")
|
||||
extras["tf-cpu"] = deps_list("tensorflow-cpu", "onnxconverter-common", "tf2onnx", "tensorflow-text")
|
||||
|
||||
extras["torch"] = deps_list("torch")
|
||||
extras["accelerate"] = deps_list("accelerate")
|
||||
|
||||
Reference in New Issue
Block a user