Sehoon Kim
63645b3b11
I-BERT model support (#10153)
* IBertConfig, IBertTokentizer added
* IBert Model names moified
* tokenizer bugfix
* embedding -> QuantEmbedding
* quant utils added
* quant_mode added to configuration
* QuantAct added, Embedding layer + QuantAct addition
* QuantAct added
* unused path removed, QKV quantized
* self attention layer all quantized, except softmax
* temporarl commit
* all liner layers quantized
* quant_utils bugfix
* bugfix: requantization missing
* IntGELU added
* IntSoftmax added
* LayerNorm implemented
* LayerNorm implemented all
* names changed: roberta->ibert
* config not inherit from ROberta
* No support for CausalLM
* static quantization added, quantize_model.py removed
* import modules uncommented
* copyrights fixed
* minor bugfix
* quant_modules, quant_utils merged as one file
* import * fixed
* unused runfile removed
* make style run
* configutration.py docstring fixed
* refactoring: comments removed, function name fixed
* unused dependency removed
* typo fixed
* comments(Copied from), assertion string added
* refactoring: super(..) -> super(), etc.
* refactoring
* refarctoring
* make style
* refactoring
* cuda -> to(x.device)
* weight initialization removed
* QuantLinear set_param removed
* QuantEmbedding set_param removed
* IntLayerNorm set_param removed
* assert string added
* assertion error message fixed
* is_decoder removed
* enc-dec arguments/functions removed
* Converter removed
* quant_modules docstring fixed
* conver_slow_tokenizer rolled back
* quant_utils docstring fixed
* unused aruments e.g. use_cache removed from config
* weight initialization condition fixed
* x_min, x_max initialized with small values to avoid div-zero exceptions
* testing code for ibert
* test emb, linear, gelu, softmax added
* test ln and act added
* style reformatted
* force_dequant added
* error tests overrided
* make style
* Style + Docs
* force dequant tests added
* Fix fast tokenizer in init
* Fix doc
* Remove space
* docstring, IBertConfig, chunk_size
* test_modeling_ibert refactoring
* quant_modules.py refactoring
* e2e integration test added
* tokenizers removed
* IBertConfig added to tokenizer_auto.py
* bugfix
* fix docs & test
* fix style num 2
* final fixes
Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-02-25 10:06:42 -05:00
..
2021-02-24 15:19:01 -05:00
2021-02-05 15:47:54 +03:00
2021-02-25 17:42:46 +03:00
2021-02-25 17:42:46 +03:00
2021-02-25 10:06:42 -05:00
2021-02-01 17:55:10 +03:00
2021-01-11 08:53:41 -05:00
2021-01-05 06:18:48 -05:00
2021-02-11 12:53:40 +03:00
2021-01-13 16:01:51 +01:00
2020-06-17 14:01:10 -04:00
2021-01-26 03:37:57 -05:00
2021-02-16 14:00:05 -05:00
2020-05-27 11:36:55 -04:00
2020-02-25 13:48:24 -05:00
2021-01-20 13:28:40 -05:00
2021-02-25 10:06:42 -05:00
2021-01-28 09:36:46 -08:00
2020-12-07 18:36:34 -05:00
2021-01-30 09:59:19 -05:00
2021-01-27 03:20:09 -05:00
2021-01-05 06:18:48 -05:00
2020-04-06 14:32:39 -04:00
2020-12-07 18:36:34 -05:00
2020-12-23 10:15:49 -05:00
2020-12-23 10:15:49 -05:00
2021-02-19 18:34:44 -05:00
2020-12-23 10:15:49 -05:00
2020-12-07 18:36:34 -05:00
2021-02-25 09:18:47 -05:00
2021-01-05 06:18:48 -05:00
2020-12-23 10:15:49 -05:00
2021-01-12 19:05:18 -08:00