SqueezeBERT architecture (#7083)

* configuration_squeezebert.py thin wrapper around bert tokenizer fix typos wip sb model code wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working set up squeezebert to use BertModelOutput when returning results. squeezebert documentation formatting allow head mask that is an array of [None, ..., None] docs docs cont'd path to vocab docs and pointers to cloud files (WIP) line length and indentation squeezebert model cards formatting of model cards untrack modeling_squeezebert_scratchpad.py update aws paths to vocab and config files get rid of stub of NSP code, and advise users to pretrain with mlm only fix rebase issues redo rebase of modeling_auto.py fix issues with code formatting more code format auto-fixes move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert tests for squeezebert modeling and tokenization fix typo move squeezebert before bert in modeling_auto.py to fix inheritance problem disable test_head_masking, since squeezebert doesn't yet implement head masking fix issues exposed by the test_modeling_squeezebert.py fix an issue exposed by test_tokenization_squeezebert.py fix issue exposed by test_modeling_squeezebert.py auto generated code style improvement issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head() update copyright resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask docs add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli autogenerated formatting tweaks integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings * tiny change to order of imports
2020-10-05 01:25:43 -07:00
parent e2c935f561
commit 02ef825be2
18 changed files with 1950 additions and 14 deletions
--- a/templates/adding_a_new_model/README.md
+++ b/templates/adding_a_new_model/README.md
@@ -21,7 +21,7 @@ For a quick overview of the general philosphy of the library and its organizatio

 # Typical workflow for including a model

-Here an overview of the general workflow: 
+Here an overview of the general workflow:

 - [ ] Add model/configuration/tokenization classes.
 - [ ] Add conversion scripts.
@@ -69,7 +69,7 @@ Here is the workflow for documentation:
 - [ ] Create a new page `xxx.rst` in the folder `docs/source/model_doc` and add this file in `docs/source/index.rst`.

 Make sure to check you have no sphinx warnings when building the documentation locally and follow our
-[documentaiton guide](https://github.com/huggingface/transformers/tree/master/docs#writing-documentation---specification).
+[documentation guide](https://github.com/huggingface/transformers/tree/master/docs#writing-documentation---specification).

 ## Final steps

--- a/templates/adding_a_new_model/modeling_xxx.py
+++ b/templates/adding_a_new_model/modeling_xxx.py
@@ -19,7 +19,6 @@
 ####################################################


-import logging
 import os

 import torch
@@ -37,9 +36,10 @@ from .modeling_outputs import (
    TokenClassifierOutput,
 )
 from .modeling_utils import PreTrainedModel
+from .utils import logging


-logger = logging.getLogger(__name__)
+logger = logging.get_logger(__name__)

 _CONFIG_FOR_DOC = "XXXConfig"
 _TOKENIZER_FOR_DOC = "XXXTokenizer"
@@ -433,7 +433,7 @@ class XxxForMaskedLM(XxxPreTrainedModel):
        )

        sequence_output = outputs[0]
-        prediction_scores = self.cls(sequence_output)
+        prediction_scores = self.lm_head(sequence_output)

        masked_lm_loss = None
        if labels is not None: