From ccbe839ee0b78a17e74dab218bfae7efe904ac3b Mon Sep 17 00:00:00 2001 From: Gabriele Sarti Date: Wed, 25 Mar 2020 02:15:55 +0100 Subject: [PATCH] Added BioBERT-NLI model card (#3421) --- model_cards/gsarti/biobert-nli/README.md | 37 ++++++++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 model_cards/gsarti/biobert-nli/README.md diff --git a/model_cards/gsarti/biobert-nli/README.md b/model_cards/gsarti/biobert-nli/README.md new file mode 100644 index 0000000000..0936cb00a1 --- /dev/null +++ b/model_cards/gsarti/biobert-nli/README.md @@ -0,0 +1,37 @@ +# BioBERT-NLI + +This is the model [BioBERT](https://github.com/dmis-lab/biobert) [1] fine-tuned on the [SNLI](https://nlp.stanford.edu/projects/snli/) and the [MultiNLI](https://www.nyu.edu/projects/bowman/multinli/) datasets using the [`sentence-transformers` library](https://github.com/UKPLab/sentence-transformers/) to produce universal sentence embeddings [2]. + +The model uses the original BERT wordpiece vocabulary and was trained using the **average pooling strategy** and a **softmax loss**. + +**Base model**: `monologg/biobert_v1.1_pubmed` from HuggingFace's `AutoModel`. + +**Training time**: ~6 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks. + +**Parameters**: + +| Parameter | Value | +|------------------|-------| +| Batch size | 64 | +| Training steps | 30000 | +| Warmup steps | 1450 | +| Lowercasing | False | +| Max. Seq. Length | 128 | + +**Performances**: The performance was evaluated on the test portion of the [STS dataset](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity. + +| Model | Score | +|-------------------------------|-------------| +| `biobert-nli` (this) | 73.40 | +| `gsarti/scibert-nli` | 74.50 | +| `bert-base-nli-mean-tokens`[3]| 77.12 | + +An example usage for similarity-based scientific paper retrieval is provided in the [Covid Papers Browser](https://github.com/gsarti/covid-papers-browser) repository. + +**References:** + +[1] J. Lee et al, [BioBERT: a pre-trained biomedical language representation model for biomedical text mining](https://academic.oup.com/bioinformatics/article/36/4/1234/5566506) + +[2] A. Conneau et al., [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://www.aclweb.org/anthology/D17-1070/) + +[3] N. Reimers et I. Gurevych, [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://www.aclweb.org/anthology/D19-1410/)