From 7eeca4d3999f000af5d16917fe0650ceac10aaa8 Mon Sep 17 00:00:00 2001 From: Patrick von Platen Date: Fri, 18 Sep 2020 16:44:02 +0200 Subject: [PATCH] Create README.md --- model_cards/facebook/rag-token-base/README.md | 21 +++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 model_cards/facebook/rag-token-base/README.md diff --git a/model_cards/facebook/rag-token-base/README.md b/model_cards/facebook/rag-token-base/README.md new file mode 100644 index 0000000000..ee4c754f37 --- /dev/null +++ b/model_cards/facebook/rag-token-base/README.md @@ -0,0 +1,21 @@ +## RAG + +This is a "base" version of the RAG-Token Model of the the paper [Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](https://arxiv.org/pdf/2005.11401.pdf) +by Patrick Lewis, Ethan Perez, Aleksandara Piktus et al. + +## Usage: + +```python +from transformers import RagTokenizer, RagRetriever, RagSequenceForGeneration + +tokenizer = RagTokenizer.from_pretrained("facebook/rag-token-base") +retriever = RagRetriever.from_pretrained("facebook/rag-token-base", index_name="exact", use_dummy_dataset=True) +model = RagSequenceForGeneration.from_pretrained("facebook/rag-token-base", retriever=retriever) + +input_ids = tokenizer("What is the largest country in the world?", return_tensors="pt").input_ids + +generated = model.generate(input_ids=input_ids) +generated_string = tokenizer.batch_decode(generated, skip_special_tokens=True) + +# => should give [' russia']. Pretty good answer for just having just a dummy dataset. +```