Documentation
This commit is contained in:
@@ -55,4 +55,81 @@ Example usage
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
An example using these processors is given in the
|
||||
`run_glue.py <https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_glue.py>`__ script.
|
||||
`run_glue.py <https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_glue.py>`__ script.
|
||||
|
||||
|
||||
|
||||
SQuAD
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
`The Stanford Question Answering Dataset (SQuAD) <https://rajpurkar.github.io/SQuAD-explorer//>`__ is a benchmark that evaluates
|
||||
the performance of models on question answering. Two versions are available, v1.1 and v2.0. The first version (v1.1) was released together with the paper
|
||||
`SQuAD: 100,000+ Questions for Machine Comprehension of Text <https://arxiv.org/abs/1606.05250>`__. The second version (v2.0) was released alongside
|
||||
the paper `Know What You Don't Know: Unanswerable Questions for SQuAD <https://arxiv.org/abs/1806.03822>`__.
|
||||
|
||||
This library hosts a processor for each of the two versions:
|
||||
|
||||
Processors
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Those processors are:
|
||||
- :class:`~transformers.data.processors.utils.SquadV1Processor`
|
||||
- :class:`~transformers.data.processors.utils.SquadV2Processor`
|
||||
|
||||
They both inherit from the abstract class :class:`~transformers.data.processors.utils.SquadProcessor`
|
||||
|
||||
.. autoclass:: transformers.data.processors.squad.SquadProcessor
|
||||
:members:
|
||||
|
||||
Additionally, the following method can be used to convert SQuAD examples into :class:`~transformers.data.processors.utils.SquadFeatures`
|
||||
that can be used as model inputs.
|
||||
|
||||
.. automethod:: transformers.data.processors.squad.squad_convert_examples_to_features
|
||||
|
||||
These processors as well as the aforementionned method can be used with files containing the data as well as with the `tensorflow_datasets` package.
|
||||
Examples are given below.
|
||||
|
||||
Example usage
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Here is an example using the processors as well as the conversion method using data files:
|
||||
|
||||
Example::
|
||||
|
||||
# Loading a V2 processor
|
||||
processor = SquadV2Processor()
|
||||
examples = processor.get_dev_examples(squad_v2_data_dir)
|
||||
|
||||
# Loading a V1 processor
|
||||
processor = SquadV1Processor()
|
||||
examples = processor.get_dev_examples(squad_v1_data_dir)
|
||||
|
||||
features = squad_convert_examples_to_features(
|
||||
examples=examples,
|
||||
tokenizer=tokenizer,
|
||||
max_seq_length=max_seq_length,
|
||||
doc_stride=args.doc_stride,
|
||||
max_query_length=max_query_length,
|
||||
is_training=not evaluate,
|
||||
)
|
||||
|
||||
Using `tensorflow_datasets` is as easy as using a data file:
|
||||
|
||||
Example::
|
||||
|
||||
# tensorflow_datasets only handle Squad V1.
|
||||
tfds_examples = tfds.load("squad")
|
||||
examples = SquadV1Processor().get_examples_from_dataset(tfds_examples, evaluate=evaluate)
|
||||
|
||||
features = squad_convert_examples_to_features(
|
||||
examples=examples,
|
||||
tokenizer=tokenizer,
|
||||
max_seq_length=max_seq_length,
|
||||
doc_stride=args.doc_stride,
|
||||
max_query_length=max_query_length,
|
||||
is_training=not evaluate,
|
||||
)
|
||||
|
||||
|
||||
Another example using these processors is given in the
|
||||
`run_squad.py <https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_squad.py>`__ script.
|
||||
Reference in New Issue
Block a user