typos in seq2seq/readme (#5937)
This commit is contained in:
@@ -14,7 +14,7 @@ wget https://s3.amazonaws.com/datasets.huggingface.co/summarization/xsum.tar.gz
|
|||||||
tar -xzvf xsum.tar.gz
|
tar -xzvf xsum.tar.gz
|
||||||
export XSUM_DIR=${PWD}/xsum
|
export XSUM_DIR=${PWD}/xsum
|
||||||
```
|
```
|
||||||
this should make a directory called cnn_dm/ with files like `test.source`.
|
this should make a directory called `xsum/` with files like `test.source`.
|
||||||
To use your own data, copy that files format. Each article to be summarized is on its own line.
|
To use your own data, copy that files format. Each article to be summarized is on its own line.
|
||||||
|
|
||||||
CNN/DailyMail data
|
CNN/DailyMail data
|
||||||
@@ -22,8 +22,8 @@ CNN/DailyMail data
|
|||||||
cd examples/seq2seq
|
cd examples/seq2seq
|
||||||
wget https://s3.amazonaws.com/datasets.huggingface.co/summarization/cnn_dm.tgz
|
wget https://s3.amazonaws.com/datasets.huggingface.co/summarization/cnn_dm.tgz
|
||||||
tar -xzvf cnn_dm.tgz
|
tar -xzvf cnn_dm.tgz
|
||||||
|
|
||||||
export CNN_DIR=${PWD}/cnn_dm
|
export CNN_DIR=${PWD}/cnn_dm
|
||||||
|
this should make a directory called `cnn_dm/` with files like `test.source`.
|
||||||
```
|
```
|
||||||
|
|
||||||
WMT16 English-Romanian Translation Data:
|
WMT16 English-Romanian Translation Data:
|
||||||
@@ -32,6 +32,7 @@ cd examples/seq2seq
|
|||||||
wget https://s3.amazonaws.com/datasets.huggingface.co/translation/wmt_en_ro.tar.gz
|
wget https://s3.amazonaws.com/datasets.huggingface.co/translation/wmt_en_ro.tar.gz
|
||||||
tar -xzvf wmt_en_ro.tar.gz
|
tar -xzvf wmt_en_ro.tar.gz
|
||||||
export ENRO_DIR=${PWD}/wmt_en_ro
|
export ENRO_DIR=${PWD}/wmt_en_ro
|
||||||
|
this should make a directory called `wmt_en_ro/` with files like `test.source`.
|
||||||
```
|
```
|
||||||
|
|
||||||
If you are using your own data, it must be formatted as one directory with 6 files: train.source, train.target, val.source, val.target, test.source, test.target.
|
If you are using your own data, it must be formatted as one directory with 6 files: train.source, train.target, val.source, val.target, test.source, test.target.
|
||||||
|
|||||||
Reference in New Issue
Block a user