VQA task guide (#25244)
* initial commit * semi-finished task guide draft * image link * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/visual_question_answering.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * feedback addressed * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
This commit is contained in:
@@ -75,6 +75,8 @@
|
||||
title: Image captioning
|
||||
- local: tasks/document_question_answering
|
||||
title: Document Question Answering
|
||||
- local: tasks/visual_question_answering
|
||||
title: Visual Question Answering
|
||||
- local: tasks/text-to-speech
|
||||
title: Text to speech
|
||||
title: Multimodal
|
||||
|
||||
Reference in New Issue
Block a user