MultiDoc2Dial is a new task and dataset on modeling goal-oriented dialogues grounded in multiple documents. Most previous works treat document-grounded dialogue modeling as a machine reading comprehension task based on a single given document or passage. We aim to address more realistic scenarios where a goal-oriented information-seeking conversation involves multiple topics, and hence is grounded on different documents.
To facilitate such a study, we introduce a new dataset that contains dialogues grounded in multiple documents from four different domains; and tasks that involve modeling the dialogue-based and document-based context for predicting the next agent turn. Figure 1 illustrates a goal-oriented dialogue that corresponds to multiple documents.
@inproceedings{feng2021multidoc2dial,
title={MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents},
author={Feng, Song and Patel, Siva Sankalp and Wan, Hui and Joshi, Sachindra},
booktitle={EMNLP},
year={2021}
}
Data
Please check out our data: data and readme .
For domain adapation set up, check out data .
Figure 1: a sample goal-oriented dialogue (left) that is grounded in several documents (right).