For goal-oriented document-grounded dialogs, it often involves complex contexts for identifying the most
relevant information, which requires better understanding of the inter-relations between conversations and
documents. Meanwhile, many online user-oriented documents use both semi-structured and unstructured contents
for guiding users to access information of different contexts. Thus, we create a new goal-oriented
document-grounded dialogue dataset that captures more diverse scenarios derived from various document contents
from multiple domains such ssa.gov and studentaid.gov. For data collection, we propose a
novel pipeline approach for dialogue data construction, which has been adapted and evaluated for several
domains. We introduce multiple dialogue modeling tasks that are supported by our dataset, and present the
Figure 1: a sample dialogue (left) that is grounded in a document (right). Numbers in brackets such as
"&" indicates the grounding in the spans of the documents.
For more information, please go to paper and data page.