Ming Zhu, Aman Ahuja, Chandan Reddy


Answering questions in many real-world applications often requires complex and precise information excerpted from texts spanned across a long document. However, currently no such annotated dataset is publicly available, which hinders the development of neural question-answering (QA) systems. To this end, we present MASH-QA, a Multiple Answer Spans Healthcare Question Answering dataset from the consumer health domain, where answers may need to be excerpted from multiple, non-consecutive parts of text spanned across a long document. We also propose MultiCo, a neural architecture that is able to capture the relevance among multiple answer spans, by using a query-based contextualized sentence selection approach, for forming the answer to the given question. We also demonstrate that conventional QA models are not suitable for this type of task and perform poorly in this setting. Extensive experiments are conducted, and the experimental results confirm the proposed model significantly outperforms the state-of-the-art QA models in this multi-span QA setting.

Ming Zhu, Aman Ahuja, Da-Cheng Juan, Wei Wei, Chandan K. Reddy: Question Answering with Long Multiple-Span Answers. EMNLP (Findings) 2020: 3840-3849


Ming Zhu

Aman Ahuja

Chandan Reddy

Publication Details

Date of publication:
Empirical Methods in Natural Language Processing
Page number(s):