Hanwen (Zoe) Zheng, Sijia Wang, Lifu Huang

Abstract

Document-level information extraction (IE) is a crucial task in natural language processing (NLP). This paper conducts a systematic review of recent document-level IE literature. In addition, we conduct a thorough error analysis with current state-of-the-art algorithms and identify their limitations as well as the remaining challenges for the task of document-level IE. According to our findings, labeling noises, entity coreference resolution, and lack of reasoning, severely affect the performance of document-level IE. The objective of this survey paper is to provide more insights and help NLP researchers to further enhance document-level IE performance.

People

Hanwen (Zoe) Zheng


Sijia Wang


Lifu Huang


Publication Details

Date of publication:
September 23, 2023
Journal:
Cornell University
Publication note:

Hanwen Zheng, Sijia Wang, Lifu Huang: A Survey of Document-Level Information Extraction. CoRR abs/2309.13249 (2023)