
Employing the Correspondence of Relations and Connectives to Identify Implicit Discourse Relations via Label Embeddings



Discourse parsing reveals the discourse units (i.e., text spans, sentences, clauses) of the documents and how such units are related to each others to improve the coherence.


This work focuses on the task of implicit discourse relation recognition (IDRR), aiming to identify the discourse relations (i.e., cause, contrast) between adjacent text spans in documents.


IDRR is a fundamental problem in discourse analysis (Knott, 2014; Webber et al., 1999) with important applications on question answering (Liakata et al., 2013; Jansen et al., 2014) and text summarization (Gerani et al., 2014; Yoshida et al., 2014), to name a few.


Due it its importance, IDRR is being studied actively in the literature, leading to the recent advances for this problem based on deep learning (Chen et al., 2016; Qin et al., 2016; Zhang et al., 2016; Lan et al., 2017; Dai and Huang, 2018).


Consider the two following text spans (called arguments) taken from (Qin et al., 2017) as an example:


Argument 1: Never mind.

Argument 2: You already know the answer

An IDRR model should be able to recognize that argument 2 is the cause of argument 1 (i.e., the Cause relation) in this case.


This is a challenging problem as the models need to rely solely on the text of the arguments to predict accurate discourse relations.


The problem would become more manageable if connective/marker cues (i.e., “but”, “so”) are provided to connect the two arguments according to their discourse relations (Qin et al., 2017).


In the example above, it is beneficial for the models to know that “because” can be a connective of the two arguments that is consistent with their discourse relation (i.e., Cause).


In fact, a human annotator can also benefit from the connectives between arguments when he or she needs to assign discourse relations for pairs of arguments (Qin et al., 2017).


This is demonstrated in the Penn Discourse Treebank dataset (PDTB) (Prasad et al., 2008), a major benchmark dataset for IDRR, where the annotators first inject the connectives between the arguments (called the “implicit connectives”) to aid the relation assignment of the arguments later (Qin et al., 2017).


Motivated by the relevance of connectives for IDRR, some recent work on deep learning has explored methods to transfer the knowledge from the implicit connectives to support discourse relation prediction using the multi-task learning frameworks (Qin et al., 2017; Bai and Zhao, 2018).


The typical approach is to simultaneously predict the discourse relations and the implicit connectives for the input arguments in which the model parameters for the two prediction tasks are shared/tied to allow the knowledge transfer (Liu et al., 2016; Wu et al., 2016; Lan et al., 2017; 4202 Bai and Zhao, 2018).


Unfortunately, such multitask learning models for IDRR share the limitation of failing to exploit the mapping between the implicit connectives and the discourse relations.


In particular, each implicit connective in the PDTB dataset can be naturally mapped into the corresponding discourse relations based on their semantics that can be further employed to transfer the knowledge from the connectives to the relations.


For instance, in the PDTB dataset, the connective “consequently” uniquely corresponds to the relation cause while the connective “in contrast” can be associated with the relation comparison.


In this work, we argue that the knowledge transfer facilitated by such a connective-relation mapping can indeed help to improve the performance of the multi-task learning models for IDRR with deep learning.


Consequently, in order to exploit the connective-relation mapping, we propose to embed the implicit connectives and the discourse relations into the same space that would be used to transfer the knowledge between connective and relation predictions via the mapping.


We introduce several mechanisms to encourage both knowledge sharing and representation distinction for the embeddings of the connectives and relations for IDRR.


In the experiments, we extensively demonstrate that the novel embeddings of connectives and relations along with the proposed mechanisms significantly improve the multi-task learning models for IDRR.


We achieve the state-of-the-art performance for IDRR over several settings of the benchmark dataset PDTB.


Related Work

There have been many research on IDRR since the creation of the PDTB dataset (Prasad et al., 2008).


The early work has manually designed various features for IDRR (Pitler et al., 2009; Lin et al., 2009; Wang et al., 2010; Zhou et al., 2010; Braud and Denis, 2015; Lei et al., 2018) while the recent approach has applied deep learning to significantly improve the performance of IDRR (Zhang et al., 2015; Ji et al., 2015a; Chen et al., 2016; Liu et al., 2016; Qin et al., 2016; Zhang et al., 2016; Cai and Zhao, 2017; Lan et al., 2017; Wu et al., 2017; Dai and Huang, 2018; Kishimoto et al., 2018).


The most related work to ours in this paper involves the multi-task learning models for IDRR that employ connectives as the auxiliary labels for the prediction of the discourse relations.


For the feature-based approach, (Zhou et al., 2010) employ a pipelined approach to first predict the connectives and then assign discourse relations accordingly while (Lan et al., 2013) use the connective-relation mapping to automatically generate synthetic data.


For the recent work on deep learning for IDRR, (Liu et al., 2016; Wu et al., 2016; Lan et al., 2017; Bai and Zhao, 2018) simultaneously predict connectives and relations assuming the shared parameters of the deep learning models while (Qin et al., 2017) develop adversarial networks to encourage the relation models to mimic the features learned from the connective incorporation.


However, none of these work employs embeddings of connectives and relations to transfer knowledge with the connective-relation mapping and deep learning as we do in this work.



