Dependency Parsing

分类: NLP基础

定义

依存句法分析是将句子中的词通过有向边（依存关系）连接成树结构的任务，每条边表示一个修饰词（dependent）指向其中心词（head），揭示句子的语法结构。是 CS224N 中重点讲授的经典 NLP 任务之一

基于转移的解析（Transition-based Parsing）定义三种操作：

神经网络依存解析器（Chen & Manning, 2014）用前馈网络对每步操作打分： $p(a | c) = \text{softmax}(W_2 \cdot \text{ReLU}(W_1 \cdot [e_{w}; e_{t}; e_{l}] + b_1))$

其中 $c$ 为当前解析配置， $e_w, e_t, e_l$ 分别为词、词性、依存标签的嵌入拼接

依存关系类型：nsubj（名词主语）、dobj（直接宾语）、amod（形容词修饰）、prep（介词）等，Universal Dependencies (UD) 定义了 37 种通用关系

Transition-based vs Graph-based：

Chen & Manning (2014)：首次用神经网络替代手工特征做依存解析，CS224N 作业的经典题目

评估指标：UAS（Unlabeled Attachment Score，只看 head 对不对）和 LAS（Labeled Attachment Score，head 和关系类型都要对）

实际应用：信息抽取、关系抽取、语义角色标注、问答系统

在 Transformer 时代，依存解析仍有价值——BERT 的注意力头被发现隐式编码了某些依存关系

Chen & Manning, 2014: A Fast and Accurate Dependency Parser using Neural Networks

Dozat & Manning, 2017: Deep Biaffine Attention for Neural Dependency Parsing

Universal Dependencies: 跨语言一致的依存标注框架