Neural Networks for Linguistic Structured Prediction and Their Interpretability

Abstract

Linguistic structured prediction, such as sequence labeling, syntactic and semantic parsing, and coreference resolution, is one of the first stages in deep language understanding and its importance has been well recognized in the natural language processing community, and has been applied to a wide range of down-stream tasks. Most traditional high performance linguistic structured prediction models are linear statistical models, including Hidden Markov Models (HMM) and Conditional Random Fields (CRF), which rely heavily on hand-crafted features and task-specific resources. However, such task-specific knowledge is costly to develop, making structured prediction models difficult to adapt to new tasks or new domains. In the past few years, non-linear neural networks with as input distributed word representations have been broadly applied to NLP problems with great success. By utilizing distributed representations as inputs, these systems are capable of learning hidden representations directly from data instead of manually designing hand-crafted features. Despite the impressive empirical successes of applying neural networks to linguistic structured prediction tasks, there are at least two major problems: 1) there is no a consistent architecture for, at least of components of, different structured prediction tasks that is able to be trained in a truely end-to-end setting. 2) The end-to-end training paradigm, however, comes at the expense of model interpretability: understanding the role of different parts of the deep neural network is difficult. In this thesis, we will discuss the two of the major problems in current neural models, and attempt to provide …

Date: November 6, 2025
Authors: Xuezhe Ma
Institution: Carnegie Mellon University

View Paper

Information Sciences Institute

Publications

Neural Networks for Linguistic Structured Prediction and Their Interpretability

Abstract