[introduction] neural network-based abstract generation for opinions and arguments
TRANSCRIPT
Neural Network-Based Abstract Generation for
Opinions and ArgumentsLu Wang and Wang Ling
NAACL 2016
論文紹介
Presentation:Tomonori Kodaira
1
Abstract
• 文書to文のAbstractive要約
• 文書をエンコードする際,Samplingによって計算量の問題を解決その際,text unitの重要度を基準にTop-Kのデータを用いる
2
Introduction• Authors present an attention-based NN model for
generating abstractive summaries of opinionated text.
• Their system takes as input a set of text units, and then outputs a one-sentence abstractive summary.
• Two type of opinionated text: Movie reviews Arguments on controversial topic
3
Introduction
• Systems Attention-based model (Bahdanau et al. 2014) An importance-based sampling method.
• The importance score of a text unit is estimated from a regression model with pairwise preference-based sampling.
4
Data CollectionRotten Tomatoes (www.rottentomatoes.com)
• There are professional critics and user-generated reviews.
• For each movie has a one-sentence critic consensus.
Data
• 246,164 critics and their opinion consensus for 3,731 movies
• train: 2,458, validation: 536, test: 737 movies
Movie Reviews
5
Data Collectionidebate.org
• This site is wikipedia-style website for gathering pro and con arguments on controversial issues.
• Each point contains a one-sentence central claim.
Data
• 676 debates with 2,259 claims.
• train: 450, validation: 67, test: 150 debates
Arguments on controversial topic
6
The Neural Network-Based Abstract Generation Model
• summary y (composed by the sequence of words y1 , …, |y|.
• input consists of an arbitrary number of reviews or arguments -> text units x = {x1, … , xM}
• Each text unit xk is composed by a sequence of words xk1 , …., xk|xk|.
Problem Formulation
8
The Neural Network-Based Abstract Generation Model
• a sequence of word-level predictions: log P(y|x) = ∑j=1 log P(yj| y1, ….yj-1, x) P(yi | y1 , …, yj-1, x) = sofmax(hj)
• hj is RNNs state variable. hj = g(yj-1, hj-1, s)
• g is LSTM network (Hochreiter and Schmidhuber, 1997)
Decoder
9
The Neural Network-Based Abstract Generation Model
• LSTM
• The model concatenates the representation of previous output word yj-1 and the input representation s as uj
Decoder
10
The Neural Network-Based Abstract Generation Model
• The representation of input text units s is computed using an attention model (Bahdanau et al., 2014) -> ∑i=1aibi
• Authors construct bi by building a bidirectional LSTM.They use the LSTM formulation by setting uj = xj.
• ai = softmax(v(bi, hj-1)) v(bi, hj-1) = Ws•tanh(Wcgbi + Whghj-1)
Encoder
11
The Neural Network-Based Abstract Generation Model
Their input consists multiple separate text units. • one sequence z =
There two problem: • The model is sensitive to the order of text units
• z may contain thousands of words.
Attention Over Multiple Inputs
12
The Neural Network-Based Abstract Generation Model
Sub-sampling from the input
• They define an importance score f (xk) ∈ [0, 1] for each document xk.
• K candidates are sampled
Attention Over Multiple Inputs
13
The Neural Network-Based Abstract Generation Model
a ridge regression model and a regularizer.
• Learning f(xk) = rk•w by minimizing ||Rw - L ||22 + λ•||R’w-L’||22 + β•||w||22.
• text unit xk is represented as an d-dimentional feature vector rk ∈ Rd.
Importance Estimation
14
The Neural Network-Based Abstract Generation Model
• For testing phase, they re-rank the n-best summaries according to their cosine similarity with the input text units.
• The one with the highest similarity is included in the final summary.
Post-processing
15
Experimental Setup• Data Preprocessing
Stanford CoreNLP (Manning et al., 2014)
• Pre trained Embeddings and Features word embedding: 300 dimension They extend their model with additional features.
16
• Hyper parameters The LSTMs are defined with states and cells of 150 dimensions. The attention: 100 dimensions. Training is performed via Adagrad (Duchi et al. 2011)
• Evaluation : BLEU
• The importance-based sampling rate K is set of 5
• Decoding: beam serch -> 20
Experimental Setup
17
Results
• MRR (Mean Reciprocal Rank • NDCG (normalized Discounted Cumulative Gain)
Importance Estimation Evaluation
18
Conclusion• Authors presented a neural approach to generate
abstractive summaries for opinionated text.
• They employed an attention-based method that finds salient information from different input text units.
• They deploy an importance-based sampling mechanism for model training.
• Their system obtained sota results.
22