learning to select, track, and generate for data-to-text · record1 and updates the state of the...

12
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2102–2113 Florence, Italy, July 28 - August 2, 2019. c 2019 Association for Computational Linguistics 2102 Learning to Select, Track, and Generate for Data-to-Text Hayate Iso * Yui Uehara Tatsuya Ishigaki \Hiroshi Noji Eiji Aramaki †‡ Ichiro Kobayashi [Yusuke Miyao ]Naoaki Okazaki \Hiroya Takamura \Nara Institute of Science and Technology Artificial Intelligence Research Center, AIST \ Tokyo Institute of Technology [ Ochanomizu University ] The University of Tokyo {iso.hayate.id3,aramaki}@is.naist.jp [email protected] {yui.uehara,ishigaki.t,hiroshi.noji,takamura.hiroya}@aist.go.jp [email protected] [email protected] Abstract We propose a data-to-text generation model with two modules, one for tracking and the other for text generation. Our tracking mod- ule selects and keeps track of salient infor- mation and memorizes which record has been mentioned. Our generation module generates a summary conditioned on the state of track- ing module. Our model is considered to simu- late the human-like writing process that gradu- ally selects the information by determining the intermediate variables while writing the sum- mary. In addition, we also explore the ef- fectiveness of the writer information for gen- eration. Experimental results show that our model outperforms existing models in all eval- uation metrics even without writer informa- tion. Incorporating writer information fur- ther improves the performance, contributing to content planning and surface realization. 1 Introduction Advances in sensor and data storage technolo- gies have rapidly increased the amount of data produced in various fields such as weather, fi- nance, and sports. In order to address the infor- mation overload caused by the massive data, data- to-text generation technology, which expresses the contents of data in natural language, becomes more important (Barzilay and Lapata, 2005). Re- cently, neural methods can generate high-quality short summaries especially from small pieces of data (Liu et al., 2018). Despite this success, it remains challenging to generate a high-quality long summary from data (Wiseman et al., 2017). One reason for the difficulty is because the input data is too large for a naive model to find its salient part, i.e., to deter- mine which part of the data should be mentioned. * Work was done during the internship at Artificial Intel- ligence Research Center, AIST In addition, the salient part moves as the sum- mary explains the data. For example, when gen- erating a summary of a basketball game (Table 1 (b)) from the box score (Table 1 (a)), the input contains numerous data records about the game: e.g., Jordan Clarkson scored 18 points. Existing models often refer to the same data record mul- tiple times (Puduppully et al., 2019). The mod- els may mention an incorrect data record, e.g., Kawhi Leonard added 19 points: the summary should mention LaMarcus Aldridge, who scored 19 points. Thus, we need a model that finds salient parts, tracks transitions of salient parts, and ex- presses information faithful to the input. In this paper, we propose a novel data-to- text generation model with two modules, one for saliency tracking and another for text generation. The tracking module keeps track of saliency in the input data: when the module detects a saliency transition, the tracking module selects a new data record 1 and updates the state of the tracking mod- ule. The text generation module generates a doc- ument conditioned on the current tracking state. Our model is considered to imitate the human-like writing process that gradually selects and tracks the data while generating the summary. In ad- dition, we note some writer-specific patterns and characteristics: how data records are selected to be mentioned; and how data records are expressed as text, e.g., the order of data records and the word usages. We also incorporate writer information into our model. The experimental results demonstrate that, even without writer information, our model achieves the best performance among the previous models in all evaluation metrics: 94.38% precision of re- lation generation, 42.40% F1 score of content se- lection, 19.38% normalized Damerau-Levenshtein 1 We use ‘data record’ and ‘relation’ interchangeably.

Upload: others

Post on 06-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2102–2113Florence, Italy, July 28 - August 2, 2019. c©2019 Association for Computational Linguistics

2102

Learning to Select, Track, and Generate for Data-to-Text

Hayate Iso∗†Yui Uehara‡ Tatsuya Ishigaki\‡Hiroshi Noji‡

Eiji Aramaki†‡ Ichiro Kobayashi[‡Yusuke Miyao]‡Naoaki Okazaki\‡Hiroya Takamura\‡

†Nara Institute of Science and Technology ‡Artificial Intelligence Research Center, AIST\Tokyo Institute of Technology [Ochanomizu University ]The University of Tokyo{iso.hayate.id3,aramaki}@is.naist.jp [email protected]

{yui.uehara,ishigaki.t,hiroshi.noji,takamura.hiroya}@[email protected] [email protected]

Abstract

We propose a data-to-text generation modelwith two modules, one for tracking and theother for text generation. Our tracking mod-ule selects and keeps track of salient infor-mation and memorizes which record has beenmentioned. Our generation module generatesa summary conditioned on the state of track-ing module. Our model is considered to simu-late the human-like writing process that gradu-ally selects the information by determining theintermediate variables while writing the sum-mary. In addition, we also explore the ef-fectiveness of the writer information for gen-eration. Experimental results show that ourmodel outperforms existing models in all eval-uation metrics even without writer informa-tion. Incorporating writer information fur-ther improves the performance, contributing tocontent planning and surface realization.

1 Introduction

Advances in sensor and data storage technolo-gies have rapidly increased the amount of dataproduced in various fields such as weather, fi-nance, and sports. In order to address the infor-mation overload caused by the massive data, data-to-text generation technology, which expresses thecontents of data in natural language, becomesmore important (Barzilay and Lapata, 2005). Re-cently, neural methods can generate high-qualityshort summaries especially from small pieces ofdata (Liu et al., 2018).

Despite this success, it remains challengingto generate a high-quality long summary fromdata (Wiseman et al., 2017). One reason for thedifficulty is because the input data is too large fora naive model to find its salient part, i.e., to deter-mine which part of the data should be mentioned.

∗Work was done during the internship at Artificial Intel-ligence Research Center, AIST

In addition, the salient part moves as the sum-mary explains the data. For example, when gen-erating a summary of a basketball game (Table 1(b)) from the box score (Table 1 (a)), the inputcontains numerous data records about the game:e.g., Jordan Clarkson scored 18 points. Existingmodels often refer to the same data record mul-tiple times (Puduppully et al., 2019). The mod-els may mention an incorrect data record, e.g.,Kawhi Leonard added 19 points: the summaryshould mention LaMarcus Aldridge, who scored19 points. Thus, we need a model that finds salientparts, tracks transitions of salient parts, and ex-presses information faithful to the input.

In this paper, we propose a novel data-to-text generation model with two modules, one forsaliency tracking and another for text generation.The tracking module keeps track of saliency in theinput data: when the module detects a saliencytransition, the tracking module selects a new datarecord1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on the current tracking state.Our model is considered to imitate the human-likewriting process that gradually selects and tracksthe data while generating the summary. In ad-dition, we note some writer-specific patterns andcharacteristics: how data records are selected to bementioned; and how data records are expressed astext, e.g., the order of data records and the wordusages. We also incorporate writer informationinto our model.

The experimental results demonstrate that, evenwithout writer information, our model achievesthe best performance among the previous modelsin all evaluation metrics: 94.38% precision of re-lation generation, 42.40% F1 score of content se-lection, 19.38% normalized Damerau-Levenshtein

1We use ‘data record’ and ‘relation’ interchangeably.

Page 2: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2103

Distance (DLD) of content ordering, and 16.15%of BLEU score. We also confirm that addingwriter information further improves the perfor-mance.

2 Related Work

2.1 Data-to-Text Generation

Data-to-text generation is a task for generating de-scriptions from structured or non-structured dataincluding sports commentary (Tanaka-Ishii et al.,1998; Chen and Mooney, 2008; Taniguchi et al.,2019), weather forecast (Liang et al., 2009; Meiet al., 2016), biographical text from infobox inWikipedia (Lebret et al., 2016; Sha et al., 2018;Liu et al., 2018) and market comments from stockprices (Murakami et al., 2017; Aoki et al., 2018).

Neural generation methods have become themainstream approach for data-to-text generation.The encoder-decoder framework (Cho et al., 2014;Sutskever et al., 2014) with the attention (Bah-danau et al., 2015; Luong et al., 2015) and copymechanism (Gu et al., 2016; Gulcehre et al., 2016)has successfully applied to data-to-text tasks.However, neural generation methods sometimesyield fluent but inadequate descriptions (Tu et al.,2017). In data-to-text generation, descriptions in-consistent to the input data are problematic.

Recently, Wiseman et al. (2017) introducedthe ROTOWIRE dataset, which contains multi-sentence summaries of basketball games with box-score (Table 1). This dataset requires the selectionof a salient subset of data records for generatingdescriptions. They also proposed automatic evalu-ation metrics for measuring the informativeness ofgenerated summaries.

Puduppully et al. (2019) proposed a two-stagemethod that first predicts the sequence of datarecords to be mentioned and then generates asummary conditioned on the predicted sequences.Their idea is similar to ours in that the both con-sider a sequence of data records as content plan-ning. However, our proposal differs from theirsin that ours uses a recurrent neural network forsaliency tracking, and that our decoder dynami-cally chooses a data record to be mentioned with-out fixing a sequence of data records.

2.2 Memory modules

The memory network can be used to maintainand update representations of the salient informa-tion (Weston et al., 2015; Sukhbaatar et al., 2015;

Graves et al., 2016). This module is often usedin natural language understanding to keep trackof the entity state (Kobayashi et al., 2016; Hoanget al., 2018; Bosselut et al., 2018).

Recently, entity tracking has been popular forgenerating coherent text (Kiddon et al., 2016; Jiet al., 2017; Yang et al., 2017; Clark et al., 2018).Kiddon et al. (2016) proposed a neural checklistmodel that updates predefined item states. Ji et al.(2017) proposed an entity representation for thelanguage model. Updating entity tracking stateswhen the entity is introduced, their method selectsthe salient entity state.

Our model extends this entity tracking modulefor data-to-text generation tasks. The entity track-ing module selects the salient entity and appropri-ate attribute in each timestep, updates their states,and generates coherent summaries from the se-lected data record.

3 Data

Through careful examination, we found that in theoriginal dataset ROTOWIRE, some NBA gameshave two documents, one of which is sometimes inthe training data and the other is in the test or val-idation data. Such documents are similar to eachother, though not identical. To make this datasetmore reliable as an experimental dataset, we cre-ated a new version.

We ran the script provided by Wiseman et al.(2017), which is for crawling the ROTOWIRE

website for NBA game summaries. The script col-lected approximately 78% of the documents in theoriginal dataset; the remaining documents disap-peared. We also collected the box-scores associ-ated with the collected documents. We observedthat some of the box-scores were modified com-pared with the original ROTOWIRE dataset.

The collected dataset contains 3,752 instances(i.e., pairs of a document and box-scores). How-ever, the four shortest documents were not sum-maries; they were, for example, an announcementabout the postponement of a match. We thusdeleted these 4 instances and were left with 3,748instances. We followed the dataset split by Wise-man et al. (2017) to split our dataset into train-ing, development, and test data. We found 14 in-stances that didn’t have corresponding instances inthe original data. We randomly classified 9, 2, and3 of those 14 instances respectively into training,development, and test data. Finally, the sizes of

Page 3: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2104

TEAM H/V WIN LOSS PTS REB AST FG PCT FG3 PCT . . .

KNICKS H 16 19 104 46 26 45 46 . . .BUCKS V 18 16 105 42 20 47 32 . . .

PLAYER H/V PTS REB AST BLK STL MIN CITY . . .

CARMELO ANTHONY H 30 11 7 0 2 37 NEW YORK . . .DERRICK ROSE H 15 3 4 0 1 33 NEW YORK . . .COURTNEY LEE H 11 2 3 1 1 38 NEW YORK . . .GIANNIS ANTETOKOUNMPO V 27 13 4 3 1 39 MILWAUKEE . . .GREG MONROE V 18 9 4 1 3 31 MILWAUKEE . . .JABARI PARKER V 15 4 3 0 1 37 MILWAUKEE . . .MALCOLM BROGDON V 12 6 8 0 0 38 MILWAUKEE . . .MIRZA TELETOVIC V 13 1 0 0 0 21 MILWAUKEE . . .JOHN HENSON V 2 2 0 0 0 14 MILWAUKEE . . .. . . . . . . . . . . . . . . . . . . . .

(a) Box score: Top contingency table shows number of wins andlosses and summary of each game. Bottom table shows statisticsof each player such as points scored (PLAYER’s PTS), and totalrebounds (PLAYER’s REB).

The Milwaukee Bucks defeated the New York Knicks,105-104, at Madison Square Garden on Wednesday. TheKnicks (16-19) checked in to Wednesday’s contest lookingto snap a five-game losing streak and heading into the fourthquarter, they looked like they were well on their way to thatgoal. . . . Antetokounmpo led the Bucks with 27 points, 13rebounds, four assists, a steal and three blocks, his secondconsecutive double-double. Greg Monroe actually checkedin as the second-leading scorer and did so in his customarybench role, posting 18 points, along with nine boards, fourassists, three steals and a block. Jabari Parker contributed15 points, four rebounds, three assists and a steal. MalcolmBrogdon went for 12 points, eight assists and six rebounds.Mirza Teletovic was productive in a reserve role as well,generating 13 points and a rebound. . . . Courtney Leechecked in with 11 points, three assists, two rebounds, asteal and a block. . . . The Bucks and Knicks face off onceagain in the second game of the home-and-home series, withthe meeting taking place Friday night in Milwaukee.

(b) NBA basketball game summary: Each summaryconsists of game victory or defeat of the game andhighlights of valuable players.

Table 1: Example of input and output data: task defines box score (1a) used for input and summary document ofgame (1b) used as output. Extracted entities are shown in bold face. Extracted values are shown in green.

t 199 200 201 202 203 204 205 206 207 208 209

Yt Jabari Parker contributed 15 points , four rebounds , three assists

Zt 1 1 0 1 0 0 1 0 0 1 0

EtJABARI JABARI - JABARI - - JABARI - - JABARI -PARKER PARKER PARKER PARKER PARKER

At FIRST NAME LAST NAME - PLAYER PTS - - PLAYER REB - - PLAYER AST -Nt - - - 0 - - 1 - - 1 -

Table 2: Running example of our model’s generation process. At every time step t, model predicts each randomvariable. Model firstly determines whether to refer to data records (Zt = 1) or not (Zt = 0). If random variableZt = 1, model selects entity Et, its attribute At and binary variables Nt if needed. For example, at t = 202, modelpredicts random variable Z202 = 1 and then selects the entity JABARI PARKER and its attribute PLAYER PTS.Given these values, model outputs token 15 from selected data record.

our training, development, test dataset are respec-tively 2,714, 534, and 500. On average, each sum-mary has 384 tokens and 644 data records. Eachmatch has only one summary in our dataset, asfar as we checked. We also collected the writerof each document. Our dataset contains 32 differ-ent writers. The most prolific writer in our datasetwrote 607 documents. There are also writers whowrote less than ten documents. On average, eachwriter wrote 117 documents. We call our newdataset ROTOWIRE-MODIFIED.2

4 Saliency-Aware Text Generation

At the core of our model is a neural languagemodel with a memory state hLM to generate asummary y1:T = (y1, . . . , yT ) given a set of datarecords x. Our model has another memory statehENT, which is used to remember the data records

2For information about the dataset, please followthis link: https://github.com/aistairc/rotowire-modified

that have been referred to. hENT is also used to up-date hLM, meaning that the referred data recordsaffect the text generation.

Our model decides whether to refer to x, whichdata record r ∈ x to be mentioned, and how to ex-press a number. The selected data record is used toupdate hENT. Formally, we use the four variables:

1. Zt: binary variable that determines whether themodel refers to input x at time step t (Zt = 1).

2. Et: At each time step t, this variable indi-cates the salient entity (e.g., HAWKS, LEBRON

JAMES).3. At: At each time step t, this variable indicates

the salient attribute to be mentioned (e.g., PTS).4. Nt: If attribute At of the salient entity Et is

a numeric attribute, this variable determines ifa value in the data records should be output inArabic numerals (e.g., 50) or in English words(e.g., five).

To keep track of the salient entity, our modelpredicts these random variables at each time step

Page 4: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2105

t through its summary generation process. Run-ning example of our model is shown in Table 2and full algorithm is described in Appendix A. Inthe following subsections, we explain how to ini-tialize the model, predict these random variables,and generate a summary. Due to space limitations,bias vectors are omitted.

Before explaining our method, we describe ournotation. Let E and A denote the sets of en-tities and attributes, respectively. Each recordr ∈ x consists of entity e ∈ E , attribute a ∈ A,and its value x[e, a], and is therefore representedas r = (e, a,x[e, a]). For example, the box-score in Table 1 has a record r such that e =ANTHONY DAVIS, a = PTS, and x[e, a] = 20.

4.1 InitializationLet r denote the embedding of data record r ∈ x.Let e denote the embedding of entity e. Note thate depends on the set of data records, i.e., it de-pends on the game. We also use e for static em-bedding of entity e, which, on the other hand, doesnot depend on the game.

Given the embedding of entity e, attribute a,and its value v, we use the concatenation layerto combine the information from these vectorsto produce the embedding of each data record(e, a, v), denoted as re,a,v as follows:

re,a,v = tanh(W R(e⊕ a⊕ v)

), (1)

where ⊕ indicates the concatenation of vectors,and W R denotes a weight matrix.3

We obtain e in the set of data records x, by sum-ming all the data-record embeddings transformedby a matrix:

e = tanh

(∑a∈A

W Aa re,a,x[e,a]

), (2)

where W Aa is a weight matrix for attribute a.

Since e depends on the game as above, e is sup-posed to represent how entity e played in the game.

To initialize the hidden state of each module, weuse embeddings of<SOD> for hLM and averagedembeddings of e for hENT.

4.2 Saliency transitionGenerally, the saliency of text changes during textgeneration. In our work, we suppose that the

3We also concatenate the embedding vectors that repre-sents whether the entity is in home or away team.

saliency is represented as the entity and its at-tribute being talked about. We therefore proposea model that refers to a data record at each time-point, and transitions to another as text goes.

To determine whether to transition to anotherdata record or not at time t, the model calculatesthe following probability:

p(Zt = 1 | hLMt−1,h

ENTt−1) = σ(W z(h

LMt−1 ⊕ hENT

t−1)),(3)

where σ(·) is the sigmoid function. If p(Zt = 1 |hLMt−1,h

ENTt−1) is high, the model transitions to an-

other data record.When the model decides to transition to another,

the model then determines which entity and at-tribute to refer to, and generates the next word(Section 4.3). On the other hand, if the model de-cides not transition to another, the model generatesthe next word without updating the tracking stateshENTt = hENT

t−1 (Section 4.4).

4.3 Selection and trackingWhen the model refers to a new data record(Zt = 1), it selects an entity and its attribute. Italso tracks the saliency by putting the informa-tion about the selected entity and attribute into thememory vector hENT. The model begins to selectthe subject entity and update the memory states ifthe subject entity will change.

Specifically, the model first calculates the prob-ability of selecting an entity:

p(Et = e | hLMt−1,h

ENTt−1)

{exp

(hENTs W OLDhLM

t−1)

if e ∈ Et−1exp

(eW NEWhLM

t−1)

otherwise, (4)

where Et−1 is the set of entities that have alreadybeen referred to by time step t, and s is defined ass = max{s : s ≤ t− 1, e = es}, which indicatesthe time step when this entity was last mentioned.

The model selects the most probable entity asthe next salient entity and updates the set of enti-ties that appeared (Et = Et−1 ∪ {et}).

If the salient entity changes (et 6= et−1), themodel updates the hidden state of the trackingmodel hENT with a recurrent neural network witha gated recurrent unit (GRU; Chung et al., 2014):

hENT′t =

hENTt−1 if et = et−1

GRUE(e,hENTt−1) else if et 6∈ Et−1

GRUE(W Ssh

ENTs ,hENT

t−1) otherwise.(5)

Page 5: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2106

Note that if the selected entity at time step t, et, isidentical to the previously selected entity et−1, thehidden state of the tracking model is not updated.

If the selected entity et is new (et 6∈ Et−1), thehidden state of the tracking model is updated withthe embedding e of entity et as input. In contrast,if entity et has already appeared in the past (et ∈Et−1) but is not identical to the previous one (et 6=et−1), we use hENT

s (i.e., the memory state whenthis entity last appeared) to fully exploit the localhistory of this entity.

Given the updated hidden state of the trackingmodel hENT

t , we next select the attribute of thesalient entity by the following probability:

p(At = a | et,hLMt−1,h

ENT′t ) (6)

∝ exp(ret,a,x[et,a]W

ATTR(hLMt−1 ⊕ hENT′

t )).

After selecting at, i.e., the most probable attributeof the salient entity, the tracking model updates thememory state hENT

t with the embedding of the datarecord ret,at,x[et,at] introduced in Section 4.1:

hENTt = GRUA(ret,at,x[et,at],h

ENT′t ). (7)

4.4 Summary generationGiven two hidden states, one for language modelhLMt−1 and the other for tracking model hENT

t , themodel generates the next word yt. We also incor-porate a copy mechanism that copies the value ofthe salient data record x[et, at].

If the model refers to a new data record (Zt =1), it directly copies the value of the data recordx[et, at]. However, the values of numerical at-tributes can be expressed in at least two differentmanners: Arabic numerals (e.g., 14) and Englishwords (e.g., fourteen). We decide which one to useby the following probability:

p(Nt = 1 | hLMt−1,h

ENTt ) = σ(W N(hLM

t−1 ⊕ hENTt )),

(8)

where W N is a weight matrix. The model thenupdates the hidden states of the language model:

h′t = tanh(W H(hLM

t−1 ⊕ hENTt )

), (9)

where W H is a weight matrix.If the salient data record is the same as the pre-

vious one (Zt = 0), it predicts the next word yt viaa probability over words conditioned on the con-text vector h′t:

p(Yt | h′t) = softmax(W Yh′t). (10)

Subsequently, the hidden state of language modelhLM is updated:

hLMt = LSTM(yt ⊕ h′t,h

LMt−1), (11)

where yt is the embedding of the word generatedat time step t.4

4.5 Incorporating writer informationWe also incorporate the information about thewriter of the summaries into our model. Specif-ically, instead of using Equation (9), we concate-nate the embedding w of a writer to hLM

t−1 ⊕ hENTt

to construct context vector h′t:

h′t = tanh(W ′H(hLM

t−1 ⊕ hENTt ⊕w)

), (12)

where W ′H is a new weight matrix. Since this newcontext vector h′t is used for calculating the proba-bility over words in Equation (10), the writer infor-mation will directly affect word generation, whichis regarded as surface realization in terms of tra-ditional text generation. Simultaneously, contextvector h′t enhanced with the writer information isused to obtain hLM

t , which is the hidden state ofthe language model and is further used to select thesalient entity and attribute, as mentioned in Sec-tions 4.2 and 4.3. Therefore, in our model, thewriter information affects both surface realizationand content planning.

4.6 Learning objectiveWe apply fully supervised training that maximizesthe following log-likelihood:

log p(Y1:T , Z1:T , E1:T , A1:T , N1:T | x)

=

T∑t=1

log p(Zt = zt | hLMt−1,h

ENTt−1)

+∑

t:Zt=1

log p(Et = et | hLMt−1,h

ENTt−1)

+∑

t:Zt=1

log p(At = at | et,hLMt−1,h

ENT′t )

+∑

t:Zt=1,atis num attr

log p(Nt = nt | hLMt−1,h

ENTt )

+∑

t:Zt=0

log p(Yt = yt | h′t)

4In our initial experiment, we observed a word repetitionproblem when the tracking model is not updated during gen-erating each sentence. To avoid this problem, we also updatethe tracking model with special trainable vectors vREFRESHto refresh these states after our model generates a period:hENT

t = GRUA(vREFRESH,hENTt )

Page 6: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2107

MethodRG CS CO

BLEU# P% P% R% F1% DLD%

GOLD 27.36 93.42 100. 100. 100. 100. 100.TEMPLATES 54.63 100. 31.01 58.85 40.61 17.50 8.43

Wiseman et al. (2017) 22.93 60.14 24.24 31.20 27.29 14.70 14.73Puduppully et al. (2019) 33.06 83.17 33.06 43.59 37.60 16.97 13.96

PROPOSED 39.05 94.43 35.77 52.05 42.40 19.38 16.15

Table 3: Experimental result. Each metric evaluates whether important information (CS) is described accurately(RG) and in correct order (CO).

5 Experiments

5.1 Experimental settings

We used ROTOWIRE-MODIFIED as the dataset forour experiments, which we explained in Section 3.The training, development, and test data respec-tively contained 2,714, 534, and 500 games.

Since we take a supervised training approach,we need the annotations of the random variables(i.e., Zt, Et, At, and Nt) in the training data, asshown in Table 2. Instead of simple lexical match-ing with r ∈ x, which is prone to errors in theannotation, we use the information extraction sys-tem provided by Wiseman et al. (2017). Althoughthis system is trained on noisy rule-based annota-tions, we conjecture that it is more robust to errorsbecause it is trained to minimize the marginalizedloss function for ambiguous relations. All trainingdetails are described in Appendix B.

5.2 Models to be compared

We compare our model5 against two baselinemodels. One is the model used by Wisemanet al. (2017), which generates a summary with anattention-based encoder-decoder model. The otherbaseline model is the one proposed by Puduppullyet al. (2019), which first predicts the sequence ofdata records and then generates a summary condi-tioned on the predicted sequences. Wiseman et al.(2017)’s model refers to all data records everytimestep, while Puduppully et al. (2019)’s modelrefers to a subset of all data records, which is pre-dicted in the first stage. Unlike these models, ourmodel uses one memory vector hENT

t that tracksthe history of the data records, during generation.We retrained the baselines on our new dataset. Wealso present the performance of the GOLD and

5Our code is available from https://github.com/aistairc/sports-reporter

TEMPLATES summaries. The GOLD summary isexactly identical with the reference summary andeach TEMPLATES summary is generated in thesame manner as Wiseman et al. (2017).

In the latter half of our experiments, we exam-ine the effect of adding information about writers.In addition to our model enhanced with writer in-formation, we also add writer information to themodel by Puduppully et al. (2019). Their methodconsists of two stages corresponding to contentplanning and surface realization. Therefore, byincorporating writer information to each of thetwo stages, we can clearly see which part of themodel to which the writer information contributesto. For Puduppully et al. (2019) model, we attachthe writer information in the following three ways:

1. concatenating writer embedding w with the in-put vector for LSTM in the content planningdecoder (stage 1);

2. concatenating writer embedding w with the in-put vector for LSTM in the text generator (stage2);

3. using both 1 and 2 above.

For more details about each decoding stage, read-ers can refer to Puduppully et al. (2019).

5.3 Evaluation metrics

As evaluation metrics, we use BLEU score (Pap-ineni et al., 2002) and the extractive metrics pro-posed by Wiseman et al. (2017), i.e., relation gen-eration (RG), content selection (CS), and contentordering (CO) as evaluation metrics. The extrac-tive metrics measure how well the relations ex-tracted from the generated summary match thecorrect relations6:

6The model for extracting relation tuples was trained ontuples made from the entity (e.g., team name, city name,player name) and attribute value (e.g., “Lakers”, “92”) ex-

Page 7: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2108

- RG: the ratio of the correct relations out of allthe extracted relations, where correct relationsare relations found in the input data records x.The average number of extracted relations isalso reported.

- CS: precision and recall of the relations ex-tracted from the generated summary againstthose from the reference summary.

- CO: edit distance measured with normalizedDamerau-Levenshtein Distance (DLD) betweenthe sequences of relations extracted from thegenerated and reference summary.

6 Results and Discussions

We first focus on the quality of tracking model andentity representation in Sections 6.1 to 6.4, wherewe use the model without writer information. Weexamine the effect of writer information in Sec-tion 6.5.

6.1 Saliency tracking-based modelAs shown in Table 3, our model outperforms allbaselines across all evaluation metrics.7 One ofthe noticeable results is that our model achievesslightly higher RG precision than the gold sum-mary. Owing to the extractive evaluation nature,the generated summary of the precision of the rela-tion generation could beat the gold summary per-formance. In fact, the template model achieves100% precision of the relation generations.

The other is that only our model exceeds thetemplate model regarding F1 score of the con-tent selection and obtains the highest performanceof content ordering. This imply that the trackingmodel encourages to select salient input records inthe correct order.

6.2 Qualitative analysis of entity embeddingOur model has the entity embedding e, which de-pends on the box score for each game in additionto static entity embedding e. Now we analyze thedifference of these two types of embeddings.

We present a two-dimensional visualizations ofboth embeddings produced using PCA (Pearson,

tracted from the summaries, and the corresponding attributes(e.g., “TEAM NAME”, “PTS”) found in the box- or line-score.The precision and the recall of this extraction model are re-spectively 93.4% and 75.0% in the test data.

7The scores of Puduppully et al. (2019)’s model signifi-cantly dropped from what they reported, especially on BLEUmetric. We speculate this is mainly due to the reduced amountof our training data (Section 3). That is, their model might bemore data-hungry than other models.

Solomon Hill

Joe InglesPJ Tucker

Kevin Garnett

Jimmy Butler Shabazz Muhammad

Carlos Boozer

Jonathan Gibson

Bobby Portis

Jabari BrownJames Jones

Giannis Antetokounmpo

Deron Williams

Patrick Beverley

Shavlik Randolph

Alan WilliamsJared Sullinger

Joel Freeland

Amir Johnson

Dahntay Jones

Amar'e Stoudemire

Mario HezonjaNene

Terrence RossLarry Drew II Tony Snell

Tyler Ulis

Cheick Diallo

Jamal Crawford

Devin HarrisGordon HaywardAlex Stepheson

Hedo TurkogluC.J. Watson

Joffrey Lauvergne

Karl-Anthony Towns

Dennis SchroderJarell Eddie

Isaiah Whitehead

Jordan McRae

Joel Embiid

Otto Porter Jr.

Arron Afflalo

Jae Crowder

Jannero Pargo

Al Horford

Boris Diaw

Serge Ibaka

Tyreke Evans

Shaun LivingstonAndre Roberson

Dwight Howard

Alexis AjincaRoy Hibbert

Hollis Thompson

Gerald Green

Isaiah Canaan

Pascal Siakam

Michael Kidd-Gilchrist

Darrun Hilliard

Mario Chalmers

Kelly Oubre Jr.

John Henson

Shelvin Mack

Quincy Pondexter

Thomas Robinson

Brian Roberts

Kris Humphries

Jerome Jordan

Mike Miller

Alec BurksZach Randolph Ricky Rubio

Nikola Mirotic

Randy Foye

Danny Granger

Nikola Pekovic

Ed Davis

Francisco GarciaDonald Sloan

Nick CollisonEvan Fournier

Nerlens Noel

Jordan Hill

Semaj Christon

Ersan Ilyasova

Jarell MartinBen Gordon

Damien Inglis

Marreese Speights

Georgios PapagiannisJon Leuer

Denzel Valentine

Clint Capela

Willie Green

Sean Kilpatrick

Chris Douglas-Roberts

DeMarcus Cousins

Shawn Marion

Trey BurkeBrook Lopez

James Harden

Kevon LooneyMalcolm BrogdonNorris ColeLance Thomas

Jeremy Lin

Klay Thompson

Reggie Jackson

Channing FryeJoe Harris

Jerian GrantDion Waiters

Tiago Splitter

Anthony Morrow

Aaron Brooks

Andrew Wiggins

Jusuf Nurkic

Greg Monroe

Justise Winslow

Ty Lawson

Bryce Cotton

Dorell Wright

Tim Hardaway Jr.Carmelo Anthony

Alex LenDeMarre Carroll

DeMar DeRozan

Jason Smith

Kevin Love

Marco Belinelli

Bruno Caboclo

KJ McDaniels

Mike MuscalaSpencer Hawes

Marcin Gortat

Alex Kirk

Donatas Motiejunas

Jordan AdamsAndrew Harrison

Nick CalathesTomas Satoransky

Tyrus ThomasLance Stephenson

Raul NetoStanley Johnson

T.J. McConnell

Rasual Butler

Ricky LedoLuis Scola

Dario Saric

Brandon RushAlex AbrinesDarius Miller Evan Turner

Doug McDermott

Chris Johnson

Steve Novak

Tarik Black

Blake Griffin

Goran Dragic

Derrick Rose

Matt Bonner

Omer AsikCourtney Lee

Okaro White

Gerald Henderson Shawn Long

Jordan Clarkson

Mike Scott

Derrick Favors

Eric Gordon

David WestKyle O'Quinn

Dirk Nowitzki

Kyrie Irving

Kendrick Perkins

Victor Oladipo JaMychal Green

Chris Bosh

Buddy Hield

Bismack Biyombo

Cody Zeller

Russ SmithJimmer Fredette

Garrett Temple

Dante Cunningham

Ish Smith

Robert Sacre

Sam Dekker

Malcolm Delaney

Raymond Felton

Austin Daye

Wayne Ellington

Vince Carter

Kris Dunn

Andre IguodalaShawne Williams

Eric BledsoeCorey Brewer

Paul George

Alonzo Gee

Nemanja NedovicAndre Drummond

Reggie Evans

Dante ExumReggie Bullock

Bradley Beal

Kirk HinrichSteve Blake

Mike Dunleavy

CJ McCollumMarcus Smart

Jared Cunningham

Kevin Durant

Jorge GutierrezTaj Gibson

T.J. WarrenIan Mahinmi

Drew Gooden

Nick Young

Anthony Brown

O.J. Mayo

Trevor Ariza

LeBron James

Ray McCallum

Rondae Hollis-Jefferson Skal Labissiere

Ryan Hollins

Andrea Bargnani

Allen Crabbe

Chris Kaman

Sheldon Mac

Robin Lopez

Kobe Bryant

Shane Larkin

Marcus ThorntonMatt Barnes

Chase Budinger

Troy Williams

Marvin Williams

Kemba Walker

Luc Mbah a MouteKenyon Martin

Cameron PayneAustin RiversJabari Parker Elliot WilliamsJustin Hamilton

Justin Holiday

Ryan AndersonEmmanuel Mudiay

Omri CasspiJahlil Okafor

Aaron Gordon

Ron Baker

Avery Bradley

Derrick Williams

Kyle Korver

Ronny Turiaf

Juancho HernangomezCory Jefferson

DeAndre Liggins

Greivis VasquezTobias Harris

Kyle Anderson

D.J. Augustin

Rodney McGruder

Bernard James

Andrew Goudelock

Tyler HansbroughIan Clark

Festus Ezeli

Tyler Johnson

Tyler Zeller

Troy DanielsFred VanVleet

John Lucas III

JJ Redick

Brandon Bass

Damian JonesJeff Teague

Paul Millsap Lavoy Allen

Jared Dudley

Zaza Pachulia

Danny Green

Henry Walker

Chris Copeland

Miles Plumlee

Matthew Dellavedova

Cristiano FelicioIsaiah Thomas

Jerami Grant

Jarrod Uthoff

Sasha Vujacic

JR Smith

Malachi Richardson

Adreian Payne

CJ Miles

Meyers LeonardStephen Curry

Jose Calderon

John Wall

Elijah Millsap

Tony Allen

Brandon DaviesHarrison Barnes

JaKarr SampsonRajon Rondo

Dorian Finney-Smith

Brandon Jennings

Lou WilliamsDeAndre Jordan

Danilo Gallinari

Kelly OlynykKosta Koufos

Jeff Withey

Hassan WhitesideRodney StuckeyDomantas SabonisQuincy Acy

Jarrett Jack

Glen Davis

Quincy Miller

AJ PriceKhris Middleton

Gary Neal

James Young

JaVale McGee

Michael Beasley

Deyonta Davis

Russell WestbrookAnthony Davis E'Twaun Moore

Jonas Valanciunas

Tony Wroten

Perry Jones IIIHenry SimsSpencer Dinwiddie

Carl Landry

Markel Brown

Wesley Matthews

Dwyane Wade

Dewayne Dedmon

Mason Plumlee

Tim Frazier

Gorgui Dieng

Brandon Knight

Ramon Sessions

Darrell Arthur

Nicolas Batum

Iman Shumpert

Gary Harris

Yogi Ferrell

Tyler Ennis

Beno Udrih

Draymond Green

Danuel House Jr.

Al Jefferson

Cole Aldrich

Thaddeus Young

Frank Kaminsky

J.J. Barea

Nate Wolters

Marcelo Huertas

Kevin Seraphin

Manny Harris

Devyn Marble

Chris Andersen

Mitch McGary

Davis Bertans

Cory JosephAron Baynes

Alexey Shved

Charlie Villanueva

Kyle Singler

Kristaps PorzingisDamian Lillard

DeJuan BlairTyson Chandler

Joey Dorsey

Tony Parker

Johnny O'Bryant III

Willy Hernangomez

Kawhi Leonard

Jeff Ayres

Marcus Morris

Bojan Bogdanovic

Zoran DragicJonathon Simmons

Salah Mejri

Nemanja Bjelica

Alan Anderson

Larry Nance Jr.

Chris McCulloughShabazz Napier

Michael Carter-Williams

Norman Powell

Timothe Luwawu-Cabarrot

LaMarcus Aldridge

Sebastian Telfair

Justin AndersonWesley Johnson

Furkan Aldemir

Montrezl Harrell

Brice JohnsonTristan Thompson

Tim Duncan

Nikola Jokic

Zach LaVine

Mo Williams

Robert Covington

Terrence JonesJaylen Brown

Robbie HummelJoakim Noah

Steven AdamsAnthony Bennett

Phil Pressey

Joe Johnson

Arinze Onuaku

Coty Clarke

Chris Paul

Seth CurryMonta Ellis Dwight BuycksChasson Randle

Brandan Wright

Kentavious Caldwell-PopeMaurice Harkless

Jeremy Evans

Nikola Vucevic

Ronnie Price Andre Miller JJ Hickson

Andrew Bogut

Andrew Nicholson

Julius RandleMarkieff Morris

Chandler Parsons

Timofey Mozgov

Jason Richardson

Anderson Varejao

James Ennis III

Manu Ginobili

Jodie Meeks

Ivica Zubac

Jrue Holiday

Richaun Holmes

Luis Montero

Jordan Crawford

Caris LeVert

Pierre Jackson

Nik Stauskas

Darren Collison

Wilson Chandler

Josh Huestis

Patrick Patterson

Damjan RudezJakob Poeltl

Jeff Green

Pau Gasol

Rodney Hood Willie Cauley-Stein

Greg Stiemsma

Rudy Gobert

Jerryd Bayless

Kenneth Faried

Jason ThompsonPaul Zipser

Glenn Robinson III

Trevor Booker

Jonas Jerebko

Marc GasolToney Douglas

Tyus Jones

Richard Jefferson

Archie GoodwinMirza Teletovic

Jordan Hamilton

Thabo Sefolosha

John Jenkins

Landry Fields

Brandon Ingram

Joe Young

Samuel DalembertPero Antic

Lou Amundson

Jeremy LambDwight Powell

Luol Deng

Patty Mills

Anthony Tolliver

Thanasis Antetokounmpo

Rashad Vaughn

Mike Conley

Enes Kanter

Devin Booker

Elfrid Payton

Langston Galloway

D'Angelo Russell

Will Barton

Kostas Papanikolaou

Kevin MartinRyan Kelly

James Michael McAdoo

Kent Bazemore

Al-Farouq AminuPaul Pierce

Josh Smith

Josh McRoberts

James JohnsonBen McLemore Mindaugas Kuzminskas

David Lee

Rudy Gay

George Hill

Jordan Mickey

Jameer NelsonCleanthony Early

Myles TurnerJason Terry

John Salmons

James Anderson

Kyle Lowry

Tayshaun Prince

Gigi DatomeLeandro Barbosa

Nate Robinson

Figure 1: Illustrations of static entity embeddings e.Players with colored letters are listed in the ranking top100 players for the 2016-17 NBA season at https://www.washingtonpost.com/graphics/sports/nba-top-100-players-2016/.Only LeBron James is in red and the other players intop 100 are in blue. Top-ranked players have similarrepresentations of e.

1901). As shown in Figure 1, which is the vi-sualization of static entity embedding e, the top-ranked players are closely located.

We also present the visualizations of dynamicentity embeddings e in Figure 2. Although wedid not carry out feature engineering specific tothe NBA (e.g., whether a player scored doubledigits or not)8 for representing the dynamic en-tity embedding e, the embeddings of the playerswho performed well for each game have similarrepresentations. In addition, the change in embed-dings of the same player was observed dependingon the box-scores for each game. For instance, Le-Bron James recorded a double-double in a gameon April 22, 2016. For this game, his embeddingis located close to the embedding of Kevin Love,who also scored a double-double. However, hedid not participate in the game on December 26,2016. His embedding for this game became closerto those of other players who also did not partici-pate.

6.3 Duplicate ratios of extracted relationsAs Puduppully et al. (2019) pointed out, a gen-erated summary may mention the same relationmultiple times. Such duplicated relations are notfavorable in terms of the brevity of text.

Figure 3 shows the ratios of the generated sum-maries with duplicate mentions of relations in thedevelopment data. While the models by Wisemanet al. (2017) and Puduppully et al. (2019) respec-

8In the NBA, a player who accumulates a double-digitscore in one of five categories (points, rebounds, assists,steals, and blocked shots) in a game, is regarded as a goodplayer. If a player had a double in two of those five cate-gories, it is referred to as double-double.

Page 8: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2109

4 2 0 2 4April 22, 2016

4

2

0

2

4 LeBron James

Kevin Love

Tristan ThompsonJR Smith

Kyrie Irving

Iman Shumpert

Richard JeffersonChanning Frye

Matthew Dellavedova

Dahntay Jones

James Jones

Timofey Mozgov

Mo Williams

Tobias HarrisMarcus MorrisAndre DrummondKentavious Caldwell-Pope

Reggie Jackson

Anthony Tolliver

Steve Blake

Aron BaynesStanley Johnson

Joel Anthony

Spencer Dinwiddie

Darrun Hilliard

Jodie Meeks

4 2 0 2 4 6December 26, 2016

Richard Jefferson

Kevin Love

Tristan ThompsonDeAndre Liggins

Kyrie Irving

Iman ShumpertKay Felder

Mike Dunleavy

Channing FryeJames Jones

Jordan McRae

LeBron JamesJR Smith

Jon LeuerMarcus Morris

Andre DrummondKentavious Caldwell-PopeReggie Jackson

Tobias Harris

Ish Smith

Stanley JohnsonAron Baynes

Darrun HilliardHenry EllensonMichael Gbinije

Beno Udrih

Figure 2: Illustrations of dynamic entity embedding e. Both left and right figures are for Cleveland Cavaliers vs.Detroit Pistons, on different dates. LeBron James is in red letters. Entities with orange symbols appeared onlyin the reference summary. Entities with blue symbols appeared only in the generated summary. Entities withgreen symbols appeared in both the reference and the generated summary. The others are with red symbols. 2

represents player who scored in the double digits, and 3 represents player who recorded double-double. Playerswith4 did not participate in the game. ◦ represents other players.

Wiseman+'17 Pudupully+'19 Proposed0.00

0.25

0.50

0.75

1.00

64.0%

20.2%15.8%

84.2%

12.7%

95.8%

12> 2

Figure 3: Ratios of generated summaries with dupli-cate mention of relations. Each label represents numberof duplicated relations within each document. WhileWiseman et al. (2017)’s model exhibited 36.0% dupli-cation and Puduppully et al. (2019)’s model exhibited15.8%, our model exhibited only 4.2%.

tively showed 36.0% and 15.8% as duplicate ra-tios, our model exhibited 4.2%. This suggests thatour model dramatically suppressed generation ofredundant relations. We speculate that the track-ing model successfully memorized which inputrecords have been selected in hENT

s .

6.4 Qualitative analysis of output examples

Figure 5 shows the generated examples from val-idation inputs with Puduppully et al. (2019)’smodel and our model. Whereas both generationsseem to be fluent, the summary of Puduppullyet al. (2019)’s model includes erroneous relationscolored in orange.

Specifically, the description about DERRICK

ROSE’s relations, “15 points, four assists, threerounds and one steal in 33 minutes.”, is also usedfor other entities (e.g., JOHN HENSON and WILLY

HERNAGOMEZ). This is because Puduppully et al.(2019)’s model has no tracking module unlike our

model, which mitigates redundant references andtherefore rarely contains erroneous relations.

However, when complicated expressions suchas parallel structures are used our model also gen-erates erroneous relations as illustrated by the un-derlined sentences describing the two players whoscored the same points. For example, “11-pointefforts” is correct for COURTNEY LEE but not forDERRICK ROSE. As a future study, it is necessaryto develop a method that can handle such compli-cated relations.

6.5 Use of writer information

We first look at the results of an extension ofPuduppully et al. (2019)’s model with writer in-formation w in Table 4. By adding w to con-tent planning (stage 1), the method obtained im-provements in CS (37.60 to 47.25), CO (16.97to 22.16), and BLEU score (13.96 to 18.18). Byadding w to the component for surface realization(stage 2), the method obtained an improvement inBLEU score (13.96 to 17.81), while the effects onthe other metrics were not very significant. Byadding w to both stages, the method scored thehighest BLEU, while the other metrics were notvery different from those obtained by adding wto stage 1. This result suggests that writer infor-mation contributes to both content planning andsurface realization when it is properly used, andimprovements of content planning lead to muchbetter performance in surface realization.

Our model showed improvements in most met-rics and showed the best performance by incor-

Page 9: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2110

MethodRG CS CO

BLEU# P% P% R% F1% DLD%

Puduppully et al. (2019) 33.06 83.17 33.06 43.59 37.60 16.97 13.96+ w in stage 1 28.43 84.75 45.00 49.73 47.25 22.16 18.18+ w in stage 2 35.06 80.51 31.10 45.28 36.87 16.38 17.81+ w in stage 1 & 2 28.00 82.27 44.37 48.71 46.44 22.41 18.90

PROPOSED 39.05 94.38 35.77 52.05 42.40 19.38 16.15+ w 30.25 92.00 50.75 59.03 54.58 25.75 20.84

Table 4: Effects of writer information. w indicates that WRITER embeddings are used. Numbers in bold are thelargest among the variants of each method.

The Milwaukee Bucks defeated the New York Knicks, 105-104, at Madison Square Garden on Wednesday evening. TheBucks (18-16) have been one of the hottest teams in the league,having won five of their last six games, and they have now wonsix of their last eight games. The Knicks (16-19) have nowwon six of their last six games, as they continue to battle for theeighth and final playoff spot in the Eastern Conference. GiannisAntetokounmpo led the way for Milwaukee, as he tallied 27points, 13 rebounds, four assists, three blocked shots and onesteal, in 39 minutes . Jabari Parker added 15 points, four re-bounds, three assists, one steal and one block, and 6-of-8 fromlong range. John Henson added two points, two rebounds, oneassist, three steals and one block. John Henson was the onlyother player to score in double digits for the Knicks, with 15points, four assists, three rebounds and one steal, in 33 min-utes. The Bucks were led by Derrick Rose, who tallied 15points, four assists, three rebounds and one steal in 33 minutes.Willy Hernangomez started in place of Porzingis and finishedwith 15 points, four assists, three rebounds and one steal in 33minutes. Willy Hernangomez started in place of Jose Calderon( knee ) and responded with one rebound and one block. TheKnicks were led by their starting backcourt of Carmelo An-thony and Carmelo Anthony, but combined for just 13 pointson 5-of-16 shooting. The Bucks next head to Philadelphia totake on the Sixers on Friday night, while the Knicks remainhome to face the Los Angeles Clippers on Wednesday.

(a) Puduppully et al. (2019)

The Milwaukee Bucks defeated the New York Knicks, 105-104, atMadison Square Garden on Saturday. The Bucks (18-16) checked into Saturday’s contest with a well, outscoring the Knicks (16-19) bya margin of 39-19 in the first quarter. However, New York by just a25-foot lead at the end of the first quarter, the Bucks were able to pullaway, as they outscored the Knicks by a 59-46 margin into the second.45 points in the third quarter to seal the win for New York with therest of the starters to seal the win. The Knicks were led by GiannisAntetokounmpo, who tallied a game-high 27 points, to go along with13 rebounds, four assists, three blocks and a steal. The game wasa crucial night for the Bucks’ starting five, as the duo was the mosteffective shooters, as they posted Milwaukee to go on a pair of lowlow-wise (Carmelo Anthony) and Malcolm Brogdon. Anthony added11 rebounds, seven assists and two steals to his team-high scoring total.Jabari Parker was right behind him with 15 points, four rebounds,three assists and a block. Greg Monroe was next with a bench-leading18 points, along with nine rebounds, four assists and three steals.Brogdon posted 12 points, eight assists, six rebounds and a steal.Derrick Rose and Courtney Lee were next with a pair of {11 / 11}-point efforts. Rose also supplied four assists and three rebounds, whileLee complemented his scoring with three assists, a rebound and a steal.John Henson and Mirza Teletovic were next with a pair of {two / two}-point efforts. Teletovic also registered 13 points, and he added a re-bound and an assist. Jason Terry supplied eight points, three reboundsand a pair of steals. The Cavs remain in last place in the EasternConference’s Atlantic Division. They now head home to face theToronto Raptors on Saturday night.

(b) Our model

Table 5: Example summaries generated with Puduppully et al. (2019)’s model (left) and our model (right). Namesin bold face are salient entities. Blue numbers are correct relations derived from input data records but are notobserved in reference summary. Orange numbers are incorrect relations. Green numbers are correct relationsmentioned in reference summary.

porating writer information w. As discussed inSection 4.5, w is supposed to affect both contentplanning and surface realization. Our experimen-tal result is consistent with the discussion.

7 Conclusion

In this research, we proposed a new data-to-textmodel that produces a summary text while track-ing the salient information that imitates a human-writing process. As a result, our model outper-formed the existing models in all evaluation mea-sures. We also explored the effects of incorpo-rating writer information to data-to-text models.With writer information, our model successfully

generated highest quality summaries that scored20.84 points of BLEU score.

Acknowledgments

We would like to thank the anonymous reviewersfor their helpful suggestions. This paper is basedon results obtained from a project commissionedby the New Energy and Industrial Technology De-velopment Organization (NEDO), JST PRESTO(Grant Number JPMJPR1655), and AIST-TokyoTech Real World Big-Data Computation Open In-novation Laboratory (RWBC-OIL).

Page 10: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2111

ReferencesTatsuya Aoki, Akira Miyazawa, Tatsuya Ishigaki, Kei-

ichi Goshima, Kasumi Aoki, Ichiro Kobayashi, Hi-roya Takamura, and Yusuke Miyao. 2018. Gener-ating Market Comments Referring to External Re-sources. In Proceedings of the 11th InternationalConference on Natural Language Generation, pages135–139.

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Ben-gio. 2015. Neural machine translation by jointlylearning to align and translate. In Proceedings of theThird International Conference on Learning Repre-sentations.

Regina Barzilay and Mirella Lapata. 2005. Collectivecontent selection for concept-to-text generation. InProceedings of the conference on Human LanguageTechnology and Empirical Methods in Natural Lan-guage Processing, pages 331–338.

Antoine Bosselut, Omer Levy, Ari Holtzman, CorinEnnis, Dieter Fox, and Yejin Choi. 2018. SimulatingAction Dynamics with Neural Process Networks. InProceedings of the Sixth International Conferenceon Learning Representations.

David L Chen and Raymond J Mooney. 2008. Learn-ing to sportscast: a test of grounded language ac-quisition. In Proceedings of the 25th internationalconference on Machine learning, pages 128–135.

Kyunghyun Cho, Bart van Merrienboer, Caglar Gul-cehre, Dzmitry Bahdanau, Fethi Bougares, Hol-ger Schwenk, and Yoshua Bengio. 2014. Learn-ing Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Pro-ceedings of the 2014 Conference on Empirical Meth-ods in Natural Language Processing, pages 1724–1734.

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho,and Yoshua Bengio. 2014. Empirical evaluation ofgated recurrent neural networks on sequence model-ing. arXiv preprint arXiv:1412.3555.

Elizabeth Clark, Yangfeng Ji, and Noah A Smith. 2018.Neural Text Generation in Stories Using Entity Rep-resentations as Context. In Proceedings of the 16thConference of the North American Chapter of theAssociation for Computational Linguistics: HumanLanguage Technologies, pages 2250–2260.

Xavier Glorot and Yoshua Bengio. 2010. Understand-ing the difficulty of training deep feedforward neu-ral networks. In Proceedings of the thirteenth in-ternational conference on artificial intelligence andstatistics, pages 249–256.

Alex Graves, Greg Wayne, Malcolm Reynolds,Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwinska, Sergio Gomez Colmenarejo, EdwardGrefenstette, Tiago Ramalho, John Agapiou, et al.2016. Hybrid computing using a neural net-work with dynamic external memory. Nature,538(7626):471.

Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OKLi. 2016. Incorporating Copying Mechanism inSequence-to-Sequence Learning. In Proceedings ofthe 54th Annual Meeting of the Association for Com-putational Linguistics, volume 1, pages 1631–1640.

Caglar Gulcehre, Sungjin Ahn, Ramesh Nallapati,Bowen Zhou, and Yoshua Bengio. 2016. Pointingthe Unknown Words. In Proceedings of the 54th An-nual Meeting of the Association for ComputationalLinguistics, pages 140–149.

Luong Hoang, Sam Wiseman, and Alexander Rush.2018. Entity Tracking Improves Cloze-style Read-ing Comprehension. In Proceedings of the 2018Conference on Empirical Methods in Natural Lan-guage Processing, pages 1049–1055.

Yangfeng Ji, Chenhao Tan, Sebastian Martschat, YejinChoi, and Noah A Smith. 2017. Dynamic EntityRepresentations in Neural Language Models. InProceedings of the 2017 Conference on EmpiricalMethods in Natural Language Processing, pages1830–1839.

Chloe Kiddon, Luke Zettlemoyer, and Yejin Choi.2016. Globally coherent text generation with neuralchecklist models. In Proceedings of the 2016 Con-ference on Empirical Methods in Natural LanguageProcessing, pages 329–339.

Sosuke Kobayashi, Ran Tian, Naoaki Okazaki, andKentaro Inui. 2016. Dynamic entity representationwith max-pooling improves machine reading. InProceedings of the 15th Conference of the NorthAmerican Chapter of the Association for Computa-tional Linguistics: Human Language Technologies,pages 850–855.

Remi Lebret, David Grangier, and Michael Auli. 2016.Neural Text Generation from Structured Data withApplication to the Biography Domain. In Proceed-ings of the 2016 Conference on Empirical Methodsin Natural Language Processing, pages 1203–1213.

Percy Liang, Michael I Jordan, and Dan Klein. 2009.Learning semantic correspondences with less super-vision. In Proceedings of the Joint Conference ofthe 47th Annual Meeting of the ACL and the 4th In-ternational Joint Conference on Natural LanguageProcessing of the AFNLP, pages 91–99.

Tianyu Liu, Kexiang Wang, Lei Sha, Baobao Chang,and Zhifang Sui. 2018. Table-to-text Generation byStructure-aware Seq2seq Learning. In Proceedingsof the Thirty-Second AAAI Conference on ArtificialIntelligence.

Thang Luong, Hieu Pham, and Christopher D Man-ning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedingsof the 2015 Conference on Empirical Methods inNatural Language Processing, pages 1412–1421.

Page 11: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2112

Hongyuan Mei, Mohit Bansal, and Matthew R Walter.2016. What to talk about and how? Selective Gener-ation using LSTMs with Coarse-to-Fine Alignment.In Proceedings of the 15th Conference of the NorthAmerican Chapter of the Association for Computa-tional Linguistics: Human Language Technologies,pages 720–730.

Soichiro Murakami, Akihiko Watanabe, AkiraMiyazawa, Keiichi Goshima, Toshihiko Yanase, Hi-roya Takamura, and Yusuke Miyao. 2017. Learningto generate market comments from stock prices.In Proceedings of the 55th Annual Meeting of theAssociation for Computational Linguistics, pages1374–1384.

Graham Neubig, Chris Dyer, Yoav Goldberg, AustinMatthews, Waleed Ammar, Antonios Anastasopou-los, Miguel Ballesteros, David Chiang, DanielClothiaux, Trevor Cohn, et al. 2017. Dynet: Thedynamic neural network toolkit. arXiv preprintarXiv:1701.03980.

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automaticevaluation of machine translation. In Proceedingsof the 40th annual meeting on association for com-putational linguistics, pages 311–318.

Karl Pearson. 1901. On lines and planes of closest fit tosystems of points in space. The London, Edinburgh,and Dublin Philosophical Magazine and Journal ofScience, 2(11):559–572.

Ratish Puduppully, Li Dong, and Mirella Lapata. 2019.Data-to-Text Generation with Content Selection andPlanning. In Proceedings of the Thirty-Third AAAIConference on Artificial Intelligence.

Sashank J Reddi, Satyen Kale, and Sanjiv Kumar.2018. On the convergence of adam and beyond. InProceedings of the Sixth International Conferenceon Learning Representations.

Lei Sha, Lili Mou, Tianyu Liu, Pascal Poupart, SujianLi, Baobao Chang, and Zhifang Sui. 2018. Order-planning neural text generation from structured data.In Proceedings of the Thirty-Second AAAI Confer-ence on Artificial Intelligence.

Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, et al.2015. End-to-end memory networks. In Advancesin neural information processing systems, pages2440–2448.

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014.Sequence to sequence learning with neural net-works. In Advances in neural information process-ing systems, pages 3104–3112.

Kumiko Tanaka-Ishii, Koiti Hasida, and Itsuki Noda.1998. Reactive content selection in the generationof real-time soccer commentary. In Proceedings ofthe 36th Annual Meeting of the Association for Com-putational Linguistics and 17th International Con-ference on Computational Linguistics, pages 1282–1288.

Yasufumi Taniguchi, Yukun Feng, Hiroya Takamura,and Manabu Okumura. 2019. Generating LiveSoccer-Match Commentary from Play Data. In Pro-ceedings of the Thirty-Third AAAI Conference onArtificial Intelligence.

Zhaopeng Tu, Yang Liu, Zhengdong Lu, Xiaohua Liu,and Hang Li. 2017. Context gates for neural ma-chine translation. Transactions of the Associationfor Computational Linguistics, 5:87–99.

Jason Weston, Sumit Chopra, and Antoine Bordes.2015. Memory Networks. In Proceedings of theThird International Conference on Learning Repre-sentations.

Sam Wiseman, Stuart Shieber, and Alexander Rush.2017. Challenges in Data-to-Document Generation.In Proceedings of the 2017 Conference on Empiri-cal Methods in Natural Language Processing, pages2253–2263.

Zichao Yang, Phil Blunsom, Chris Dyer, and WangLing. 2017. Reference-Aware Language Models.In Proceedings of the 2017 Conference on Empiri-cal Methods in Natural Language Processing, pages1850–1859.

A Algorithm

The generation process of our model is shownin Algorithm 1. For a concise description, weomit the condition for each probability notation.<SOD> and <EOD> represent “start of the doc-ument” and “end of the document”, respectively.

B Experimental settings

We set the dimensions of the embeddings to 128,and those of the hidden state of RNN to 512 and allof parameters are initialized with the Xavier ini-tialization (Glorot and Bengio, 2010). We set themaximum number of epochs to 30, and choose themodel with the highest BLEU score on the devel-opment data. The initial learning rate is 2e-3 andAMSGrad is also used for automatically adjustingthe learning rate (Reddi et al., 2018). Our imple-mentation uses DyNet (Neubig et al., 2017).

Page 12: Learning to Select, Track, and Generate for Data-to-Text · record1 and updates the state of the tracking mod-ule. The text generation module generates a doc-ument conditioned on

2113

Algorithm 1: Generation processInput: Data records s,Annotations Z1:T , E1:T , A1:T , N1:T

1 Initialize {re,a,v}r∈x, {e}e∈E , hLM0 , hENT

0

2 t← 03 et, yt ← NONE, < SOD >4 while yt 6=< EOD > do5 t← t+ 16 if p(Zt = 1) ≥ 0.5 then

/* Select the entity */7 et ← arg max p(Et = e′t)8 if et 6∈ Et−1 then

/* If et is a new entity */

9 hENT′t ← GRUE(et,h

ENTt−1)

10 Et ← Et−1 ∪ {et}11 else if et 6= et−1 then

/* If et has been observed before, but is different fromthe previous one. */

12 hENT’t ← GRUE(W ShENT

s ,hENTt−1),

13 where s = max{s : s ≤ t− 1, e = es}14 else15 hENT’

t ← hENTt−1

/* Select an attribute for the entity, et. */16 at ← arg max p(At = a′t)

17 hENTt ← GRUA(ret,at,x[et,at],h

ENT′t )

18 if at is a number attribute then19 if p(Nt = 1) ≥ 0.5 then20 yt ← numeral of x[et, at]21 else22 yt ← x[et, at]23 end24 else25 yt ← x[et, at]

26 h′t ← tanh(WH(hLM

t−1 ⊕ hENTt )

)27 hLM

t ← LSTM(yt ⊕ h′t,hLMt−1)

28 else29 et, at,h

ENTt ← et−1, at−1,h

ENTt−1

30 h′t ← tanh(WH(hLM

t−1 ⊕ hENTt )

)31 yt ← arg max p(Yt)

32 hLMt ← LSTM(yt ⊕ h′t,h

LMt−1)

33 end34 if yt is “.” then35 hENT

t ← GRUA(vREFRESH,hENTt )

36 end37 return y1:t−1;