second edition and language processing an introduction to natural language processing, computational...
Post on 05-May-2018
242 Views
Preview:
TRANSCRIPT
------
Speech and Language Processing An Introduction to Natural Language Processing,
Computational Linguistics, and Speech Recognition
Second Edition
ition
Daniel Jurafsky Stanford University
;econd Edition James H. Martin University of Colorado at Boulder
PEARSON
PrpnticeHall
Upper Saddle River, New Jersey 07458
Summary of Contents Foreword .......................•...................................xxiii Preface .......•..................................................... xxv About the Authors .xxxi
1 Introduction 1 I Words
11 Speech
111 Syntax
IV Semantics and Pragmatics
V Applications
2 Regular Expressions and Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17 3 Words and Transducers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 45 4 N-Grams 83 5 Part-of-Speech Tagging 123 6 Hidden Markov and Maximum Entropy Models 173
7 Phonetics 215 8 Speech Synthesis 249 9 Automatie Speech Recognition 285 10 Speech Recognition: Advanced Topics , 335 11 Computational Phonology 361
12 Formal Grammars of English 385 13 Syntactic Parsing 427 14 Statistical Parsing 459 15 Features and Unification 489 16 Language and Complexity 529
17 The Representation of Meaning 545 18 Computational Semantics .....•.................................. 583 19 Lexical Semantics 611 20 Computational Lexical Semantics 637 21 Computational Discourse 681
22 Information Extraction 725 23 Question Answering and Summarization 765 24 Dialogue and Conversational Agents 811 25 Machine Translation 859
Bibliography 909 Author Index 959 Subject Index 971
vii
Contents Foreword xxiii
Preface xxv
About the Authors xxxi
1 Introduction 1 1.1 Knowledge in Speech and Language Processing 2 1.2 Ambiguity................ 4 1.3 Models and Algorithms . . . . . . . . . S 1.4 Language, Thought, and Understanding . 6 I.S The State of the Art . . . . . . . . . . . 8 1.6 Some Brief History . . . . . . . . . . . 9
1.6.1 Foundational Insights: 1940s and 1950s 9 1.6.2 The Two Camps: 1957-1970. . . . . . 10 1.6.3 Four Paradigms: 1970-1983 . . . . . . 11 1.6.4 Empiricism and Finite-State Models Redux: 1983-1993 12 1.6.5 The Field Comes Together: 1994-1999 . . 12 1.6.6 The Rise of Machine Learning: 2000-2008 12 1.6.7 On Multiple Discoveries . . . . . 13 1.6.8 A Final Brief Note on Psychology 14
1.7 Summary........... 14 Bibliographical and Historical Notes . . . . . . . 15
I Words
2 Regular Expressions and Automata 17 2.1 Regular Expressions . 17
2.1.1 Basic Regular Expression Patterns . 18 2.1.2 Disjunction, Grouping, and Precedence 21 2.1.3 A Simple Example ..... 22 2.1.4 A More Complex Example . . . . . . . 23 2.1.5 Advanced Operators . 24 2.1.6 Regular Expression Substitution, Memory, and ELIZA 25
2.2 Finite-State Automata . 26 2.2.1 Use of an FSA to Recognize Sheeptalk 27 2.2.2 Formal Languages . 30 2.2.3 Another Example . 31 2.2.4 Non-Deterministic FSAs . 32 2.2.5 Use of an NFSA to Accept Strings . 33 2.2.6 Recognition as Search . 35 2.2.7 Relation of Detelministic and Non-Deterministic Automata 38
2.3 Regular Languages and FSAs 38 2.4 Summary . 41
ix
x Contents
Bibliographieal and Historieal Notes 42 Exereises 42
3 Words and Transducers 45 3.1 Survey of (Mostly) English Morphology 47
3.1.1 Infleetional Morphology 48 3.1.2 Derivational Morphology .... 50 3.1.3 Clitieization . 51 3.1.4 Non-Coneatenative Morphology 51 5 3.1.5 Agreement........ 52
3.2 Finite-State Morphologieal Parsing 52 3.3 Construetion of a Finite-State Lexieon 54 3.4 Finite-State Transdueers . 57
3.4.1 Sequential Transdueers and Determinism 59 3.5 FSTs for Morphologieal Parsing 60 3.6 Transdueers and Orthographie RuJes 62 3.7 The Combination of an FST Lexieon and Rules 65 3.8 Lexieon-Free FSTs: The Porter Stemmer 68 3.9 Word and Sentenee Tokenization 68
3.9.1 Segmentation in Chinese . 70 3.10 Deteetion and Correetion of Spelling Errors 72 3.11 Minimum Edit Distanee . 73 3.12 Human Morphologieal Proeessing 77 3.13 Summary . 79 Bibliographieal and Historieal Notes 80 Exereises 81
4 N-Grams 83 4.1 Word Counting in Corpora 85 4.2 Simple (Unsmoothed) N-Grams . 86 4.3 Training and Test Sets . 91
4.3.1 N-Gram Sensitivity to the Training Corpus 92 4.3.2 Unknown Words: Open Versus C10sed Voeabulary Tasks 95
4.4 Evaluating N-Grams: Perplexity 95 4.5 Smoothing 97 6
4.5.1 Laplaee Smoothing ... 98 4.5.2 Good-Turing Diseounting 101 4.5.3 Some Advaneed Issues in Good-Turing Estimation 102
4.6 Interpolation................... 104 4.7 Baekoff . 105
4.7.1 Advaneed: Details of Computing Katz Baekoff a and P* 107 4.8 Praetieal Issues: Toolkits and Data Formats . 108 4.9 Advaneed Issues in Language Modeling . 109
4.9.1 Advaneed Smoothing Methods: Kneser-Ney Smoothing 109 4.9.2 Class-Based N-Grams . 111 4.9.3 Language Model Adaptation and Web Use ..... 112
Contents Xl
42 42
45 47 48 50 51 51 52 52 54 57 59 60 62 65 68 68 70 72 73
77 79 80 81
83 85 86 91
lS 92 ocabulary Tasks . 95
95 97 98
101 ~stimation 102
104 105
ackoff a and P' . 107 108 109
Ney Smoothing 109 111
se . 112
4.9.4 Using Longer-Distance Information: A Brief Summary. 4.10 Advanced: Information Theory Background . . . . . . . . . . .
4.10.1 Cross-Entropy for Comparing Models . 4.11 Advanced: The Entropy of English and Entropy Rate Constancy 4.12 Summary . Bibliographical and Historical Notes Exercises .
5 Part-of-Speech Tagging 5.1 (Mostly) English Word Classes 5.2 Tagsets for English . 5.3 Part-of-Speech Tagging . 5.4 Rule-Based Part-of-Speech Tagging . 5.5 HMM Part-of-Speech Tagging .. ,
5.5.1 Computing the Most Likely Tag Sequence: An Examp1e 5.5.2 Formalizing Hidden Markov Model Taggers 5.5.3 Using the Viterbi Algorithm for HMM Tagging 5.5.4 Extending the HMM Algorithm to Trigrams .
5.6 Transformation-Based Tagging .... 5.6.1 How TBL Rules Are Applied 5.6.2 How TBL Rules Are Learned
5.7 Evaluation and Error Analysis . 5.7.1 Error Analysis .
5.8 Advanced Issues in Part-of-Speech Tagging 5.8.1 Practical Issues: Tag Indeterminacy and Tokenization . 5.8.2 Unknown Words . 5.8.3 Part-of-Speech Tagging for Other Languages 5.8.4 Tagger Combination .
5.9 Advanced: The Noisy Channel Model for Spelling . 5.9.1 Contextual Spelling Error Correction
5.10 Summary . Bibliographical and Historical Notes Exercises .
6 Hidden Markov and Maximum Entropy Models 6.1 Markov Chains . 6.2 The Hidden Markov Model . 6.3 Likelihood Computation: The Forward Algorithm 6.4 Decoding: The Viterbi Aigorithm . 6.5 HMM Training: The Forward-Backward Algorithm 6.6 Maximum Entropy Models: Background
6.6.1 Linear Regression . 6.6.2 Logistic Regression . 6.6.3 Logistic Regression: Classification 6.6.4 Advanced: Learning in Logistic Regression
6.7 Maximum Entropy Modeling .
112 114 116 118 119 120 121
123 124 130 133 135 139 142 144 145 149 151 152 152 153 156 157 157 158 160 163 163 167 168 169 171
173 174 176 179 184 186 193 194 197 199 200 201
xii Contents
6.7.1 Why We Call It Maximum Entropy 205 6.8 Maximum Entropy Markov Models ..... 207
6.8.1 Decoding and Learning in MEMMs 210 6.9 Summary ........... 211 Bibliographieal and Historieal Notes 212 Exereises .............. 213
11 Speech
7 Phonetics 215 7.1 Speech Sounds and Phonetie Transeription 216 7.2 Artieulatory Phoneties ........... 217
7.2.1 The Voeal Organs. . . . . . . . . 218 !I 97.2.2 Consonants: Plaee of Artieulation 220
7.2.3 Consonants: Manner of Artieulation . 221 7.2.4 Vowels ................ 222 7.2.5 Syllables . . . . . . . . . . . . . . . 223
7.3 Phonologieal Categories and Pronuneiation Variation 225 7.3.1 Phonetie Features. . . . . . . . . . . . 227 7.3.2 Predieting Phonetie Variation ..... 228 7.3.3 Faetors Influeneing Phonetie Variation. 229
7.4 Aeoustie Phoneties and Signals 230 7.4.1 Waves .................. 230 7.4.2 Speech Sound Waves .......... 231 7.4.3 Frequeney and Amplitude; Piteh and Loudness 233 7.4.4 Interpretation of Phones from a Waveform 236 7.4.5 Speetra and the Frequeney Domain 236 7.4.6 The Souree-Filter Model ........ 240
7.5 Phonetie Resourees ............... 241 7.6 Advaneed: Artieulatory and Gestural Phonology 244 7.7 Summary ........... 245 Bibliographieal and Historieal Notes 246 Exereises .............. 247
8 Speech Synthesis 249 8.1 Text Normalization ...... 25]
8.1.1 Sentenee Tokenization 251 I H 8.1.2 Non-Standard Words . 252 8.1.3 Homograph Disambiguation 256
8.2 Phonetie Analysis ..... 257 8.2.1 Dietionary Lookup ..... 257 8.2.2 Names ........... 258 8.2.3 Grapheme-to-Phoneme Conversion 259
8.3 Prosodie Analysis ....... 262 8.3.1 Prosodie Strueture .. 262 8.3.2 Prosodie Prominenee . 263
Contents Xill
205 207 210 211 212 213
215 216 217 218 220 221 222 223 225 227 228 229 230 230 231
Iness 233 236 236 240 241 244 245 246 247
249 251 251 252 256 257 257 258 259 262 262 263
8.3.3 Tune . 8.3.4 More Sophisticated Models: ToBI . 8.3.5 Computing Duration from Prosodie Labels 8.3.6 Computing FO from Prosodie Labels .... 8.3.7 Final Result of Text Analysis: Internal Representation
8.4 Diphone Waveform Synthesis . 8.4.1 Steps for Bui1ding a Diphone Database . 8.4.2 Diphone Concatenation and TD-PSOLA for Prosody
8.5 Vnit Selection (Waveform) Synthesis 8.6 Evaluation . Bibliographical and Historical Notes Exercises .
9 Automatie Speech Recognition 9.1 Speech Recognition Architecture . 9.2 The Hidden Markov Model Applied to Speech 9.3 Feature Extraction: MFCC Vectors
9.3.1 Preemphasis . 9.3.2 Windowing . 9.3.3 Discrete Fourier Transform . 9.3.4 Mel Filter Bank and Log .. 9.3.5 The Cepstrum: Inverse Discrete Fourier Transform 9.3.6 Deltas and Energy . 9.3.7 Summary: MFCC .
9.4 Acoustic Likelihood Computation . 9.4.1 Vector Quantization . 9.4.2 Gaussian PDFs . 9.4.3 Probabilities, Log-Probabilities, and Distance Functions
9.5 The Lexicon and Language Model 9.6 Search and Decoding . 9.7 Embedded Training . 9.8 Evaluation: Word Error Rate 9.9 Summary . Bibliographical and Historical Notes Exercises .
10 Speech Recognition: Advanced Topics 10.1 Multipass Decoding: N-Best Lists and Lattices . 10.2 A' ("Stack") Decoding . 10.3 Context-Dependent Acoustic Models: Triphones 10.4 Discriminative Training . . . . . . . . . . . . .
IOAI Maximum Mutual Information Estimation . 10.4.2 Acoustic Models Based on Posterior Classifiers .
10.5 Modeling Variation . lOS I Environmental Variation and Noise . 10.5.2 Speaker Variation and Speaker Adaptation
265 266 268 269 271 272 272 274 276 280 281 284
285 287 291 295 296 296 298 299 300 302 302 303 303 306 313 314 314 324 328 330 331 333
335 335 341 345 349 350 351 352 352 353
xiv Contents
10.5.3 Pronunciation Modeling: Variation Oue to Genre 10.6 Metadata: Boundaries, Punctuation, and Disfluencies 10.7 Speech Recognition by Humans . 10.8 Summary . Bibliographical and Historical Notes Exercises .
]] Computational Phonology 11.1 Finite-State Phonology . 11.2 Advanced Finite-State Phonology .
11.2.1 Harmony . 11.2.2 Templatic Morpho1ogy . .
11.3 Computational Optimality Theory . 11.3.1 Finite-State Transducer Models of Optimality Theory . 11.3.2 Stochastic Models of Optimality Theory .
11.4 Syllabification . 11.5 Learning Phonology and Morphology .
11.5.1 Leaming PhonologicaJ Rules . 11.5.2 Learning Morphology .... 11.5.3 Learning in Optimality Theory .
11.6 Summary........... Bib1iographical and Historica1 Notes Exercises .
III Syntax
12 Formal Grammars of English 12.1 Constituency...................... 12.2 Context-Free Grammars .
12.2.1 Formal Definition of Context-Free Grammar 12.3 Some Grammar Rules for English ...
12.3.1 Sentence-Level Constructions 12.3.2 Clauses and Sentences 12.3.3 The Noun Phrase . 12.3.4 Agreement . 12.3.5 The Verb Phrase and Subcategorization 12.3.6 AuxiJiaries . 12.3.7 Coordination .
12.4 Treebanks . . . . . . . . . . . . . . . . . . . 12.4.1 Example: The Penn Treebank Project 12.4.2 Treebanks as Granunars 12.4.3 Treebank Searching . 12.4.4 Heads and Head Finding .
12.5 Grammar Equiva1ence and Normal Form 12.6 Finite-State and Context-Free Grammars 12.7 Dependency Grammars .
354 356 358 359 359 360
36] 361 365 365 366 367 369 370 372 375 375 377 380 381 381 383
385 386 387
],391 392 392 394 394 398 400 402 403 404 404 406 408 409 412 413 414
I
Contents xv
Genre 354
~s 356 358 359 359 360
361 361 365 365 366 367
lity Theory . 369 370 372 375 375 377 380 381 381 383
385 386 387
tar 391 392 392 394 394 398 400 402 403 404 404 406 408 409 412 413 414
12.7.1 The Relationship Between Dependencies and Heads 415 12.7.2 Categorial Grammar . . 417
12.8 Spoken Language Syntax . . . . . . . . 417 12.8.1 Disfluencies and Repair . . . . 418 12.8.2 Treebanks for Spoken Language 419
12.9 Grammars and Human Processing . 420 12.10 Summary. . . . . . . . . . . 421 Bibliographical and Historical Notes 422 Exercises 424
13 Syntactic Parsing 427 13.1
13.2 13.3 13.4
13.5
13.6
ParsingasSearch 428 13.1.1 Top-Down Parsing 429 13.1.2 Bottom-Up Parsing. 430 13.1.3 Comparing Top-Down and Bottom-Up Parsing 431 Ambiguity . . . . . . . . . . . . . . . . 432 Search in the Face of Ambiguity .... 434 Dynamic Programming Parsing Methods 435 13.4.1 CKY Parsing . . . . . 436 13.4.2 The Earley Algorithm 443 13.4.3 Chart Parsing . . . . . 448 Partial Parsing . . . . . . . . . 450 13.5.1 Finite-State Rule-Based Chunking . 452 13.5.2 Machine Learning-Based Approaches to Chunking 452 13.5.3 Chunking-System Evaluations 455 Summary........... 456
Bibliographical and Historical Notes 457 Exercises 458
14 Statistical Parsing 459 14.1 Probabilistic Context-Free Grammars . 460
14.1.1 PCFGs for Disambiguation. . 461 14.1.2 PCFGs for Language Modeling 463
14.2 Probabilistic CKY Parsing of PCFGs . . 464 14.3 Ways to Learn PCFG Rule Probabilities 467 14.4 Problems with PCFGs . . . . . . . . . . 468
14.4.1 Independence Assumptions Miss Structural Dependencies Between Rules. . . . . . . . . . . . . . . . . . 468
14.4.2 Lack of Sensitivity to Lexical Dependencies 469 14.5 Improving PCFGs by Splitting Non-Terminals 471 14.6 Probabilistic Lexicalized CFGs . . . . . . . . . . . . 473
14.6.1 The Collins Parser . . . . . . . . . . . . . . 475 14.6.2 Advanced: Further Details of the Collins Parser. 477
14.7 Evaluating Parsers . . . . . . . . . . . . . . . 479 14.8 Advanced: Discriminative Reranking . . . . . 481 14.9 Advanced: Parser-Based Language Modeling . 482
xvi Contents
14.10 Human Parsing . 14.11 Summary . Bibliographical and Historical Notes Exercises .
15 Features and Unifkation 15.1 Feature Structures . 15.2 Unification of Feature Structures 15.3 Feature Structures in the Grammal'
15.3.1 Agreement . 15.3.2 Head Features . 15.3.3 Subcategorization 15.3.4 Long-Distance Dependencies
15.4 Implementation of Unification 15.4.1 Unification Data Structures . 15.4.2 The Unification Aigorithm .
15.5 Parsing with Unification Constraints 15.5.1 Integration of Unification into an Earley Parser 15.5.2 Unification-Based Parsing .
15.6 Types and Inheritance . 15.6.1 Advanced: Extensions to Typing 15.6.2 Other Extensions to Unification
15.7 Summary . Bibliographical and Historical Notes Exercises .
16 Language and Complexity 16.1 The Chomsky Hierarchy . 16.2 Ways to Tell if a Language Isn't Regular
16.2.1 The Pumping Lemma ..... 16.2.2 Proofs that Various Natural Languages Are Not Regular
16.3 Is Natural Language Context Free? 16.4 Complexity and Human Processing 16.5 Summary . Bibliographical and Historical Notes Exercises .
IV Semantics and Pragmatics
17 The Representation ofMeaning 17.1 Computational Desiderata for Representations
17.1.1 17.1.2 17.1.3 17.1.4 17.1.5
Verifiability . Unambiguous Representations Canonical Form Inference and Variables. Expressiveness .....
483 485 486 488
489 490 492 497 498 500 501 506 507 507 509 513 514 519 521 524 525 525 526 527
529 530 532 533 535 537 539 542 543 544
545 547 547 548 549 550 551
1
Contents xvii
483 485 486 488
489 490 492 497 498 500 501 506 507 507 509 513 514 519 521 524 525 525 526 527
529 530 532 533
re Not Regular 535 537 539 542 543 544
545 547 547 548 549 550 551
17.2 Model-Theoretic Semantics . 17.3 First-Order Logic .
17.3.1 Basic Elements of First-Order Logic . 17.3.2 Variables and Quantifiers . 17.3.3 Lambda Notation. . .. . . 17.3.4 The Semantics of First-Order Logic 17.3.5 Inference .
17.4 Event and State Representations . 17.4.1 Representing Time 17.4.2 Aspect .
17.5 Description Logics . 17.6 Embodied and Situated Approaches to Meaning 17.7 Summary . Bibliographical and Historical Notes Exercises .
18 Computational Semantics 18.1 Syntax-Driven Semantic Analysis . 18.2 Semantic Augmentations to Syntactic Rules .. 18.3 Quantifier Scope Ambiguity and Underspecification
18.3.1 Store and Retrieve Approaches . 18.3.2 Constraint-Based Approaches .
18.4 Unification-Based Approaches to Semantic Analysis. 18.5 Integration of Semantics into the Earley Parser 18.6 Idioms and Compositionality 18.7 Summary........... BibJiographicaJ and Historical Notes Exercises .
19 LexicaI Semantics 19.1 19.2
19.3 19.4
19.5 19.6 19.7
Word Senses . Relations Between Senses . 19.2.1 Synonymy and Antonymy 19.2.2 Hyponymy . 19.2.3 Semantic Fields . WordNet: A Database of LexicaJ Relations Event Participants . 19.4.1 Thematic Roles . 19.4.2 Diathesis Alternations . 19.4.3 Problems with Thematic Roles . 19.4.4 The Proposition Bank 19.4.5 FrameNet . 19.4.6 Selectional Restrictions Primitive Decomposition Advanced: Metaphor Summary .
552 555 555 557 559 560 561 563 566 569 572 578 580 580 582
583 583 585 592 592 595 598 604 605 607 607 609
611 612 615 615 616 617 617 619 620 622
623 624 625 627 629 631 632
xviii Contents
Bibliographieal and Historieal Notes 633 Exereises . 634
20 Computational Lexical Semantics 637 20.1 Word Sense Disambiguation: Overview . 638 20.2 Supervised Word Sense Disambiguation . 639
20.2.1 Feature Extraetion for Supervised Learning 640 20.2.2 Naive Bayes and Deeision List Classifiers . 641
20.3 WSD Evaluation, Baselines, and Ceilings . 644 20.4 WSD: Dietionary and Thesaurus Methods . 646 \i
20.4.1 The Lesk Algorithm . 646 2:
20.4.2 Seleetional Restrietions and Seleetional Preferenees 648 20.5 Minimally Supervised WSD: Bootstrapping 650 20.6 Word Similarity: Thesaurus Methods . 652 20.7 Word Similarity: Distributional Methods . . . . . . 658
20.7.1 Defining a Word's Co-Oeeurrence Vectors . 659 20.7.2 Measuring Assoeiation with Context .... 661 20.7.3 Defining Similarity Between Two Vectors . 663 20.7.4 Evaluating Distributional Word Similarity . 667
20.8 Hyponymy and Other Word Relations . 667 20.9 Semantie Role Labeling . 670 20.10 Advaneed: Unsupervised Sense Disambiguation 674 20.11 Summary . 675 Bibliographieal and Historical Notes 676 Exercises . 679
21 Computational Discourse 681 21.1 Diseourse Segmentation . . . . . . . . . . . . . 684
21.1.1 Unsupervised Diseourse Segmentation 684 21.1.2 Supervised Diseourse Segmentation 686 21.1.3 Diseourse Segmentation Evaluation 688
21.2 Text Coherenee . 689 21.2.1 Rhetorieal Strueture Theory . . . . 690 21.2.2 Automatie Coherenee Assignment . 692
21.3 Referenee Resolution . 695 21.4 Referenee Phenomena . . . . . . . . . . . . 698
21.4.1 Five Types of Referring Expressions . 698 21.4.2 Information Status . 700
21.5 Features for Pronominal Anaphora Resolution 701 21.5.1 Features for Filtering Potential Referents 701 21.5.2 Preferenees in Pronoun Interpretation .. 702
21.6 Three Algorithms for Anaphora Resolution .... 704 21.6.1 Pronominal Anaphora Baseline: The Hobbs Algorithm 704 21.6.2 A Centering Aigorithm for Anaphora Resolution . . . 706 21.6.3 A Log-Linear Model for Pronominal Anaphora Resolution 708 21.6.4 Features for Pronominal Anaphora Resolution . 709
23
Contents xix
633 634
637 638 639
19 640 641 644 646 646
references 648 650 652 658
, . 659 661 663 667 667 670 674 675 676 679
681 684 684 686 688 689 690 692 695 698 698 700 701 701 702 704
Jbs Aigorithm 704 mlution . . . 706 Iphora Resolution 708 ution 709
21.7 Coreference Resolution . 710 21.8 Evaluation of Coreference Resolution . 712 21.9 Advanced: Inference-Based Coherence Resolution. 713 21.10 Psycholinguistic Studies of Reference . 718 21.11 Summary . 719 BibliographicaJ and Historica1 Notes 720 Exercises . 722
V Applications
22 Information Extraction 725 22.1 Named Entity Recognition . 727
22.1.1 Ambiguity in Named Entity Recognition 729 22.1.2 NER as Sequence Labeling . 729 22.1.3 Evaluation of Named Entity Recognition 732 22.1.4 Practica1 NER Architectures . . . . . . . 734
22.2 Relation Detection and Classification . . . . . . . 734 22.2.1 Supervised Learning Approaches to Relation Analysis 735 22.2.2 Lightly Supervised Approaches to Relation Analysis 738 22.2.3 Evaluation of Relation Analysis Systems 742
22.3 Temporal and Event Processing . . . . . . 743 22.3.1 Temporal Expression Recognition 743 22.3.2 Temporal Normalization ... 746 22.3.3 Event Detection and Analysis 749 22.3.4 TimeBank . 750
22.4 Template Filling . . . . . . . . . . . . 752 22.4.1 Statistical Approaches to Template-Filling 752 22.4.2 Finite-State Template-Filling Systems 754
22.5 Advanced: Biomedica1 Information Extraction 757 22.5.1 Biological Named Entity Recognition 758 22.5.2 Gene Normalization . 759 22.5.3 Biologica1 Roles and Relations. 760
22.6 Summary........... 762 Bibliographical and Historical Notes 762 Exercises . 763
23 Question Answering and Summarization 765 23.1 Information Retrieval . . . . . . 767
23.1.1 The Vector Space Model . . 768 23.1.2 Term Weighting . 770 23.1.3 Term Selection and Creation 772 23.1.4 Evaluation of Information-Retrieval Systems 772 23.1.5 Homonymy, Polysemy, and Synonymy 776 23.1.6 Ways to Improve User Queries . 776
23.2 Factoid Question Answering 778 23.2.1 Question Processing . . . . . . 779
xx Contents
23.2.2 Passage Retrieval . 781 23.2.3 Answer Processing . 783 23.2.4 Evaluation of Factoid Answers . 787
23.3 Summarization . 787 23.4 Single-Document Summarization . 790
23.4.1 Unsupervised Content Selection 790 23.4.2 Unsupervised Summarization Based on Rhetorical Parsing 792 23.4.3 Supervised Content Selection 794 23.4.4 Sentence Simplification . 795
23.5 Multi-Document Summarization . 796 23.5.1 Content Selection in Multi-Document Summarization 797 23.5.2 Information Ordering in Multi-Document Summarization 798
23.6 Focused Summarization and Question Answering 801 23.7 Summarization Evaluation .. 805 23.8 Summary . 807 Bibliographical and Historical Notes 808 Exercises . 8lO
24 Dialogue and Conversational Agents 811 24.1 Properties of Human Conversations . 813
24.1.1 Turns and Turn-Taking ... 813 24.1.2 Language as Action: Speech Acts 815 24.1.3 Language as Joint Action: Grounding 816 24.1.4 Conversational Structure .. 818 24.1.5 Conversational Implicature . 819
24.2 Basic Dialogue Systems . 821 24.2.1 ASR Component . 821 24.2.2 NLU Component . 822 24.2.3 Generation and TTS Components 825 24.2.4 Dialogue Manager . 827 24.2.5 Dealing with Errors: Confirmation and Rejection 831
24.3 VoiceXML . 832 24.4 Dialogue System Design and Evaluation 836 Bit
24.4.1 Designing Dialogue Systems . 836 AUI24.4.2 Evaluating Dialogue Systems 836
24.5 Information-State and Dialogue Acts 838 Sul 24.5.1 Using Dialogue Acts ... 840 24.5.2 Interpreting Dialogue Acts . 841 24.5.3 Detecting Correction Acts . 844 24.5.4 Generating Dialogue Acts: Confirrnation and Rejection . 845
24.6 Markov Decision Process Architecture . 846 24.7 Advanced: P1an-Based Dialogue Agents . 850
24.7.1 Plan-Inferential Interpretation and Production . 851 24.7.2 The Intentional Structure of Dialogue 853
24.8 Summary . 855 Bibliographical and Historical Notes . 856
2
Contents xxi
781 783 787 787 790 790
hetorical Parsing 792 794 795 796
nmarizalion 797 Summarization 798
801 805 807 808 810
811 813 813 815 816 818 819 821 821 822 825 827
~jection 831 832 836 836 836 838 840 841 844
and Rejection . 845 846 850
;tion . 851 853 855 856
Exercises . 858
25 Machine Translation 859 25.1 Why Machine Translation Is Hard . 862
25.1.1 Typology . 862 25.1.2 Other Structural Divergenees . 864 25.1.3 Lexical Divergenees ..... 865
25.2 Classieal MT and the Vauguois Triangle 867 25.2.1 Direct Translation . 868 25.2.2 Transfer . 870 25.2.3 Combined Direct and Transfer Approaches in Classic MT 872 25.2.4 The Interlingua Idea: Using Meaning 873
25.3 StatisticaJ MT . 874 25.4 P(FIE): The Phrase-Based Translation Model 877 25.5 Alignment in MT . 879
25.5.1 IBM Modell . 880 25.5.2 HMM AJignment . 883
25.6 Training Alignment Models 885 25.6.1 EM für Training Alignment Models 886
25.7 Symmetrizing Alignments for Phrase-Based MT 888 25.8 Decoding for Phrase-Based Statistical MT 890 25.9 MT Evaluation . 894
25.9.1 Using Human Raters . 894 25.9.2 Automatie Evaluation: BLEU 895
25.10 Advaneed: Syntaetie Models for MT . 898 25.11 Advaneed: IBM Model 3 and Fertility 899
25.1 1.1 Training for Model 3 ..... 903 25.12 Advaneed: Log-Linear Models for MT 903 25.13 Summary . 904 Bibliographical and Historieal Notes 905 Exercises . 907
Bibliography 909
Author Index 959
Subject Index 971
top related