3. linguistic essential

35
1 3. Linguistic Essential 인인인인 인인인 인인인

Upload: erno

Post on 03-Feb-2016

34 views

Category:

Documents


0 download

DESCRIPTION

3. Linguistic Essential. 인공지능 연구실 강미영. 3. Linguistic Essential. 3.1 Parts of Speech and Morphology Nouns and pronouns Determiners and adjectives Verbs Other parts of speech (adverbs, prepositions, particles) 3.2 Phrase Structure Phrase structure grammars - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 3. Linguistic Essential

1

3. Linguistic Essential

인공지능 연구실 강미영

Page 2: 3. Linguistic Essential

2

3. Linguistic Essential

3.1 Parts of Speech and Morphology– Nouns and pronouns

– Determiners and adjectives

– Verbs

– Other parts of speech (adverbs, prepositions, particles)

3.2 Phrase Structure– Phrase structure grammars

– Dependency: Arguments and adjuncts

– X' theory

– Phrase structure ambiguity

3.3 Semantics and Pragmatics

Page 3: 3. Linguistic Essential

3

3.1 Parts of Speech and Morphology (1-1)

• Parts of speech(POS) : Syntactic or grammatical categories– show similar syntactic behavior

– Three important parts of speech• Nouns: (refer to people, animals, concepts and things)

• Verb: (used express the action in the sentence)

• Adjectives: (describe properties of nouns)

– Substitution test: most basic test for words belonging to the same class

sad intelligent

The green one is in the cornerfat…

Children eat sweet candy

Page 4: 3. Linguistic Essential

4

3.1 Parts of Speech and Morphology (1-2)

• Many words have multiple parts of speech

• Word classes1. open (or lexical) categories

• nouns, verbs, adjectives• large number of members• new words added

2. closed (or functional) categories• prepositions (of, on), determiners(the, a)• a few members • normally have a clear grammatical use

• Various parts of speech for a word listed in a lexicon • Systems of parts of speech :

– Tradition: about 8 categories – Corpus linguists: sets of abbreviations for naming word classes ( POS tags )

Too much boiling will candy the molasses

Have a candy from the box

Page 5: 3. Linguistic Essential

5

3.1 Parts of Speech and Morphology (2-1)

• Word categories systematically related by Morphological processes – Formation of the plural form from the singular form of the noun

• Dog Dog-s

• Morphology– Very important in NLP

– Language is productive새로운 단어 우리가 이미 알고 있는 단어와 관련 새로운 단어의 형태론적 절차를 이해 많은 통사론적 의미론적 속성을 이해 가능

Page 6: 3. Linguistic Essential

6

3.1 Parts of Speech and Morphology (2-2-1)

• Types of Morphological processes

1. Inflexion

• Systematic modifications of a root for by means of prefixes and suffixes

• indicatite grammatical distinction (singular/ plural)– Varies features (tense, number, plurality)

• No change word class or meaning significantly

• Inflectional forms of a word = manifestations of a single lexeme

Page 7: 3. Linguistic Essential

7

3.1 Parts of Speech and Morphology (2-2-2)

2. Derivation• Less systematic (wide widely/ old *oldly/ difficult *difficultly)

• More radical change of syntactic category

• Involves a change in meaning

• Suffixes of derivation: -en(weak-en)/ -able(accept-able)/ -er(teach-er)

wide(adj) widely(adv)

a wide river : 넓은 영역 퍼져있는

It is widely believed : 넓게 퍼져있는 사람들 사이에

derivation

Page 8: 3. Linguistic Essential

8

3.1 Parts of Speech and Morphology (2-2-3)

3. Compounding: two or more words into a new word• Noun-noun compounds: combinations of two other nouns

– written as separate words,

– pronounced as a single word,

– Denote a single semantic concept ( 주로 lexicon 에 있음 )

• Other compounds involving adjectives, verbs, prepositions

tea kettle, disk driver, college degree

down market, (to)overtake, mad cow disease

Page 9: 3. Linguistic Essential

9

3.1.1 Noun and pronouns (1)

• Entities in the world: people, animals, things

– Number ex) English ( 형태론적으로 빈약한 언어 )

– Gender

plural form: + suffix-s

regular dog :dogs[s] person:persons[z] speech:speeches[s]

Irregular child: children woman:women

English He, she, it

Latin -a (fem) -us (mas)

fili-us (son: male child) fili-a (daughter: female child)

German mädchen (daughter: neuter; arbitrary)

Type of inflection Instances

number singular, plural

gender feminine, masculine, neuter

case nominative, genitive, dative, accusative

Page 10: 3. Linguistic Essential

10

3.1.1 Noun and pronouns (2)

• Cases: nouns appears in different forms when they have different functions (subject, object, etc.) in the sentence

Latin: Filius (subject) filium (object)

English : no real case inflections

only case relationship genitive(describes the possessor)

’s, ’ (after words ending in s)

CLITIC

phrasal affix

- women’s house

- The person you met’s house was broken into

Page 11: 3. Linguistic Essential

11

3.1.1 Noun and pronouns (3)

• Pronoun: variables in that they refer to a person or thing that is somehow salient in the discourse context

– Only words in English which appear in different forms when they are used as the subject and the object of the sentence

personal pronouns

– Possessive pronouns: my car, a friend of mine

– Reflexive pronouns: always refer to nearby antecedent in the same sentence, normally the subject of the sentence

After Mary arrived in the village, she looked for a bed-and-breakfast

Nominative

Subject case

Accusative

Object case

Anaphors : refer to something very nearby in the text

Page 12: 3. Linguistic Essential

12

3.1.1 Noun and pronouns (4)

NN (singular nouns)

NNP (proper nouns: ex. Mary, Korea)

NR (adverbial nouns: ex. west tomorrow)

NNS (plural nouns)

NNPS (plurar proper nouns)

NRS (plural adverbial nouns)

NN$ (possessive singular nouns)

NNS$ (possessive plural nouns)

NNP$ (possessive singular proper nouns)

NNPS$ (possessive plural proper nouns)

NR$ (possessive adverbial nouns)

Nominative

PPS (3SG)

PPSS (1SG, 2SG, PL)

Accusative

PPO

Possessive

PP$ / PP$$ (2nd possessive)

Reflexive

PPL / PPLS (plural)

Page 13: 3. Linguistic Essential

13

3.1.2 Words that accompany nouns (1)

• Determiners( 한정사 ) and adjectives( 형용사 )– Determiner: describe the particular reference of a noun

• Article (the; a(an))• Demonstratives (this, that)

– Adjectives: describe properties of noun• Attributive (Adnominal)• Predicative (complement of be)• Agreement: adjective(& article) agree with the noun (case, number, gender) • Comparative/ Superlative (> Positive (rich))

– -er/ -est (richer/ richest)– periphrastic forms (more intelligent/ most intelligent )

• Quantifiers ( 수량사 )– Pre-quantifier (all, many)– Nominal pronoun (one ,something, anything)

• Interrogative pronouns/ determiners: with or instead of nouns

Page 14: 3. Linguistic Essential

14

3.1.2 Words that accompany nouns (2)

JJ (positives adj)

JJR (comparatives adj)

JJT (superlatives adj)

JJS (semantically superlatives adj : chief)

NUMBERS

CD (cardinals: one, two…)

OD (ordinals: first, second…)

AT (articles)

DT (singular determiners: this, that)

DTS (plural determiners: these, those)

DTI ( 단 복수 구분 없음 : some, any)

DTX (double conjunction: either, neither)

WDT (wh-determiner: what, which)

WP$ (possessive wh-pronoun: whose)

WPO (objective wh-pronoun: whom, which, that)

WPS (nominative wh-pronoun: who, which, that)

ABN (pre-quantifier: all, many)

PN (nominal pronoun: one, something)

EX ( 문두에 쓰인 형식적인 주어 : there)

Page 15: 3. Linguistic Essential

15

3.1.3 Verbs

• Describe action

• Morphological forms of regular verbs

VB Base form take

VBD Past tense took

VBG Gerund & present participle taking

VBN Past participle taken

MD Modal auxiliaries can, may, must, could, might…

see table 4.6 ! be, have, do…

Page 16: 3. Linguistic Essential

16

3.1.4 Other parts of speech

• Adverbs– Modify a verb (specify place, time, manner, degree)– Modify adjectives and adverbs

• Preposition– Express spatial relationships

• Particle( 후치사 < 접미사 : ! suffix>)– Subclass of prepositions: most prepositions do double duty as particles– Construct phrasal verb entering into strong bounds with verbs – Separate lexical entry ( different syntactic semantic properties)

• preposition particle

• Conjunctions & complementizers

She ran up a hill She ran up a bill

Page 17: 3. Linguistic Essential

17

3.1.4 Other parts of speech

RB Ordinary adverb simply, late, well, little

RBR Comparative adverb later, better, less

RBT Superlative adverb latest, best, least

QL Qualifier

=degree adverb

very, too, extremely

( 형용사나 부사만을 수식 )

QLP Post-qualifier enough, indeed

WQL Wh-qualifier how

WRB Wh-adverb how, when, where

IN Prepositions

RP Particles

CC Conjunctions and, or, but

CS Subornating conjunctions

( Complementizers: that)

that, because, if, before,…

Page 18: 3. Linguistic Essential

18

3.2 Phrase structure( 구구조 ) (1)

• Syntax: study of the regularities and constraints of word order and phrase structure

• Constituents( 구성요소 ) 3.2.1 Phrase structure grammars (1)– able to occur various positions– uniform syntactic possibilities for expansion

– Paradigmatic relationship: all element that can be replaced for each other in certain syntactic position are member of one paradigm.

– Syntagmatic relationship: 하나의 구 (phrase (syntagma)) 를 이루는 둘 이상의 단어 ( 구 ) 사이의 관계

She

The woman

The tall woman

The very tall woman

The tall woman with sad eyes

saw him

the man

the short man

the very short man

the short man with red hair

….

Page 19: 3. Linguistic Essential

19

3.2 Phrase structure( 구구조 ) (2)

• Typical English phrase structure – Rewrite rule

S

NP VP

That man VBD NP PP

caught the butterfly IN NP

with a net

Page 20: 3. Linguistic Essential

20

3.2 Phrase structure( 구구조 ) (3)

• Noun phrases (NP)– Head: noun

– Arguments of verb

• Prepositional phrases (PPs) – Head: preposition

– Contain a noun phrase complement

– Express spatial, temporal locations, etc.

• Verb phrases (VP) – Head: verb

– Organize all elements of the sentence

• Adjective phrases (APs) : very sure of herself, quite certain to succeed

(determiner) + (adjective phrase) + noun + (post-modifier)(optional) (head) prepositional phrases,

clausal modifiers

Page 21: 3. Linguistic Essential

21

3.2.1 Phrase structure grammars (1)

• Word order – Change in meaning : English

– Do not change in meaning: Latin ( = free word order language)

– Declaratives, Interrogatives (inversion), Imperatives

• Rewrite rules (constituency 를 보여줌 1) 3.2 Phrase structure (1)– used to generate sentences

– A B, A B + C :

Generation

S NP VP AT the

AT NNS chidren

NP AT NN NNS students

NP PP mountains

VP PP slept

VP VBD VBD ate

VBD NP saw

etc.

S

NP VP

AT NNS VBD

The children slept

Page 22: 3. Linguistic Essential

22

3.2.1 Phrase structure grammars (2)

• Tree (constituency 를 보여줌 2)

– Terminal nodes: 분석되는 문장의 실제 어형– Non terminal nodes (Internal nodes): 통사적 그룹 , 구– The order of daughters generates the word order of sentence

• Bracketing (constituency 를 보여줌 3): grouping

S

NP VP

AT NNS VBD

The children slept

[S [NP [AT THE] [NNS children]] [VP [VBD ate] [NP [AT the] [NN cake]]]]

Page 23: 3. Linguistic Essential

23

3.2.1 Phrase structure grammars (3)

• Recursivity: recursive expansions– A property of most formalizations of natural language syntax in terms of rewrite ru

les

– The fact that there are constellations in which rewrite rules can be applied an arbitrary number of times

• Non-local dependencies [challenge some Statistical NLP approach]– Syntactically dependent even though they occur far apart in a sentence

• Subject-verb agreement(number, person)

• Long-distance dependencies– Wh-extraction

– Empty nodes: , e ; ex. NP

Should Peter buy a book?

Which book should Peter buy?

S”

NP S’

Which book MD S

Should NP VP

peter VB NP

buy e

Page 24: 3. Linguistic Essential

24

3.2.2 Dependency: Arguments and adjuncts (1)

• Dependency (= dependency grammar 의존문법 ) : 어떤 문장요소들 간의 의존관계를 바탕으로 문장구조를 표시

• 문장 성분들 간의 의존관계 ( 종속 관계 ) 를 기술하는 문법 : 문장에 들어 있는 상이한 등급의 성분들 중에서 지배 성분에는 어떤 것이 있으며 , 또 이 지배 성분에 결합되어 있는 종속 성분에는 어떤 것이 있는가를 기술한 문법

• 의존 관계 : 두 문장 성분 사이의 이진 관계 (binary relation):

지배 성분 (governor head) & 종속 성분 (dependent)– 영어 ( 프랑스어 ) 와 같은 언어 :

•어순이 고정되어 있어 구구조 (phrase structure) 에 기초한 구문분석이 가능– 어순이 자유로워 , 구구조 규칙을 설정하기 어려운 언어 ( 예 : 한국어와 일본어 ) :

•의존문법에 의한 구문분석 방법 선호 어순이 비교적 자유롭게 나타나는 언어에서 문장의 기본적 구조를 규정하고자 하면 모든 어순 , 생략 패턴에 대한 규칙을 설정해야 된다 : - 영어 , 프랑스어 , etc.:

제대로 되지 않은 어순 비문

- 한국어 , etc.:

문법적 ( 미묘한 의미차이 )

I put a pen on the table. 나는 책상 위에 펜을 놓았다 .*A pen put I on the table. 펜을 나는 책상 위에 놓았다 .*I put a pen. 나는 펜을 놓았다 .

Page 25: 3. Linguistic Essential

25

3.2.2 Dependency: Arguments and adjuncts (2)

• Arguments (dependent) of verbs– express entities that are centrally involved in the activity of the verb

• NPs, PPs, VPs

– Semantic roles• Agent( 동작주 )/ Patient( 수동자 )

– Grammatical relations• Subject

• Object – direct object (patient)

– indirect object (recipients( 수용자 )): prepositional phrase

– Roles(relations) change by voice alternations• Active & Passive

– English: patient subject / agent oblique role (by-phrase)

– Others: change in case marking/ morphology on the verb

Page 26: 3. Linguistic Essential

26

3.2.2 Dependency: Arguments and Adjuncts (3)

• Adjunct (dependent) – Less tight link to the verb– Always optional (many complement = obligatory)– Move around more easily than complement– Phrases describing time, place, manner of action, or state

• yesterday, in Paris, with great interest …

– Difficult to distinguish adjunct and compliment

• Intermediate degree of selection ?

[Statistical NLP: degree of association between a verb and a dependent]

He put the book on the table (obligatory).

He gave his presentation on the stage (optional).

He will retire in Florida.

Page 27: 3. Linguistic Essential

27

3.2.2 Dependency: Arguments and adjuncts (4)

• Subcategorization( 하위범주화 )

– A verb subcategorizes for a particular complementex) bring: subcategorizes for an object

– Subcategorized arguments• Subject, Object, Prepositional phrase,

Predicative adjective, Bare infinitive, Infinitive with to,

Participial phrase, That-clause, Question form clauses

• S’[S Bar] constituent: relative clause, main clause questions

Page 28: 3. Linguistic Essential

28

3.2.2 Dependency: Arguments and adjuncts (5)

• Syntactic regularities about complements: Subcategorization frame: patterns of arguments

– A particular set of arguments that a verb can appear with

• Semantic regularities between constituents– Selectional restrictions (preferences)

• bark (dogs as subjects)/ eat (edible as objects)• Violation of selectional preferences: odd sentence

Intransitive verb NP[subject]

Transitive verb NP[subject], NP[object]

Ditransitive verb NP[subject], NP[direct object], NP [indirect object]

Intransitive with PP NP[subject], PP

Transitive with PP NP[subject], NP[object], PP

Sentential comp NP[subject], clause

Transitive with sentential comp

NP[subject], NP[object], clause

Page 29: 3. Linguistic Essential

29

3.2.3 X’ theory

• Phrases structure rules as presented above do not predict any systematicity in the way that phrases in natural languages are made, nor any regularities for the appearance of different kinds of dependents in clauses.

• Head of a phrase: a word

• A broad systematicity in the way dependents arrange themselves around a head in a phrase: head/complements

NP

Det N’

The AP N’

definitive N PP

study of subcategorization

(X) N’

(XP) N”

Basic 2 level

Can have more or fewer level

Page 30: 3. Linguistic Essential

30

3.2.4 Phrase structure ambiguity (1)

• rewrite rules used in Parsing– Parse = phrase structure tree that is constructed from a sentence

• Phrase structure ambiguity (syntactic structure ambiguity) ex) 100 parse for a English sentence

a. Attachment ambiguity• Phrase that could have been generated by 2 different nodes

• Different attachments have different meanings.

The children ate the cake with a spoon.

Attachment to the verb phrase: (instrument)

Attachment to the noun phrase: (which cake was eaten)

Page 31: 3. Linguistic Essential

31

3.2.4 Phrase structure ambiguity (2)

S

NP VP

AT NNS VP PP

The children VBD NP IN NP

ate AT NN with AT NNthe cake a spoon

S

NP VP

AT NNS VBD NP

The children ate NP PP

AT NN IN NP

the cake with AT NN

a spoon

Page 32: 3. Linguistic Essential

32

3.2.4 Phrase structure ambiguity (3)

b. Garden paths( 순간적 중의성 ) • Additional words in the sentence that do not seem to belong there

adopt a spurious parse

backtrack to try to construct the right parse • Rarely problem in spoken language( intonational patterns, pause… )

c. No path at all (not covered by the grammar)• Syntactic illformedness (ungrammatical): no interpretation

• Semantic abnormality : semantic, pragmatic, cultural oddness

* Slept chidren the.

# Coloress green ideas sleep furiously.

The horse ran past the barn fell

The horse fell after it had been raced past the barn

Page 33: 3. Linguistic Essential

33

3.3 Semantics and Pragmatics (1)

• Semantics: study of the meaning of words, constructions, and utterances1. Lexical semantics

– Lexical hierarchy

– Ambiguity: refer to homonymy & polysemy• Hypernymy: Animal(general) is hypernym of cat(specialized)

• Antonyms: words with opposite meaning; hot / cold , long / short

• Meronymy: part-whole relationship

– Meronym(holonym): tire (car), leaf (tree)

• Synonyms: words with the same or very similar meaning; car / automobile

• Homonyms: different words that are written the same way; bank

• Polyseme: word’s meaning are related; branch

• Homophony: written the same way, identical pronunciation; bass( 베이스 , 농어 )

Page 34: 3. Linguistic Essential

34

3.3 Semantics and Pragmatics (2)

2. Study of how meanings of individual words are combined into the meaning of

sentences (>> discourses)– Compositionality: the meaning of the whole can be predicted from the meaning of

the parts• white paper(white), white hair(grey), white skin(rose), white wine(yellow)

– Collocations:

meaning of the whole = sum of the meaning of the part + some additional semantic component that cannot be predicted from the parts

– Idiom: relationship between the meaning of the words and the meaning of the phrase is completely opaque

• To kick the bucket ( = die)– Scope: quantifier have a scope which extends over one or more phrases or clauses

• Everyone didn’t go to the movie.

Page 35: 3. Linguistic Essential

35

3.3 Semantics and Pragmatics (3)

– Discourse analysis: – Relationships between sentences in the text

– Part of pragmatics

– Pragmatics: study of how knowledge about the world and language conventions interact with literal meaning

– Anaphoric relations: important for information extraction

Mary helped Peter get out of the cab. He thanked her

Mary helped the other passenger out of the cab.

The man had asked her to help him because of his foot injury.

Which Hurricane caused more than a billion dollars worth of damage need pragmatic information