fall 2005 lecture notes #6 eecs 595 / ling 541 / si 661 natural language processing
Embed Size (px)
Lecture Notes #6
EECS 595 / LING 541 / SI 661
Natural Language Processing
Lexicalized andprobabilistic parsing
• G (N, Σ, P, S)• Non-terminals (N)• Terminals (Σ)• Productions (P)
augmented with probabilities: A β [p]
Adobe Acrobat 7.0 Document
Adobe Acrobat 7.0 Document
Disambiguation as a probability problem
• P(T,S) = P(T) P(S|T) = P(T)
• P(Tl) = .15 * .40 * .05 * .05 * .35 * .75 * .40 * .40 * .40 * .30 * .40 * .50 = 1.5 x 10-6
• P(Tr) = .15 * .40 * .40 * .05 * .05 * .75 * .40 * .40 * .40 * .30 * .40 * .50 = 1.7 x 10-6
• Probabilistic Earley algorithm– Top-down parser with a dynamic programming
• Cocke-Younger-Kasami (CYK) algorithm– Bottom-up parser with a dynamic programming
• Probabilities come from a Treebank.
Adobe Acrobat 7.0 Document
• Lexical dependencies between head words
• Top-level predicate of a sentence is the root
• Useful for free word order languages
• Also simpler to parse
John likes tabby cats
NNP VBS JJ NNS
• Meaning representation languages: capturing the meaning of linguistic utterances using formal notation
• Example: deciding what to order at a restaurant by reading a menu
• Example: answering a question on an exam• Semantic analysis: mapping between language and
real life• I have a car:
∃ x,y: Having(x) ^ Haver(speaker,x) ^ HadThing(y,x) ^ Car(y)Adobe Acrobat 7.0
• Example: Does LeDog serve vegetarian food?
• Knowledge base (KB)
• Sample entry in KB: Serves(LeDog,Vegetarian Food)
• Convert question to logical form and verify its truth value against the knowledge base
• Example:I want to eat some place near UM.(multiple interpretations)
• Interpretation is important
• Preferred interpretations
• Vagueness: I want to eat Italian food.- what particular food?
• Does LeDog have vegetarian dishes?• Do they have vegetarian food at LeDog?• Are vegetarian dishes served at LeDog?• Does LeDog serve vegetarian fare?• Having vs. serving• Food vs. fare vs. dishes (each is ambiguous but
one sense of each matches the others)• word sense disambiguation
Inference and variables; expressiveness
• Inference and variables:– I’d like to find a restaurant that serves vegetarian food.
– Serves (x,VegetarianFood)
– System’s ability to draw valid conclusions based on the meaning representations of inputs and its store of background knowledge.
• Expressiveness:– system must be able to handle a wide range of subject
• I want Italian food. NP want NP• I want to spend less than five dollars. NP want Inf-VP• I want it to be close by here. NP want NP Inf-VP• Thematic roles: e.g. entity doing the wanting vs. entity that is
wanted (linking surface arguments with the semantic=case roles)
• Syntactic selection restrictions: I found to fly to Dallas.• Semantic selection restrictions: The risotto wanted to spend
less than ten dollars.• Make a reservation for this evening for a table for two persons
at eight: Reservation (Hearer,Today,8PM,2)
First-order predicate calculus (FOPC)
• Formula AtomicFormula | Formula Connective Formula | Quantifier Variable … Formula | ¬ Formula | (Formula)
• AtomicFormula Predicate (Term…)
• Term Function (Term…) | Constant | Variable
• Connective ∧ | ⋁ | ⇒• Quantifier ∀ | ∃
• Constant A | VegetarianFood | LeDog
• Variable x | y | …
• Predicate Serves | Near | …
• Function LocationOf | CuisineOf | …
• I only have five dollars and I don’t have a lot of time.
• Have(Speaker,FiveDollars) ∧ ¬ Have(Speaker,LotOfTime)
• variables:– Have(x,FiveDollars) ∧ ¬ Have(x,LotOfTime)
• Note: grammar is recursive
Semantics of FOPC
• FOPC sentences can be assigned a value of true or false.
• LeDog is near UM.
Adobe Acrobat 7.0 Document
Variables and quantifiers
• A restaurant that serves Mexican food near UM.• ∃ x: Restaurant(x)
∧ Serves(x,MexicanFood) ∧ Near(LocationOf(x),LocationOf(UM))
• All vegetarian restaurants serve vegetarian food. x: VegetarianRestaurant(x)
⇒ Serves (x,VegetarianFood)• If this sentence is true, it is also true for any
substitution of x. However, if the condition is false, the sentence is always true.
• Modus ponens:
VegetarianRestaurant(Joe’s) x: VegetarianRestaurant(x) ⇒ Serves(x,VegetarianFood)Serves(Joe’s,VegetarianFood)
Uses of modus ponens
• Forward chaining: as individual facts are added to the database, all derived inferences are generated
• Backward chaining: starts from queries. Example: the Prolog programming language
• father(X, Y) :- parent(X, Y), male(X).parent(john, bill).parent(jane, bill).female(jane).male (john).?- father(M, bill).
Examples from Russell&Norvig (1)
• 7.2. p.213
• Not all students take both History and Biology.• Only one student failed History.• Only one student failed both History and Biology.• The best history in History was better than the best score in Biology.• Every person who dislikes all vegetarians is smart.• No person likes a smart vegetarian.• There is a woman who likes all men who are vegetarian.• There is a barber who shaves all men in town who don't shave themselves.• No person likes a professor unless a professor is smart.• Politicians can fool some people all of the time or all people some of the
time but they cannot fool all people all of the time.
Categories & Events
• Categories:– VegetarianRestaurant (Joe’s) – categories are relations and not objects– MostPopular(Joe’s,VegetarianRestaurant) – not FOPC!– ISA (Joe’s,VegetarianRestaurant) – reification (turn all concepts into
objects)– AKO (VegetarianRestaurant,Restaurant)
• Events:– Reservation (Hearer,Joe’s,Today,8PM,2)– Problems:
• Determining the correct number of roles• Representing facts about the roles associated with an event• Ensuring that all the correct inferences can be drawn• Ensuring that no incorrect inferences can be drawn
INCIDENT: DATE 30 OCT 89 INCIDENT: LOCATION EL SALVADOR INCIDENT: TYPE ATTACK INCIDENT: STAGE OF EXECUTION ACCOMPLISHED INCIDENT: INSTRUMENT ID INCIDENT: INSTRUMENT TYPEPERP: INCIDENT CATEGORY TERRORIST ACT PERP: INDIVIDUAL ID "TERRORIST" PERP: ORGANIZATION ID "THE FMLN" PERP: ORG. CONFIDENCE REPORTED: "THE FMLN" PHYS TGT: ID PHYS TGT: TYPEPHYS TGT: NUMBERPHYS TGT: FOREIGN NATIONPHYS TGT: EFFECT OF INCIDENTPHYS TGT: TOTAL NUMBERHUM TGT: NAMEHUM TGT: DESCRIPTION "1 CIVILIAN"HUM TGT: TYPE CIVILIAN: "1 CIVILIAN"HUM TGT: NUMBER 1: "1 CIVILIAN"HUM TGT: FOREIGN NATIONHUM TGT: EFFECT OF INCIDENT DEATH: "1 CIVILIAN"HUM TGT: TOTAL NUMBER
On October 30, 1989, one civilian was killed in a reported FMLN attack in El Salvador.
1. I ate
2. I ate a turkey sandwich
3. I ate a turkey sandwich at my desk
4. I ate at my desk
5. I ate lunch
6. I ate a turkey sandwich for lunch
7. I ate a turkey sandwich for lunch at my desk- no fixed “arity” (problem for FOPC)
One possible solution
1. Eating1 (Speaker)2. Eating2 (Speaker, TurkeySandwich)3. Eating3 (Speaker, TurkeySandwich, Desk)4. Eating4 (Speaker, Desk)5. Eating5 (Speaker, Lunch)6. Eating6 (Speaker, TurkeySandwich, Lunch)7. Eating7 (Speaker, TurkeySandwich, Lunch, Desk)Meaning postulates are used to tie semantics of predicates:
w,x,y,z: Eating7(w,x,y,z) ⇒ Eating6(w,x,y)Scalability issues again!
- Say that everything is a special case of Eating7 with some arguments unspecified: ∃w,x,y Eating (Speaker,w,x,y)
- Two problems again:- Too many commitments (e.g., no eating except at meals:
lunch, dinner, etc.)- No way to individuate events:
∃w,x Eating (Speaker,w,x,Desk) ∃w,y Eating (Speaker,w,Lunch,y) – cannot combine into ∃w Eating (Speaker,w,Lunch,Desk)
• ∃ w: Isa(w,Eating) ∧ Eater(w,Speaker) ∧ Eaten(w,TurkeySandwich) – equivalent to sentence 5.
• Reification:– No need to specify fixed number of arguments for a
given surface predicate– No more roles are postulated than mentioned in the
input– No need for meaning postulates to specify logical
connections among closely related examples
1. I arrived in New York
2. I am arriving in New York
3. I will arrive in New York
• ∃ w: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork)
• ∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ EndPoint(I,e) ∧ Precedes (e,Now)
• ∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ MemberOf(i,Now)
• ∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ StartPoint(i,s) ∧ Precedes (Now,s)
• We fly from San Francisco to Boston at 10.• Flight 1390 will be at the gate an hour now.
– Use of tenses
• Flight 1902 arrived late.• Flight 1902 had arrived late.
– “similar” tenses
• When Mary’s flight departed, I ate lunch• When Mary’s flight departed, I had eaten lunch
– reference point
• Stative: I know my departure gate• Activity: John is flying
no particular end point• Accomplishment: Sally booked her flight
natural end point and result in a particular state• Achievement: She found her gate• Figuring out statives:
* I am needing the cheapest fare.* I am wanting to go today.* Need the cheapest fare!
• Want, believe, imagine, know - all introduce hypothetical worlds
• I believe that Mary ate British food.• Reified example:
– ∃ u,v: Isa(u,Believing) ∧ Isa(v,Eating) ∧ Believer (u,Speaker) ∧ BelievedProp(u,v) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood)
However this implies also:– ∃ u,v: Isa(v,Eating) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood)
• Modal operators:– Believing(Speaker,Eating(Mary,BritishFood)) - not FOPC! –
predicates in FOPC hold between objects, not between relations.– Believes(Speaker, ∃ v: ISA(v,Eating) ∧ Eater(v,Mary) ∧
• Issues: If you are interested in baseball, the Red Sox are playing tonight.
Examples from Russell&Norvig (2)
• 7.3. p.214
• One more outburst like that and you'll be in comptempt of court.• Annie Hall is on TV tonight if you are interested.• Either the Red Sox win or I am out ten dollars.• The special this morning is ham and eggs.• Maybe I will come to the party and maybe I won't.• Well, I like Sandy and I don't like Sandy.