named entity recognition - acl 2011 presentation
DESCRIPTION
Given for the MultiwTRANSCRIPT
![Page 1: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/1.jpg)
The Web is not a PERSON, Berners-Lee is not an ORGANIZATION, and African-Americans are not LOCATIONS: An Analysis of the Performance of Named-Entity RecognitionRobert Krovetz (Lexicalresearch.com), Paul Deane, Nitin Madnani (ETS)
A Review by Richard Littauer (UdS)
![Page 2: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/2.jpg)
The BackgroundNamed-Entity Recognition (NER)
is normally judged in the context of Information Extraction (IE)
![Page 3: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/3.jpg)
The BackgroundNamed-Entity Recognition (NER)
is normally judged in the context of Information Extraction (IE)
Various competitions
![Page 4: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/4.jpg)
The BackgroundNamed-Entity Recognition (NER)
is normally judged in the context of Information Extraction (IE)
Various competitionsRecently:
◦non-English languages◦improving unsupervised learning
methods
![Page 5: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/5.jpg)
The Background“There are no well-established
standards for evaluation of NER.”
![Page 6: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/6.jpg)
The Background“There are no well-established
standards for evaluation of NER.”◦Criteria for NER system changes for
competitions◦Proprietary software
![Page 7: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/7.jpg)
The BackgroundKDM wanted to identify MWEs…
![Page 8: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/8.jpg)
The BackgroundKDM wanted to identify MWEs…
… but false positives, tagging inconsistencies stopped this.
![Page 9: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/9.jpg)
The BackgroundKDM wanted to identify MWEs…
… but false positives, tagging inconsistencies stopped this.
IE derives Recall and Precision from Information Retrieval
NER is just a small part of this, so is rarely evaluated independently
![Page 10: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/10.jpg)
The BackgroundSo, they want to test NER
systems, and provide a unit test based on the problems encountered
![Page 11: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/11.jpg)
Evaluation
Compared three NER taggers: Stanford:
◦CRF, 100m training corpus;University of Illinois (LBJ):
◦Regularized average perceptron, Reuters 1996 News Corpus;
BBN IdentiFinder (IdentiFinder):◦HMMs, commercial
![Page 12: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/12.jpg)
EvaluationAgreement on Classification
![Page 13: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/13.jpg)
EvaluationAgreement on ClassificationAmbiguity in Discourse
![Page 14: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/14.jpg)
EvaluationAgreement on ClassificationAmbiguity in Discourse
Stanford vs. LBJ on internal ETS 425m corpus
All three on American National Corpus
![Page 15: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/15.jpg)
Stanford vs. LBJNER reported as 85-95%
accurate.
![Page 16: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/16.jpg)
Stanford vs. LBJNER reported as 85-95%
accurate.Same number for both: 1.95m for
Stanford, 1.8m for LBJ (7.6% difference)
However, errors:
![Page 17: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/17.jpg)
Stanford vs. LBJAgreement:
![Page 18: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/18.jpg)
Stanford vs. LBJAmbiguity:
![Page 19: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/19.jpg)
Stanford vs. LBJ vs. IdentiFinderAgreement:
![Page 20: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/20.jpg)
Stanford vs. LBJ vs. IdentiFinderAgreement:
![Page 21: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/21.jpg)
Stanford vs. LBJ vs. IdentiFinderDifferences:
◦How they are tokenized◦Number of entities recognized
overall
![Page 22: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/22.jpg)
Stanford vs. LBJ vs. IdentiFinderAmbiguity:
![Page 23: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/23.jpg)
Unit TestCreated two documents that can
be used as texts◦Different cases for true positives of
PERSON, LOCATION, ORGANIZATION◦Entirely upper case not NE (Ex.
AAARGH)◦Punctuated terms not NE◦Terms with Initials◦Acronyms (some expanded, some
not)◦Last names in close proximity to first
names
![Page 24: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/24.jpg)
Unit TestCreated two documents that can
be used as texts◦Terms with prepositions (Mass. Inst.
Of Tech.)◦Terms with location and organization
(Amherst College)
Provided freely online.
![Page 25: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/25.jpg)
One NE Tag per DiscourseUnusual for multiple occurrences
of a token in a document to be different entities
True for homonymsAn exception: Location + sports
team
![Page 26: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/26.jpg)
One NE Tag per DiscourseStanford, LBJ have features for
non-local dependencies to help with this.
KDM: Two other uses for NLD:◦Source of error in evaluation◦A way to identify semantically
related entities
These should be treated as exceptions
![Page 27: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/27.jpg)
DiscussionThere are guidelines for NER –
but we need standards.The community should focus on
PERSON, ORGANISATION, LOCATION, and MISC.◦Harder to deal with than Dates,
Times.◦Disagreement between taggers.◦MISC is necessary.◦These have important value
elsewhere.
![Page 28: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/28.jpg)
DiscussionTo improve intrinsic evaluation
for NER:1. Create test sets for divers domains.2. Use standardized sets for different
phenomena.3. Report accuracy for POL separately.4. Establish uncertainty in the tagging
system.
![Page 29: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/29.jpg)
Conclusion90% accuracy not real. We need to use only entities that
are agreed on by multiple taggers.
Even in cases where they both disagree (Hint: Future work.)
Unit test downloadable.
![Page 30: Named Entity Recognition - ACL 2011 Presentation](https://reader034.vdocuments.pub/reader034/viewer/2022052523/55635338d8b42a6f7b8b54f0/html5/thumbnails/30.jpg)
Cheers/PERSON
Richard/ORGANISATION thanks the Mword Class/LOCATION for listening to his talk about Berners-Lee/MISC