taas workshop 2014, term mining and terminology management in a corporate setting perspective, luigi...
DESCRIPTION
he time spent looking for and not finding information cost an organization a total of $6 million a year, not including opportunity costs or the costs of reworking existing information that could not be located. Only 41% of localization-mature organizations have some terminology management policy in place, almost solely translation-oriented. Today we will talk about how terminology management works, demonstrate its power, through controlled languages, ontologies, search engine applications, content and knowledge management applications, and e-learning systems.TRANSCRIPT
Wednesday, 4 June /10:50 – 11:20
Term Mining and Terminology Management in a Corporate Se@ng PerspecCve
Luigi Muzii, sQuid
TaaS Workshop 2014 4 June, Dublin (Ireland)
The research within the project TaaS leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-‐2013), grant agreement no 296312
Welcome Ivan Smolniov, ABBYY Language Services
Term Mining and Terminology Management A Corporate Setting Perspective
Awareness
Globally active organizations whose core business is not communications-‐related (translation, localization, information management, etc.) are generally unaware of the benefits of performing terminology management.
Kara Warburton, LISA, 2001
Term Mining and Terminology Management in a Corporate Setting Perspective
Translation-‐oriented terminology
Only 41% of localization-‐mature organizations have some terminology management policy in place, almost solely translation-‐oriented
Term Mining and Terminology Management in a Corporate Setting Perspective
Scope
• Technical documentation • Controlled languages • Translation and localization • Translation automation
• Content and Knowledge Management Systems • Knowledge organization • Taxonomies and ontologies
• Learning Management Systems • Knowledge nugget (knowledge representation) • Self-‐contained reusable educational entities (Learning Object
Metadata, IEEE 1484.12.1)
• Marketing management • Customer service • SEM/SEO • Sentiment analysis
Term Mining and Terminology Management in a Corporate Setting Perspective
Integrations
Documentation
CMS Website
Marketing
Service & Support
LMS
Term Mining and Terminology Management in a Corporate Setting Perspective
CVS
Costs (IDC, 2004)
• Productivity of knowledge workers • 15% to 35% searching for information • Successfully completed 50% of the time or less • Only 21% found the information they needed 85% to 100% of the time
• $6 million a year looking for and not finding information
• 15% of time for duplicating existing information • Opportunity costs • Reworking existing information that could not be located
• $12 million a year
Term Mining and Terminology Management in a Corporate Setting Perspective
Terminology cost multiplier (Jörg Schütz/Rita Nübel)
Term Mining and Terminology Management in a Corporate Setting Perspective
Product data
Documentation development
Authoring
Editing
Approval
Localization
Maintenance
0.1 -‐ 0.2
0.5
1.0
2.0
5.0
10.0
20.0
Costs/Benefits
• Huge costs in the short term • $150 per terminological entry (J.D. Edwards, 2001)
• The practical value does not match the technical value
Term Mining and Terminology Management in a Corporate Setting Perspective
Accuracy Fundamental accuracy of statement is the one sole morality of writing.
Term Mining and Terminology Management in a Corporate Setting Perspective
Payback
• Cost reduction • Authoring, localization, training, customer service • Overhead • Time reduction in the production cycle • Immediate 1% payback for larger businesses
• Productivity increase • Time-‐to-‐market
• Qualitative improvements • Branding • Safety
Term Mining and Terminology Management in a Corporate Setting Perspective
Controlled languages
The most valuable of all talents is that of never using two words when one will do.
Term Mining and Terminology Management in a Corporate Setting Perspective
Fatal errors
• The Linate Airport disaster (Oct 8, 2001) • Deficiencies in the airport layout and procedures • Violations of ICAO regulations • Incorrect signs to runway
• Incorrect, uncorrected readback • Non-‐standard phraseology • Irrelevant term (extension) leading to fatal misunderstanding
Term Mining and Terminology Management in a Corporate Setting Perspective
Keywords advertising
Rem tene, verba sequentur (Keep to the subject, words will follow)
Marcus Porcius Cato (Cato the Censor)
Term Mining and Terminology Management in a Corporate Setting Perspective
The long tail
Rerum enim copia verborum copiam gignit (All this gives rise to a plethora of words)
Cicero
Term Mining and Terminology Management in a Corporate Setting Perspective
Term mining • Complex knowledge-‐intensive task • Different approach for different scope
• Hard to grasp in a corporate setting perspective • Business intelligence
Term Mining and Terminology Management in a Corporate Setting Perspective
Mining terms
• Linguistic approach • Based on rules and dictionaries • Collocations • One language at a time
• Issues • Loans • Synonyms, variants,
abbreviations • Ellipses • Improper usage
• Bitext • Knowledge bases • Knowledge discovery
• Statistical approach • Language independent • Based on frequency • Repeated sequences of
syntagmas • The frequency threshold
must be specified • Frequency does not necessarily
means importance • Much “noise”
• Monolingual corpus • Indices • Controlled languages • Keywords
• TQA
Term Mining and Terminology Management in a Corporate Setting Perspective
TaaS test drive
• Building a Localization Kit • 13688 words, 142 repetitions • memoQ Term Extraction • Statistical analysis • 815 term entries from the English document • 647 term entries from translation memory
• Tilde Wrapper System for CollTerm (TWSC) • Linguistic analysis enriched by statistical features • 3046 term entries
• Kilgray Terminology extractor • Statistical analysis • 3218 term entries
Term Mining and Terminology Management in a Corporate Setting Perspective
Terminology management in the cloud
Pros
• Zero TCO
• Availability and deployability • Collaboration features
Cons
• Limited scalability
• Security issues • Integration costs
Term Mining and Terminology Management in a Corporate Setting Perspective
ROI
The proof of performance, i.e. ROI considerations, of terminology management within the corporate setting is a challenge for future projects.
Stefan Kremer, 2005
Term Mining and Terminology Management in a Corporate Setting Perspective
Thank you
Term Mining and Terminology Management in a Corporate Setting Perspective