數位圖書館 – 知識架構的理論與發展 jian-hua yeh ( 葉建華 )...
TRANSCRIPT
2
Outline
• Ontology
– The problem
– What is an ontology?
– Why develop an ontology?
– Usage of ontology
– Complexity and Processing of ontology
– OWL introduction
• Topic maps
– Concepts
3
The Problem
• With the increasing complexity of our systems and our IT needs, we need to go to human level interaction
• We need to maximize the amount of Semantics we can utilize
• From data and information level, we need to go to human semantic level interaction
DATA Information Knowledge
Run84
ID=08
NULLPARRT
ACC
ID=34
e
5
&
#
~
Qü
@
¥
¥
�
Å
Tank
¥
Noise Human Meaning
VehicleLocated at
Semi-mountainous terrainobscured
decide
Vise maneuver
• And represented semantics means multiple represented semantics, requiring semantic integration
4
Simple Metadata: XML
Advancing Along the Interpretation Continuum
Human interpreted Computer interpreted
DATA KNOWLEDGE• Relatively unstructured• Random
• Very structured• Logical
Moving to the right depends on increasing automated semantic interpretation
• Info retrieval
• Web search
• Text summarization• Content extraction• Topic maps
• Reasoning services
• Ontology Induction
...Display raw documents;All interpretation done by humans
Find and correlate patterns in raw docs; display matches only
Store and connect patterns via conceptual model (i.e,. an ontology); link to docs to aid retrieval
Automatically acquire concepts; evolve ontologies into domain theories; link to institution repositories (e.g., MII)
Richer Metadata: RDF/S
Very Rich Metadata: DAML+OIL
Automatically span domain theories and institution repositories; inter-operate with fully interpreting computer
Interpretation Continuum
5
Dimensions of Interoperability & Integration
Enterprise
Object
Data
System
Application
Component
0% 100%
6 Levels o
f Inte
ropera
bility
3 Kinds of Integration
Interoperability Scale
Our interest lies here
Community
6
Information Semantics
• Provide semantic representation (meaning) for our systems, our data, our documents, our agents
• Focus on machines more closely interacting at human conceptual level
• Spans Ontologies, Knowledge Representation, Semantic Web, Semantics in NLP, Knowledge Management
• Linking notion is Ontologies (rich formal models)
8
Triangle of Signification
Terms
Concepts
Real (& Possible)World Referents
SenseReference/Denotation
<Joe_ Montana >
“Joe” + “Montana”
Syntax: Symbols
Semantics: Meaning
Pragmatics: Use
Intension
Extension
9
What is an Ontology?
• Many definitions of an ontology contradict one another.
• One formal definition
– A formal explicit description of concepts in a domain of discourse (classes), properties of each concept describing various features and attributes of the concept (slots), and restriction on slots.
10
What is an Ontology? (2)
• Another definition
– The subject of ontology is the study of the categories of things that exist or may exist in some domain.
• A simple definition
– Ontology is about the exact description of things and their relationships.
• “An ontology is a specification of a conceptualization” [Gruber 95]
11
0 2000
1613384-322 BC
Aristotle‘Ontology’
coined
1967
First occurrence of ontology in Information Science
10001721
First occurrence in OED
Ontology Background
Timeline (Smith 2002)
15
Why Develop an Ontology?
• Semantic Interoperability
– Generalized database integration
– Virtual Enterprises
– e-commerce
• Information Retrieval
– Decoupling user vocabulary from data vocabulary
– Query answering over document sets
– Natural Language Processing
16
Different Uses of Ontologies
• Application ontologies (run time)– Offer terminological services, checking constraints between terms
– Limited expressivity (stringent computational reqs)
• Reference ontologies (develop. time)– Establish consensus about meaning of terms (in general)
– Higher expressivity (less stringent computational reqs)
• Mutual understanding more important than mass interoperability– Understanding disagreements
– Establish trustable mappingsamong application ontologies
17
Ontology Structure Levels
• The term ontology has been used to describe models with different degrees of structure (Ontology Spectrum)
– Less structure: Taxonomies (Semio taxonomies, Yahoo hierarchy, biological taxonomy), Database Schemas (many) and metadata schemes (ICML, ebXML, WSDL)
– More Structure: Thesauri (WordNet, CALL, DTIC), Conceptual Models (OO models, UML)
– Most Structure: Logical Theories (Ontolingua, TOVE, CYC, Semantic Web)
• Ontologies are usually expressed in a logic-based language
– Enabling detailed, sound, meaningful distinctions to be made among the classes, properties, & relations
– More expressive meaning but maintain “computability”
• Using ontologies, tomorrow's applications can be "intelligent”
– Work at the human conceptual level
18
E-commerceArea ofInterestMostly This
Middle Ontology(Domain-spanning
Knowledge)
Most General Thing
Upper Ontology(Generic Common
Knowledge)Products/Services
Processes
Organizations
Locations
Lower Ontology(individual domains)
Metal PartsArt Supplies
Lowest Ontology(sub-domains)
Washers
But Also This!
Ontology: General Picture at Object Level
21
Steps:
•Determine the domain and scope of ontology
•Consider reusing existing ontologies
•Enumerate important terms in the ontology
•Define classes and the class hierarchy
•Define the properties of the classes ─ slots
•Define the facets of the slots (cardinality, value-type)
•Create instances
How to Build an Ontology
22
Kno
wle
dge s
hari
ng a
nd re
use
Building an ontology is not a goal in itself.
Communication between people
Interoperability between software agents
Reuse of domain knowledge
Make domain knowledge explicit
Analyze domain knowledge
Benefits of Building Ontologies
23
The benefits:Modularisation
Bridging Scales and context with Ontologies
GenesSpecies
Protein
Function
Disease
Protein coded bygene in humans
Function ofProtein coded bygene in humans
Disease caused by abnormality inFunction ofProtein coded bygene in humans
Gene in humans
24
Thesaurus vs. Ontology
Concepts
‘‘Semantic’ Relations:Semantic’ Relations:
Equivalent =
Used For (Synonym) UF
Broader Term BT
Narrower Term NT
Related Term RT
Thesaurus
Ontology
Term Semantics
(Weak)
Logical-Conceptual Semantics
(Strong)
Semantic Relations:Semantic Relations:
Subclass Of
Part Of
Arbitrary Relations
Meta-Properties on Relations
Terms: Metal working machinery, equipment and supplies, metal-cutting machinery, metal-turning equipment, metal-milling equipment, milling insert,turning insert, etc.Relations: use, used-for, broader-term, narrower-term, related-term
Controlled Vocabulary
TermsReal (& Possible)World Referents
Entities: Metal working machinery, equipment and supplies, metal-cutting machinery, metal-turning equipment, metal-milling equipment, milling insert, turning insert, etc.Relations: subclass-of; instance-of; part-of; has-geometry; performs, used-on;etc.Properties: geometry; material; length; operation; UN/SPSC-code; ISO-code; etc.Values: 1; 2; 3; “2.5 inches”; “85-degree-diamond”; “231716”; “boring”; “drilling”; etc.Axioms/Rules: If milling-insert(X) & operation(Y) & material(Z)=HG_Steel & performs(X, Y, Z), then has-geometry(X, 85-degree-diamond).
Logical Concepts
25
weak semanticsweak semantics
strong semanticsstrong semantics
Is Disjoint Subclass of with transitivity property
Modal Logic
Logical Theory
Thesaurus Has Narrower Meaning Than
TaxonomyIs Sub-Classification of
Conceptual Model Is Subclass of
DB Schemas, XML Schema
UML
First Order Logic
RelationalModel, XML
ER
Extended ER
Description LogicDAML+OIL, OWL
RDF/SXTM
Ontology Spectrum: One View
Syntactic Interoperability
Structural Interoperability
Semantic Interoperability
Source: Obrst, L. 2004
26
Logical Theory
Thesaurus Has Narrower Meaning Than
TaxonomyIs Sub-Classification of
Conceptual Model Is Subclass of
Is Disjoint Subclass of with transitivity property
weak semanticsweak semantics
strong semanticsstrong semantics
DB Schemas, XML Schema
UML
Modal LogicFirst Order Logic
RelationalModel, XML
ER
Extended ER
Description LogicDAML+OIL, OWL
RDF/SXTM
Ontology Spectrum: One View (cont.)
Problem: Very GeneralSemantic Expressivity: Very High
Problem: Local Semantic Expressivity: Low
Problem: GeneralSemantic Expressivity: Medium
Problem: Local Semantic Expressivity: High
Syntactic Interoperability
Structural Interoperability
Semantic Interoperability
Source: Obrst, L. 2004
28
Emerging XML Stack Architecture for the Semantic Web + Grid + Agents
• Semantic Brokers
• Intelligent Agents
• Advanced Applications
• Use, Intent: Pragmatics
• Trust: Proof + Security + Identity
• Reasoning/Proof Methods
• OWL: Ontologies
• RDF Schema: Ontologies
• RDF: Instances (assertions)
• XML Schema: Encodings of Data Elements & Descriptions, Data Types, Local Models
• XML: Base Documents
• Grid & Semantic Grid: New System Services, Intelligent QoS
Sem-Grid Services Water, LISP?
Syntax: Data
Structure
Semantics
Higher Semantics
Reasoning/Proof
XML
XML Schema
RDF/RDF Schema
OWL
Inference Engine
Trust Security/Identity
Use, Intent Pragmatic Web
Intelligent Domain Services, Applications
Agents, Brokers, Policies
30
What Problems Do Ontologies Help Solve?
• Heterogeneous database problem
– Different organizational units, Service Needers/Providers have radically different databases
– Different syntactically: what’s the format?
– Different structurally: how are they structured?
– Different semantically: what do they mean?
– They all speak different languages (access, description, schemas, meaning)
– Integration: rather than N2 problem, with single, adequate Ontology reduces to N
• Enterprise-wide system interoperability problem
– Currently: system-of-systems, vertical stovepipes
– Ontologies act as conceptual model representing enterprise consensus semantics
• Relevant document retrieval/question-answering problem
– What is the meaning of your query?
– What is the meaning of documents that would satisfy your query?
– Can you obtain only meaningful, relevant documents?
31
OWL: Web Ontology Language
• OWL is built on top of RDF
• OWL is for processing information on the web
• OWL was designed to be interpreted by computers
• OWL was not designed for being read by people
• OWL is written in XML
• OWL has three sublanguages
• OWL is a web standard
32
Why OWL?
• OWL is a part of the "Semantic Web Vision" - a future where:
– Web information has exact meaning
– Web information can be processed by computers
– Computers can integrate information from the web
33
Origins of OWL
DAML
DAML+OIL
DAML = DARPA Agent Markup LanguageOIL = Ontology Inference Layer
OWL is now on track tobecome a W3C Recommendation!
OIL
OWL
RDF
All were influenced by RDF
34
OWL Sublanguages
• OWL has three sublanguages:
– OWL Lite
– OWL DL (includes OWL Lite)
– OWL Full (includes OWL DL)
35
OWL is Different from RDF
• OWL and RDF are much of the same thing, but OWL is a stronger language with greater machine interpretability than RDF.
• OWL comes with a larger vocabulary and stronger syntax than RDF.
37
Where is the Technology Going
• “The Semantic Web is very exciting, and now just starting off in the same grassroots mode as the Web did 10 years ago ... In 10 years it will in turn have revolutionized the way we do business, collaborate and learn.”
– Tim Berners-Lee, CNET.com interview, 2001-12-12
• We can look forward to:
– Semantic Integration/Interoperability, not just data interoperability
– Applications with trans-community semantics
– Device interoperability in the ubiquitous computing future: achieved through semantics & contextual awareness
– True realization of intelligent agent interoperability
– Intelligent semantic information retrieval & search engines
– Next generation electronic commerce/business & web services
– Semantics beginning to be used once again in NLP: information extraction becomes knowledge extraction
Key to all of this is effective & efficient use of explicitly represented semantics (ontologies)!
38
What do we want the future to be?
• 2100 A.D: models, models, models
• There are no human-programmed programming languages
• There are only Models
Ontological Models
Knowledge Models
Belief Models
Application Models
Presentation Models
Target Platform Models
Transformations, Compilations
Executable Code
INFRASTRUCTURE
39
Ontology Example from Electronic Commerce: the general domain of machine tooling & manufacturing; note that these are expressed in English, but usually would be in expressed in a logic-based language Concept Example
Classes (general things)
Metal working machinery, equipment and supplies, metal-cutting machinery, metal-turning equipment, metal-milling equipment, milling insert, turning insert, etc.
Instances (particular things)
An instance of metal-cutting machinery is the “OKK KCV 600 15L Vertical Spindle Direction, 1530x640x640mm 60.24"x25.20"x25.20 X-Y-Z Travels Coordinates, 30 Magazine Capacity, 50 Spindle Taper, 20kg 44 lbs Max Tool Weight, 1500 kg 3307 lbs Max Loadable Weight on Table, 27,600 lbs Machine Weight, CNC Vertical Machining Center”
Relations: subclass-of, (kind_of), instance-of, part-of, has-geometry, performs, used-on, etc.
A kind of metal working machinery is metal cutting machinery, A kind of metal cutting machinery is milling insert.
Properties Geometry, material, length, operation, ISO-code, etc.
Values: 1; 2; 3; “2.5”, inches”; “85-degree-diamond”; “231716”; “boring”; “drilling”; etc.
Rules
If milling-insert(X) & operation(Y) & material(Z)=HG_Steel & performs(X, Y, Z), then has-geometry(X, 85-degree-diamond). [Meaning: if you need to do milling on High Grade Steel, then you need to use a milling insert (blade) which has a 85-degree diamond shape.]
41
Topic Maps Introduction
• Goal: organize information for navigation
• Topic Maps are the online equivalent of printed indexes
• A powerful way to manage link information, such as glossaries, cross-references, thesauri, catalogs, they enable the merging of structured, unstructured information.
43
Objects and Their Metadata
• What is metadata?
• Metadata as a finding aid
• Subjects and precision
44
Subject-based Classification
• Controlled vocabularies
• Taxonomies
• Thesauri
• Faceted classification
• Ontologies
• Other subject-based techniques
45
Topic Maps Concepts
• Topic
– A topic is a multi-headed link, that points to all its occurrences
– Topic occurrence
– A topic type is a category to which one given topic instance belong("person", "city", "product"…,etc)
– Topic name: base name, display name, sort name
46
Topic Maps Concepts (2)
• Types
– is-a relationships
• Occurrences
– Relate topics to the information they are relevant to
47
Topic Maps Concepts (3)
• Association
– Topics can be related together through some association expressing given semantic
– Describes relationships
• Facet
– Multiple facets can be applied to view the topic in different ways
50
XTM Element Types
• <topicRef>: Reference to a Topic element
• <subjectIndicatorRef>: Reference to a Subject Indicator
• <scope>: Reference to Topic(s) that comprise the Scope
• <instanceOf>: Points to a Topic representing a class
• <topicMap>: Topic Map document element
• <topic>: Topic element
• <subjectIdentity>: Subject reified by Topic
• <baseName>: Base Name of a Topic
• <baseNameString>: Base Name String container
• <variant>: Alternate forms of Base Name
• <variantName>: Container for Variant Name
• <parameters>: Processing context for Variant
• <association>: Topic Association
• <member>: Member in Topic Association
• <roleSpec>: Points to a Topic serving as an Association Role
• <occurrence>: Resources regarded as an Occurrence
• <resourceRef>: Reference to a Resource
• <resourceData>: Container for Resource data
• <mergeMap>: Merge with another Topic Map
51
The Comparison
• Traditional classifications in topic maps
• Merging metadata and classification
• Benefits and costs
• Searching
• Schemas
• Identity and merging