ontos solutions for semantic web: text mining, navigation ... · ais-adm-07, st. petersburg,...
TRANSCRIPT
The Second International WorkshopAIS-ADM-07
June 3-5, 2007, St. Petersburg, Russia
Ontos Solutions for Semantic Web:Text Mining, Navigation and Analytics
Vladimir Khoroshevsky, Computer Center RAS, 40 Vavilov str,
GSP-1 Moscow, Russia
Irina Efimenko, Grigory Drobyazko, Polina Kananykina, Victor Klintsov,Dmitry Lisitsin, Viacheslav Seledkin, Anatoli Starostin, Vyacheslav Vorobyov
Ontos AG, 84/2 Vernadskogo Av., 119606 Moscow, Russia
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 2
AgendaAgenda
IntroductionSemantic Technologies Umbrella
Ontos Solutions for Semantic WebGeneral View
MAS Based Information-To-Knowledge TransformationOntology Driven Text Mining & Web Mining
Semantic Navigation and Analytics
Ontos Semantic Services DemosPortal MedTrustSemantic RSS and Semantic Navigation
ConclusionChallenges and Future Trends
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 3
Introduction.Semantic Technologies Umbrella
SemanticWeb
MAS
“The Semantic Web will globalize KR, just as the WWW globalisedhypertext”
Tim Berners-Lee
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 4
Introduction.Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 5
Introduction.Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 6
Introduction.Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 7
Introduction.Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 8
Introduction.Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 9
Introduction.Semantic Web and MAS
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 10
Ontos Solutions for Semantic Web.General View
Daniel HladkyCEO
Ontos International AGMittelstrasse 24, 2560 [email protected]: +41 79 353 50 43Tel.: +41 32 332 82 70Fax: +41 32 332 92 52
Ontology Driven Text Mining & Web Mining
MAS Based Information-to-Knowledge Transformation
RDF-storage Development and Implementation
Semantic Navigation & Analytics
Main R&D in Domain
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 11
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
ObjectivesCombining of AI & IT experience within the NLP domainR&D in knowledge management
ApproachesUsage of IE technologies in NL texts processingEnrichment of IE techniques with NLP on the basis of special linguistic modelsRepresentation of the NL texts meaning in the form of cognitive maps
ResultsNew generation of MIE-systemsPractical usage of the results within the commercial & non-commercial organizations
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 12
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Basic Principles
Processing those constructions, that can be processed correctly, and NON-processing those ones, that still can not be processed correctly
Development of reusable components for multi-platform implementation
Providing domain ontology-driven analysis
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 13
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Main Requirements
Work with multilingual document collection (English, German and Russian texts)
Work with monothematic document collection (first of all, the so-called “Business Duties” domain)
An adequate processing of relevant objects and relations, according to the concrete ontology
Representation of processing results in a form of a cognitive map, that is a kind of semantic network
Multi-platform implementation of all systems of the family
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 14
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Domain Ontology “Business Duties”
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 15
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Domain Ontology “Business Duties” (cont.)
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 16
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Domain Ontology “Business Duties” (cont.)
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 17
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Domain Ontology “Business Duties” (cont.)
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 18
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Domain Ontology “MedTrust”
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 19
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Web doc, xls, pdf
Crawler filtersplain text
plain text
OntosMiner™
RDF-Store
•Oracle RDF Store•MS SQL Server 2005•InMemory DB•IBM DB2
General Sheme of NL-texts Processing
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 20
ApplicationsDigester, Summarizer, LightOntos, Report Generator, Semantic Navigator, Semantic RSS
Dix (Dictionary SDK for developers),minDix (user-oriented Dictionary SDK),Cross-lingua XML-generator
Plugins
Additional Technologcal Components
Tokenizer, SentSplitter, Morph-tagger, POS-tagger, Gazetteer, Morph Gazetteer, NP-chunker, VP-chunker, Anaphora Resolver, NE-transducer, OrthoMatcher (rule-based), UnknownMather(rule-based), Minimizer, Semantic Tagger (on text), Semantic Tagger (by sent), XML-generator (model-driven)
Tokenizer, SentSplitter, POS-tagger, Gazetteer, NE-transducer, Coreferencer, OrthoMatcher(hardcoded)
Main Modules
Domain-based ChainCreole-based Chain
Architecture
Ontos SolutionGATE Solution
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 21
Extracted Named Entities Types
People;Organizations – Various types (commercial, educational, etc.);ElectionsMass Media and Media holdingsPartiesGovernment StructuresTitles and JobTitles;Scientific degrees;Several kinds of Addresses;Money;Percent;URL, e-mail, phone (international style);Locations;Dates and Periods of Time;Medicines, diseases, treatment methods, etc.Etc.
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 22
Affiliate;Buy-Sell; Employ;Found;Graduate;Invest;JointVenture; Own;Rival;LocatedIn;EarnDegree
Reside;Belong;Investigate;Lobby;Participate;BeFriend;BeRelative;BePartner;Support;Medical Rels (be_indicated, etc.)Etc.
Extracted Semantic Relations Types
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 23
Google management Team
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 24
Ontos Solutions for Semantic Web.Ontology Driven Text Mining & Web Mining
Russian Text Processing Results
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 25
Ontos Solutions for Semantic Web.MAS Based Information-To-Knowledge Transformation
ResourcesCrawling
TextMining
Objects & RelationsIdentification & Merging
Data & MetadataStoring
Semantic Navigation&
Intelligent Analytics
Information Processing Technological Cycle
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 26
Ontos Solutions for Semantic Web.MAS Based Information-To-Knowledge Transformation
Internet
RDF-STORE
NLP SERVICE
IDENTIFY & MERGE AGENTs
OntosMiner
OntosMiner
OntosMiner
CRAWLERs
MERGE AGENTs
APPL SERVECEs
Ontos SOA General View
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 27
Ontos Solutions for Semantic Web.MAS Based Information-To-Knowledge Transformation
Doc Storage
InternetWeb Crawler
Web Crawler
Web Crawler
Content Extraction Broker
Broker of Doc Processing
Broker of Doc Preprocessing
Broker of Doc Postprocessing
Agent Fragmentator
Agent Tokenizer
Agent Tokenizer Agent
Tokenizer
Morph TaggerAgent
Morph TaggerAgent
Morph TaggerAgent
Vocab AgentVocab Agent
Vocab Agent
Vocabs
Broker of NE Transducing
VP Chunk AgentVP Chunk
AgentVP Chunk Agent
Broker of Syntax
NP Chunk Agent
NP Chunk Agent
NP Chunk Agent
NE Agent
NE AgentNE Agent
OWL Generator
Agent
Broker of Semantic Tagger
SR Agent
SR Agent SR
Agent
Ortho Matcher Agent
CorefAgent
Knowledge Base
Store Agent
Broker of Merging
Merge AgentMerge
AgentMerge Agent
MAS Architecture
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 28
Ontos Solutions for Semantic Web.MAS Based Information-To-Knowledge Transformation
Objects & Relationships Merging
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 29
Ontos Solutions for Semantic Web.MAS Based Information-To-Knowledge Transformation
Semantic Indexing
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 30
Ontos Solutions for Semantic Web.Semantic Navigation and Analytics
Semantic Navigation
On Fly Digesting of Document Collections
On Fly Summarization of Documents & Document Collections in a Specified Target Language
Documents & Document Collections Meaning Visual Representation & on Fly Reporting by Demands
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 31
Ontos Solutions for Semantic Web.Semantic Navigation
Backward to Previous point of semantic navigation
Hide/Visualize object attributes
Change rank threshold
Look through references
State-1 State-3State-2 State-4
Choose an object of interest
Open Object Navigation Card
Backward to Previous point of semantic navigation
Open Relation Navigation Card
Open Object Navigation Card
Backward to Previous point of semantic navigation
Return to Start point
Forward to Next point
Forward to Next point
Return to Start point
Show navigation history, Generate Digest
Generate summary
Generate summary
Show navigation history, Generate Digest
Work Flow of Semantic Web Client
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 32
Ontos Solutions for Semantic Web.Semantic Navigation
Objectcard
Object relations(How interacts With,
Indicated To, Contraindicated To,...)
RelevantDocs
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 33
Ontos Solutions for Semantic Web.Cross Lingua Digesting & Summarization
Google Management
Google Management
English
Russian
RDF StorageOntosMiner
Language IndependedInternal Representation
SemanticDigester
SemanticSummarizer
General Work Flow
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 34
Ontos Solutions for Semantic Web.Semantic Digesting
Surf for Digesting
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 35
Ontos Solutions for Semantic Web.Semantic Digesting
On Fly Digesting of Document Collections
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 36
Ontos Solutions for Semantic Web.Semantic Digesting
On Fly Digesting of Document Collections
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 37
Ontos Solutions for Semantic Web.Semantic Summarization
“Report” summarization mode
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 38
Ontos Solutions for Semantic Web.Semantic Summarization
Summary of Documents Collection by Demand
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 39
Ontos Solutions for Semantic Web.Semantic Summarization
Summary of Documents Collection by Demand
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 40
Ontos Semantic Services Demos.Portal MedTrust
Within the medtrust portal you can order directly from a connected partner (drug store). Mashups function build into the semantic navigation card.
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 41
Ontos Semantic Services Demos.Portal MedTrust
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 42
Ontos Semantic Services Demos.Portal MedTrust
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 43
Ontos Semantic Services Demos.Semantic RSS and Semantic Navigation
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 44
Ontos Semantic Services Demos.Semantic RSS and Semantic Navigation
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 45
Ontos Semantic Services Demos.Semantic RSS and Semantic Navigation
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 46
Conclusion.Main Challenges
Challenges of “Multi”Multi-source (Adoption of the Semantic Web paradigm (e.g. blogs, wikis, etc.) because of their inherent multi-source nature)Multi-lingua (Collection under processing consists of the texts in different languages)Multilingual (Text under processing consists of the fragments in different languages)
Challenges of “Mono”Native-Language-Only Users (need in information presented in different languages but understand only native language)
Challenges of “Customizing”Domains (users needs related to the different domains)Ontologies (different domains and users need in related ontologies)Adaptation (based on extension of ontologies and/or NLPs)
Challenges of “Back-End”Results representation (different domains and users need in different representation of results)User friendly interface
Challenges of “Dimension”Volume (Terabytes of documents should be processed)Performance (NLP processing is time-consuming task)
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 47
Conclusion.Future Trends
Solutions in “Multi” and ”Mono”Multi-source, Multilingual Content ExtractionGenre-Driven Content ExtractionCross-lingua Digesting and Summarization
Solutions in “Customizing”Ontologies Merging and AlignmentUser-Driven Acquisition and Management of Lexicons & Processing RulesComputer-Aided Language Processing
Solutions in“Back-End” and “Dimension”Representation of Results as Cognitive MapsKnowledge-Driven Semantic NavigationMulti-agent Architecture for Language Processing & Knowledge ManagementGrid Platform Usage
AIS-ADM-07, St. Petersburg, Russia, June 3 - 5, 2007 Page 48
Conclusion.Outlook
Semantic Navigation with Analytics is available for a specified domain ontology
Pilots with first customers show that the frequency can be improved by such new technologies
Volume handling and performanceAt the moment rather difficult to imagine a “world ontology” to cover all possible topics and being supported by the super hardware infrastructure
Adding functions to the navigation card
Social book markingMashups
Semantic SearchExtending domain ontology's to cover wider range of people and areas of interest
Social book marking / manual annotation
Mashups: e.g. call pictures for a named entity
Any Questions?
Thank You!