“achieving information resources empowerment: a digital library and knowledge management...

85
Achieving Information Resources Empowerment: A Digital Library and Knowledge Management Perspective” T HE U N IV ERSITY OF T U C SO N A RIZO N A A RIZO N A Hsinchun Chen, Ph.D. McClelland Professor of MIS University of Arizona 美美美美美美美美 , 美美美 美美 PI, NSF DLI-1, DLI-2, NSDL; Director, Artificial Intelligenc PI, NSF Digital Government Program, ITR Director, Hoffman E-Commerce Lab; PI, SAP, HP research progr Founder, Knowledge Computing Corp.

Post on 20-Dec-2015

225 views

Category:

Documents


1 download

TRANSCRIPT

“Achieving Information Resources Empowerment: A Digital Library and Knowledge Management Perspective”

T H E U N IV ER SIT Y O F

T U C SO N A R IZ O N A

A R IZ O N A

Hsinchun Chen, Ph.D. McClelland Professor of MIS University of Arizona

美國亞歷桑那大學 , 陳炘鈞 博士

• PI, NSF DLI-1, DLI-2, NSDL; Director, Artificial Intelligence Lab• PI, NSF Digital Government Program, ITR• Director, Hoffman E-Commerce Lab; PI, SAP, HP research programs; Founder, Knowledge Computing Corp.

Digital Library: OverviewDigital Library: Overview

Introduction

• The Internet is changing the way we live and do business.• Opportunities for libraries, governments, and businesses: to better deliver its contents and services and interact with its many constituents – citizens, patrons, businesses, and

other government partners. • Exciting and innovative transformation could occur with the new technologies and practices: in addition to providing information, communication, and transaction services.• Review and comparison: but with more focus on digital library + some examples/case studies

Digital Library: Characteristics

• No need to leave the home or office:

information now readily available on-line

via digital gateways furnished by a wide

variety of information providers.

• Information is multimedia:

electronically available in a wide variety

of formats, many of which are large,

complex (i.e., video and audio),

and often integrated.

continue

Digital Library: Characteristics (continued)

• Interface to the Web has evolved from browsing to searching: but the commercial technology has remained largely unchanged from its roots in the 1960s. New research presents new opportunities.

• Social impact matters as much as technological advancement: DL projects need to examine the broad social, economic, legal, ethical, and cross-cultural contexts and impacts.

DLI-1, DLI-2, NSDL, JCDL, ECDL, and ICADL: Towards Building A Global Digital Library

• NSF Digital Library Initiative Phase 1 (DLI-1), 1994-1998

• NSF CISE/IIS Special Program, $24M, NSF,

DARPA, NASA funding; Six projects: Stanford,

Berkeley, UCSB, Michigan, CMU, UIUC.

• Technology focus, new and rich library content; Bi-annual site visits and project meetings. Special activities: IEEE Computer, CACM, JASIS special issues, and many books and book chapters.

continue

DLI-1, DLI-2, NSDL, JCDL, ECDL, and ICADL: Towards Building A Global Digital Library (continued)

• NSF Digital Library Initiative Phase 2 (DLI-2), 1998-2003

• NSF CISE/IIS Special Program, $60M, 1998-; NSF, DARPA, NLM, LoC, NASA, NEH; 20+ projects: Stanford,

Berkeley, UCSB, CMU, Arizona, and many others.

• Strong focus on integration of technologies, contents, and service. Annual NSF all-PI meeting with JCDL.

continue

DLI-1, DLI-2, NSDL, JCDL, ECDL, and ICADL: Towards Building A Global Digital Library (continued)

• National Science Digital Library (NSDL), 2000-• NSF CISE/IIS Special Program, $45M, 60+ projects:

Strong education focus in many different application domains.• Annual NSF all-PI meeting in DC. Core Integration effort: Cornell (Open Archive Initiative), UCAR, U. Mass., etc.

• Joint Conference on Digital Libraries (JCDL), 1996-• ACM DL Conferences and IEEE DL Conferences, 1996-2000.• JCDL 2001, Virginia, E. Fox; JCDL 2002, Oregon, G. Marchionini; NSF DLI-2 all-PI meeting held after JCDL.

continue

DLI-1, DLI-2, NSDL, JCDL, ECDL, and ICADL: Towards Building A Global Digital Library (continued)

• European Conference on Digital Libraries (ECDL), 1997-• Many working group meetings held in different

DL sub-areas.• International Conference of Asian Digital Libraries (ICADL), 1998-

• ICADL 1998, Hong Kong, J. Yen; ICADL 1999, Taipei, Taiwan, Hsueh-hua Chen; ICADL 2000, Seoul, Korea, Key-Sun Choi; ICADL 2001, Bangalore, India, Shalini R. Urs (600 people); ICADL 2002, Singapore, S. Foo and E. Lim (450 people); ICADL 2003, KL,

Malaysia, Baba• Local content, cultural heritage, education and deployment,

multilingual retrieval, other new technologies; Many other national programs: China, India, Russia, Japan, etc.

Digital Library: Challenges

• Cultural and historical heritage: Many digital library and museum collections contain artifacts that are fragile, precious, and of historical significance.

• Heterogeneity of content and media types: Digital library collections have the widest range of content and media types, ranging from 3D chemical structures to tornado simulation models, from the statue of David to paintings of Van Gogh.

continue

Digital Library: Challenges (continued)

• Intellectual property issues: Unlike digital government or e-commerce applications that often derive their own content, digital libraries provide content management and retrieval services to many different information creators.

• Cost and sustainability issues: Many patrons often would like library services to be “free” or at least extremely affordable.

• Universal access and international collaboration: Digital library content is often of interest to not just people in one region, but possibly all over the world.

Digital Government and E-Commerce: Digital Government and E-Commerce: OverviewOverview

Digital Government: Characteristics

• Multi-faceted roles of Federal Government: Government as a major user of information technologies, a collector and maintainer of very large data sets, and a provider of critical and often unique information services to individuals, states, businesses, and other customers.

• Potential for nearly ubiquitous access: to government information services by citizen/customers

• Re-inventing the government: Enhancements derived from new information technology-based

services can be expected to contribute to reinvented and economical government services, and more productive government employees.

Digital Government: US Government Goes Electronic

• 1986 Brooks Act amended:

reducing government costs through

volume buying, including IT purchases.

• 1996 Information Technology Management Reform Act: Establishing the CIO position to manage IT resources.

• 1998 WebGov portal: announced in August 1998, failed and replaced by FirstGov portal after technology donation from Inktomi.

• 2000 Federal Rehabilitation Act:

Requiring all IT products be accessible to the disabled.continue

Digital Government: US Government Goes Electronic (continued)

• 2000 FirstGov portal unveiled in June 2000.

• 2001 National Security Telecommunications and Information Systems Security Policy No. 11:

Mandating off-the-shelf software used in defense be evaluated by an approved third party (NSA).

• 2001 Health Insurance Portability and Accountability Act (HIPPA):

Requiring health care information in compliance with privacy regulations

• 2002 E-Government Act: Funding additional e-government initiatives and creating Office

of Electronic Government.

Digital Government: Research Programs

• NSF DG Program, 1998- : areas such as: law enforcementinformation sharing, citizen access to government statisticaldata, and comprehensive emergency management; Digital Government Research Center (DGRC) and annual NSF-sponsored Digital Government Conference (DG.O)

• EU: areas such as: online public service for information content; politics, e-democracy, e-voting; transactions, security, and digital signatures for e-government.

• Other regions: Many ongoing e-government (G2C) initiatives have also emerged in Asia and Pan-Pacific countries such as: China, Singapore, Japan, Korea, India, New Zealand, Australia, etc. E-government projects in Latin American countries have also been reported.

Digital Government: Challenges

• E-commerce is not at the heart of e-government: The core task of government is governance, the job of regulating society, not marketing and sales. • Organizational and cultural inertia: Most government entities are not known for their efficiency or willingness to adopt changes.

• Government laws and legal regulations: Although well-intended, such laws and regulations often inhibit innovation or thinking “out-of-the-box.”

Digital Government: Challenges (continued)

• Security and privacy issues:

Government-provided services have an extra burden of guaranteeing security and privacy for citizens.

• Disparate and out-dated information infrastructure and systems: Many government departments at all levels often face budget shortfalls for years.

• Lack of IT funding and personnel:

Some government units (local, state, and federal) are

affluent, but most are not. IT spending often is not a priority.

E-Commerce: Characteristics

• Business/commercial initiatives: From Fortune 500 companies to Internet start-ups, from self-funded dotcoms to ventures funded by influential VCs (unlike digital library or digital government research).

• Quick evolution and extensive coverage in many magazines and newspapers: Business Process Re-Engineering (BPR),

Total Quality Management (TQM), Enterprise Resource Planning (ERP), Supply-Chain Management (SCM),

Knowledge Management (KM), Customer Relation Management (CRM), etc.

E-Commerce: Challenges

• Internet time or library/government time: In a competitive business environment, “Internet time” often demands a business to act on its instinct and to take risks.

• Build it, but will they come: With the intense business pressure to perform and significant injection of funding (at least before the Internet bubble burst), many companies invest significantly in major Internet-based e-commerce infrastructure and product initiatives.

• True innovations or marketing hypes: With the fast moving and sometimes impulsive business behaviors, marketing hypes are often disguised as true innovations.

The Information-Communication-Transaction-Transformation (ICTT) Continuum: The Path to Innovation

• Information: content (e-library)

• Communication: interaction (e-government)

• Transaction: process and rule (e-commerce)

• Transformation: innovation (all)

ICTT: Information

• Definition: Library, government or business “information” is created, categorized, and indexed and delivered to its target audiencesthrough the Internet.

• Core competency of digital library research and services: metadata generation, data creation and management, content management, interoperability, system interfaces, etc.

• Many early G2C (government-to-citizen), G2B (government-to-business), and B2C services deliver information only: governments and business portals act as information (about regulations and products) providers.

ICTT: Communication

• Definition: E-services support two-way “communication,” whereby

customers or citizens can communicate their needs or requests through web forms, email, or other Internet media. • Core function for e-government: by providing effective communication channels to citizens.• Many early B2C, G2C and G2B applications quickly evolved into

such communication services: by adding simple web-based groupware functionalities such as

web forms, email, bulletin boards, chat rooms, etc.• Computer-Supported Collaborative Systems (or groupware) and

recommender systems: can significantly improve communication services for all digital

library, digital government, and e-commerce applications.

ICTT: Transaction

• Definition: Citizens and businesses are supported in conducting

“transactions.”

• Transaction is the essence of e-commerce: “You are not successful unless they buy.” Many businesses

support transactions among their suppliers (B2B) or customers (B2C) through ERP, SCM, and CRM systems.

• Digital government could support “citizen transactions:” such as income tax filing & returns, municipal service requests

and tracking, business license applications and payments, etc.

• Significant adaptation needed for e-government and digital library:

to be cost-effective for non-commercial applications. (Most governments and libraries cannot afford SAP R3!)

ICTT Continuum: Transformation

• Definition: There is an opportunity for “transformation” for libraries, government agencies, and businesses

through new technologies. • Digital libraries: Traditional libraries need to re-examine their content

management and service delivery assumptions and practices. • E-Commerce: Business consulting professionals are creating new

methodology and best practices to take advantage of the new business opportunities.

• E-government: New information technologies and innovative processes could

significantly enhance many facets of the governments, e.g., e-politics and e-voting, law enforcement and litigation support, etc.

Knowledge Management:Knowledge Management:OverviewOverview

Unit of Analysis• Data: 1980s

– Factual– Structured, numeric Oracle, Sybase, DB2

• Information: 1990s– Factual Yahoo!, Excalibur,– Unstructured, textual Verity,

Documentum• Knowledge: 2000s

– Inferential, sensemaking, decision making– Multimedia ???

• According to Alter (1996), Tobin (1996), and Beckman (1999): – Data: Facts, images, or sounds

(+interpretation+meaning =)– Information: Formatted, filtered, and

summarized data (+action+application =)– Knowledge: Instincts, ideas, rules, and

procedures that guide actions and decisions

Data, Information and Knowledge:

Application and Societal Relevance :

• Ontologies, hierarchies, and subject headings• Knowledge management systems and

practices: knowledge maps• Digital libraries, search engines, web mining,

text mining, data mining, CRM, eCommerce• Semantic web, multilingual web, multimedia

web, and wireless web

1965

1975

1985

1995

2000

2010

ARPANET Internet “SemanticWeb”

Company IBM ???Microsoft/Netscape

The Third Wave of Net Evolution

Function Server Access Knowledge AccessInfo Access

Unit Server ConceptsFile/Homepage

Example Email Concept ProtocolsWWW: “World Wide Wait”

Knowledge Management Definition

“The system and managerial approach to collecting, processing, and organizing enterprise-specific knowledge assets for business functions and decision making.”

Knowledge Management Challenges• “… making high-value corporate

information and knowledge easily available to support decision making at the lowest, broadest possible levels …”– Personnel Turn-over– Organizational Resistance– Manual Top-down Knowledge Creation– Information Overload

Knowledge Management Landscape• Research Community

– NSF / DARPA / NASA, Digital Library Initiative I & II, NSDL ($120M)

– NSF, Digital Government Initiative ($60M)– NSF, Knowledge Networking Initiative ($50M)– NSF, Information Technology Research ($300M)

• Business Community– Intellectual Capital, Corporate Memory,– Knowledge Chain, Competitive Intelligence

• Enabling Technologies:– Information Retrieval (Excalibur, Verity, Oracle Context)

– Electronic Document Management (Documentum, PC DOCS)

– Internet/Intranet (Yahoo!, Excite)

– Groupware (Lotus Notes, MS Exchange, Ventana)

• Consulting and System Integration:– Best practices, human resources, organizational

development, performance metrics, methodology, framework, ontology (Delphi, E&Y, Arthur Andersen, AMS, KPMG)

Knowledge Management Foundations

Knowledge Management Perspectives:

• Process perspective (management and behavior): consulting practices, methodology, best practices, e-learning, culture/reward, existing IT new information, old IT, new but manual process

• Information perspective (information and library sciences): content management, manual ontologies new information, manual process

• Knowledge Computing perspective (text mining, artificial intelligence): automated knowledge extraction, thesauri, knowledge maps new IT, new knowledge, automated process

KMS

Analysis

ConsultingMethodology

Databases

ePortals

Email

Notes

Search Engine

User Modeling

Content Mgmt

Ontology

Content/Info

Structure

Data MiningText Mining

Cultural

Learning /Education

Best Practices

HumanResources

Tech Foundation

Infrastructure

KM Perspectives

• Dataware Technologies

(1) Identify the Business Problem

(2) Prepare for Change

(3) Create a KM Team

(4) Perform the Knowledge Audit and Analysis

(5) Define the Key Features of the Solution

(6) Implement the Building Blocks for KM

(7) Link Knowledge to People

• Anderson Consulting

(1) Acquire

(2) Create

(3) Synthesize

(4) Share

(5) Use to Achieve Organizational Goals

(6) Environment Conducive to Knowledge Sharing

• Ernst & Young

(1) Knowledge Generation

(2) Knowledge Representation

(3) Knowledge Codification

(4) Knowledge Application

KM Architecture (Source: GartnerGroup)

Network Services

Platform Services

Distributed Object Models

Databases

Database Indexes

Conceptual

Knowledge Maps

Web Browser

“Workgroup” Applications

Text Indexes

EnterpriseKnowledge Architecture

Intranetand

Extranet

Applications

Web UI

KR Functions

Text and Database Drivers

Physical

Application Index

KnowledgeRetrieval

Knowledge Retrieval Level (Source: GartnerGroup)

KR Functions

Concept“Yellow Pages”

Value “Recommendation”

RetrievedKnowledge

Semantic

Collaboration

•Clustering — categorization “table of contents”

•Semantic Networks “index”

•Dictionaries•Thesauri•Linguistic analysis•Data extraction

•Collaborative filters

•Communities•Trusted advisor•Expert identification

Knowledge Retrieval Vendor Direction(Source: GartnerGroup)

• grapeVINE• Sovereign Hill• CompassWare• Intraspect• KnowledgeX• WiseWire• Lycos• Autonomy• Perspecta

LotusNetscape*

Technology Innovation

Niche Players

IR Leaders

•Verity• Fulcrum• Excalibur • Dataware

Microsoft

Content Experience

• IDI • Oracle• Open Text• Folio• IBM • InText• PCDOCS• Documentum

Knowledge Retrieval

NewBies

Newbies: IR Leaders:

Niche Players:

MarketTarget

* Not yet marketed

KM Software Vendors

Abilityto

Execute

Completeness of VisionNiche Players Visionaries

Challengers Leaders

Microsoft * Lotus * Dataware *

* Verity * Excalibur

Netscape *Documentum*

* IBM

Inference*Lycos/InMagic*

CompassWare*KnowledgeX*

SovereignHill*Semio*

IDI*

PCDOCS/*Fulcrum

OpenText*

Autonomy*

GrapeVINE** InXight

WiseWire*

*Intraspect

From Federal Research to Commercial Start-ups

• U. Mass: Sovereign Hill• MIT Media Lab: Perspecta• Xerox PARC: InXight• Batelle: ThemeMedia• U. Waterloo: OpenText• Cambridge U. Autonomy• U. Arizona: Knowledge

Computing Corporation (KCC)

Two Approaches to Codifying Knowledge

• Structured• Manual• Human-driven

• Unstructured• System-aided• Data/Info-driven

Bottom-UpApproach

Top-DownApproach

Information Resources Empowerment: Information Resources Empowerment: DG and KM as Catalyst DG and KM as Catalyst

Examples and Case StudiesExamples and Case Studies

Medical Portal and Informatics:• Goal:

– A “knowledge” portal for medical researchers in US and the world.

• Content/Information:– Comprehensive, high quality medical-related content: NLM

databases, evidence-based medical databases• Key Features:

– Comprehensive medical resources and ontologies– Automatic medical thesaurus (48.5M terms) and medical

knowledge map (MED Map and Cancer Map)– Scalable for multilingual support: English, Chinese, Spanish,

Arabic• Funding:

– NSF DLI2 Program + NIH NLM Medical Informatics Program (S. Griffin + A. McCray)

Consulting HelpfulMED Cancer Space (Thesaurus)

Enter search term

Select relevant search terms

New terms are posted

Search again...

Or find relevant content

1 Visual Site Browser

Browsing HelpfulMED Cancer Map

Top level map2

Diagnosis, Differential3

4 Brain Neoplasms

Brain Tumors5

Browsing Taiwan Health Map

Simplified Chinese summary

Traditional Chinese summary

Chinese folder displayChinese visualizationwith SOM

Simplified/Traditional Chinese summarization

Select websites from mainland China, Hong Kong and Taiwan

Select search engines from mainland China, Hong Kong and Taiwan

Results are from both Simplified and Traditional Chinese

Original encoding of the result

Traditional Chinese results haven been converted into simplified Chinese

Chinese Medical Intelligence Portal

Spanish Business Intelligence Portal

Meta searches 7 major sources and provides searching of its own collection (PIN)

Supports boolean searching and allows the display of 10, 20, 30, 50, or 100 results per each meta searchers

Keyword suggestion from Scirus and Concept Space

Detailed directory of Spanish business resources on the Web

Keyword:

comercio electronico

Search, Organize, or Visualize resultsSearch, Organize, or Visualize resultsSearch, Organize, or Visualize results

Results organized by meta searchersSummarize in 3 or 5

sentences

Automatic keyword suggestion

Search Page Result PageSummarizer

A three-sentence summary on leftOriginal page

shown on right

Categorizer

Web pages grouped by key phrases extracted by mutual information algorithm (non-exclusive categorization)

Visualizer

Web pages visualized by self-organizing map (SOM) algorithm

Search Page Spanish Business Taxonomy

Web sites about the topic “Electronic Commerce” in Spanish speaking countries

Arabic Medical Intelligence Portal

Provides a virtual Arabic keyboard to facilitate input

Search Page Result Page

Categorizer

Visualizer

NanoPort: • Goal:

– A “knowledge” portal for nano researchers in US and the world.

• Content:– Comprehensive, high quality nano-related web content:4 nano-related search engines, 5 online databases, and 3

online journals• Key Features:

– Comprehensive nano resources– Post-retrieval analysis: AZ SUM, AZ NP, AZ SOM– AZ Web Weaver (WW) toolkit: “weaving” your own web– Alerting and communication among researchers

• Funding:– NSF Nano Science and Engineering Program (M. Roco)

Input keywords

Select search engines

Select online databases

Select online journals

 Folder displayVisualization using SOM

Folder display Visualization with SOM

Summary

The original page

Highlight the summary in the original page

with corresponding color

Click on the summary sentence and jump to

its position in the original page

Summarize result dynamically

Communication Garden: • Goal:

– Visualizing communication patterns and identifying experts in email/newsgroups.

• Content:– Any email/newsgroups contents, in any languages

• Key Features:– Linguistic analysis: AZ NP, MI– Topic clustering: AZ SOM– Glyph-based visualization: garden metaphor

• Funding:– NSF Information and Data Management Program

Thread

Disadvantages:

•No sub-topic identification

•Difficult to identify experts

•Difficult to learn participants’ attitude toward the community

Thread RepresentationTime

Message

PersonLength of

Time

People RepresentationTime

Message

ThreadLength of

Time

Proposed Interface (Interaction Summary)

Visual Effects:

•Healthy sub-garden with many blooming high flowers = popular active sub-topic

•A long, blooming flower is a healthy thread

Proposed Interface (Expert Indicator)

Visual Effects:

•Healthy sub-garden with many blooming high flowers = popular sub-topic

•A long, blooming people flower is a recognized expert.

GeneScene: Transforming Biomedical Research

• Correctly extract gene pathway information from millions of abstracts

• Expedite comprehension of the literature

• Position results relatively to others in the blink of an eye

• New hypotheses discovery– Magnesium and migraines (Hearst,99)

Genescene Overview

Text MiningProcess Medline abstracts and extract gene relations automatically from the text

Data MiningProcess gene expression

data (and existing knowledge) and use

different algorithms to extract regulatory

networks Interface & Visualization

Allow searching for keywords, display a map of the relations extracted from the text and/or from

the microarray

Knowledge BaseIntegrate gene relations from

literature and outside databases and provide

knowledge for learning and evaluation in data mining

Medline

Titles & Abstracts

Feature Structures

Publications &

Meta Information

Publications

MicroArray DataUMLS

VisualizationInformation

RetrievalGeneSceneData Mart

GeneSceneText Mart

Text Mining GeneScene

ConceptSpace

Co-occurrence relations

Data Mining

Relation Parsers

Relations inflat files

XML Parser

UMLS

GO

HUGO

Ontologies

Relations inflat files

Spring Algorithm

BayesianNetworks

AssociationRule Mining

JIF

POS Tagging

FullParser

RelationGrammar

FSA

AZ NounPhraser

Adjuster & Tagger

Lexical lookup

External Databases

KnowledgeBase

Problem (PBG)•Title Key roles for E2F1 in signaling p53-dependent apoptosis and in cell division within developing tumors.•Abstract: Apoptosis induced by the p53 tumor suppressor can attenuate cancer growth in preclinical animal models. Inactivation of the pRb proteins in mouse brain epithelium by the T121 oncogene induces aberrant proliferation and p53-dependent apoptosis. p53 inactivation causes aggressive tumor growth due to an 85% reduction in apoptosis. Here, we show that E2F1 signals p53-dependent apoptosis since E2F1 deficiency causes an 80% apoptosis reduction. E2F1 acts upstream of p53 since transcriptional activation of p53 target genes is also impaired. Yet, E2F1 deficiency does not accelerate tumor growth. Unlike normal cells, tumor cell proliferation is impaired without E2F1, counterbalancing the effect of apoptosis reduction. These studies may explain the apparent paradox that E2F1 can act as both an oncogene and a tumor suppressor in experimental systems

"E2F1 signals p53-dependentapoptosis"

p53

E2F1

apoptosis

infers So, I'm assuming... a straightline pathway...

reads "E2F1 acts upstream of p53"

p53

E2F1

apoptosis

"E2F1 deficiency does notaccelerate tumor growth"

E2F1

p53

apoptosis

tumor growth

reads

E2F1

p53

apoptosis

Action Protocols

reads

GraphicRepresentation

Expert errs and corrects

Final graph

Example: Combination

Inactivation of the pRb proteins in mouse brain epithelium by the T121 oncogene induces aberrant

proliferation and p53-dependent apoptosis

By-template:

nullT121 oncogene

null

Agent Action Theme

Of-template:

inactivate pRb proteins

Combo: T121 oncogene

inactivate

pRb proteins

null

Preposition: OFq0

q3

q1

q2

q7

q4

q5

q6

NP only

NP only

OF

OF

Negation

NP only

Adjective, noun,verb (-ed)

Adjective, nounverb (-ed)

Nominalization (-ion)

Nominalization (-ion)

OF

Nominalization (-ion)

Adjective, noun,verb (-ed)

Nominalization (-ion)

Nominalization (-ion)

OF

q8

q10

OF

NP only

q9

Examples:

Q0 – q1 – q2 – q3

Dfp1/Him1 protein OF fission yeast

Q0 – q5 – q6 – q2 – q3

MRNA expression OF genes

Q0 – q6 – q2 – q3 – q9 – q10

the determination OF the biological characteristics OF human cancers

Q0 – q5 – q6 – q2 – q7 – q8 – q2 – q3

Time-dependent induction OF mRNA expression OF Wip1

Visualization

Line thickness indicates frequency of findings

Contradictory finding

All abstracts related to the search, or abstracts related to a term highlighted in the map are displayed

Preferences to limit the knowledge map, e.g. only abstracts with research on human cells

Line thickness indicates frequency of findings

Contradictory finding

All abstracts related to the search, or abstracts related to a term highlighted in the map are displayed

Preferences to limit the knowledge map, e.g. only abstracts with research on human cells

Select interesting relations to visualize

Double click to expand

Overview

Expanded node

Finding the truth: p38 acts as a negative feedback for Ras

signaling

COPLINK: From Transaction to Transformation• Goal:

– Supporting law enforcement information sharing and crime analysis.• Content/Information:

– Police incident records, mug shots, gang information.• Key Features:

– COPLINK Connect: linking legacy databases– COPLINK Detect: detecting crime associations (“criminal

thesaurus”)– COPLINK Agent: wireless alerting– COPLINK Visualization: revealing criminal networks

• Funding:– NSF Digital Government Program (L. Brandt)

Finding criminals: English and Chinese interface

A narcotic network example

Switch between narcotic network and gang network

Show network and reset network

Adjust level of details

A point represents an individual labeled by his name

A line represents a link between two persons

A bubble represents a subgroup labeled by its leaders name

A line implies that some individuals in one group interact with some individuals in the other group. The thicker the link, the more individual interactions between the two groups

The size of a bubble is proportional to the number of individuals in the group

The rankings of the members of a selected group (green).

A gang network example

The leader

A clique

A gatekeeper

The reduced network structure

Criminal Patterns Found

• The chain structure of the narcotic network

• Implications: disrupt the network by breaking the chain

• The star structure of the gang network

• Implications: disrupt the network by removing the leader

White gangs who involved in murders and shootings

White gangs who sold crack cocaine

A group of black gangs

Expert Validation

“Yes, these two groups are together very often”

“(211) and (173) are best friends”

Expert Validation

The Future

• Many active and high-impact research opportunities for researchers in information science, library science, computer science, public policy, and management information systems.

• Digital library researchers are well positioned to become the “agents of transformation” for the new Net of the 21st century.

The Questions

• Who/what is a “librarian”?

• How to transform data and information into knowledge?

• How to balance between technology, policy, users, and services?

For more information• “Knowledge Management Systems,” H. Chen, 2002 • “Trailblazing a Path Towards Knowledge and Transformation,” H. Chen, 2003

• International Conference of Asian Digital Libraries, December 8-11, 2003, KL, Malaysia

• ACM/IEEE Joint Conference on Digital Libraries, June 7-11, 2004, Tucson, Arizona

• NSF International Digital Library Workshop, June 10-11, 2004, Tucson, Arizona (successful national DL projects)

For Project Information at AI Lab:

http://ai.bpa.arizona.edu

[email protected]