network learning: ai-driven connectivist framework for e-learning 3.0

29
ネットワーク ラーニング Network Learning AI-driven Connectivist Framework for E-Learning 3.0 振興調整費 国立大学法人 電気通信大学 Unique and Exciting Campus Neil Rubens Active Intelligence Group Knowledge Systems Laboratory University of Electro-Communications Tokyo, Japan

Upload: neil-rubens

Post on 01-Nov-2014

1.065 views

Category:

Education


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

ネットワーク ラーニング Network Learning

AI-driven Connectivist Framework for E-Learning 3.0  

振興調整費  

国立大学法人  電気通信大学  

Unique and Exciting Campus  

From Collective Intelligence to Connected Intelligence  

Neil Rubens Okamoto/Ueno Laboratory Graduate School of Information Systems Center for Frontier Science and Engineering University of Electro-Communications

Neil Rubens Active Intelligence Group Knowledge Systems Laboratory University of Electro-Communications Tokyo, Japan

Page 2: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Evolution of eLearning: eLearning 1.0eLearning uses technology to enhance Learning

‣ eLearning 1.0:

‣ reading: content became easily accessible

‣ logging: user’s activities could be logged and analyzed

‣ Learning Theories:

‣ Behaviorism: learning is manifested by a change in behavior, environment shapes behavior, contiguity

‣ Cognitivism: how human memory works to promote learning

---------------------------------------------

Page 3: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Evolution of eLearning: eLearning 2.0‣ eLearning 2.0:

‣ writing: anybody can easily create content (e.g. blogs, wiki, etc.)

‣ socializing: interaction is easy (e.g. facebook, twitter, etc.)

‣ Learning Theories:

‣ Constructivism: constructing one's own knowledge from one's own experiences (enabled through writing)

‣ Social Learning: people learn from one another (enabled through socializing)

---------------------------------------------

---------------------------------------------

Page 4: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0
Page 5: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0
Page 6: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Broken Knowledge Cycle‣ Problem: The current cycle of knowledge creation/utilization is inefficient !

‣ large portion of created content is never utilized by others* only 0.05% of twitter messages attracts attention (Wu et. al., 2011) only 3% of users look beyond top 3 search results (Infolosopher, 2011)

‣ large parts of created contents are redundant (Drost, 2011)

‣ Peak Social – the point at which we can gain no new advantage from social activity (Siemens 2011)

*there are some personal benefits e.g. externalization, crystallization, etc.Knowledge

utilize

create

Redundant

Novel

U0lized

Existing Knowledge

Page 7: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0
Page 8: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

“There  is  no  data  like  more  data”    (Mercer  at  Arden.  House,  1985)  

Tan,  Steinbach,  Kumar;  2004  

2,000  points  500  Points   8,000  points  

“There  is  no  data  like  more  data”    (Mercer  at  Arden.  House,  1985)  

Information Overload������

for Computers ���not a Problem ���but an Opportunity���

Page 9: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

h"p://www.kieranhealy.org/files/misc/SocCoreCites.jpg:

h"p://wiki.ubc.ca/images/f/ff/SocialWeb.jpg9

Social Network

http://datamining.typepad.com/photos/uncategorized/2007/04/08/twitter20070405.png

Messaging Networks

Citation Network

How can we use computers to learn in these settings?���

Page 10: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Nova  Spivack©  

social  connec0vity  

inform

a0on

 con

nec0vity  

hFp://novaspivack.typepad.com/nova_spivacks_weblog/metaweb_graph.GIF  

Our  Focus  

Page 11: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Connectivism (Learning Theory)������Connec0vism:  Knowledge  is  distributed  across  a  network  of  connecTons,  and  therefore  learning  consists  of  the  ability  to  construct  and  traverse  these  networks    (Siemens  &  Downes,  2008)  

Property Behaviourism Cognitivism Constructivism Humanism Connectivism

Learning theorists

How learning occurs

Influencing factors

Role of memory

How transfer occurs

Types of learning best explained

Thorndike, Pavlov, Watson, Guthrie, Hull, Tolman, Skinner

Koffka, Kohler, Lewin, Piaget, Ausubel, Bruner, Gagne

Piaget, Vygotsky

Maslow, Rogers

Siemens, Downes

Black box—observable behaviour main focus

Structured, computational

Social, meaning created by each learner (personal)

Reflection on personal experience

Distributed within a network, social, technologically enhanced, recognizing and interpreting patterns

Nature of reward, punishment, stimuli

Existing schema, previous experiences

Engagement, participation, social, cultural

Motivation, experiences, relationships

Diversity of network, strength of ties, context of occurrence

Memory is the hardwiring of repeated experiences—where reward and punishment are most influential

Encoding, storage, retrieval

Prior knowledge remixed to current context

Holds changing concept of self

Adaptive patterns, representative of current state, existing in networks

Stimulus, response Duplicating knowledge constructs of “knower”

Socialization Facilitation, openness

Connecting to (adding) nodes and growing the network (social/conceptual/biological)

Task-based learning Reasoning, clear objectives, problem solving

Social, vague(“ill defined”)

Self-directed Complex learning, rapid changing core, diverse knowledge sources

connecTvism-­‐vs-­‐others.num

bers  

Page 12: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

hFp://imgs.sfgate.com/c/pictures/2011/12/19/ba-­‐BRIDGE20_SFC0105724887.jpg  

ConnecTvism:  Nice  Theory  

Need:  Tools  &  Frameworks  To  make  it  Prac0cal  

Page 13: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Methods  

Page 14: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Extraction Layer

docu

men

ts

Linking Layer

node

s

Aggregation Layer

links

conn

ectio

ns

Analysis Layer

netw

orkConceptual Framework���

hFp://www.progress.com/images/soluTons/rbi/rbi-­‐stack2-­‐705w.jpg?KeepThis=true&TB_iframe=true&height=534&width=705  

hFp://mafra-­‐toolkit.sourceforge.net/  

use AI to:���§  connect contents���§  connect people���§  connect people & contents���§  connect models���

---------------------------------------------

---------------------------------------------

AI

---------------------------------------------

Page 15: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Concept Extraction

conceptconcept

conceptconcept

conceptconcept

conceptconcept

documentsconcepts

conceptconcept

conceptconcept

conceptconcept

conceptconcept

SemanticMapping

context(documents)

concepts

concept concept

concept concept

concept concept

Knowledge LevelEstimation

conceptconcept

conceptconcept

conceptconcept

conceptconcept

concepts

concept concept

concept concept

concept concept

Group Formation

tasks

Influence Estimation

interaction log

Modules���

Extraction Layer

docu

men

ts

Linking Layer

node

s

Aggregation Layer

links

conn

ectio

ns

Analysis Layer

netw

ork

Page 16: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Module���Pipeline���(Example)���

Search Engineconceptwant to learn about:

docs

Concept Extractor

conceptconcept

conceptconcept

conceptconcept

conceptconcept

docs

conceptconcept

conceptconcept

conceptconcept

conceptconcept

SemanticMapping

concept concept

concept

concept concept

Search Engine

concept

concept

concept

concept

concept concept

Analysis Layer Link Layer

discussions

……

……

concept

concept

concept

concept concept

concept

concept

concept

Page 17: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

userI want to know about term t_i

t_i

t_i t_i

t_i

t_it_i

semantics

contents

social

u_i: I think t_i is same as t_j … u_j: no t_is is more like t_p ...u_k: you are both wrong t_i is ...

system user

I want to know about term t_i and t_k

system

semantics

contents

social

u_i: I think t_i is same as t_j … u_j: no t_is is more like t_p ... u_j: no t_is is more like t_p ...

t_i

t_i t_i

t_i

t_it_i

t_i

user

I think t_i and t_k are similar ..

system

social

u_i: I think t_i is same as t_j … u_j: no t_is is more like t_p ... u_j: no t_is is more like t_p ... u_m: I think t_i and t_k are similar .. u_i: you are right

semantics

contents

t_i

t_i t_i

t_i

t_it_i

t_i

social

u_i: I think t_i is same as t_j … u_j: no t_is is more like t_p ... u_j: no t_is is more like t_p ... u_m: I think t_i and t_k are similar .. u_i: you are right

Sequence Diagram (Example)���

Page 18: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

�!V (d1)

�!V (d2)

�!V (d3)

1

�!V (d1)

�!V (d2)

�!V (d3)

1

�!V (d1)

�!V (d2)

�!V (d3)

1

�⇤V (d1)

�⇤V (d2)

�⇤V (d3)

tfi,j =

ni,jPk nk,j

idfi = log

|D||{d : ti ⌅ d}|

tf-idf i,j = tfi,j ⇥ idfi

wi,j = tf-idf i,j

�⇤V (dj) =

2

6664

w1,j

w2,j...

wt,j

3

7775

1

�⇤V (d1)

�⇤V (d2)

�⇤V (d3)

tfi,j =

ni,jPk nk,j

idfi = log

|D||{d : ti ⌅ d}|

tf-idf i,j = tfi,j ⇥ idfi

wi,j = tf-idf i,j

�⇤V (dj) =

2

6664

w1,j

w2,j...

wt,j

3

7775

sim(di, dj) =

�⇤V (di) ·�⇤V (dj)����⇤V (di)

�������⇤V (dj)

���

1

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

�!V (d1)

�!V (d2)

�!V (d3)

1

�!V (d1)

�!V (d2)

�!V (d3)

1

�!V (d1)

�!V (d2)

�!V (d3)

1

�⇥V (dj) =

⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇤

w1,j

w2,j...

w|T |,j

w1,j

w2,j...

w|N |,j

⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌅

2

term-space

network-space

�!V (d1)

�!V (d2)

�!V (d3)

1

�!V (d1)

�!V (d2)

�!V (d3)

1

�!V (d1)

�!V (d2)

�!V (d3)

1

term-network-space

�⇤V (d1)

�⇤V (d2)

�⇤V (d3)

tfi,j =

ni,jPk nk,j

idfi = log

|D||{d : ti ⌅ d}|

tf-idf i,j = tfi,j ⇥ idfi

wi,j = tf-idf i,j

�⇤V (dj) =

2

6664

w1,j

w2,j.

.

.

w|T |,j

3

7775

sim(di, dj) =

�⇤V (di) ·�⇤V (dj)����⇤V (di)

�������⇤V (dj)

���

Network Content based

wi,j = dist(i, j)

�⇤V (dj) =

2

6664

w1,j

w2,j.

.

.

w|N |,j

3

7775

1

�⇤V (d1)

�⇤V (d2)

�⇤V (d3)

tfi,j =

ni,jPk nk,j

idfi = log

|D||{d : ti ⌅ d}|

tf-idf i,j = tfi,j ⇥ idfi

wi,j = tf-idf i,j

�⇤V (dj) =

2

6664

w1,j

w2,j...

wt,j

3

7775

sim(di, dj) =

�⇤V (di) ·�⇤V (dj)����⇤V (di)

�������⇤V (dj)

���

1

——-

wi,j = �(i, j)

—-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2Mr (m, p) (1)subject to |M | = s (2)

ChallengeExpertise is not additive:

R(M, p) ⇤=X

m2Mr (m, p) . (3)

–Group Expertise EstimationAssumptionsM⇤

p are athors of paper p; so R(M⇤p , p) = 1, and

R(M, p) = 0, where M ⌅M⇤p = Ø.

R(M, p) =

��M ⌅M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.Use both semantic and structural features.Use ensemble predictive model.—Training Set: T = (XT , YT ) = {(xi, f(xi))xi2XT }Function learned from XT (and corresponding YT ): bRT

Generalization error: G( bRT ) = L( bRT , f)

minXT

G( bRT )

g: optimal function (in the sollution space)bR: learned functionbRi’s: learned functions from a slightly di�erent training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

1

Reverse could also be done: i.e. converting textual representation to a network one.

Characteristics!

Network (N)! Content (C)! Proposed Hybrid!

Analysis!

!"#$%$&'()%*+(,( !"#$%$&'()%*+(-( same as C part of N

User Interface!./+$(/.(,(

Deployment!./+$(/.(,(

ExecutionSpeed! ./+$(/.(,(

Type!

Connecting Representations Semantic + Network���

Page 19: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

#1

#3

#8

#2

#7

#6

#energy#star

#30

#soleus

#4

#5

#9

#50#pt

#70

#air#dp1

#pint

#a#b

#agw#globalwarming

#architecture

#construction

#asia

#news

#benefits

#thermal#blog

#reduce

#budget

#diy

#business

#buzz

#evenementiel

#pouf

#cadeau

#noel

#california

#cleantech

#eco

#canada

#cancun

#capandtrade

#glennbeck

#career

#green

#cars

#hybrid

#cleanenergy

#cleanthinking

#efficiency

#energyefficiency

#environment

#greenbuilding

#led

#pr

#utility

#climate

#iea

#peakoil

#climategate

#cmpcc

#wpccc

#co2

#coal

#oil

#cofely#duurzaam

#conservation#earth

#lamp

#power

#water

#cookies#starbucks

#cop16

#wef

#credits#tax

#csr

#cville

#dd45p#dd50p

#delonghi

#deco

#facebook#mobilier

#dehumidifier

#design

#digg#digguser

#dimmer

#disambiguation#underoath

#home

#money

#dlr#nature

#homes

#renewable

#science#solar

#sustainable

#tips

#economy

#edgestar

#education

#eecbg

#electricity

#encell

#windpower

#europe

#frugal

#greenbuild

#greenroofs

#improvements

#investment

#it

#leds

#lg

#lighting

#microwave

#policy

#renewableenergy

#renewables

#rural

#sfo

#sunpower

#sustainability

#utilities

#wind

#fail#tcot

#fb

#solarthermal

#florida

#footprint

#global

#gop

#hhrs

#gossip

#recycle

#greenjobs

#greentech

#housing

#ontario

#organic

#politics

#reuse

#uk

#greenenergy

#smartmeter

#grid

#jobs

#smart

#hiring

#nissan

#inception

#twitter

#india#stock

#job

#tweetmyjobs

#usa

#livewithless#thepowerofwaiting

#lx

#meters

#obama

#ocra#sgp

#teaparty

#p2

#re

#tech

#smartgrid

#solpwr10

#sun

#webhost#webhosting

#011013#pentair

#10000#btu

#frigidaire

#65

#912

#c

#accenture

#affiliate#marketing

#architekt#calau

#audit

#becktips

#biz

#bolivia

#breakingnews

#prop23

#car#leaf

#carbon

#climatechange

#cdnpoli#cdnprog

#cfd #mkt

#stocks

#change

#china

#hydrogen

#sandiego

#socialmedia

#vc

#windenergy

#cloud#tcn

#tudelft

#cooperative

#unfccc#wcs

#cre #dehumidifiers

#dems

#dms#dsm

#ff

#ecofriendly

#economist

#ecosmart

#efficient

#emissions

#energystar

#fuelcell

#google

#greenliving

#haier#indeed

#innovation

#leed

#light

#pakistan

#r

#siemens

#us

#engineering

#ev

#ft#sunpentown

#gadgets

#geospatial

#gis

#glam

#warming

#savings

#hayward

#pump

#super

#new

#hot

#in

#tip

#jobdanmark

#management

#veterans

#meter

#twisters

#platts

#pledge

#pv

#rated

#reddit#rulez

#rs

#technology

#sd

#tlot

#taf#tfb

#tpp

#vs

#45

#d

#africa

#aktien

#boerse

#alternative

#arm#linux

#cochabamba

#ca#electric

#finland

#greenit

#israel

#neweconomy

#clothdiapers#win

#cnet

#scriptie

#dhilipsiva

#doe

#ecomonday

#electronics

#employment

#solarpower

#environmental

#epa

#eu

#gas

#ghg

#indee

#mass

#save#saving

#victoria#media

#ftrs

#products

#gridweek

#sp1580

#sp2610x15

#heatpumps#saveenergy

#intelliflo

#mlfeeds#trhug

#mobile#web

#p21 #youcut

#windworks

#publicrelations

#privacy

#turbines

#46201

#aia

#art

#biofuel#biofuels

#building

#careers

#cdw

#cisco

#clean

#lightingdesign#nw

#energyrevolution

#un

#technews

#noc

#ebc

#tv

#energysaving

#enviro

#fossil

#fuels

#gadget

#gold

#silver

#greentip

#living

#transport

#travel

#greenwebhosting

#hohoho#stopglobalwarming

#lampe

#licht

#momentive

#streetlights

#novosti

#offshore

#topprog

#photography#pod

#realestate#sale

#space#yahoo

#timelyadvice

#winterhomeprep

#99#vt

#ac

#jobangels

#animals

#apple

#aps#getinthegame

#baby

#finance

#seattle

#socent

#startup

#climaterealism

#gobeyondoil

#datacenter

#esncm053e

#free

#fuel

#nuclear#texas

#football#motiongraphics

#london

#health

#heathrow#uksnow

#ledstreetlights

#nasa

#ondp

#ontpoli

#pkfloods

Figure 7: Visualization (zoomable) of 416 Hashtags in Energy-Related Tweets, September 2010 ThroughJanuary 2011; created with Network Explorer (Rubens et al., 2011).

proaches that help understand the large scale conversationstaking place on Twitter and elsewhere. This project hasdemonstrated the feasibility of using data mining techniquesto gather and analyze vast amounts of data from ongoingsocial media conversations and of analyzing the data formeaningful metrics that describe conversations about en-ergy consumption behavior. The methods for this prelim-inary investigation included: development of a list of en-ergy related metaphors, terms and general descriptors aswell as a list of energy conservation and reduction behav-iors, monitoring term usage (location, frequency, context,clustering) on the Internet at frequent intervals, analyzingdata for frequency and location of communication, clusteringof terms over time – such as changes in their proximity andoccurrence, introduction of new terms and fading of others.Our initial exploration confirmed that conversations aboutenergy-related issues are, indeed, taking place in social me-dia, specifically Twitter and that these communications canbe studied to better understand how to use technologically-

enhanced word-of-mouth to stimulate user-generated per-suasion. Using content analysis of full Tweets, network anal-ysis of co-occurring hashtags, and semantic analysis of theco-occurring hashtags and their authors, this preliminary in-vestigation identified descriptors, concerns, actions, and is-sues. We confirm that studying Twitter communications canprovide actionable means for assessing engagement, identify-ing influencers, and identifying word-of-mouth communitiesthat can accelerate change in energy e�ciency behaviors.

An ecolinguistic taxonomy of over one hundred terms wasestablished and included terms for: energy technologies/hardware&software; communication behaviors; energy & climate changeframes, metaphors, & visualizations; energy e�ciency andclimate change innovative programs; issues such as renew-able energy, global warning, energy insecurity; utilities, ven-ture firms and companies; and behaviors (high and low costand impact.)

By example, we have demonstrated that it is possible tocapture an issue-based sample of the Tweetstream and cu-

Connecting Concepts���

Modeling of: ���•  topics���•  lexicon���•  semantics���•  dynamics���

Page 20: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

risk assessment

china

hypoxia

parasites

survival

complications

synaptic plasticity

trauma

internet

hypertension

disease outbreaks

bioinformatics

research

adaptation

biomarkers

public health

diabetes

exposure

echocardiography

systematic review

brain

mri

mitochondria

osteoporosis

liver

eeg

bangladesh

mortality

smoking

clinical trials

inflammation

ultrasound

education

antimicrobial resistance

elderly

incidence

genetics

perspective

malaria

africa

development

rehabilitation

mechanical ventilation

vaccination

biomonitoring

polychlorinated biphenyls

mass spectrometry

cadmium

serotoninattention

west nile virus

vaccine

glaucoma

reactive oxygen species

physical activity

computed tomography

oxidative stress

letter

child

epilepsy

air pollution

colorectal cancer

diabetes mellitus

imaging

molecular epidemiology

migration

retina

chronic obstructive pulmonary disease

body mass index

obesity

insulin

insulin resistance

prostate cancer

dispatch

treatment

breast cancer

alcohol

radiotherapy

pharmacokinetics

plasticity

evolution

neurodegeneration

meta-analysis

cholesterol

adolescents

fertility

diet

blood pressure

cognition

genotype

pain

cancer

asthma

autism

hippocampuslung cancer

dna methylation

tuberculosis viruses

x-ray crystallography

stroke

transcription

differentiation

communication

metastasis

biomarker

health korea

sepsis

infection

prognosis

chemotherapy

climate change

adherence

pregnancy

fmri

signal transduction

ethics

cell cycle

environmental exposure

transmission outbreak

nanotechnology

lung

risk factors

protein folding

myocardial infarction

atrial fibrillation

nutrition

quality of life

microarray

safety

environmental health

schizophrenia

particulate matter

kidney

management

zoonoses

diagnosis

aging

atherosclerosis

p53

mouse

surgery

zinc

critical care

prevention

occupational exposure

exercise

type 2 diabetes

epigenetics

electron microscopy

influenza

cardiovascular diseasedopamine

metabolism

environment

bacteria

pediatric

copd

outcome

antibodies

cytotoxicity

heart failure

memory

magnetic resonance imaging

endocrine disruption

electrophysiology

laparoscopy

pneumonia

calcium

melanoma

stress

chromatin

systems biology

prevalence

surveillance

therapy

mice

gaba

drinking water

phylogeny

intensive care

proliferation

arsenic

india

screening

human

angiogenesis

epidemiology

learning

apoptosis

simulation

gender

polymorphism

gene expression

prefrontal cortex

anxiety

drug resistance

brazil

mutation

multiple sclerosis

children

toxicity

infant

coronary artery disease

pesticides

depression

dementia

cytokines

proteomics

mercury

rat

nitric oxide

review

pcr

fish

metabolic syndrome

hiv

behavior

growth

women

immunohistochemistry

antioxidant

nanoparticles

lead

Domain-specific���Semantics���

Page 21: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

J93-1006

J93-2003

P91-1022

P91-1023

P93-1002

P94-1012

J94-4004

W95-0107

J90-2002

J93-1003

J96-1002

P93-1003

P95-1032

C98-2225

A94-1006

A97-1050

J96-1001

A88-1019

J93-2004

A00-1031

W96-0213

J93-1004

P95-1037

P93-1001

P97-1003

C98-1013

J95-4004

C96-1058

P96-1025

A00-2018J98-2004

J98-4004

W98-1115

J97-4005

P99-1069

W97-0302A00-2031

W97-0301

P96-1023

P99-1059

H91-1060

H91-1026P93-1004

W93-0301

H94-1020

J92-4003

W95-0115

W96-0201

C92-2066

J99-2004

P92-1017

H94-1028

P96-1024

W99-0604

C00-2163

P97-1037

C98-2153

C96-2141

J00-2004

J97-3002

P95-1033

C98-1066

P00-1050

W00-0726

P97-1063

J99-4004

P00-1061

P02-1035

P99-1065

J99-4005

P01-1030

N01-1025

H01-1035

P00-1056

W02-2018

P00-1058

C02-1126

W00-1201

J01-2004

P01-1017

P03-1012

C04-1006

J03-1002

P03-1011

W02-1012

C04-1041

P04-1014

C04-1010

H92-1026

W04-2407

P02-1040

J02-3001

N03-1016

P03-1054

C04-1030

J03-1005

N03-1017

P02-1038

P03-1019

P03-1021

W02-1018

W03-0301

C04-1032

P02-1034

P02-1043

E03-1071

N04-1013

P02-1042

W03-1013

J04-4002

W01-0521

P03-1013

P03-1056

C04-1060

N04-1035

P01-1067

P02-1039

P02-1050

W02-1039

C04-1073

N04-1021

P96-1021

N03-1033

W02-1001

N03-1028

C04-1090

P02-1031

P03-1002

J03-3002

C94-2178

P03-2041

P02-1018

W03-1006

W03-1008

W02-1002

N04-1030

C04-1204

P02-1036

P03-1046

J05-1004

P05-1022

W06-1615

E06-1005

N07-1029

W07-0702

P07-1005

J03-4003

J04-4004

P07-1040

P07-2045

D07-1013

D07-1096

D07-1111

D07-1119

P06-1043

W06-2932

W06-2933

N06-1014

P05-1074

N06-1020

W07-2202

D07-1077

P06-1067

P07-1091

W06-3114

W07-0718

P04-1041

C08-1041

D07-1007

P05-1033

P06-1066

P06-1077

P06-1121

W04-3250

W05-1506

W06-3108

W06-3601

P04-1061P05-1044

P05-1012

P05-1013

W06-2920

P08-1067

C08-1050

N07-1070

P05-1072

P05-1073

W04-3212

W05-0620

P08-1068

P06-1033

P07-1122

H05-1059

C08-1064

D07-1079

D07-1104

H05-1095

N04-4026

N06-1002

N06-1013

P05-1032

P06-1002

P07-1032

W05-1511W07-2208

E03-1005

J05-1003

P06-1055

P07-1080

C08-1081

H05-1100

P03-1055

W07-2219

P05-1057

P06-1009

P06-1072

P07-1120

C08-1095

E06-1011

H05-1066

N07-1050

P05-1011

P07-1079

W05-1513P05-1010

H05-1010

N06-1019

J05-4003

C08-1127

D07-1056

D07-1091N06-1031

P05-1034

P05-1059

P05-1067

P07-1090

W06-1606

N04-1033

D07-1097

W05-1516

H05-1101

P06-1123

C08-1137

P05-1066

W06-1609

W07-0401

C08-1138

P04-1083

P07-1089

W06-1628

E06-1032

W05-0909

C08-1144

N07-1063

P07-1019

N06-1032

H05-1021

P04-1015

H05-1011

N07-1061

P07-1108

P06-2041

H05-1036

N04-1014

N06-1040

P96-1018

P97-1039

C98-2129

P97-1038

P06-1146

W04-2412

D07-1003

W06-3104

D07-1005

H05-1012

H05-1022

N04-1022

P06-1097 D07-1006

J00-1004

N06-1015

P06-1065W05-0812

W07-0403

E06-1010

N06-2033

W04-1513

D07-1014

W06-1638

D07-1015

H05-1064

N07-1051

P01-1042

W05-1512

W07-2216

D07-1101

W04-3201

P05-1039

W04-3207

P06-1091

P06-1096

D07-1027

W03-1005

N06-1024

P04-1042

P04-1082

N07-1008

D07-1030

D07-1038

N06-1033

W04-3224

N01-1026

D07-1055

N04-1023

P06-2101

P01-1010

D07-1062

W05-0634

D07-1070

W06-1616

E03-1008

J04-2003

N04-1032

D07-1078

W05-0908

D07-1080

P06-1098

W06-3119

N06-1003

D07-1099

N03-1014

P04-1013

P05-1023

W07-2218

H05-1098

P06-2089

N07-1049

W06-2922

W07-2217

N06-1021

D08-1008

D08-1010

D08-1011

P08-1066

W07-0711

D08-1012

N06-1022

D08-1016

P08-1108

D08-1017

W07-0405

D08-1022P08-1023

P08-1024

D08-1024

P06-1110

P08-1009

P08-1114

W08-0304

D08-1033

W06-3123

W07-0715

P08-1061

P06-1088

W04-3215

W06-1619

D08-1059

D08-1060

P05-1069

W03-1001

D08-1065

P08-1025W06-1666

D08-1066

P08-1012

D08-1076

D08-1089

N03-1021

D08-1091

P05-1038

P08-1109

D08-1093

E06-1019H05-1009

P04-1066

P03-1041

H05-1023

H05-1024

W05-1504

W00-1320

H05-1078

H05-1099

I05-1007

J97-2004

I08-1012

N07-3002

P07-1003

P07-1039

I08-1066

W05-1507

I08-2087

I08-2097

W05-1514

N06-1004

W05-1509N06-2026

N06-3004

W05-1505

W06-1608

N07-2022

W06-2904

P05-2016

P06-1062

W05-1515

W06-3603

W04-3228

P06-2014P07-1001

P07-1002

W06-1668

P07-1020

W07-1202

P07-2052

P08-1006

P08-1010

P08-1037

P08-1064

P08-1115

W07-0716

W01-0505

W03-0401

W04-2003

W06-2303

W06-3106

W07-0701

W07-0706

W07-0709

W08-0302

W08-0306

W08-0307

W08-0308

W08-0401

W08-0402

W08-0403

W08-0409

W08-2102

D09-1008D09-1021

D09-1023

D09-1037

D09-1040

D09-1050

D09-1058

D09-1059

D09-1060

D09-1073

D09-1085

D09-1087

D09-1105

D09-1106

D09-1108

D09-1114

D09-1119

D09-1123

D09-1127

D09-1135

D09-1136

D09-1141

D09-1161

E09-1033

E09-1037

E09-1044

E09-1049

E09-1061

E09-1090

N09-1013

N09-1025

N09-1026

N09-1027

N09-1029

N09-1049

N09-2066

P09-1007

P09-1020

P09-1036

P09-1039

P09-1040

P09-1041

P09-1042

P09-1059

P09-1063

P09-1064

P09-1065

P09-1066

P09-1087

P09-1103

P09-1104

P09-1106

P09-2035

P09-3009

W09-0103W09-0424

W09-0426

W09-0434

W09-0435

W09-0809

W09-1008

W09-1104

W09-1114

W09-2306

W09-2310

J08-3003

J08-4003

J07-2003

J07-3002

J07-3004

J07-4004

-�������

3�������

3�������

:�������

3�������

:�������

-�������

1�������

'�������

-�������

3�������

3�������

3�������

3�������

-�������

3�������

1�������

1�������

:�������

3�������

:�������

-�������

3�������

3�������

3�������

3�������'�������

1�������

+�������

3�������

'�������

'�������

3�������

3�������

:�������

:�������

3�������

3�������

&�������3�������

3�������

3�������

-�������

3�������

3�������

3�������

+�������

:�������

3�������:�������

1�������

Connection Formation���Curriculum / Survey Design���

Original Design:������Lacks Information������Limited Diversity ���Limited Context���

Revised Design:���Information Criterion-based���

Page 22: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

output

inpu

t ind

ex

feature index

inpu

t ind

ex

output

inpu

t ind

ex

feature index

inpu

t ind

ex

Traditional

output

inpu

t ind

ex

feature index

inpu

t ind

ex

output

inpu

t ind

ex

feature index

inpu

t ind

ex

Collaborative

Learning in Collaborative Settings

Learning in Black-box Settings

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (dj) =

⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇤

w1,j

w2,j...

w|T |,j

w1,j

w2,j...

w|N |,j

⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌅

2

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

term-space node-space

term-node-space

Figure 2: Vector-space interpretation of the proposed conver-sion and integration methods (figure 1) [4], [6].

(a) (b) (c)

Figure 3: Utilizing training points selected by an activelearning method 3c, allows to more accurately predict thetrue values 3a, in comparison with selecting training pointsrandomly 3b [5].

our motivation is to use the only data that is accessible inblack box settings – output estimates. We note that accuracywill improve only if the learner’s output estimates change.Therefore we propose active learning criterion that utilizes theinformation contained within the changes of output estimates.

Many active learning methods are inapplicable in blackbox settings, since they rely on the knowledge of at leastsome aspect of the model’s workings, as indicated by recentsurveys [5]. Variance-based active learning approaches areapplicable, but are not effective for a number of reasons.Since no information about the model is available, we proposeto define an active learning criterion based on the indirectinformation available about the model – it’s output estimates.We note that model’s accuracy may improve only if its outputestimates change (as a result of adding a new training point).In an attempt to speed up the improvements in accuracy ofthe model estimates, we propose to estimate the usefulness oflabeling based on the magnitude of its impact on the estimates.We show that defining an active learning criterion by takinginto account changes in the output estimates is a promisingpractical approach [11].

C. Network Analysis

Network data structures are becoming increasingly commondata type, in part due to the social nature and inherent intercon-nectedness of many domains. Traditional machine learning has

�f (x)x

�yx

�yx

= � · x

(a) White Box Model.

�yx

= � · x

�f (x)x

�yx

(b) Black Box Model.

Figure 4: For Black Box models only the inputs and outputsare accessible, internal workings are not accessible, unlike forWhite Box models [11].

input index

�yt

�yt+1

a) Adding a training pointinfluences many output es-timates.

input index

�yt

�yt+1

b) Adding a training pointinfluences only a few outputestimates.

Figure 5: Intuition for the proposed method. Before a trainingpoint was added output estimates are denoted as �yt, after as�yt+1 [11].

been focused on a traditional (non-relational data), consistingof multi dimensional samples; which makes them incompatiblewith network data structures. Motivated by this, we focus ondeveloping machine learning methods that are applicable tocomplex data that includes networks, text, semantics, etc. Inparticular we concentrate on finding patterns within complexdata and modeling network dynamics (especially with regardsto semantics) [13], [12].

III. APPLICATIONS

In this section we describe application of the developedmethods in diverse practical domains.

A. Expertise FindingIn today’s knowledge-based economy, having the proper ex-pertise is crucial to resolving many tasks. Expertise Finding(EF) is the area of research concerned with matching availableexperts to given tasks. A standard approach is to input atask description/proposal/paper into an EF system, and receiverecommended experts as output. Traditionally group formation(GF) models are constructed from the data to represent each ofthe underlying entities, e.g. the task description and candidate

!"#$%&"'(!)*+,

-

./

!"#$"%&'

('%'#")$*"+$,%-'##,#

.,/')-'##,#

0

1

20$"1

314

315

316

31(

Figure 6: Decomposition of generalization error G into modelerror C, bias B, and variance V , where g denotes optimalfunction, f is a learned function fi’s are the learned functionsfrom a slightly different training set. Traditional active learningmethods concentrate on minimizing only the variance (V) partof the error, proposed methods takes into consideration all ofthe error components [11].

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

This is a Title of a Research Paper

Joe Fakeman Jane NomanNowhere University

{fakeman, noman}@nowhereuni.edu

Abstract~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Introduction~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~- ~~~~~~~~~~~~~~~~~~

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (dj) =

⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇧⇤

w1,j

w2,j...

w|T |,j

w1,j

w2,j...

w|N |,j

⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌃⌅

2

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

�⇥V (d1)

�⇥V (d2)

�⇥V (d3)

1

term-space node-space

term-node-space

Figure 2: Vector-space interpretation of the proposed conver-sion and integration methods (figure 1) [4], [6].

(a) (b) (c)

Figure 3: Utilizing training points selected by an activelearning method 3c, allows to more accurately predict thetrue values 3a, in comparison with selecting training pointsrandomly 3b [5].

our motivation is to use the only data that is accessible inblack box settings – output estimates. We note that accuracywill improve only if the learner’s output estimates change.Therefore we propose active learning criterion that utilizes theinformation contained within the changes of output estimates.

Many active learning methods are inapplicable in blackbox settings, since they rely on the knowledge of at leastsome aspect of the model’s workings, as indicated by recentsurveys [5]. Variance-based active learning approaches areapplicable, but are not effective for a number of reasons.Since no information about the model is available, we proposeto define an active learning criterion based on the indirectinformation available about the model – it’s output estimates.We note that model’s accuracy may improve only if its outputestimates change (as a result of adding a new training point).In an attempt to speed up the improvements in accuracy ofthe model estimates, we propose to estimate the usefulness oflabeling based on the magnitude of its impact on the estimates.We show that defining an active learning criterion by takinginto account changes in the output estimates is a promisingpractical approach [11].

C. Network Analysis

Network data structures are becoming increasingly commondata type, in part due to the social nature and inherent intercon-nectedness of many domains. Traditional machine learning has

�f (x)x

�yx

�yx

= � · x

(a) White Box Model.

�yx

= � · x

�f (x)x

�yx

(b) Black Box Model.

Figure 4: For Black Box models only the inputs and outputsare accessible, internal workings are not accessible, unlike forWhite Box models [11].

input index

�yt

�yt+1

a) Adding a training pointinfluences many output es-timates.

input index

�yt

�yt+1

b) Adding a training pointinfluences only a few outputestimates.

Figure 5: Intuition for the proposed method. Before a trainingpoint was added output estimates are denoted as �yt, after as�yt+1 [11].

been focused on a traditional (non-relational data), consistingof multi dimensional samples; which makes them incompatiblewith network data structures. Motivated by this, we focus ondeveloping machine learning methods that are applicable tocomplex data that includes networks, text, semantics, etc. Inparticular we concentrate on finding patterns within complexdata and modeling network dynamics (especially with regardsto semantics) [13], [12].

III. APPLICATIONS

In this section we describe application of the developedmethods in diverse practical domains.

A. Expertise FindingIn today’s knowledge-based economy, having the proper ex-pertise is crucial to resolving many tasks. Expertise Finding(EF) is the area of research concerned with matching availableexperts to given tasks. A standard approach is to input atask description/proposal/paper into an EF system, and receiverecommended experts as output. Traditionally group formation(GF) models are constructed from the data to represent each ofthe underlying entities, e.g. the task description and candidate

!"#$%&"'(!)*+,

-

./

!"#$"%&'

('%'#")$*"+$,%-'##,#

.,/')-'##,#

0

1

20$"1

314

315

316

31(

Figure 6: Decomposition of generalization error G into modelerror C, bias B, and variance V , where g denotes optimalfunction, f is a learned function fi’s are the learned functionsfrom a slightly different training set. Traditional active learningmethods concentrate on minimizing only the variance (V) partof the error, proposed methods takes into consideration all ofthe error components [11].

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

!y

!yt+1

P (!yt+1)

16% 84%

y

Figure 3: Distribution of the estimates�yt+1 in relation to the estimate �yt and thetrue value y.

ty

+ty

*y

cba

Figure 4: �y after the training point �is added to the training set (making thenumber of training points equal to t+1).

Figure 5: T1 =⇤�yt � �yt+1⇤ and thevalue that it tries to approximate ⇥G(Section Section 3.1). Most importantly,high values of ⇤�yt � �yt+1⇤2 should cor-respond to high values of ⇥G, sincethose are the points that are likely to bechosen.

1 2 3 4 5 6 7 8 9 102

3

4

5

6

7

8

9

Training Set Size

Mean S

quare

d E

rror

Proposed

A!optimal

D!optimal

E!optimal

Transductive

Random

Optimal

Figure 6: Evaluation of active learningcriterions (Mean Square Error: lowervalues are better, values are different atthe statistical significance level of 95%).

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

!y

!yt+1

P (!yt+1)

16% 84%

y

Figure 3: Distribution of the estimates�yt+1 in relation to the estimate �yt and thetrue value y.

ty

+ty

*y

cba

Figure 4: �y after the training point �is added to the training set (making thenumber of training points equal to t+1).

Figure 5: T1 =⇤�yt � �yt+1⇤ and thevalue that it tries to approximate ⇥G(Section Section 3.1). Most importantly,high values of ⇤�yt � �yt+1⇤2 should cor-respond to high values of ⇥G, sincethose are the points that are likely to bechosen.

1 2 3 4 5 6 7 8 9 102

3

4

5

6

7

8

9

Training Set Size

Mean S

quare

d E

rror

Proposed

A!optimal

D!optimal

E!optimal

Transductive

Random

Optimal

Figure 6: Evaluation of active learningcriterions (Mean Square Error: lowervalues are better, values are different atthe statistical significance level of 95%).

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

!y

!yt+1

P (!yt+1)

16% 84%

y

Figure 3: Distribution of the estimates�yt+1 in relation to the estimate �yt and thetrue value y.

ty

+ty

*y

cba

Figure 4: �y after the training point �is added to the training set (making thenumber of training points equal to t+1).

Figure 5: T1 =⇤�yt � �yt+1⇤ and thevalue that it tries to approximate ⇥G(Section Section 3.1). Most importantly,high values of ⇤�yt � �yt+1⇤2 should cor-respond to high values of ⇥G, sincethose are the points that are likely to bechosen.

1 2 3 4 5 6 7 8 9 102

3

4

5

6

7

8

9

Training Set Size

Mean S

quare

d E

rror

Proposed

A!optimal

D!optimal

E!optimal

Transductive

Random

Optimal

Figure 6: Evaluation of active learningcriterions (Mean Square Error: lowervalues are better, values are different atthe statistical significance level of 95%).

Model

SL

Network Structure Learning

Proposed Method Proposed Approach

input index

�yt

�yt+1

a) Adding training point causes manyoutput estimates to change.

input index

�yt

�yt+1

b) Adding training point causes fewoutput estimates to change.

Proposed Approach

Use changes in the estimates ⇥�yt ��yt+1⇥2 to estimate improvement in thegeneralization error G (xd )where �yt+1 are the estimates after (xd ,yd ) was added to the training set.

15 / 20

Proposed Method Proposed Approach

input index

�yt

�yt+1

a) Adding training point causes manyoutput estimates to change.

input index

�yt

�yt+1

b) Adding training point causes fewoutput estimates to change.

Proposed Approach

Use changes in the estimates ⇥�yt ��yt+1⇥2 to estimate improvement in thegeneralization error G (xd )where �yt+1 are the estimates after (xd ,yd ) was added to the training set.

15 / 20

Proposed Method Proposed Approach

input index

�yt

�yt+1

a) Adding training point causes manyoutput estimates to change.

input index

�yt

�yt+1

b) Adding training point causes fewoutput estimates to change.

Proposed Approach

Use changes in the estimates ⇥�yt ��yt+1⇥2 to estimate improvement in thegeneralization error G (xd )where �yt+1 are the estimates after (xd ,yd ) was added to the training set.

15 / 20

Proposed Method Proposed Approach

input index

�yt

�yt+1

a) Adding training point causes manyoutput estimates to change.

input index

�yt

�yt+1

b) Adding training point causes fewoutput estimates to change.

Proposed Approach

Use changes in the estimates ⇥�yt ��yt+1⇥2 to estimate improvement in thegeneralization error G (xd )where �yt+1 are the estimates after (xd ,yd ) was added to the training set.

15 / 20

Network Structure Learning���Connecting/Pruning Nodes���

Page 23: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

!"#$%&"'(!)*+,

-

./

!"#$"%&'

('%'#")$*"+$,%-'##,#

.,/')-'##,#

0

1

20$"1

314

315

316

31(

Figure 7: Decomposition of generalization error G into model error C, bias B, and variance V , where g denotesoptimal function, ⇥f is a learned function ⇥fi’s are the learned functions from a slightly di�erent training set.

6.2.1 Parameter Change-based

Parameter Change-based AL (Settles et al., 2008b) favors items that are likely to influence the model the most.Assuming that changes in the model’s parameters are for the better, i.e. approach the optimal parameters, itis then beneficial to select an item that has the greatest impact on the model’s parameters:

⇥G�change(xa) = ��

Ey�YL(�T , �T ⇥(xa,y)), (22)

where �T are the model’s parameters estimated from the current training set T , and �T ⇥(xa,y) are the model’sparameter estimates after a hypothetical rating y of an item xa is added to the training set T , and L is the lossfunction that measures the di�erences between the parameters.

6.2.2 Variance-based

In this approach the error is decomposed into three components: model error C (the di�erence between the opti-mal function approximation g, given the current model, and the true function f), bias B (the di�erence betweenthe current approximation ⇥f and an optimal one g), and variance V (how much the function approximation ⇥fvaries ). In other words, we have:

G = C + B + V. (23)

One solution (Cohn et al., 1996) is to minimize the variance component V of the error by assuming that thebias component becomes negligible (if this assumption is not satisfied then this method may not be e�ective).There are a number of methods proposed that aim to select training inputs for reducing a certain measure of thevariance of the model’s parameters. The A-optimal design (Chan, 1981) seeks to select training input pointsso as to minimize the average variance of the parameter estimates, the D-optimal design (John & Draper,1975) seeks to maximize the di�erential Shannon information content of the parameter estimates, and theTransductive Experimental design (Yu et al., 2006) seeks to find representative training points that may allowretaining most of the information of the test points. The AL method in (Sugiyama, 2006), in addition to thevariance component, also takes into account the existense of the model error component.

6.2.3 Image Restoration-based

It is also possible to treat the problem of predicting the user’s preferences as one of image restoration (Nakamura& Abe, 1998), that is, based on our limited knowledge of a user’s preferences (a partial picture), we try to restorethe complete picture of the user’s likes and dislikes. The AL task is then to select the training points that wouldbest allow us to restore the “image” of the user’s preferences. It is interesting to note that this approach satisfiesthe desired properties of the AL methods outlined in Section 2. For example, if a point already exists in a region,then without sampling neighboring points the image in that region could likely be restored. This approach alsomay favor sampling close to the edges of image components (decision boundaries).

7 Ensemble-based Active LearningSometimes instead of using a single model to predict a user’s preferences, an ensemble of models may bebeneficial (??). In other cases only a single model is used, but it is selected from a number of candidate models.The main advantage of this is the premise that di�erent models are better suited to di�erent users or di�erent

13

f(x)

�2 B V

E⇥

G = �2 + B + V,

E⇥ {⇥i}ni=1

B =

D

⇤E⇥

f(x)� g(x)

⌅2

q(x)dx,

V = E⇥

D

⇤ f(x)� E

f(x)

⌅2

q(x)dx.

p(x) ⇤= q(x)

min�

⇧n⌥

i=1

⇤q(xi)

p(xi)

⌅� � f(xi)� yi

⇥2⌃

,

⇤ 0 ⇥ ⇤ ⇥ 1 ⇤

⇤ = 0⇤ = 1

⇤JO =

(x⇤⇥ A�1x⇥)2

(1 + x⇤⇥ A�1x⇥)2,

JT =

⇤xt⇥X�\x�

�x⇤⇥ A�1xt

⇥2

(1 + x⇤⇥ A�1x⇥)2.

JO JT

X⇤X

X⇤X =p⌅

i=1

⇥i�i�⇤i ,

⇥i ⇥1 � ⇥2 � . . . � ⇥m > ⇥m+1 = . . . = ⇥d =0 �i m X⇤X

A

A = X⇤X + �I

=p⌅

i=1

⇥i�i�⇤i + �I.

JO

x⇤⇥ A�1x⇥ = a JO

JO =a2

(a + 1)2 .

JO

a aa

a =p⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �

=m⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �+

1

p⌅

i=m+1

�x⇤⇥ �i

⇥2.

� 1�

⇤pi=m+1

�x⇤⇥ �i

⇥2

a

JO =(x⇤⇥ A�1x⇥)2

(1 + x⇤⇥ A�1x⇥)2,

JT =

⇤xt⇥X�\x�

�x⇤⇥ A�1xt

⇥2

(1 + x⇤⇥ A�1x⇥)2.

JO JT

X⇤X

X⇤X =p⌅

i=1

⇥i�i�⇤i ,

⇥i ⇥1 � ⇥2 � . . . � ⇥m > ⇥m+1 = . . . = ⇥d =0 �i m X⇤X

A

A = X⇤X + �I

=p⌅

i=1

⇥i�i�⇤i + �I.

JO

x⇤⇥ A�1x⇥ = a JO

JO =a2

(a + 1)2 .

JO

a aa

a =p⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �

=m⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �+

1

p⌅

i=m+1

�x⇤⇥ �i

⇥2.

� 1�

⇤pi=m+1

�x⇤⇥ �i

⇥2

a

JO JT

y�

J(�| y� = r) J(�) y� = r

J(�) ��

r

P (y� = r)J(�| y� = r),

P (y� = r)

JO JT

y�

J(�| y� = r) J(�) y� = r

J(�) ��

r

P (y� = r)J(�| y� = r),

P (y� = r)

a ⇥ 1

p⌃

i=m+1

�x⌅⇥ �i

⇥2.

1�

⇧pi=m+1

�x⌅⇥ �i

⇥2x⇥

X⌅X {�i}pi=m+1 x⇥

X⌅X {�i}mi=1

JO x⇥

JT

JT

(1 + x⌅⇥ A�1x⇥)2 � 1 JT

JT �⌃

xt⇤X�\x�

�x⌅⇥ A�1xt

⇥2.

x⌅⇥ A�1xt

x⌅⇥ A�1xt =p⌃

i=1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥ 1

⇥i + �

=m⌃

i=1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥ 1

⇥i + �+

1

p⌃

i=m+1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥.

� 1�

⇧pi=m+1

�x⌅⇥ �i

⇥ �x⌅t �i

x⌅⇥ A�1xt ⇥1

p⌃

i=m+1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥,

JT

JT �⌃

xt⇤X�\x�

�x⌅⇥ A�1xt

⇥2

⇥⌃

xt⇤X�\x�

⇤1

d⌃

i=m+1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥⌅2

.

JT x⇥ X⌅XX⇥ \x⇥

a ⇥ 1

p⌃

i=m+1

�x⌅⇥ �i

⇥2.

1�

⇧pi=m+1

�x⌅⇥ �i

⇥2x⇥

X⌅X {�i}pi=m+1 x⇥

X⌅X {�i}mi=1

JO x⇥

JT

JT

(1 + x⌅⇥ A�1x⇥)2 � 1 JT

JT �⌃

xt⇤X�\x�

�x⌅⇥ A�1xt

⇥2.

x⌅⇥ A�1xt

x⌅⇥ A�1xt =p⌃

i=1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥ 1

⇥i + �

=m⌃

i=1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥ 1

⇥i + �+

1

p⌃

i=m+1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥.

� 1�

⇧pi=m+1

�x⌅⇥ �i

⇥ �x⌅t �i

x⌅⇥ A�1xt ⇥1

p⌃

i=m+1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥,

JT

JT �⌃

xt⇤X�\x�

�x⌅⇥ A�1xt

⇥2

⇥⌃

xt⇤X�\x�

⇤1

d⌃

i=m+1

�x⌅⇥ �i

⇥ �x⌅t �i

⇥⌅2

.

JT x⇥ X⌅XX⇥ \x⇥

JO =(x⇤⇥ A�1x⇥)2

(1 + x⇤⇥ A�1x⇥)2,

JT =

⇤xt⇥X�\x�

�x⇤⇥ A�1xt

⇥2

(1 + x⇤⇥ A�1x⇥)2.

JO JT

X⇤X

X⇤X =p⌅

i=1

⇥i�i�⇤i ,

⇥i ⇥1 � ⇥2 � . . . � ⇥m > ⇥m+1 = . . . = ⇥d =0 �i m X⇤X

A

A = X⇤X + �I

=p⌅

i=1

⇥i�i�⇤i + �I.

JO

x⇤⇥ A�1x⇥ = a JO

JO =a2

(a + 1)2 .

JO

a aa

a =p⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �

=m⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �+

1

p⌅

i=m+1

�x⇤⇥ �i

⇥2.

� 1�

⇤pi=m+1

�x⇤⇥ �i

⇥2

a

JO =(x⇤⇥ A�1x⇥)2

(1 + x⇤⇥ A�1x⇥)2,

JT =

⇤xt⇥X�\x�

�x⇤⇥ A�1xt

⇥2

(1 + x⇤⇥ A�1x⇥)2.

JO JT

X⇤X

X⇤X =p⌅

i=1

⇥i�i�⇤i ,

⇥i ⇥1 � ⇥2 � . . . � ⇥m > ⇥m+1 = . . . = ⇥d =0 �i m X⇤X

A

A = X⇤X + �I

=p⌅

i=1

⇥i�i�⇤i + �I.

JO

x⇤⇥ A�1x⇥ = a JO

JO =a2

(a + 1)2 .

JO

a aa

a =p⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �

=m⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �+

1

p⌅

i=m+1

�x⇤⇥ �i

⇥2.

� 1�

⇤pi=m+1

�x⇤⇥ �i

⇥2

a

JO =(x⇤⇥ A�1x⇥)2

(1 + x⇤⇥ A�1x⇥)2,

JT =

⇤xt⇥X�\x�

�x⇤⇥ A�1xt

⇥2

(1 + x⇤⇥ A�1x⇥)2.

JO JT

X⇤X

X⇤X =p⌅

i=1

⇥i�i�⇤i ,

⇥i ⇥1 � ⇥2 � . . . � ⇥m > ⇥m+1 = . . . = ⇥d =0 �i m X⇤X

A

A = X⇤X + �I

=p⌅

i=1

⇥i�i�⇤i + �I.

JO

x⇤⇥ A�1x⇥ = a JO

JO =a2

(a + 1)2 .

JO

a aa

a =p⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �

=m⌅

i=1

�x⇤⇥ �i

⇥2 1

⇥i + �+

1

p⌅

i=m+1

�x⇤⇥ �i

⇥2.

� 1�

⇤pi=m+1

�x⇤⇥ �i

⇥2

a

⇤�t = A�1X⇤y.

⇤�t+1 �

⇤�t+1 = (A + x�x⇤� )�1(X⇤y + x�y�)

= (A + x�x⇤� )�1X⇤y + (A + x�x

⇤� )�1x�y�.

(A + x�x⇤� )�1 = A�1 � A�1x�x⇤� A�1

1 + x⇤� A�1x�.

(A + x�x⇤� )�1X⇤y = A�1X⇤y � A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

(A + x�x⇤� )�1x�y� = A�1x�y� �

A�1x�x⇤� A�1x�y�

1 + x⇤� A�1x�.

⇤�t+1 � ⇤�t =A�1x�y� �A�1x�x⇤� A�1x�y�

1 + x⇤� A�1x�� A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

=A�1x�y�1 + x⇤� A�1x� � x⇤� A�1x�

1 + x⇤� A�1x�� A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

=A�1x�(y� � x⇤� ⇤�t)

1 + x⇤� A�1x�.

⇤yt+1 � ⇤yt = X⇥A�1x�(y� � x⇤� ⇤�t)

1 + x⇤� A�1x�.

J(�) = ⇥⇤yt+1 � ⇤yt⇥2

=

�y� � x⇤� ⇤�t

1 + x⇤� A�1x�

⇥2

x⇤� A�1X⇥⇤X⇥A�1x�.

⇤�t = A�1X⇤y.

⇤�t+1 �

⇤�t+1 = (A + x�x⇤� )�1(X⇤y + x�y�)

= (A + x�x⇤� )�1X⇤y + (A + x�x

⇤� )�1x�y�.

(A + x�x⇤� )�1 = A�1 � A�1x�x⇤� A�1

1 + x⇤� A�1x�.

(A + x�x⇤� )�1X⇤y = A�1X⇤y � A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

(A + x�x⇤� )�1x�y� = A�1x�y� �

A�1x�x⇤� A�1x�y�

1 + x⇤� A�1x�.

⇤�t+1 � ⇤�t =A�1x�y� �A�1x�x⇤� A�1x�y�

1 + x⇤� A�1x�� A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

=A�1x�y�1 + x⇤� A�1x� � x⇤� A�1x�

1 + x⇤� A�1x�� A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

=A�1x�(y� � x⇤� ⇤�t)

1 + x⇤� A�1x�.

⇤yt+1 � ⇤yt = X⇥A�1x�(y� � x⇤� ⇤�t)

1 + x⇤� A�1x�.

J(�) = ⇥⇤yt+1 � ⇤yt⇥2

=

�y� � x⇤� ⇤�t

1 + x⇤� A�1x�

⇥2

x⇤� A�1X⇥⇤X⇥A�1x�.

y� T1

y�� T1

T2

T1 ⇤G⌅yt+1 ⌅yt+1

⌅yt y�

T1 ⇤G

T1 ⇤G⌅yt+1

⌅yt y�

T1

⇤GJ

J(⇥) = ⌅⌅yt � ⌅yt+1⌅2 .

argmax�J(⇥).

J(⇥) = ⌅⌅yt � ⌅yt+1⌅2 =���X�

⇥⌅�t � ⌅�t+1

⇤���2,

⌅�t⌅�t+1

A = X⇥X + �I,

�I 0 < � ⇥ 1A ⌅�t

y� T1

y�� T1

T2

T1 ⇤G⌅yt+1 ⌅yt+1

⌅yt y�

T1 ⇤G

T1 ⇤G⌅yt+1

⌅yt y�

T1

⇤GJ

J(⇥) = ⌅⌅yt � ⌅yt+1⌅2 .

argmax�J(⇥).

J(⇥) = ⌅⌅yt � ⌅yt+1⌅2 =���X�

⇥⌅�t � ⌅�t+1

⇤���2,

⌅�t⌅�t+1

A = X⇥X + �I,

�I 0 < � ⇥ 1A ⌅�t

⇤�t = A�1X⇤y.

⇤�t+1 �

⇤�t+1 = (A + x�x⇤� )�1(X⇤y + x�y�)

= (A + x�x⇤� )�1X⇤y + (A + x�x

⇤� )�1x�y�.

(A + x�x⇤� )�1 = A�1 � A�1x�x⇤� A�1

1 + x⇤� A�1x�.

(A + x�x⇤� )�1X⇤y = A�1X⇤y � A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

(A + x�x⇤� )�1x�y� = A�1x�y� �

A�1x�x⇤� A�1x�y�

1 + x⇤� A�1x�.

⇤�t+1 � ⇤�t =A�1x�y� �A�1x�x⇤� A�1x�y�

1 + x⇤� A�1x�� A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

=A�1x�y�1 + x⇤� A�1x� � x⇤� A�1x�

1 + x⇤� A�1x�� A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

=A�1x�(y� � x⇤� ⇤�t)

1 + x⇤� A�1x�.

⇤yt+1 � ⇤yt = X⇥A�1x�(y� � x⇤� ⇤�t)

1 + x⇤� A�1x�.

J(�) = ⇥⇤yt+1 � ⇤yt⇥2

=

�y� � x⇤� ⇤�t

1 + x⇤� A�1x�

⇥2

x⇤� A�1X⇥⇤X⇥A�1x�.

⇤�t = A�1X⇤y.

⇤�t+1 �

⇤�t+1 = (A + x�x⇤� )�1(X⇤y + x�y�)

= (A + x�x⇤� )�1X⇤y + (A + x�x

⇤� )�1x�y�.

(A + x�x⇤� )�1 = A�1 � A�1x�x⇤� A�1

1 + x⇤� A�1x�.

(A + x�x⇤� )�1X⇤y = A�1X⇤y � A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

(A + x�x⇤� )�1x�y� = A�1x�y� �

A�1x�x⇤� A�1x�y�

1 + x⇤� A�1x�.

⇤�t+1 � ⇤�t =A�1x�y� �A�1x�x⇤� A�1x�y�

1 + x⇤� A�1x�� A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

=A�1x�y�1 + x⇤� A�1x� � x⇤� A�1x�

1 + x⇤� A�1x�� A�1x�x⇤� A�1X⇤y

1 + x⇤� A�1x�

=A�1x�(y� � x⇤� ⇤�t)

1 + x⇤� A�1x�.

⇤yt+1 � ⇤yt = X⇥A�1x�(y� � x⇤� ⇤�t)

1 + x⇤� A�1x�.

J(�) = ⇥⇤yt+1 � ⇤yt⇥2

=

�y� � x⇤� ⇤�t

1 + x⇤� A�1x�

⇥2

x⇤� A�1X⇥⇤X⇥A�1x�.

J(�) = (y� � x⌅� ⇧�t)2x

⌅� A�1X⇥⌅X⇥A�1x�

(1 + x⌅� A�1x�)2

= JRJS

JP,

JR = (y� � x⌅� ⇧�t)2,

JS = x⌅� A�1X⇥⌅X⇥A�1x�,

JP = (1 + x⌅� A�1x�)2.

JR = (y��x⌅� ⇧�t)2

y� x⌅� ⇧�t x�

JR

JS

JS = x⌅� A�1X⇥⌅X⇥A�1x�

=⌅

xt⇤X�

�x⌅� A�1xt

⇥2,

X⇥ X⇥

x� ⇥ X⇥

JS = (x⌅� A�1x�)2 +

xt⇤X�\x�

�x⌅� A�1xt

⇥2.

JSJP

JS

JP=

(x⌅� A�1x�)2 +⇤

xt⇤X�\x�

�x⌅� A�1xt

⇥2

(1 + x⌅� A�1x�)2

=(x⌅� A�1x�)2

(1 + x⌅� A�1x�)2+

⇤xt⇤X�\x�

�x⌅� A�1xt

⇥2

(1 + x⌅� A�1x�)2

= JO + JT ,

f(x)

�2 B V

E⇥

G = �2 + B + V,

E⇥ {⇥i}ni=1

B =

D

⇤E⇥

f(x)� g(x)

⌅2

q(x)dx,

V = E⇥

D

⇤ f(x)� E

f(x)

⌅2

q(x)dx.

p(x) ⇤= q(x)

min�

⇧n⌥

i=1

⇤q(xi)

p(xi)

⌅� � f(xi)� yi

⇥2⌃

,

⇤ 0 ⇥ ⇤ ⇥ 1 ⇤

⇤ = 0⇤ = 1

Conceptual Justifications���

Page 24: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Connecting People���Collaboration Networks���

University Alumnus Company

Size (degree log-scaled)

(a) Stanford University. (b) Harvard University. (c) MIT. (d) UC Berkeley.

Figure 1: Intra-University Networks. Networks of the above universities are expanded in a breadth first manner up to the depthof 2, (showing university, alumni and companies they are associated with through employment, investment or other activities)(Section III-A1). Size of the node reflects degree of the node (scaled logarithmically).

relation to number of alumni nodes differs. Stanford Universityhas a significantly higher ratio of companies per alumni inleadership roles, followed by Harvard, trailed by Berkeley, andMIT. A high ratio of company nodes indicates that alumnihave been involved with multiple companies - either throughemployment, advisory or investment activities. In addition,the number of highly connected alumni (large nodes withmany connections located on the perimeter) differs significantlybetween universities (we further explore this in Section III-B).One particular characteristic of highly connected alumni standsout, namely their collaboration patterns. Stanford’s denselyconnected alumni are highly likely to collaborate with fellowalumni (indicated by the company nodes being pulled awayfrom highly connected alumnus towards other less-connectedalumni in the center). In the networks of other universities, thecollaboration of highly connected individuals with their fellowalumni is evidenced but to a lesser degree.

Harvard alumni appear active in leadership positions intechnology-based startups, even more so than MIT alumni(Figure 1b vs. 1c). A possible explanation for the relativelylower level of MIT alumni may be attributable to the focus ofthis dataset on leadership positions in the organizations. Whileengineers play a key role, they often do so in a technologydevelopment capacity rather than in the leadership positions thatare visible in public relations communications. Some supportfor this explanation may be seen in the Figure 2 relatively largedistance between Microsoft and University of Washington, eventhough a large number of engineers at Microsoft are indeedfrom University of Washington.

2) Inter-University Network: A graphic representation ofthe alumni-based inter-relations between universities, shownin Figure 2, was produced as follows. Four universities wereselected for analysis: Stanford University, University of Califor-nia (Berkeley), Harvard University and Massachusetts Instituteof Technology (MIT). From these nodes we have performedbreadth-first expansion up to the depth of three: 1st levelbeing alumni of these corresponding universities, 2nd levelare companies with which alumni have relations, and 3rd

University

Size (Centrality log-scaled)

Stanford

MIT

Harvard

Berkeley

Financial Org.

Company

Person

University & Alumni Other Nodes

Figure 2: Inter-University Network (between Stanford, Harvard,MIT, Berkeley) (Section III-A2). Network is obtained by start-ing with the nodes of the above mentioned universities andperforming a breadth-first expansion up to the depth of 3.

level entities/nodes that are linked to previous levels includingfinancial organizations, company employees, etc. Since weare primarily interested in relationships among alumni, all ofother entities are faded out, except for the above mentioneduniversities and their alumni. In addition, we glimpse at therelations between alumni and investment firms (a very impor-tant factor for entrepreneurism). Therefore, nodes of financialorganizations are not faded out.

Two distinct groups – universities (in the lower left corner)and financials (in the upper right corner) – are visible in the

University Alumnus Company

Size (degree log-scaled)

(a) Stanford University. (b) Harvard University. (c) MIT. (d) UC Berkeley.

Figure 1: Intra-University Networks. Networks of the above universities are expanded in a breadth first manner up to the depthof 2, (showing university, alumni and companies they are associated with through employment, investment or other activities)(Section III-A1). Size of the node reflects degree of the node (scaled logarithmically).

relation to number of alumni nodes differs. Stanford Universityhas a significantly higher ratio of companies per alumni inleadership roles, followed by Harvard, trailed by Berkeley, andMIT. A high ratio of company nodes indicates that alumnihave been involved with multiple companies - either throughemployment, advisory or investment activities. In addition,the number of highly connected alumni (large nodes withmany connections located on the perimeter) differs significantlybetween universities (we further explore this in Section III-B).One particular characteristic of highly connected alumni standsout, namely their collaboration patterns. Stanford’s denselyconnected alumni are highly likely to collaborate with fellowalumni (indicated by the company nodes being pulled awayfrom highly connected alumnus towards other less-connectedalumni in the center). In the networks of other universities, thecollaboration of highly connected individuals with their fellowalumni is evidenced but to a lesser degree.

Harvard alumni appear active in leadership positions intechnology-based startups, even more so than MIT alumni(Figure 1b vs. 1c). A possible explanation for the relativelylower level of MIT alumni may be attributable to the focus ofthis dataset on leadership positions in the organizations. Whileengineers play a key role, they often do so in a technologydevelopment capacity rather than in the leadership positions thatare visible in public relations communications. Some supportfor this explanation may be seen in the Figure 2 relatively largedistance between Microsoft and University of Washington, eventhough a large number of engineers at Microsoft are indeedfrom University of Washington.

2) Inter-University Network: A graphic representation ofthe alumni-based inter-relations between universities, shownin Figure 2, was produced as follows. Four universities wereselected for analysis: Stanford University, University of Califor-nia (Berkeley), Harvard University and Massachusetts Instituteof Technology (MIT). From these nodes we have performedbreadth-first expansion up to the depth of three: 1st levelbeing alumni of these corresponding universities, 2nd levelare companies with which alumni have relations, and 3rd

University

Size (Centrality log-scaled)

Stanford

MIT

Harvard

Berkeley

Financial Org.

Company

Person

University & Alumni Other Nodes

Figure 2: Inter-University Network (between Stanford, Harvard,MIT, Berkeley) (Section III-A2). Network is obtained by start-ing with the nodes of the above mentioned universities andperforming a breadth-first expansion up to the depth of 3.

level entities/nodes that are linked to previous levels includingfinancial organizations, company employees, etc. Since weare primarily interested in relationships among alumni, all ofother entities are faded out, except for the above mentioneduniversities and their alumni. In addition, we glimpse at therelations between alumni and investment firms (a very impor-tant factor for entrepreneurism). Therefore, nodes of financialorganizations are not faded out.

Two distinct groups – universities (in the lower left corner)and financials (in the upper right corner) – are visible in the

Inter-University Network

subdued edges in Figure 2. The distance from the universitiesto the cloud of ‘financial’ clusters also varies. In particular,Stanford and Berkeley are rather close to the financial cloud.This may be explained by the geographical proximity of theseuniversities to one of the largest sources of venture funding –Silicon Valley. While universities themselves are not embeddedwithin the financial clusters, a noticeable proportion of alumniare deeply connected within the financial clusters by havingdirect or indirect relations with multiple financial organizations.Stanford has the largest number of alumni connected to thefinancial cluster, followed by Harvard (even though universityitself is relatively distant from the financial cluster); followedby Berkeley, and only a few alumni from MIT. The proximitybetween alumni and their alma matters appear to differ signif-icantly. Berkeley alumni tend to be clustered together, MIT tosomewhat lesser degree, and Stanford and Harvard alumni arerather dispersed. Proximity between universities differs as well.Stanford and Berkeley are close together (many alumni holdleadership positions in the same companies). One of the likelyexplanations for this network proximity is the geographicalproximity of both universities to Silicon Valley where many ofthe investment firms and startup companies are located. Harvardand MIT do not appear to have as strong relations with otheruniversities in these settings.

University

Other Entities (Company, People, Financial Org.)

Size (Centrality degree log-scaled)

Figure 3: Universities within the Business Network (partialsnapshot) (Section III-A3). Note that nodes locations differsignificantly from Figure 2 due to additional forces exerted by avery large number of nodes and links of the complete network(144,685 nodes and 129,423 links). For better visibility of theentity types except for universities are faded out.

3) Universities Within the Technology-Based Business Net-work : Through alumni, universities become indirectly linked

to a variety of business entities – technology-based companies,the service organizations that support them, and investmentfirms. The positions of universities within the technology-based business network Figure 3 are determined by their directlinks only to the alumni. However the proximity and locationof universities within the business network Figure 3 differfrom those of the inter-university network [fig:Inter-University-Network]. It should be noted that a large number of nodesand links that were not included in the inter-university networkare, in fact, included in the full network layout of the nodes.The cluster and forced based layout algorithms used in thisanalysis produce nodes that have many interconnections andtend to be close together. Moreover, both the direct and indirectlinks influence position of nodes within the network. Hence,the patterns of nodes differs significantly between the BusinessNetwork and the Inter-University Network.

Let us look at the proximity between universities and compa-nies. While Microsoft and Yahoo are close to many major uni-versities, Google appears to be distant from them. Discoveringthe precise explanation for this warrants further investigation,but let us suggest two hypotheses. As we briefly discussed inSection III-A2 and Section II-C, our dataset is focused on ‘key’people within the company (e.g. mentioned in press releases).Unlike many of other companies, Google tends to give creditto its engineers, e.g. names of engineers are mentioned inpress releases. In addition, Google had experienced very rapidemployee growth, which has required establishing relationshipswith many universities to meet hiring goals.

B. Data AnalysisIn addition to examining networks visually, we use several

network measures to reveal the characteristics and patternsof the underlying network. One of the biggest advantages ofnumerical analysis of network data is the ability to analyze verylarge networks; in visual analysis patterns in large networksquickly become difficult to discern (Figure 2, 3). For thenumerical analysis, we used the full set of data as described inSection II-C; constructed network contains 144,685 nodes and129,423 links; including over 2,100 educational institutions.Due to space limitations, we have selected to report networkproperties of 20 universities with the largest number of alumniincluded in our dataset.

Social Network Metrics: Network metrics numericallyexpress characteristics and patterns of the underlying network.For this analysis, we have chosen to use the following networkmeasures: centrality (betweenness centrality and closeness cen-trality), and eccentricity. Centrality reflects the relative impor-tance of a node within the graph. Betweenness and closenesscentrality are typical measures of centrality [5], [3]. Between-ness centrality can be thought of as a kind of bridge/brokerscore, a measure of how much the connections between othernodes in the network would be disrupted by removing thatnode. If an alum has a very novel and highly desired expertise,s/he may provide crucial and rather exclusive links for doingbusiness in that domain. More precisely betweenness centralitymeasures how frequently a node appears on shortest pathsbetween nodes in the network. On the other hand, closeness

University-Company Network

�$03$$,,$/��

�� � � ��� �� ��� ���1,,(,&��1+�-%��-1,0�-%��1+!$.�-%��$"-.#/

������

����

�����

����

����

�����

�����

����

�����

� ���

�����

�����

� ���

�����

�����

����

����

�����

�����

�����

�����

�����

���

����

����

���

���

����

����

����

����

��

� *$��,(2$./(05

�,(2$./(05�-%�� /'(,&0-,

�,(2$./(05�-%��-10'$.,�� *(%-.,(

�,(2$./(05�-%��$,,/5*2 ,(

�,(2$./(05�-%��("'(& ,

�,(2$./(05�-%��'(" &-

����

����$.)$*$5

�0 ,%-.#��,(2$./(05

�.(,"$0-,��,(2$./(05

�4%-.#��,(2$./(05

�-.0'3$/0$.,��,(2$./(05

�$3��-.)��,(2$./(05

���

� .2 .#��,(2$./(05

�1)$��,(2$./(05

� .0+-10'��-**$&$

�-.,$**��,(2$./(05

�-*1+!( ��,(2$./(05

������������#��������$� !�"'

�� ��������$� !�"'

�� "��#"���������

�#������$� !�"'

�� $� �����$� !�"'

��

�%��� �����$� !�"'

� "�%�!"� �����$� !�"'

�&�� �����$� !�"'

� ����"������$� !�"'

"���� �����$� !�"'

����� ����'

����

���$� !�"'�����������

���$� !�"'�����������

���$� !�"'��������!'�$����

���$� !�"'���� �#"�� �������� ���

���$� !�"'������!����"��

��������$� !�"'

Betw

eenn

ess

Cen

tralit

y

Alumni Index

�������������

���� �������������������������

���� ���������������� �������������

(a) Betweenness Centrality of Alumni Nodes in Networks (University andalumni nodes are disconnected).

Betw

eenn

ess

Cen

tralit

y

Alumni Index

(b) Betweenness Centrality of Alumni Nodes in Networks (University andalumni nodes are connected).

Figure 4: Betweenness Centrality (y-axis) of alumni nodes in the network (connected to the university node (b)), and (disconnectedfrom the university node (a)). The x-axis corresponds to alumni ordered in descending order of centrality for each of theuniversities. Note that the scale of y-axis differs between (a) and (b).

# of Alumni Median Betweenness Centrality

MedianEccentricity

Median Closeness Centrality

Table II: Network Metrics of Alumni Nodes in Networks(University and alumni nodes are connected). Note that theheading, “# of Alumni” is in reference to the number of alumniin the IEN dataset (Section II-C).

of these data-gathering and network analysis approaches. Re-cent applications of network analysis have demonstrated theirpower in understanding social norms, inter-firm relationships,and influence. Continuing developments combining networkanalysis and machine learning are opening opportunities forpredictive methods as well. Alumni and their connections – toseveral educational institutions and to several business entities –provide a data domain with deep dimension. Alumni networkscan be both simple and complex. The complexity of the

relationships among alumni active in technology-based businessaffords extensive inquiry into many-mode, small and large-scale, directed and time-scaled. The authors invite collaborationon these frontiers. Legends, novels, and films have been madeabout the academic cohort – a graduating class, a student courseproject team, a laboratory group. Experiences at educationalinstitutions are often profound and memorable. The connectionsformed through those experiences – through processes of selfdiscovery, learning, creativity, invention, collaboration – enablethe personal and professional contributions of graduates. Edu-cational institutions stand to benefit substantially from betterunderstanding the connections of their alumni. Insights fromthis understanding could be leveraged to guide the curricula andenrichment programs that comprise the educational experience.The power of visualization can be harnessed to develop ashared mental model, among faculty, administrators and donors,toward which resources will be applied. This might includecurricular and extracurricular attention to students’ personaland professional network development in a global environment,in which life-long and life-wide learning yields competitiveadvantage.

REFERENCES

[1] N. Rubens, K. Still, J. Huhtamaki, and M. G. Russell, “Leveraging socialmedia for analysis of innovation players and their moves,” tech. rep., MediaX, Stanford University, Feb. 2010.

[2] L. C. Freeman, Encyclopedia of Complexity and Systems Science, ch. Meth-ods of Social Network Visualization. Berlin: Springer, 2009.

[3] M. Bastian, S. Heymann, and M. Jacomy, “Gephi: An open source softwarefor exploring and manipulating networks,” 2009.

[4] S. Martin, W. M. Brown, R. Klavans and K. Boyack, “OpenOrd: AnOpen-Source Toolbox for Large Graph Layout,” in SPIE Conference onVisualization and Data Analysis (VDA), 2011.

[5] D. Hansen, B. Shneiderman, and M. Smith, Analyzing Social Networks withNodeXL: Insights from a Connected World. Morgan Kaufmann, 2010.

Betw

eenn

ess

Cen

tralit

y

Alumni Index

(a) Betweenness Centrality of Alumni Nodes in Networks (University andalumni nodes are disconnected).

�$03$$,,$/��

�� � � ��� �� ��� ���1,,(,&��1+�-%��-1,0�-%��1+!$.�-%��$"-.#/

������

�����

������

������

����

�����

�����

����

����

����

�����

�����

�����

����

�����

�����

�����

�����

����

�����

�����

�����

�����

����

�����

�����

�����

����

���

����

����

��

� *$��,(2$./(05

�,(2$./(05�-%�� /'(,&0-,

�,(2$./(05�-%��-10'$.,�� *(%-.,( �,(2$./(05�-%��$,,/5*2 ,(

�,(2$./(05�-%��("'(& ,

�,(2$./(05�-%��'(" &-

����

����$.)$*$5

�0 ,%-.#��,(2$./(05

�.(,"$0-,��,(2$./(05

�4%-.#��,(2$./(05

�-.0'3$/0$.,��,(2$./(05

�$3��-.)��,(2$./(05

���

� .2 .#��,(2$./(05

�1)$��,(2$./(05

� .0+-10'��-**$&$

�-.,$**��,(2$./(05

�-*1+!( ��,(2$./(05

������������#��������$� !�"'

�� ��������$� !�"'

�� "��#"���������

�#������$� !�"'

�� $� �����$� !�"'

��

�%��� �����$� !�"'

� "�%�!"� �����$� !�"'

�&�� �����$� !�"'

� ����"������$� !�"'

"���� �����$� !�"'

����� ����'

����

���$� !�"'�����������

���$� !�"'�����������

���$� !�"'��������!'�$����

���$� !�"'���� �#"�� �������� ���

���$� !�"'������!����"��

��������$� !�"'

Betw

eenn

ess

Cen

tralit

y

���������� ���������������

���������������� ���� ������������� ������ ��

������������� ���

���� ������������������� �������������

���� �������������

Alumni Index

(b) Betweenness Centrality of Alumni Nodes in Networks (University andalumni nodes are connected).

Figure 4: Betweenness Centrality (y-axis) of alumni nodes in the network (connected to the university node (b)), and (disconnectedfrom the university node (a)). The x-axis corresponds to alumni ordered in descending order of centrality for each of theuniversities. Note that the scale of y-axis differs between (a) and (b).

# of Alumni Median Betweenness Centrality

MedianEccentricity

Median Closeness Centrality

Table II: Network Metrics of Alumni Nodes in Networks(University and alumni nodes are connected). Note that theheading, “# of Alumni” is in reference to the number of alumniin the IEN dataset (Section II-C).

of these data-gathering and network analysis approaches. Re-cent applications of network analysis have demonstrated theirpower in understanding social norms, inter-firm relationships,and influence. Continuing developments combining networkanalysis and machine learning are opening opportunities forpredictive methods as well. Alumni and their connections – toseveral educational institutions and to several business entities –provide a data domain with deep dimension. Alumni networkscan be both simple and complex. The complexity of the

relationships among alumni active in technology-based businessaffords extensive inquiry into many-mode, small and large-scale, directed and time-scaled. The authors invite collaborationon these frontiers. Legends, novels, and films have been madeabout the academic cohort – a graduating class, a student courseproject team, a laboratory group. Experiences at educationalinstitutions are often profound and memorable. The connectionsformed through those experiences – through processes of selfdiscovery, learning, creativity, invention, collaboration – enablethe personal and professional contributions of graduates. Edu-cational institutions stand to benefit substantially from betterunderstanding the connections of their alumni. Insights fromthis understanding could be leveraged to guide the curricula andenrichment programs that comprise the educational experience.The power of visualization can be harnessed to develop ashared mental model, among faculty, administrators and donors,toward which resources will be applied. This might includecurricular and extracurricular attention to students’ personaland professional network development in a global environment,in which life-long and life-wide learning yields competitiveadvantage.

REFERENCES

[1] N. Rubens, K. Still, J. Huhtamaki, and M. G. Russell, “Leveraging socialmedia for analysis of innovation players and their moves,” tech. rep., MediaX, Stanford University, Feb. 2010.

[2] L. C. Freeman, Encyclopedia of Complexity and Systems Science, ch. Meth-ods of Social Network Visualization. Berlin: Springer, 2009.

[3] M. Bastian, S. Heymann, and M. Jacomy, “Gephi: An open source softwarefor exploring and manipulating networks,” 2009.

[4] S. Martin, W. M. Brown, R. Klavans and K. Boyack, “OpenOrd: AnOpen-Source Toolbox for Large Graph Layout,” in SPIE Conference onVisualization and Data Analysis (VDA), 2011.

[5] D. Hansen, B. Shneiderman, and M. Smith, Analyzing Social Networks withNodeXL: Insights from a Connected World. Morgan Kaufmann, 2010.

Page 25: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Task Description

Researcher Profile

Knows

Researcher Profile

Knows

Group

Task Description

Researcher Profile

Knows

!"#$%&'#()*+,-.%/'010%"("&'2*(%+"+')3%

4'#'")(5')%

6'1'.&%

&"7"%2*.*.1%896'").*.1%

!"#$%&'(&)*+&*,-)&.(./)

(-.7)*:;7'&%$.-<='&1'%

!"#$%&'#()*+,-.%

!"#!$%&

!"#!$%&

#$'()(*+&,'%&),&!"#!$%&

——-Objective: given a paper p, choose a group of experts M ⇤ M (of a fixed

size s) that collectively possesses the most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

—Training Set: T = (X

T

, Y

T

) = {(xi

, f(xi

))xi2XT }

Function learned from X

T

(and corresponding Y

T

): bR

T

Generalization error: G( bR

T

) = L( bR

T

, f)

minXT

G( bR

T

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly di⇣erent training set.EG = B + V + C

B =⇣E b

R(x) � g(x)⌘2

V =⇣

bR � E b

R(x)⌘2

C = (g(x) � f(x))2

Use changes in the estimates �by

t

� by

t+1�2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.G

t+1: generalization error after (x�

, y

) is added to the training set.⌅G = G

t

� G

t+1: improvement in generalization error—

minx�

G

t+1 = maxx�

⌅G.

⌅G =J + K

J = �by

t

� by

t+1�2

K =2 byt+1 � b

y

t

,y

⇤ � by

t+1⌦

—-⌅G : estimate since the true output values y

⇤ are not accessible.

• K = 2 byt+1 � b

y

t

,y

⇤ � by

t+1⌦

– need to estimate y

⇤ (all values)

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

R(M, p) ⌅=X

m2M

r (m, p) . (3)

LimitationExpertise is not additive:–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌥M⇤

p

= Ø.

R(M, p) =

��M ⌥M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ↵byt

� by

t+1↵2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.G

t+1: generalization error after (x�

, y�

) is added to the training set.⇤G = G

t

�Gt+1: improvement in generalization error

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇤ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

—Training Set: T = (X

T

, Y

T

) = {(xi

, f(xi

))xi2XT }

Function learned from X

T

(and corresponding Y

T

): bR

T

Generalization error: G( bR

T

) = L( bR

T

, f)

minXT

G( bR

T

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly di⌘erent training set.EG = B + V + C

B =⇣E b

R(x) � g(x)⌘2

V =⇣

bR � E b

R(x)⌘2

C = (g(x) � f(x))2

Use changes in the estimates �by

t

� by

t+1�2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.G

t+1: generalization error after (x�

, y

) is added to the training set.⌅G = G

t

� G

t+1: improvement in generalization error—

minx�

G

t+1 = maxx�

⌅G.

⌅G =J + K

J = �by

t

� by

t+1�2

K =2 byt+1 � b

y

t

,y

⇤ � by

t+1⌦

—-⌅G : estimate since the true output values y

⇤ are not accessible.

• K = 2 byt+1 � b

y

t

,y

⇤ � by

t+1⌦

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

R(M, p) ⌅=X

m2M

r (m, p) . (3)

LimitationExpertise is not additive:–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌥M⇤

p

= Ø.

R(M, p) =

��M ⌥M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ↵byt

� by

t+1↵2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.G

t+1: generalization error after (x�

, y�

) is added to the training set.⇤G = G

t

�Gt+1: improvement in generalization error

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

R(M, p) ⌅=X

m2M

r (m, p) . (3)

LimitationExpertise is not additive:–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌥M⇤

p

= Ø.

R(M, p) =

��M ⌥M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ↵byt

� by

t+1↵2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.G

t+1: generalization error after (x�

, y�

) is added to the training set.⇤G = G

t

�Gt+1: improvement in generalization error

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

R(M, p) ⌅=X

m2M

r (m, p) . (3)

LimitationExpertise is not additive:–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌥M⇤

p

= Ø.

R(M, p) =

��M ⌥M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ↵byt

� by

t+1↵2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.G

t+1: generalization error after (x�

, y�

) is added to the training set.⇤G = G

t

�Gt+1: improvement in generalization error

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

ChallengeExpertise is not additive:

R(M, p) ⇤=X

m2M

r (m, p) . (3)

–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌃M⇤

p

= Ø.

R(M, p) =

��M ⌃M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.Use both semantic and structural features.Use ensemble predictive model.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ⌦byt

� by

t+1⌦2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

ChallengeExpertise is not additive:

R(M, p) ⇤=X

m2M

r (m, p) . (3)

–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌃M⇤

p

= Ø.

R(M, p) =

��M ⌃M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.Use both semantic and structural features.Use ensemble predictive model.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ⌦byt

� by

t+1⌦2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

ChallengeExpertise is not additive:

R(M, p) ⇤=X

m2M

r (m, p) . (3)

–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌃M⇤

p

= Ø.

R(M, p) =

��M ⌃M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.Use both semantic and structural features.Use ensemble predictive model.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ⌦byt

� by

t+1⌦2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

R(M, p) ⌅=X

m2M

r (m, p) . (3)

LimitationExpertise is not additive:–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌥M⇤

p

= Ø.

R(M, p) =

��M ⌥M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ↵byt

� by

t+1↵2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.G

t+1: generalization error after (x�

, y�

) is added to the training set.⇤G = G

t

�Gt+1: improvement in generalization error

1

——-Generalized Assignment ProblemObjective: given a paper p, choose a group of expertsM ⇥ M (of a fixed size s) that collectively possessesthe most expertise about p:

maximize R(M, p) =P

m2M

r (m, p) (1)subject to |M | = s (2)

ChallengeExpertise is not additive:

R(M, p) ⇤=X

m2M

r (m, p) . (3)

–Group Expertise EstimationAssumptionsM⇤

p

are athors of paper p; so R(M⇤p

, p) = 1, andR(M, p) = 0, where M ⌃M⇤

p

= Ø.

R(M, p) =

��M ⌃M⇤p

����M⇤

p

�� . (4)

Learn bR and use it to estimate group expertise.Use both semantic and structural features.Use ensemble predictive model.—Training Set: T = (X

T

, YT

) = {(xi

, f(xi

))xi2XT }

Function learned from XT

(and corresponding YT

): bRT

Generalization error: G( bRT

) = L( bRT

, f)

minXT

G( bRT

)

g: optimal function (in the sollution space)bR: learned functionbR

i

’s: learned functions from a slightly different training set.EG = B + V + C

B =⇣E bR(x)� g(x)

⌘2

V =⇣

bR� E bR(x)⌘2

C = (g(x)� f(x))2

Use changes in the estimates ⌦byt

� by

t+1⌦2 to estimate improvement in thegeneralization error G(x

)where b

y

t+1 are the estimates after (x�

, y�

) was added to the training set.G

t

: generalization error when the number of training points is equal to t.

1

Connecting People & Contents���Expertise Finding & Group Formation���

Page 26: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

hFp://cmap.ccdmd.qc.ca/rid=1225215801935_1377957481_826/  

Operationalization���of complex ���conceptual models���

Motivation���Internalization���

Connecting���Models���

Page 27: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Table 10.1. Motivation Terms: Author by Factor MatrixAuthor Wlodkowski Paulsena Donald Keller MacKinnon Panitz Feldmanb Nuhfer Farmer Theallc Pintrich Forsythd Chickeringe

Factor

Inclusion X X X X XCommunity X X X XClimate X XOwnership X X X X X X

Attitude X X X XAffect XInterest X XAwareness X XAttention XEnthusiasm X

Meaning X XRelevance X X X X XValue X X X X

Competence X X XEmpowerment X X XConfidence X XExpectancy X X X

Leadership X XHigh expectations X X XStructure X X X X XFeedback X X XSupport X X X

Satisfaction X XRewards X Xa Paulsen and Feldman (Chapter Two). b Feldman and Paulsen (Chapter Seven). c Theall, Birdsall, and Franklin, 1997. d Forsyth and McMillan, 1991. e Chickering and Gamson, 1987.

(Theall  &  Franklin,  1999)  

Model Diversity���

Page 28: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Complex Modeling���

.………………

Text (tweets) motivationlabel

.………………

.………………

.………………

.………………

Dataset Construction Feature Extraction

Extractors Features

relatedness

sentiment...

Conceptual

part of spch

dependency

Textual

...

Sub-Modeling

Sub-Models

Expct Val T

Cogn Diss T...

Conceptual

M1

M2

Computational

...

FeatureSelection Sub-Models

A1

A2

A3

Aggregation

Expct Val T

Cogn Diss T

M1

M2 A4

Classifier

Ensemble Construction

Typical Approach

Page 29: Network Learning: AI-driven Connectivist Framework for E-Learning 3.0

Networks  are  crucial  for  learning  

•  use  AI  methods  to  construct  /  opTmize  networks  –  connecTng  contents  –  connecTng  people  –  connecTng  people  &  contents  –  connecTng  models  

---------------------------------------------

---------------------------------------------

AI

---------------------------------------------Summary���

Concept Extraction

conceptconcept

conceptconcept

conceptconcept

conceptconcept

documentsconcepts

conceptconcept

conceptconcept

conceptconcept

conceptconcept

SemanticMapping

context(documents)

concepts

concept concept

concept concept

concept concept

Knowledge LevelEstimation

conceptconcept

conceptconcept

conceptconcept

conceptconcept

concepts

concept concept

concept concept

concept concept

Group Formation

tasks

Influence Estimation

interaction log