network questions: structural 1. how many connections does the average node have? 2. are some nodes...
TRANSCRIPT
Network Questions: Structural
1. How many connections does the average node have?
2. Are some nodes more connected than others?
3. Is the entire network connected?
4. On average, how many links are there between nodes?
5. Are there clusters or groupings within which the connections are
particularly strong?
6. What is the best way to characterize a complex network?
7. How can we tell if two networks are “different”?
8. Are there useful ways of classifying or categorizing networks?
slides from David P. Feldman1
Network Questions: Communities
1. Are there clusters or groupings within which the connections are particularly strong?
2. What is the best way to discover communities, especially in large networks?
3. How can we tell if these communities are statistically significant?
4. What do these clusters tell us in specific applications?
2
Network Questions: Dynamics of
1. How can we model the growth of networks?
2. What are the important features of networks that our models should capture?
3. Are there “universal” models of network growth? What details matter and what details don’t?
4. To what extent are these models appropriate null models for statistical inference?
5. What’s the deal with power laws, anyway?
3
Network Questions: Dynamics on
1. How do diseases/computer viruses/innovations/ rumors/revolutions propagate on networks?
2. What properties of networks are relevant to the answer of the above question?
3. If you wanted to prevent (or encourage) spread of something on a network, what should you do?
4. What types of networks are robust to random attack or failure?
5. What types of networks are robust to directed attack?
6. How are dynamics of and dynamics on coupled?
4
Network Questions: Algorithms
1. What types of networks are searchable or navigable?
2. What are good ways to visualize complex networks?
3. How does google page rank work?
4. If the internet were to double in size, would it still work?
5
Network Questions: Algorithms
There are also many domain-specific questions:
1. Are networks a sensible way to think about gene regulation or protein interactions or food webs?
2. What can social networks tell us about how people interact and form communities and make friends and enemies?
3. Lots and lots of other theoretical and methodological questions...
4. What else can be viewed as a network? Many applications await.
6
Network Questions: Outlook
Advances in available data, computing speed, and algorithms have made it possible to apply network analysis to a vast and growing number of phenomena. This means that there is lots of exciting, novel work being done.
This work is a mixture of awesome, exploratory, misleading, irrelevant, relevant, fascinating, ground-breaking, important, and just plain wrong.
It is relatively easy to fool oneself into seeing thing that aren’t there when analyzing networks. This is the case with almost anything, not just networks.
For networks, how can we be more careful and scientific, and not just descriptive and empirical?
7
Lecture 3:
Mathematics of Networks
CS 765: Complex Networks
Slides are modified from Networks: Theory and Application by Lada Adamic
What are networks?
Networks are collections of points joined by lines.
“Network” ≡ “Graph”
points lines Domain
vertices edges, arcs math
nodes links computer science
sites bonds physics
actors ties, relations sociology
node
edge
9
Network elements: edges
Directed (also called arcs) A -> B (EBA)
A likes B, A gave a gift to B, A is B’s child
Undirected A <-> B or A – B
A and B like each other A and B are siblings A and B are co-authors
Edge attributes weight (e.g. frequency of communication) ranking (best friend, second best friend…) type (friend, relative, co-worker) properties depending on the structure of the rest of the graph: e.g.
betweenness Multiedge: multiple edges between two pair of nodes Self-edge: from a node to itself
10
Directed networks
2
1
1
2
1
2
1
2
1
2
21
1
2
1
2
1
2
12
1
2
1
2
1
2
1
21
2 1
2
1
2
12 1
2
1
2
12
1
2
12
1
2
1 2
12
Ada
Cora
Louise
Jean
Helen
Martha
Alice
Robin
Marion
Maxine
Lena
Hazel Hilda
Frances
Eva
RuthEdna
Adele
Jane
Anna
Mary
Betty
Ella
Ellen
Laura
Irene
girls’ school dormitory dining-table partners (Moreno, The sociometry reader, 1960)
first and second choices shown
11
Edge weights can have positive or negative values
One gene activates/ inhibits another
One person trusting/ distrusting another
Research challenge: How does one
‘propagate’ negative feelings in a social network?
Is my enemy’s enemy my friend?
Transcription regulatory network in baker’s yeast
12
Adjacency matrices
Representing edges (who is adjacent to whom) as a matrix Aij = 1 if node i has an edge to node j
= 0 if node i does not have an edge to j
Aii = 0 unless the network has self-loops
If self-loop, Aii=?
Aij = Aji if the network is undirected,or if i and j share a reciprocated edge
ij
i
ij
1
2
3
4
Example:
5
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 0 0 0 1
1 1 0 0 0
A =
13
Adjacency lists
Edge list 2 3 2 4 3 2 3 4 4 5 5 2 5 1
Adjacency list is easier to work with if network is
large sparse
quickly retrieve all neighbors for a node 1: 2: 3 4 3: 2 4 4: 5 5: 1 2
1
2
3
45
14
Nodes
Node network properties from immediate connections
indegreehow many directed edges (arcs) are incident on a node
outdegreehow many directed edges (arcs) originate at a node
degree (in or out)number of edges incident on a node
outdegree=2
indegree=3
degree=5
15
HyperGraphs
Edges join more than two nodes at a time (hyperEdge)
Affliation networks
Examples Families Subnetworks
Can be transformed to a bipartite network
16
C D
A B
C D
A B
Bipartite (two-mode) networks
edges occur only between two groups of nodes, not within those groups
for example, we may have individuals and events directors and boards of directors customers and the items they purchase metabolites and the reactions they participate in
in matrix notation
Bij = 1 if node i from the first group
links to node j from the second group = 0 otherwise
B is usually not a square matrix! for example: we have n customers and m products
i
j
1 0 0 0
1 0 0 0
1 1 0 0
1 1 1 1
0 0 0 1
B =
going from a bipartite to a one-mode graph
One mode projection two nodes from the first group
are connected if they link to the same node in the second group
naturally high occurrence of cliques
some loss of information Can use weighted edges to
preserve group occurrences
Two-mode networkgroup 1
group 2
Collapsing to a one-mode network
i and k are linked if they both link to j Pij = k Bki Bkj
P’ = B BT
the transpose of a matrix swaps Bxy and Byx
if B is an nxm matrix, BT is an mxn matrix
i
j=1
k
j=2
B = BT =
1 0 0 0
1 0 0 0
1 1 0 0
1 1 1 1
0 0 0 1
1 1 1 1 0
0 0 1 1 0
0 0 0 1 0
0 0 0 1 1
Matrix multiplication
general formula for matrix multiplication Zij= k Xik Ykj
let Z = P’, X = B, Y = BT
1 0 0 0
1 0 0 0
1 1 0 0
1 1 1 1
0 0 0 1
P’ =
1 1 1 1 0
0 0 1 1 0
0 0 0 1 0
0 0 0 1 1
=
1 1 1 1 0
1 1 1 1 0
1 1 2 2 0
1 1 2 4 1
0 0 0 1 1
1 1
1 2
11 1 1 1 1
1
0
0
= 1*1+1*1 + 1*0 + 1*0= 2
Collapsing a two-mode network to a one mode-network
Assume the nodes in group 1 are people and the nodes in group 2 are movies
The diagonal entries of P’ give the number of movies each person has seen
The off-diagonal elements of P’ give the number of movies that both people have seen
P’ is symmetric
P’ =
1 1 1 1 0
1 1 1 1 0
1 1 2 2 0
1 1 2 4 1
0 0 0 1 1
1 1
1 2
1