topological properties of robust biological & computational networks research paper by… saket...

23
Topological Properties of robust biological & computational networks Research Paper By… Saket Navlakha Xin He Christos Faloutsos Ziv Bar-Joseph

Upload: candace-palmer

Post on 27-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Topological Properties of

robust biological & computational networks

Research Paper By…

Saket Navlakha

Xin He

Christos Faloutsos

Ziv Bar-Joseph

Overview

Focus:

Network robustness in the important principle in biology and engineering.

Topological Properties:

Redundancy & Sparseness are important properties used by robust networks.

Module topology is tightly linked to the level of environmental variability (noise) the module expects to encounter.

Modules internal to the cell that are less exposed to environmental noise are more connected and less robust than external modules.

Networks are evaluated and designed at module level, so that to optimize the noise or malicious environments to have high robustness.

Combined joint analysis of biological and computational networks leads to novel algorithms and insights benefiting both fields.

Introduction

Robustness??

In computer science, robustness is the ability of a computer system to cope with errors during execution.

Ability to continue the process in a cell or the ability to maintain performance in the face of perturbations and uncertainty, is a long-recognized key property of living systems.

Robustness to failures, environmental and signaling noise, and attacks is a key requirement.

Protein interaction networks, in particular, are robust to most single and double node failures, expression and environmental noise, virus and bacterial attacks.

How connectivity affects the robustness of molecular interaction networks!!

Structural redundancy

Advantage:

Redundant pathways / Parallel pathways

Dense subnetworks

Compensate the loss-of-function mutations

Disadvantage:

High Connectivity leads to functional coupling of different components making a network more susceptible to cascading local failures or attacks.

Eg: Clique

Sparsely connected networks

Bottlenecks

Weekly linked modules

Network motifs

Important Considerations: High-degree nodes in the global protein interaction network in yeast were not like

to be robust.

In other words, these gene nodes tended to be fragile / essential ( Deletion of these genes by genetic knock-out experiments, results in cell death )

These hubs contributed to overall higher connectivity of the network

Bottlenecks through which many other protein interactions are disconnected

Higher connectivity promotes robustness to perturbation under stabilizing selection

What to do!!

Adding a new interaction may also introduce new degrees of freedom for perturbations

But, when such costs are taken in to consideration, cost-benefit analysis

Sparser networks have been shown to promote more robustness

Environmental noise is generally considered equal for all nodes, but the fact is it affects some proteins more than others ( Like external and internal )

Observations in global molecular networks:

Processing in general occurs within local subnetworks or modules

Study based on module level properties may be important in determining network robustness to mutations

Cellular localization was found to be useful feature when predicting gene essentiality in yeast and nuclear proteins in particular were shown to be enriched for such essential genes

Not only in yeast but even in other studies has showed that the gene essentiality is a main factor in specific sub-network.

However, exact relationship between the level of robustness necessary for a biological process to operate and the topological properties that give rise to this robustness has so far not been determined.

Enhance Study based on Observations:

Modules which subject to large environment influences are called external modules

Modules that are relatively insulated from the external noise sources are called internal modules

Robust genes and modules in Saccharomyces Cerevisiae Species of yeast, perhaps the most useful yeast, having been instrumental to

winemaking, baking, and brewing

Analyzed the robustness under single gene deletion by integrating protein-protein interaction and protein-DNA interactions

Out of 5976 proteins, 19.4% (i.e., 1122) were determined to be essential in normal growth conditions

Further analysis:

Gene Essentiality

To determine whether gene robustness is better predicted using global or module-level topology

Global network is decomposed into 50 modules corresponding to biological processes that are required for cell survival and growth.

Gene essentiality and gene centrality increased by roughly 50% when using network features of a gene derived from its local module topology compared to global network topology

Robust genes and modules in Saccharomyces CerevisiaeFurther analysis ( cont’d… )

Infect size:

Previous studies Analysis has shown that the more connected a node is in the network, the more fragile it is

Implies, important to immunize to increase network robustness

In fact, cell needs to respond to noise

Based on susceptible-infectious (SI) model the larger the infect size of node, it is more essential

Module level infect size is also more predictive of essentiality than infect sizes compared to global topology

Highly essential modules were quickly swamped by noise and denser with a high eigenvalue compared with robust modules.

Higher eigenvalue implies higher likelihood of an outbreak

Results

Predicting node and module robustness in biological networks. Yeast interaction network were decomposed into 50 gene ontology modules. Gene essentiality was predicted using topological features of the gene computed within its local module.

Kendall rank correlation coefficient is calculated.

Topological differences between inner and external processes

Internal modules are less susceptible to noise and promote higher connectivity

External modules are more exposed to environmental noise and promote sparser connectivity

Eigenvalue of its adjacency matrix (λ)

Relationship between module essentiality and topology in other conditions and species

Other Conditions: Gene deletion is extensively studied in normal growth conditions (YPD)

It can also be analyzed under different conditions such as heat shock response modules in heat shock experiments

Results for Yeast are almost same for four other conditions

Essential modules are more highly connected than the robust modules

Relationship between module essentiality and topology in other conditions and species

Other Species: Protein-Protein interactions and Protein-DNA interactions for bacteria

Escherichia coli are collected

Out of 2915 proteins, 21.1% (i.e., 616) were determined to be essential in normal growth conditions

Global network is decomposed into 38 modules that are relevant for prokaryotes

Gene individuality is found better using the module-level topology as opposed to global topology as that of Yeast

Similar distinction between robust (external) and fragile (internal) processes based on their function and topology.

Additional biological networks support the focus on modules

Caenorhabditis elegans neural network is analyzed

Nodes as neurons and directed edges as chemical synapses

Network is decomposed to 8 modules

Neurons participating with input or output are labelled as external neurons and others as internal neurons

It is observed that the modules with higher percentage of internal neurons has higher eigenvalue, density and infect size than external modules

This is similar to the yeast observations

Over Bacterial Species

Similar study is conducted on metabolic networks for 75 bacterial species

Nodes as metabolites and edges as enzymatic reaction transforming one metabolite to another

Found similar result that networks that thrive in stable environments have significantly higher eigenvalue, density and infect size than those inhibit dnamic and variable environments

A Computational model for generating modules with varying topologies

Small change to a standard duplication-divergence model give rise to complex range of topological features

A random node u is duplicated in to topological equivalent u‘ and is connected to all neighbors or u

Each common neighbor of x of u and u‘, and remove either (u,x) or (u,x ‘).

Two duplicates are connected themselves with probability qcon

To adjust the duplication model to account for varying module topologies, analyzed common-neighbor retainment (qmod)

For each module Jaccard coefficient is calculated between each gene in the module and its paralogues.

It is observed that the higher the average Jaccard coefficient, the more like the module is essential.

On varying qmod, still adheres to original duplication model

Lower values of qmod generate dense, clique-like networks characteristic of internal modules

Higher qmod value will generate sparse.

A Computational model for generating modules with varying topologies

Biological insights for the analysis of secure communication networks

Failures and attacks are common in biological networks / communication networks

Eg : Internet is regularly targeted with worms that infect machines and transport networks such as the power grid and experienced widespread failures

Attacked systems are detected and are isolated from network for maintenance

If densely connected, then low residual connectivity as the virus will spread to many nodes

If sparsely connected, many pair of nodes will be disconnected results in equally low residual connectivity

This similar logic is applied to yeast and found that highly essential modules had the lowest residual connectivity after infection

This is similar to that of our observation that internal modules have topologies that promote efficiency more than robustness

Biological insights for the analysis of secure communication networks (cont’d…)

Benchmark is tested on two communication networks

Gnutella peer-to-peer (P2P) file sharing network

Sequence of five snapshots of graph of routers (representing internet)

Observed that P2P networks provide a robust storage mechanism and constantly deal with addition and removal of nodes and edges

P2P networks has high residual connectivity and robust compared to Internet

Designing robust communication networks for varying environmental conditions

In real world applications noise and susceptibility to attacks may not affect all nodes equally

Few sub-networks might be highly controlled or insulated from outside and requires less protection

But, few others may exists in highly variable or unknown environments, requires more protection

Designing robust communication networks for varying environmental conditions (Cont’d…)

For each node in a module, if we consider γ as probability, it becomes infected and spreads the virus as usual

But with probability as 1- γ , it does not get infected and does not spread virus to any neighbor

When γ value is low, few nodes become infected and as γ increases, more nodes become infected

Finally to design networks in accordance with γ , we varied the value of qmod in the generative model and found that when γ is low, clique-like networks confer the lowest residual routing distance after infection

As γ increases, sparser networks are preferred. Even in the most noisy of environments (γ =1) the best value of qmod is 0.5, which implies the higher values of qmod result in networks that are initially too sparse to withstand the attack

Designing robust communication networks for varying environmental conditions (Cont’d…)

Summary

From the different analysis made and different experiments conducted, this model can be used to design networks with

balanced robustness

efficiency based on the expected security risk

Queries…?