topological properties of robust biological & computational networks research paper by… saket...
TRANSCRIPT
Topological Properties of
robust biological & computational networks
Research Paper By…
Saket Navlakha
Xin He
Christos Faloutsos
Ziv Bar-Joseph
Overview
Focus:
Network robustness in the important principle in biology and engineering.
Topological Properties:
Redundancy & Sparseness are important properties used by robust networks.
Module topology is tightly linked to the level of environmental variability (noise) the module expects to encounter.
Modules internal to the cell that are less exposed to environmental noise are more connected and less robust than external modules.
Networks are evaluated and designed at module level, so that to optimize the noise or malicious environments to have high robustness.
Combined joint analysis of biological and computational networks leads to novel algorithms and insights benefiting both fields.
Introduction
Robustness??
In computer science, robustness is the ability of a computer system to cope with errors during execution.
Ability to continue the process in a cell or the ability to maintain performance in the face of perturbations and uncertainty, is a long-recognized key property of living systems.
Robustness to failures, environmental and signaling noise, and attacks is a key requirement.
Protein interaction networks, in particular, are robust to most single and double node failures, expression and environmental noise, virus and bacterial attacks.
How connectivity affects the robustness of molecular interaction networks!!
Structural redundancy
Advantage:
Redundant pathways / Parallel pathways
Dense subnetworks
Compensate the loss-of-function mutations
Disadvantage:
High Connectivity leads to functional coupling of different components making a network more susceptible to cascading local failures or attacks.
Eg: Clique
Sparsely connected networks
Bottlenecks
Weekly linked modules
Network motifs
Important Considerations: High-degree nodes in the global protein interaction network in yeast were not like
to be robust.
In other words, these gene nodes tended to be fragile / essential ( Deletion of these genes by genetic knock-out experiments, results in cell death )
These hubs contributed to overall higher connectivity of the network
Bottlenecks through which many other protein interactions are disconnected
Higher connectivity promotes robustness to perturbation under stabilizing selection
What to do!!
Adding a new interaction may also introduce new degrees of freedom for perturbations
But, when such costs are taken in to consideration, cost-benefit analysis
Sparser networks have been shown to promote more robustness
Environmental noise is generally considered equal for all nodes, but the fact is it affects some proteins more than others ( Like external and internal )
Observations in global molecular networks:
Processing in general occurs within local subnetworks or modules
Study based on module level properties may be important in determining network robustness to mutations
Cellular localization was found to be useful feature when predicting gene essentiality in yeast and nuclear proteins in particular were shown to be enriched for such essential genes
Not only in yeast but even in other studies has showed that the gene essentiality is a main factor in specific sub-network.
However, exact relationship between the level of robustness necessary for a biological process to operate and the topological properties that give rise to this robustness has so far not been determined.
Enhance Study based on Observations:
Modules which subject to large environment influences are called external modules
Modules that are relatively insulated from the external noise sources are called internal modules
Robust genes and modules in Saccharomyces Cerevisiae Species of yeast, perhaps the most useful yeast, having been instrumental to
winemaking, baking, and brewing
Analyzed the robustness under single gene deletion by integrating protein-protein interaction and protein-DNA interactions
Out of 5976 proteins, 19.4% (i.e., 1122) were determined to be essential in normal growth conditions
Further analysis:
Gene Essentiality
To determine whether gene robustness is better predicted using global or module-level topology
Global network is decomposed into 50 modules corresponding to biological processes that are required for cell survival and growth.
Gene essentiality and gene centrality increased by roughly 50% when using network features of a gene derived from its local module topology compared to global network topology
Robust genes and modules in Saccharomyces CerevisiaeFurther analysis ( cont’d… )
Infect size:
Previous studies Analysis has shown that the more connected a node is in the network, the more fragile it is
Implies, important to immunize to increase network robustness
In fact, cell needs to respond to noise
Based on susceptible-infectious (SI) model the larger the infect size of node, it is more essential
Module level infect size is also more predictive of essentiality than infect sizes compared to global topology
Highly essential modules were quickly swamped by noise and denser with a high eigenvalue compared with robust modules.
Higher eigenvalue implies higher likelihood of an outbreak
Results
Predicting node and module robustness in biological networks. Yeast interaction network were decomposed into 50 gene ontology modules. Gene essentiality was predicted using topological features of the gene computed within its local module.
Kendall rank correlation coefficient is calculated.
Topological differences between inner and external processes
Internal modules are less susceptible to noise and promote higher connectivity
External modules are more exposed to environmental noise and promote sparser connectivity
Eigenvalue of its adjacency matrix (λ)
Relationship between module essentiality and topology in other conditions and species
Other Conditions: Gene deletion is extensively studied in normal growth conditions (YPD)
It can also be analyzed under different conditions such as heat shock response modules in heat shock experiments
Results for Yeast are almost same for four other conditions
Essential modules are more highly connected than the robust modules
Relationship between module essentiality and topology in other conditions and species
Other Species: Protein-Protein interactions and Protein-DNA interactions for bacteria
Escherichia coli are collected
Out of 2915 proteins, 21.1% (i.e., 616) were determined to be essential in normal growth conditions
Global network is decomposed into 38 modules that are relevant for prokaryotes
Gene individuality is found better using the module-level topology as opposed to global topology as that of Yeast
Similar distinction between robust (external) and fragile (internal) processes based on their function and topology.
Additional biological networks support the focus on modules
Caenorhabditis elegans neural network is analyzed
Nodes as neurons and directed edges as chemical synapses
Network is decomposed to 8 modules
Neurons participating with input or output are labelled as external neurons and others as internal neurons
It is observed that the modules with higher percentage of internal neurons has higher eigenvalue, density and infect size than external modules
This is similar to the yeast observations
Over Bacterial Species
Similar study is conducted on metabolic networks for 75 bacterial species
Nodes as metabolites and edges as enzymatic reaction transforming one metabolite to another
Found similar result that networks that thrive in stable environments have significantly higher eigenvalue, density and infect size than those inhibit dnamic and variable environments
A Computational model for generating modules with varying topologies
Small change to a standard duplication-divergence model give rise to complex range of topological features
A random node u is duplicated in to topological equivalent u‘ and is connected to all neighbors or u
Each common neighbor of x of u and u‘, and remove either (u,x) or (u,x ‘).
Two duplicates are connected themselves with probability qcon
To adjust the duplication model to account for varying module topologies, analyzed common-neighbor retainment (qmod)
For each module Jaccard coefficient is calculated between each gene in the module and its paralogues.
It is observed that the higher the average Jaccard coefficient, the more like the module is essential.
On varying qmod, still adheres to original duplication model
Lower values of qmod generate dense, clique-like networks characteristic of internal modules
Higher qmod value will generate sparse.
Biological insights for the analysis of secure communication networks
Failures and attacks are common in biological networks / communication networks
Eg : Internet is regularly targeted with worms that infect machines and transport networks such as the power grid and experienced widespread failures
Attacked systems are detected and are isolated from network for maintenance
If densely connected, then low residual connectivity as the virus will spread to many nodes
If sparsely connected, many pair of nodes will be disconnected results in equally low residual connectivity
This similar logic is applied to yeast and found that highly essential modules had the lowest residual connectivity after infection
This is similar to that of our observation that internal modules have topologies that promote efficiency more than robustness
Biological insights for the analysis of secure communication networks (cont’d…)
Benchmark is tested on two communication networks
Gnutella peer-to-peer (P2P) file sharing network
Sequence of five snapshots of graph of routers (representing internet)
Observed that P2P networks provide a robust storage mechanism and constantly deal with addition and removal of nodes and edges
P2P networks has high residual connectivity and robust compared to Internet
Designing robust communication networks for varying environmental conditions
In real world applications noise and susceptibility to attacks may not affect all nodes equally
Few sub-networks might be highly controlled or insulated from outside and requires less protection
But, few others may exists in highly variable or unknown environments, requires more protection
Designing robust communication networks for varying environmental conditions (Cont’d…)
For each node in a module, if we consider γ as probability, it becomes infected and spreads the virus as usual
But with probability as 1- γ , it does not get infected and does not spread virus to any neighbor
When γ value is low, few nodes become infected and as γ increases, more nodes become infected
Finally to design networks in accordance with γ , we varied the value of qmod in the generative model and found that when γ is low, clique-like networks confer the lowest residual routing distance after infection
As γ increases, sparser networks are preferred. Even in the most noisy of environments (γ =1) the best value of qmod is 0.5, which implies the higher values of qmod result in networks that are initially too sparse to withstand the attack
Summary
From the different analysis made and different experiments conducted, this model can be used to design networks with
balanced robustness
efficiency based on the expected security risk