using structure in protein function annotation: predicting protein interactions donald petrey, cliff...
TRANSCRIPT
Using structure in protein function annotation:
predicting protein interactions
Donald Petrey, Cliff Qiangfeng Zhang, Raquel Norel, Barry Honig
Howard Hughes Medical InstituteDepartment of Biochemistry and Molecular Biophysics Center for Computational Biology and Bioinformatics
Columbia University
Fold
Superfamily
Family
Classification
●●
●
● ●
●●
●●
●
●
●●
●● ●
●
●●
●
●●
●
Discrete islands
ThioredoxinQ8L5D4
Glutaredoxin-4
protein disulfide oxidoreductase
L-VVVDFS-A-----TWCGPCKMI-KPFFH-SLSEKKSSLVVLY-A-----PWCSFSQAM-DESYN-DVAEK P--ILLYM-KGSPKLPSCGFSAQA-VQALA-AC---
Iron-sulfur cluster assembly
P22 Cro repressor λ Cro repressor
25%
Afe142%
Xfaso 1
39%
44%
42%
Pfl6
Continuous space
Putative active site(SCREEN)
Formyl-CoA transferasefrom O. formigenes
NESG Target TM1055from T. maritima
Coenzyme-A
CoA from Formyl-CoAtransferase
SAH from DNAmethyltransferaseTyrosine from tyrosyl
tRNA synthetaseThiamin diphosphate fromDXP synthetase
TM1055
Structural neighbors of TM1055
• 1793 proteins• 70 SCOP folds• 3 CATH architectures• 10 CATH topologies• 48 CATH homologous superfamilies• ~ 500 distinct ligands
“jelly roll” “β-propeller”“β-prism”
virus cell bacterium cell
“jelly roll” “β-propeller”
phagosome lyzosome
“β-prism”
Experimental interactions (from BIND+Cellzome)
Modeled interactions Davis FP, Braberg H, et. al. (2006). Nucleic Acids Research 34(10): 2943-52
19,424 12,867
409
target sequences?
sequence similarity
structural similarity
template complex
Modeled complex
Structures from the same SCOP family (non-redundant): 8 (SCOP domain d.17.4.2)
Structures from the same SCOP superfamily (non-redundant) : 23 (SCOP domain d.17.4)SCOP fold (non-redundant):44 (SCOP domain d.17)
Structural neighbors by structure alignment: 420 (PSD < 0.8, the SCOP domain id of the green structure here is d.17.4.4 )
Structure model
the overlap of modeled interface with predicted (shown in red)
good bad
B. subtilis lethal factor
PelleB. Subtilis
lethal factor
n
i xyi
xyin
iin
IcP
IcPcLRccLR
111 ~
|
|,,
Gene co-expression profiles
RGS4 block RASD1
CKS1A interact SKP2
CD4 bind TFAP2A
GPNMB contain PPFIBP1
TACR1 require PARP1
GeneWays (literature) Structures
Figure 8. Use Bayesian method to integrate PPI evidence from various sources. The likelihood ratio of an interaction between two proteins (x and y), , is inferred from different evidences (ci). Here and represent the probability that a “clue”, ci, is observed for proteins x and y that are known to interact or not (represented as and ).
),,( 1 nccLR xyi IcP | xyi IcP
~|
xyI xyI~
ThioredoxinQ8L5D4
Glutaredoxin-4
protein disulfide oxidoreductase
L-VVVDFS-A-----TWCGPCKMI-KPFFH-SLSEKKSSLVVLY-A-----PWCSFSQAM-DESYN-DVAEK P--ILLYM-KGSPKLPSCGFSAQA-VQALA-AC---
Iron-sulfur cluster assembly
Conclusions
• Structural information needs to be leveraged
• Interactively combining overall function annotation with analysis that depends on local bioinformatic/biophysical features.
• Infrastructure applies equally to analyzing subtle differences within families.
Acknowledgements
NIH grant U54-GM074958
Honig LabMarkus Fischer
Cliff ZhangKely Norel