Download - Lessons on Protein Structure from Lattice Model HC Lee 李弘謙 Nanjing University Nanjing, China 2002 May 22 – 25

Lessons on Protein Structurefrom Lattice Model

HC Lee 李弘謙

Nanjing University Nanjing, China 2002 May 22 – 25

What is a protein?

• Large molecule:

chain of amino acids

• Several tens to thousands residues

• Folds to specific shape

• Biological machines

DNA & Gene

Now we know, for higher life forms: one gene, many proteins

轉錄與翻譯

Gene to Protein

What do proteins do?

• Links Genotype & Phenotype 基因型與現象型• Structural and Functional 結構與功能

– Structural• blood, muscle, bone, etc.

– Functional• catalytic (enzyme), metabolic, neural, reproductive

催化、新陳代謝、神經、複製

Aberrant gene > malfunction protein > disease

Protein Conformation

Alpha helix

Beta sheets

HIV reverse transcriptase 反轉錄脢

Understanding protein folding

Driving Force for Protein Folding

-Most important is interaction of residues with water – hydrophobic and hydrophilic

Miyazawa-Jernigan Statistical Interaction

Li-Tang-Wingreen’s representation of MJ Matrix

one-bodytwo-body

Theoretical analysis [Wang & Lee, PRL 84 (2000)]

Fit to one (a) and two-body (b) terms

M

J-m

atri

x

Theory

Compare with MJ-matrix

Correct to first order; dominatedone-body term - hydrophobicity

Lattice Model

-Simple way to learn something about a very complex subject

Lattice model

• Represent space (or, in field theory, space-time) by a discrete lattice.

• Represent a structure by a path on the lattice.

• A peptide is a string of residues.

• A peptide whose residues occupy a path is in a state, or have a conformation.

• Residues may interact with each other according to relative distance. Or,

• In mean-field model, residue interacts only with lattice sites.

Putting a binary peptide on 2D lattice

Random coil and compact path

Binary rep’n of Peptide:0101011010010110010110010

• The most important interaction for protein folding is residue with water: residues are hydrophobic ( 厭水 ) or hydrophilic ( 親水 ).

• In real protein in native conformation, hydrophobic residues like to be buried, hydrophilic residues like to be exposed to water.

• Simplest model: divide residues into hydrophobic and hydrophilic, structure into core and surface sites.

• Both peptide and structure are binary sequences.

Mean-Field HP Model

Structure-path on a 2D lattice

Structure-path on a 2D lattice

Pay attention toonly whether path is on a core (1) or a surface (0) site

Structure has a binary representation: 001100110000110000110011000011111100 (from Li et al. PRL 79 (1997) 765-768)

Designability of Structures

-Very, very few structures are good for proteins

Structure space >> observed structures

Protein Designability

The LTW model

Ground state of peptide p is structure s closest to it in n-dimensional hyperspace. All peptides in Voronoi volume of s has s as ground state.

The Hamiltonian H = ½ (p – s)**2 is a mapping of the set of peptides P to the set of sructures S that partitions P into equivalent classes labeled by s in S. Target of each class is the ground state/conformation of the class.

Designability of a structure is the number of peptides in the class mapped to that structure

Vonoroi volume Voronoi volume

In hyperspace, all peptide sequences within the Voronoi volume of a structure is closest to that structure (from Li et al. PRL (1997)).

No. of structures vs designability

Li, Tang and Wingreen, PRL (1997)

Very few structures have high designability

Designability

Num

ber

of s

truc

ture

s

• Shortest possible Hamming distance btw two paths proportional to difference in switchback numbers (n10)

• Few paths have high n10

• Path with high n10 has large Voronoi volume, hence high designability

Paths with high switchback numbers have high designability

[Shih et al. & HCL, PRL 84 (2000)]

Hi switchback > hi design’ty

Distribution of Hamming dist.

Log distrib’n vs switchback no.

Designability vs n10; (a) 6x6 (b) 21-site triangular

Foldability of Peptides

-Vast majority of peptides do not fold

Alpha helices like paths with high switchback numbers

• Conformation degeneracy – disfavor peptides w/ long strings of identical/similar residues

• Hence proteins rarely have long strings of contiguous hydrophobic or hydrophilic residues

• Alternating short stretches of hydrophobic and hydrophilic residues yields structurally non-degenerate and robust conformations

• 0011 switchback motif simulate alpha helix on the surface

• Empirically most alpha helices on surface

Compare with real proteins

• Compare model high designability peptides with binarized (by hydrphobicity) protein sequences in PDB– Represent peptide by frequency of occurrenc

e of set of all binary words of fixed length l=2k

– Has 22k such words, put frequencies on a 2k x 2k lattce

[Shih et al. & HCL, PRE 65 (2002)]

PDBAlpha-HP

HP-LS

PDBAll - PDBAlpha

PDBAll-HP

Oligomer length

Ove

rlap

of

bina

ry s

eque

nce

Highly foldable peptides in HP-modelresemble alpha-helices in real proteins

[Shih et al. PRL 84 (2000)]

In HP model: peptide that folds into high designability conformations correspond to peptides that fold to alph

a helices in real proteins

Many models give designabilitybut not all are correct

• Any Hamiltonian (H) is a mapping of peptide space (P) onto conformation space (C)

• For coarse grained C, H partitions P into equivalent classes, each class corresponding to a point in C

• Designability results from a highly skewed distribution of the SIZES of the classes

• Example. The LS (Large-Small) model: structure dominated by steric effect; small residues inside, large residues outside. Almost same math as HP model; has designability but wrong physics.

PDBAlpha-LS

HP-LS

PDBAll - PDBAlpha

PDBAll-LS

Oligomer length

Ove

rlap

of

bina

ry s

eque

nce

Highly foldable peptides in LS-modeldoes notresemble alpha-helices in real proteins

[Shih et al. PRL 84 (2000)]

Unlike hydrophobicitySteric effect does not play a domina

nt rolein the determination of native struct

ure

Folding Funneland

Free-energy Barrier

-Why is folding so fast yet so slow ?

Folding funnel

Folding Funnel

Folding funnel (picture)

http://www.npaci.edu/envision/v15.4/proteinfolding.html

Free energy and entropy

Free Energy, Entropy and Monte Carlo

Free-energy barrier

Free-energy barrier [Guan, Su, Shih & Lee (2000)]

(a) Biding energy increase with compactness(b) Entropy lost rapidly as bindin

g energy increases(c) Free-energy barrier formed b

y competition btw energy gain and entropy lost

Log

(S)

|E/Enative|

No.

of

cont

acts (c)

(b)

G =

(E

– T

S)/

Ena

tive

|E/Enative| |E/Enative|

low T

high T

annealing

barrier(a)

Getting over the barrier takes all the folding time

Summary of lessons

• Average hydrophobic/hydrophlic property of residues can be understood by simple physics.

• Lattice model useful for examining coarse-grain phenomena.

• Long folding time caused by need to surmount free-energy barrier formed by rapid lost of entropy.

• Designability of structure is a direct consequence of hydrophobic/hydrophlic dichotomy of residues.

• Very few structures are highly designable; those that are have large switchback numbers.

• Very few peptides are foldable; many of those that are alternate rapidly between hydrophobic and hydrophlic residues.

• Highly foldable peptides folded into high designability structures form robust proteins.

• They fold easily into alpha-helices and to a lesser extent to beta-sheets; hence alpha-helices are formed very, very early in folding process, then beta-sheets.

Summary of lessons (cont’d)

Molecular Dynamics - atomistic description of protein

folding

-takes one giga-flop PC to run one-million days to fold a medium small protein

Massively Distributive Computation

• Molecular dynamics. – Atomistic level simulation needed to understand protein f

olding and function relevant to biology and drug design

• Annealing time very long– Boltzmann probability:

one machine x 1 M days = 1 M machines x one day

• Starting a program of massively distributive computation - use screen saver program for simulation

• of Vijay Pande, Stanford

The End謝謝大家

Download - Lessons on Protein Structure from Lattice Model HC Lee 李弘謙 Nanjing University Nanjing, China 2002 May 22 – 25

Top Related