ebisep2010 oboyle docking
TRANSCRIPT
-
8/18/2019 EBISep2010 OBoyle Docking
1/39
Protein-Ligand Docking
Dr. Noel O’Boyle
University College Cork
Sep 2010EMBO Practical Course
Computational aspects of protein structuredetermination and analsis! from data to structure to
function
-
8/18/2019 EBISep2010 OBoyle Docking
2/39
Outline
" #ntroduction to protein-ligand docking" Practical aspects
" Searc$ing for poses
" Scoring functions
" %ssessing performance
-
8/18/2019 EBISep2010 OBoyle Docking
3/39
Outline
" #ntroduction to protein-ligand docking" Practical aspects
" Searc$ing for poses
" Scoring functions
" %ssessing performance
-
8/18/2019 EBISep2010 OBoyle Docking
4/39
Computer-aided drug design &C%DD'
(no)n ligand&s' *o kno)n ligand
( n o
) n p r o t e i n
s t r u
c t u r e
+ n k n o ) n
p r o t e i n s t r u
c t u r e
Structure-based drug
design (SBDD)
Protein-ligand docking
igand-based drug design
(BDD)
1 or more ligands
" Similarit searc$ingSeveral ligands" P$armacop$ore searc$ingMany ligands (20+)" ,uantitatie Structure-%ctiit.elations$ips &,S%.'
De novo design
C!DD o" no use
*eed e/perimentaldata of some sort
-
8/18/2019 EBISep2010 OBoyle Docking
5/39
-
8/18/2019 EBISep2010 OBoyle Docking
6/39
Pose s5 inding site
" Binding site &or 3actie site4' t$e part of t$e protein )$ere t$e ligand
inds
generall a cait on t$e protein surface
can e identified looking at t$e crstal
structure of t$e protein ound )it$ a kno)nin$iitor
" #ose &or 3inding mode4' $e geometry of t$e ligand in t$e inding
site
6eometr 7 location& orientation andcon"or'ation
" Protein-ligand docking is not aboutidentifying the binding site
-
8/18/2019 EBISep2010 OBoyle Docking
7/39
+ses of docking
" $e main uses of protein-ligand docking are for 8irtual screening9 to identif potential lead compounds from
a large dataset &see ne/t slide' Pose prediction
" #ose $rediction
" #f )e kno) e/actl )$ereand $o) a kno)n ligandinds555 :e can see )$ic$ parts are
important for inding :e can suggest c$anges to
improe affinit
%oid c$anges t$at )ill ;clas$<)it$ t$e protein
-
8/18/2019 EBISep2010 OBoyle Docking
8/39
8irtual screening
" 8irtual screening is t$e computational or in silicoanalogue of iological screening
" $e aim is to score9 rank or "ilter a set of c$emicalstructures using one or more computationalprocedures Docking is =ust one )a to do t$is
" #t can e used to $elp decide )$ic$ compounds to screen &e/perimentall'
)$ic$ liraries to snt$esise
)$ic$ compounds to purc$ase from an e/ternal compan
to analse t$e results of an e/periment9 suc$ as a >S run
-
8/18/2019 EBISep2010 OBoyle Docking
9/39
Components of docking soft)are
" picall9 protein-ligand docking soft)are consist oft)o main components )$ic$ )ork toget$er!
" . Searc algorit' 6enerates a large numer of poses of a molecule in t$e
inding site
" *. Scoring "unction Calculates a score or inding affinit for a particular pose
" +o give," $e $ose of t$e molecule in
t$e inding site" $e inding affinit or a
score representing t$estrengt$ of inding
-
8/18/2019 EBISep2010 OBoyle Docking
10/39
?inal points
" Large numer of docking programs aailale %utoDock9 DOC(9 e->its9 ?le/@9 ?.ED9 6lide9 6OLD9
Ligand?it9 ,@P9 Surfle/-DockAamong ot$ers Different scoring functions9 different searc$ algorit$ms9
different approac$es
See Section 125 in DC oung9 Computational Drug Design &:ile200' for good oerie) of different packages
" *ote! protein-ligand docking is not to e confused )it$ t$e fieldof protein-protein docking &3protein docking4'
-
8/18/2019 EBISep2010 OBoyle Docking
11/39
Outline
" #ntroduction to protein-ligand docking" Practical aspects
" Searc$ing for poses
" Scoring functions
" %ssessing performance
-
8/18/2019 EBISep2010 OBoyle Docking
12/39
Preparing t$e protein structure" PDB structures often contain )ater molecules
#n general9 all )ater molecules are remoed e/cept )$ere it iskno)n t$at t$e pla an important role in coordinating to t$e ligand
" PDB structures are missing all $drogen atoms Man docking programs reuire t$e protein to $ae e/plicit
$drogens5 #n general t$ese can e added unamiguousl9 e/cept
in t$e case of acidicFasic side c$ains
" %n incorrect assignment of protonationstates in t$e actie site )ill gie poorresults
" 6lutamate9 %spartate $ae COO- orCOO> O> is $drogen ond donor9 O- is not
" >istidine is a ase and its neutral form$as t)o tautomers
-
8/18/2019 EBISep2010 OBoyle Docking
13/39
Preparing t$e protein structure" ?or particular protein side c$ains9 t$e PDB structure can
e incorrect" Crstallograp$ gies electron densit9 not molecular
structure #n poorl resoled crstal structures of proteins9 isoelectronic
groups can gie make it difficult to deduce t$e correct structure
" %ffects asparagine9 glutamine9 $istidine" #mportantG %ffects $drogen onding pattern
" Ma need to flip amide or imidaHole >o) to decideG Look at $drogen onding pattern in crstal
structures containing ligands
-
8/18/2019 EBISep2010 OBoyle Docking
14/39
Ligand Preparation
" % reasonale ID structure is reuired as starting point During docking9 t$e ond lengt$s and angles in ligands are $eld
fi/edJ onl t$e torsion angles are c$anged
" $e protonation state and tautomeric form of a particularligand could influence its $drogen onding ailit Eit$er protonate as e/pected for p$siological p> and use a
single tautomer
Or generate and dock all possile protonation states andtautomers9 and retain t$e one )it$ t$e $ig$est score
Enol (etone
-
8/18/2019 EBISep2010 OBoyle Docking
15/39
Outline
" #ntroduction to protein-ligand docking" Practical aspects
" Searc$ing for poses
" Scoring functions
" %ssessing performance
-
8/18/2019 EBISep2010 OBoyle Docking
16/39
$e searc$ space
" $e difficult )it$ proteinligand docking is in partdue to t$e fact t$at it inoles man degrees offreedom $e translation and rotation of one molecule relatie to
anot$er inoles si/ degrees of freedom
$ere are in addition t$e conformational degrees of freedomof ot$ t$e ligand and t$e protein
$e solent ma also pla a significant role in determiningt$e proteinligand geometr &often ignored t$oug$'
" $e searc$ algorit$m generates poses9 orientationsof particular conformations of t$e molecule in t$einding site ries to coer t$e searc$ space9 if not e/$austiel9 t$en as
e/tensiel as possile
$ere is a tradeoff et)een time and searc$ space coerage
-
8/18/2019 EBISep2010 OBoyle Docking
17/39
Ligand conformations" Conformations are different t$ree-dimensional structures of
molecules t$at result from rotation aout single onds $at is9 t$e $ae t$e same ond lengt$s and angles ut different torsionangles
" ?or a molecule )it$ * rotatale onds9 if eac$ torsion angle isrotated in increments of K degrees9 numer of conformations is&I0F K'*
#f t$e torsion angles are incremented in steps of I09 t$is meanst$at a molecule )it$ rotatale onds )it$ $ae 12N 20(conformations
" >aing too man rotatale onds results in 3cominatoriale/plosion4
" %lso ring conformations
a/ol
-
8/18/2019 EBISep2010 OBoyle Docking
18/39
-
8/18/2019 EBISep2010 OBoyle Docking
19/39
Searc$ %lgorit$ms
" :e can classif t$e arious searc$ algorit$ms
according to t$e degrees of freedom t$at t$e consider " .igid docking or fle/ile docking
:it$ respect to t$e ligand structure
" igid docking
" $e ligand is treated as a rigid structure during t$edocking Onl t$e translational and rotational degrees of freedom are
considered
" o deal )it$ t$e prolem of ligand conformations9 a large numerof conformations of eac$ ligand are generated in adance andeac$ is docked separatel
" E/amples! ?.ED &?ast .igid E/$austie Docking' from OpenEe9and one of t$e earliest docking programs9 DOC(
-
8/18/2019 EBISep2010 OBoyle Docking
20/39
$e DOC( algorit$m .igid docking
" $e DOC( algorit$m deeloped
(untH and co-)orkers is generallconsidered one of t$e ma=oradances in proteinligand docking(untH et al59 M!" /*9 1#19 2Q
" $e earliest ersion of t$e DOC(algorit$m onl considered rigid
od docking and )as designed toidentif molecules )it$ a $ig$degree of s$ape complementaritto t$e protein inding site5
" $e first stage of t$e DOC(
met$od inoles t$e construction ofa 3negatie image4 of t$e indingsite consisting of a series ofoerlapping sp$eres of aringradii9 deried from t$e molecularsurface of t$e protein
%. Leac$9 8R 6illet9 %n #ntroduction to C$eminformatics
-
8/18/2019 EBISep2010 OBoyle Docking
21/39
$e DOC( algorit$m .igid docking
%. Leac$9 8R 6illet9 %n #ntroduction to C$eminformatics
" Ligand atoms are t$en matc$ed to
t$e sp$ere centres so t$at t$edistances et)een t$e atomseual t$e distances et)een t$ecorresponding sp$ere centres9)it$in some tolerance5
"$e ligand conformation is t$enoriented into t$e inding site5 %fterc$ecking to ensure t$at t$ere areno unacceptale stericinteractions9 it is t$en scored5
" *e) orientations are produced generating ne) sets of matc$ingligand atoms and sp$ere centres5$e procedure continues until allpossile matc$es $ae eenconsidered5
-
8/18/2019 EBISep2010 OBoyle Docking
22/39
?le/ile docking" 0le1ible docking is t$e most common form of docking toda
Conformations of eac$ molecule are generated on-t$e-fl t$esearc$ algorit$m during t$e docking process
$e algorit$m can aoid considering conformations t$at do not fit" E/$austie &sstematic' searc$ing computationall too
e/pensie as t$e searc$ space is er large" One common approac$ is to use stoc$astic searc$ met$ods
$ese don
-
8/18/2019 EBISep2010 OBoyle Docking
23/39
>andling protein conformations" Most docking soft)are treats t$e protein as rigid
.igid .eceptor %ppro/imation" $is appro/imation ma e inalid for a particularprotein-ligand comple/ as555 t$e protein ma deform slig$tl to accommodate different
ligands &ligand-induced fit' protein side c$ains in t$e actie site ma adopt different
conformations" Some docking programs allo)
protein side-c$ain fle/iilit ?or e/ample9 selected side c$ains are
allo)ed to undergo torsional rotation
around acclic onds #ncreases t$e searc$ space
" Larger protein moements can onle $andled separate dockings todifferent protein conformations
" Ensemle docking &e5g5 6OLD 50'
-
8/18/2019 EBISep2010 OBoyle Docking
24/39
Outline
" #ntroduction to protein-ligand docking" Practical aspects
" Searc$ing for poses
" Scoring functions
" %ssessing performance
-
8/18/2019 EBISep2010 OBoyle Docking
25/39
Components of docking soft)are
" picall9 protein-ligand docking soft)are consist of
t)o main components )$ic$ )ork toget$er!
" . Searc algorit' 6enerates a large numer of poses of a molecule in t$e
inding site
" *. Scoring "unction Calculates a score or inding affinit for a particular pose
" +o give," $e $ose of t$e molecule int$e inding site
" $e inding affinit or ascore representing t$estrengt$ of inding
-
8/18/2019 EBISep2010 OBoyle Docking
26/39
$e perfect scoring function )illA
" %ccuratel calculate t$e inding affinit :ill allo) acties to e identified in a irtual screen Be ale to rank acties in terms of affinit
" Score t$e poses of an actie $ig$er t$an poses of an
inactie :ill rank acties $ig$er t$an inacties in a irtual screen
" Score t$e correct pose of t$e actie $ig$er t$an anincorrect pose of t$e actie
:ill allo) t$e correct pose of t$e actie to e identified
" 3acties4 7 molecules )it$ iological actiit
-
8/18/2019 EBISep2010 OBoyle Docking
27/39
Classes of scoring function
" Broadl speaking9 scoring functions can ediided into t$e follo)ing classes! ?orcefield-ased
" Based on terms from molecular mec$anicsforcefields
" 6oldScore9 DOC(9 %utoDock
Empirical" Parameterised against e/perimental inding
affinities
" C$emScore9 PLP9 6lide SPF@P (no)ledge-ased potentials
" Based on statistical analsis of osered pair)isedistriutions
" PM?9 DrugScore9 %SP
-
8/18/2019 EBISep2010 OBoyle Docking
28/39
Empirical scoring functions
-
8/18/2019 EBISep2010 OBoyle Docking
29/39
B$mere t$e alue isdirectl proportional to t$e numer of rotatale onds in t$e ligand &*.O'5
" #n general9 scoring functions assume t$at t$e free energ of inding can e)ritten as a linear sum of terms to reflect t$e arious contriutions to inding
" Bo$m
-
8/18/2019 EBISep2010 OBoyle Docking
30/39
B$m
-
8/18/2019 EBISep2010 OBoyle Docking
31/39
(no)ledge-ased potentials
"Statistical potentials
" Based on a comparison et)een t$e osered numerof contacts et)een certain atom tpes &e5g5 sp2-$ridisedo/gens in t$e ligand and aromatic carons in t$e protein' and t$e
numer of contacts one )ould e/pect if t$ere )ere nointeraction et)een t$e atoms &t$e reference state'
" Deried from an analsis of pairs of non-ondedinteractions et)een proteins and ligands in PDB Osered distriutions of geometries of ligands in crstal
structures are used to deduce t$e potential t$at gae rise tot$e distriution
>ence 3kno)ledge-ased4 potential
-
8/18/2019 EBISep2010 OBoyle Docking
32/39
?or e/ample9 creating t$e distriutions of ligand carbonyl o1ygens to$rotein ydro1yl grou$s!
&imagine t$e minimum at I50%ng'
(no)ledge-ased potentials
-
8/18/2019 EBISep2010 OBoyle Docking
33/39
" Some pair)ise interactions ma occur seldom in t$ePDB .esulting distriution ma e inaccurate
" Doesn
-
8/18/2019 EBISep2010 OBoyle Docking
34/39
Outline
" #ntroduction to protein-ligand docking" Practical aspects
" Searc$ing for poses
" Scoring functions
" %ssessing performance
-
8/18/2019 EBISep2010 OBoyle Docking
35/39
Pose prediction accurac
" 6ien a set of acties )it$ kno)n crstal poses9 can
t$e e docked accuratelG" %ccurac measured .MSD &root mean suared
deiation' compared to kno)n crstal structures .MSD 7 suare root of t$e aerage of &t$e difference
et)een a particular coordinate in t$e crstal and t$atcoordinate in t$e pose'2
:it$in 250W .MSD considered cut-off for accurac More sop$isticated measures $ae een proposed9 ut are
not )idel adopted
" #n general9 t$e est docking soft)are predicts t$ecorrect pose aout X0Y of t$e time
" *ote! it
-
8/18/2019 EBISep2010 OBoyle Docking
36/39
%. Leac$9 8R 6illet9 %n #ntroduction to C$eminformatics
-
8/18/2019 EBISep2010 OBoyle Docking
37/39
%ssess performance of a irtual screen
" *eed a dataset of *act kno)n acties9 and inacties
" Dock all molecules9 and rank eac$ score" #deall9 all acties )ould e at t$e top of t$e list
#n practice9 )e are interested in an improement oer )$at ise/pected c$ance
" Define enric$ment9 E9 as t$e numer of acties found
&*found' in t$e top @Y of scores &tpicall 1Y or Y'9compared to $o) man e/pected c$ance E 7 *found F &*act Z @F100' E [ 1 implies 3positie enric$ment49 etter t$an random E \ 1 implies 3negatie enric$ment49 )orse t$an random
" :$ use a cut-off instead of looking at t$e mean rank oft$e actiesG picall9 t$e researc$ers mig$t test onl $ae t$e resources to
e/perimentall test t$e top 1Y or Y of compounds" More sop$isticated approac$es $ae een deeloped
&e5g5 BED.OC' ut enric$ment is still )idel used
-
8/18/2019 EBISep2010 OBoyle Docking
38/39
?inal t$oug$ts
" Protein-ligand docking is an essential tool forcomputational drug design :idel used in p$armaceutical companies Man success stories &see (ol et al5 $urr% &'in% !iotech%9
*339 20 9 V2'
" But it
-
8/18/2019 EBISep2010 OBoyle Docking
39/39
Protein-Ligand Docking
Dr. Noel O’Boyle
University College Cork
Sep 2010EMBO Practical Course
Computational aspects of protein structuredetermination and analsis! from data to structure to
function
,uestionsG