computer aided drug design - sjtucbb.sjtu.edu.cn/~qinxu/files/lecture20161018.pdfnext, a series of...

148
Computer Aided Drug Design Qin Xu

Upload: others

Post on 25-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Computer Aided Drug Design

    Qin Xu

  • Contact

    • 4-221 Life Science Building • Email: [email protected] • Tel: 34204348(O) • Website:

    http://cbb.sjtu.edu.cn/~qinxu/teaching.htm

  • Readings • 分子模拟与计算机辅助药物设计

    – 魏冬青等编著,上海交通大学出版社 • 计算机辅助药物分子设计

    – 徐筱杰等编著,化学工业出版社 • 计算机辅助药物设计导论

    – 叶德泳著,化学工业出版社 • Others

    – Textbook of Drug Design and Discovery, Third Edition, by Povl Krogsgaard-Larsen, published by Oxford University Press

    – Computational Medicinal Chemistry for Drug Discovery, by Patrick Bultinck, et. al,published by Marcel Dekker, 2003

    – Molecular Modeling for Beginners,by Alan Hinchliffe,published by John Wiley and Sons,2002

  • Course Outline • Introduction of CADD • Drug Targets

    – Sequence analysis – Protein structure prediction – Molecular simulation

    • Drug Design – Finger Print – Pharmacophore – Combinatorial library – De novo Drug Design – QSAR

    • Molecular Docking

  • What is a drug? • Defined composition with a pharmacological effect • Approved by the Food and Drug Administration

    (FDA) – First, the drug company or sponsor performs laboratory

    and animal tests to discover how the drug works and whether it's likely to be safe and work well in humans. Next, a series of tests in humans is begun to determine whether the drug is safe when used to treat a disease and whether it provides a real health benefit.

    • Most drugs are small molecules, and the interactions they make with targets determine their effects, side effects and toxicity, to the human body

  • Sources of Drugs • Small Molecules

    – Natural products • fermentation broths(发酵液) • plant extracts • animal fluids (e.g., snake venoms蛇毒)

    – Synthetic Medicinal Chemicals • Project medicinal chemistry derived • Combinatorial chemistry derived

    • Biologicals – Natural products (isolation) – Recombinant products – Chimeric or novel recombinant products

  • Drug Discovery and Drug Development

    • Drug Discovery – Concept, mechanism, assay, screening, hit

    identification, lead demonstration, lead optimization

    – In Vivo proof of concept in animals and concomitant demonstration of a therapeutic index (=LD50 / ED50, 治疗指数=半数致死量/半数有效量)

    • Drug Development – Begins when the decision is made to put a

    molecule into phase I clinical trials

  • Drug Discovery Processes

    Molecular Biological Hypothesis (Genomics)

    Chemical Hypothesis

    Physiological Hypothesis

    Primary Assays Biochemical Cellular Pharmacological Physiological

    Sources of Molecules Natural Products Synthetic Chemicals Combichem Biologicals

    + Initial Hit Compounds Screening

  • Drug Discovery Processes - II

    Initial Hit Compounds

    Secondary Evaluation - Mechanism Of Action - Dose Response

    Initial Synthetic Evaluation - analytics - first analogs

    Hit to Lead Chemistry - physical properties -in vitro metabolism

    First In Vivo Tests - PK, efficacy, toxicity

  • Drug Discovery Processes - III

    Lead Optimization Potency Selectivity Physical Properties PK Metabolism Oral Bioavailability Synthetic Ease Scalability

    Pharmacology

    Multiple In Vivo Models

    Chronic Dosing

    Preliminary Toxicity

    Development Candidate

  • Issues in Drug Discovery

    • Hits and Leads - Is it a “Druggable” target? • Resistance抗药性 • Pharmacodynamics 药效学 • Pharmacokinetics 药物代谢动力学 • Delivery - oral and otherwise • Metabolism • Solubility, toxicity • Patentability

  • Issues in Drug Discovery

    • A (Absorption) :药物从作用部位进入体循环的过程

    • D (Distribution) :药物吸收后通过细胞膜屏障向各组织、器官或者体液进行转运的过程

    • M (Metabolism)(Biotransformation):药物在体内受酶系统或者肠道菌丛的作用而发生结构转化的过程

    • E (Excretion):药物以原型或者代谢产物的形式排出体外的过程

    • T (Toxcity):药物对机体的毒性

    药物代谢动力学(Pharmacokinetics, PK)

  • Drug Discovery Disciplines • Medicine • Physiology/pathology • Pharmacology • Molecular/cellular biology • Automation/robotics • Medicinal, analytical,and combinatorial

    chemistry • Structural and computational chemistry • Bioinformatics • Pharmacogenomics

  • Successful drug developments

    • HIV-1 Protease Inhibitors in the market: – Inverase (Hoffman-LaRoche, 1995) – Norvir (Abbot, 1996) – Crixivan (Merck, 1996) – Viracept (Agouron, 1997)

  • `> 10,000,000 compounds

    1 drug

    >1,000 “hits”

    6 drug candidates

    10-15 years $300 to >$800 million

    preclinical clinical Phase I III

    12 “leads”

    • The time from conception to approval of a new drug is typically 10-15 years

    • The vast majority of molecules fail along the way

    • The estimated cost to bring to market a successful drug is now $800 million!! (Dimasi, 2000)

    Costs of drug discovery and development

  • • Clinical trials are most expensive part of the pipeline – if failure can be predicted before this point, it saves time and money

    Why we need computers?

  • 先导化合物设计与优化

    药物候选物 随机筛选 1,000-2,000 个化合物

    临床前研究

    临床研究

    市场

    理论计算、分子模拟 计算机辅助药物设计

    2~3年

    2~3年 2~3年

    2~3年

    3~4年

    Database searching

    Modeling Docking

    Molecular Simulation

    Clinical Experiments

    What we can do with computers?

  • Computer aided drug design (CADD) in various stages of

    drug discovery

  • A Little History of Computer Aided Drug Design

    • 1960’s - review the target - drug interaction • 1980’s- Automation - high throughput target/drug selection • 1980’s- Databases (information technology) - combinatorial libraries • 1980’s- Fast computers - docking • 1990’s- Fast computers - genome assembly - genomic based target selection • 2000’s- Bioinformatics – pharmacogenomics •2010’s- Translational medicine, Precision medicine

  • Discovery of wgx50 - an example of computer

    aided drug design

  • The screening 药物虚拟筛选 DMXBA(GTS-21) is used

    as a template molecule The homology model of alpha7 dimer is used in

    the screening

    Similarity search and flexible alignment are conducted to exclude the molecules

    don’t match the template Lipinski’s rule of five and the molecule

    volume to exclude those too large or too polarity

    docking

    MD

    Analysis of hydrogen bond and hydrophobic

    and hydrophilic interaction and binding

    energy

    Drug candidate Agonist?

    about 10000

    590

    100

    43

    21 9

    7

    Ruo-Xu Gu et. al. Medicinal Chemistry, 2009

  • gx-50 – The Best molecule found in TCM Database

    Maoping Tang et.al., Journal of Alzheimer’s Disease 2013

    DMXBA(GTS-21) gx-50

    N-(2-(3,4-DIMETHOXYPHENYL)ETHYL)-3-PHENYLACRYLAMIDE A molecule from Pricklyash Peel (Sichuan pepper)

    Ruo-Xu Gu et. al. Medicinal Chemistry, 2009

  • MeO

    MeO

    R1N

    O

    Original R1: -CH2-CH2- Modified R1: (1) -CH2- ; (2) -(CH2)3- ; (3) -(CH2)4- ; (4) -CH(CH3)CH2- ;

    (5) -CH2CH(CH3)- ; (6) -C(CH3)=CH- ; (7) -CH=C(CH3)-

    NMeO

    MeO R2

    R3

    Original: R2=O, and R3=

    Modified: (8) R2=O, R3= ; (9) R2=O, R3= ;(10) R2=O, R3= ;

    (11) R2=O, R3= ;(12) R2=S, R3=

    N

    S NNH2

    Optimization and Modification of gx-50

    Ruo-Xu Gu et. al. Medicinal Chemistry, 2009

  • morris water maze

    Morris water maze test demonstrated that gx50 could improve the memory ability of dementia mice

    水迷宫实验 通过对东莨宕碱模型小鼠和APP转基因小鼠进行水迷宫实验,结果表明gx-50可以明显的改进小鼠的记忆力

  • Mao

    ping

    Tan

    g et

    .al.,

    Jou

    rnal

    of A

    lzhe

    imer

    ’s D

    isea

    se 2

    013 Aβ depolymerization-

    Atomic Force Microscope(AFM)

    Aβ解聚实验- 原子力显微镜

  • Destabilization of Alzheimer’s Aβ42 Protofibrils with Wgx-50 by Molecular Dynamics Simulations

    B L34

    Fibril axis

    I32L34

    D23

    K28wgx-50

    A

    Huaimeng Fan et. al. JPCB 2015

    β1

    β2

    Loop V40A42

    V36L34I32

    D23

    K28

    M35

    A21 F19

    F20

    L17

    G38

    A B C D E

  • Course Outline • Introduction of CADD • Drug Targets

    – Sequence analysis – Protein structure prediction – Molecular simulation

    • Drug Design – Finger Print – Pharmacophore – Combinatorial library – De novo Drug Design – QSAR

    • Molecular Docking

  • Drug Targets • Proteins

    – Receptor – enzyme – ion channel

    • Nucleic acid • Therapeutic Target Database (TTD) http://bidd.nus.edu.sg/group/ttd/ttd.asp

    Biomacromolecules that can interact with drug molecule and generate physiological effect

  • Therapeutic Target Database (TTD)

  • Therapeutic Target Database (TTD)

  • Therapeutic Target Database (TTD)

  • Drugs and targets • One drug on one target • Multi target agents

    – One drug on multiple target – Side effects – Serendipity in Drug Discovery

    • Drug combinations – Complexed biological metabolism – Interference between drugs – Cock tail to overcome Drug resistance

    • Natural product derived drugs, traditional medicine

  • Serendipity in Drug Discovery

    • Surprised effect of one drug designed for other proposes – Tamoxifen (birth control and breast cancer) – Viagra (hypertension 高血压 and erectile

    dysfunction 勃起功能障碍) – Salvarsan (Sleeping sickness and syphilis 胎传性梅毒)

    – Interferon-α (hairy cell leukemia 多毛细胞白血病 and Hepatitis C丙肝)

  • Drug repositioning

    • Less tests on ADMET • Bioinformatics • Big data analysis

    Known drugs or compounds

    New treatment strategy

    • Buprenorphine • Requip • Thalidomide • Colesevelam • Plerixafor

  • Drug Combinations

    • Synergistic (协同) • Additive (叠加) • Antagonistic(颉抗 ) • Potentiative(增效) • Reductive(降效)

    •due to anti-counteractive actions •due to complementary actions •due to facilitating actions

    Pharmacodynamically interferences between drugs

  • Nat

    ure

    Med

    icin

    e 5,

    740

    - 74

    2 (1

    999)

    Integrase inhibitors C

    ockt

    ail t

    hera

    py o

    f AID

    S HIV-1

    virus

  • Cocktail therapy of AIDS

  • Computer aided Drug Target Identification

    • Genomics/Proteomics – Gene expression analysis – Biochips, Microfluidics, HPLC-MS – Bioinformatics, Big data analysis

    • Target Protein Structural Prediction – Protein structure modeling – Molecular simulation

  • Sequence Alignment • Dynamic programming sequence

    alignment – Needleman/Wunsch global alignment – Smith/Waterman local alignment – Linear and affine gap penalties

    • Improved algorithms • Heuristic algorithms

    – FASTA – BLAST – CLUSTAL

  • Protein structure prediction • Sequence

    – From RNA sequence – Protein sequencing

    • Structure

    – Secondary – Tertiary

    • Function

    – Activity, specificity – Binding

  • Protein structure prediction

    • Secondary structure prediction – Distribution of amino acids – Featured sequenceFeatured domain

    • Tertiary structure prediction

    – The 3D structure of target proteins – Structure of active site, binding site,

    possible conformational changes – Molecular docking, molecular dynamics

    simulations

  • Protein 3D structure prediction

    • Homology Modeling – Homolog template – Similar sequence similar secondary

    structure similar featured domain similar backbone similar 3D structure with optimized side chains

    • Threading method – Fold recognition method – Long distance homology protein

    • Ab initio prediction – Sequence Structure – Conformer search – Energy minimization

    Rosetta

    MOE, MODELLER, SWISS-MODEL

  • % structure overlap the fraction of equivalent residues (Cα atoms within 3.5Å each other).

    Sequence identity and Accuracy

  • Sequence identity and applications

  • Molecular Simulations

    • Molecular Dynamics, Langevin Dynamics – Protein structure prediction – Molecular docking – Predictive ADMET

    • Monte-Carlo – Combinatorial library – De novo drug design – Molecular docking

  • Applications of Molecular Simulations

    • Search of conformations – Drug molecules – Target proteins – Binding complex

    • Energetic determination – Structural optimization – Binding energy – Free energy simulations

  • Calculation methods of forces and Scales of simulations

    http://www.sciencedirect.com/science/article/pii/S0079642509000565

  • QM/MM MD simulations

    “for the development of multiscale models for

    complex chemical systems"

    Martin Karplus Michael Levitt Arieh Warshel

  • Common workflow in running MD simulations 1. Model Generation. (建模) 2. Energy minimization. (能量最小化) 3. Heating-up the system to the

    desired temperature. (加温) 4. Equilibration. (平衡) 5. Production run. (采样计算) 6. Analysis. (分析)

  • Examples of biomedical researches in computer-aided Drug design

    • Much more gigantic system – 104~108 atoms

    • Much more complicated – target protein – drug ligand – Solution, membrane environment

    • Examples – Nicotinice acetylcholine receptor, nAChR – HIV-1 integrase (IN) – Influenza virus proton channel M2

  • nAChR (nicotinic acetylcholine receptor )

    Ruo-Xu Gu, 2012, Structural Bioinformatics Studies of Two Ion Channel Proteins.

    Agonists in common : cationic center + Hbond acceptor

  • Building of nAChR model

    Ruo

    -Xu

    Gu,

    201

    2, S

    truct

    ural

    Bio

    info

    rmat

    ics

    Stu

    dies

    of T

    wo

    Ion

    Cha

    nnel

    Pro

    tein

    s.

    Sequence alignment by ClusterX

  • Building of nAChR model

    Ruo

    -Xu

    Gu,

    201

    2, S

    truct

    ural

    Bio

    info

    rmat

    ics

    Stu

    dies

    of T

    wo

    Ion

    Cha

    nnel

    Pro

    tein

    s.

    Sequence alignment by ClusterX

  • Building of nAChR model

    Ruo

    -Xu

    Gu,

    201

    2, S

    truct

    ural

    Bio

    info

    rmat

    ics

    Stu

    dies

    of T

    wo

    Ion

    Cha

    nnel

    Pro

    tein

    s.

    Homology modeling by MODELLER

    Homology model of α7 nAChR

  • Drug Screening STITCH Database >100 million molecules

    Positive charge + Hbond acceptor

    Lipinski 5 rules 2610

    Molecular docking by Autodock

  • Docked compounds at the α7 nAChR binding site

    Che

    n et

    . al.

    Jour

    nal o

    f Mol

    ecul

    ar G

    raph

    ics

    and

    Mod

    ellin

    g 20

    13

    A. all; B. compound 18; C. compound 60 in the aromatic cage

  • Biological analyses

    104和115是降压药物; 118和164可用于治疗糖尿病患者的代谢紊乱,并用做肾上

    腺素β受体抑制剂; 6, 14, 17, 19, 24-26, 28, 33, 44, 57, 58, 62-65, 73, 92, 101,

    134, 143, 146, 153 和163都包含异噁唑,可以用做镇痛剂,抗炎药物和止咳药物。

    7, 8, 15, 16, 18, 37, 42, 60, 77, 78, 80, 94, 95, 98, 105-107, 111, 114, 125, 130, 135, 136, 142, 148, 149, 158, 159, 161, 165, 167, 168, 173和179可用做麻醉剂和抗痉挛药物;

    138和170可以用于治疗帕金森症。

  • Interactions between JN403 and three different nAChRs

    Cation –π interactions the VdW interactions

    the long rang electrostatic interactions Arias et. al. Biochemistry 2010

    α7 α3β4 α4β2

  • Interactions between JN403 and three different nAChRs

    Arias et. al. Biochemistry 2010

    sites Hydrophobic interactions

    H-bonds Val108 Leu119 Phe104

    JN403-α7

    Iab

    Ibc

    Icd

    Ide

    Iea

    43.9%

    99.8%

    100%

    99.8%

    100%

    65.1%

    98.6%

    100%

    0

    100%

    0

    0

    0

    97.0%

    0

    0.5

    0.7

    1.1

    1.6

    1.6

    sites Leu117 Leu119 Ile109

    JN403-α3β4 Ibc

    Ide

    100%

    100%

    100%

    100%

    0.4%

    0

    0.2

    0.4

    sites Phe117 Leu119 Val109

    JN403-α4β2 Ibc

    Ide

    -

    91.6%

    -

    0

    -

    65.5%

    -

    0.6

  • Interactions between JN403 and three different nAChRs

    Arias et. al. Biochemistry 2010

    Energy components (kcal/mol) ΔG + TSMM (kcal/mol)

    sites ΔEele ΔEvdw ΔGnonpolar ΔGPBSA-polar ΔGGBSA-

    polar PBSA GBSA

    α7/α7 −167.9±6.5 −43.3±1.9 −5.5±0.2 185.8±6.5 170.6±7.1 −30.8±3.3 −46.0±2.8

    α3/β4 −143.1±7.5 −38.9±2.1 −5.5±0.1 161.8±7.1 149.1±7.5 −25.6±4.1 −38.3±2.9

    α4/β2 −230.5±6.0 −32.4±1.8 −5.1±0.2 247.5±4.5 235.9±5.1 −20.5±3.5 −32.2±2.6

  • Flexibilities of the receptor extracellular domain induced by

    agonists, varenicline

    Cα RMSF

    Ruo

    -Xu

    Gu,

    201

    2, S

    truct

    ural

    Bio

    info

    rmat

    ics

    Stu

    dies

    of T

    wo

    Ion

    Cha

    nnel

    Pro

    tein

    s.

    RMSF=1/2

  • Flexibilities of the receptor extracellular domain induced by agonists, varenicline

    Ruo

    -Xu

    Gu,

    201

    2, S

    truct

    ural

    Bio

    info

    rmat

    ics

    Stu

    dies

    of T

    wo

    Ion

    Cha

    nnel

    Pro

    tein

    s.

  • The correlation of motions of the receptor extracellular domain induced

    by agonists, varenicline

    Cα of α subunits

    Ruo

    -Xu

    Gu,

    201

    2, S

    truct

    ural

    Bio

    info

    rmat

    ics

    Stu

    dies

    of T

    wo

    Ion

    Cha

    nnel

    Pro

    tein

    s.

    rij=Cij/(Cii1/2*Cjj1/2)

    Cij=

  • The correlation of motions of a~h

    Ruo

    -Xu

    Gu,

    201

    2, S

    truct

    ural

    Bio

    info

    rmat

    ics

    Stu

    dies

    of T

    wo

    Ion

    Cha

    nnel

    Pro

    tein

    s.

  • Design of allosteric modulators

    Three classes

    12 compounds

    Arias et. al. Biochemistry 2011

    R1

    HN

    R2

    H

    H

    NH

    R1O

    HCCH3

    CH3

    H3C

    (1)

    (2)

    (3)

    (4)

    Class 1O

    O

    H2C CH2

    OCH3

    OCH3

    N

    CH

    H

    R2R1O

    NO2

    OCH3H2C

    OCH3

    H

    CH3

    H2C CH2OH

    R2

    (6)

    (7)

    (8)

    N

    SCH2

    NR1

    H

    O

    H2CC

    H2OCH3

    OCH3

    H3C

    CH3

    HCCH3

    R1:

    (9)

    (10)

    (11)

    (12)

    Class 2

    Class 3

    O

    O

    (5)

    R2R1

  • Possible binding sites of the positive allosteric modulators (PAM)

    Ruo-Xu Gu, 2012, Structural Bioinformatics Studies of Two Ion Channel Proteins.

  • MD simulations of compound 2 at the three postulated PAM sites

    A. RMSD of ligand

    B. subunit interface domain (black line)

    C. transmembrane domain (red line)

    D. extracellular-transmembrane junction domain (blue line)

    Before (blue) and after (green) 10ns

    Arias et. al. Biochemistry 2011

  • Binding of compound 2 at the α7 receptor

    A. Residues of agonists site in red, PAM site in green.

    B. Compound 2 at the PAM binding site (gray) compared to (+)-epibatidine at the agonist site (red).

    Arias et. al. Biochemistry 2011

  • PAMs at the binding site

    A.B.C Compound 2 (class 1); D. Compound 1 (class 1);

    E. compound 6 (Class 2) ; F. compound 9 (Class 3)

    Ruo

    -Xu

    Gu,

    201

    2, S

    truct

    ural

    Bio

    info

    rmat

    ics

    Stu

    dies

    of T

    wo

    Ion

    Cha

    nnel

    Pro

    tein

    s.

  • Impact of resistance mutations on inhibitor binding to HIV-1 integrase(IN)

    Qi Chen , John K. Buolamwini , Jeremy C. Smith , Aixiu Li , Qin Xu , Xiaolin Cheng* , and Dong-Qing Wei*, J. Chem. Inf. Model. 2013, 53(12):3297-3307

  • Distribution of HIV-1 infection cases

    http://www.unaids.org

  • HIV replication. Yves Pommier et al.. Nature. 2005. 4

    HIV-1 infection and replication

    IN

  • Raltegravir (RAL)

    Raltegravir (RAL) the first IN strand transfer inhibitor (INSTI) approved by the FDA

  • Binding of RAL to HIV-1 integrase

    WT (black), E92Q/N155H (red), G140S (green), Q148H (orange), G140S/Q148H (blue)

  • Binding Energies of RAL to IN-DNA Complexes

    Energy (kcal/mol)

    Wild-type E92Q/N155H G140S Q148H G140S/Q148H

    ∆Eelec -53.1 ± 8.4 -22.9 ± 3.6 -59.3 ± 8.1 -23.1 ± 4.1 -7.7 ± 4.1

    ∆EVDW -33.4 ± 3.5 -53.0 ± 2.2 -31.5 ± 4.7 -54.3 ± 2.4 -14.7 ± 2.7

    ∆Gpolar 75.6 ± 11.6 75.7 ± 8.9 79.1 ± 7.9 70.1 ± 7.4 17.0 ± 3.4

    ∆Gnonpolar -5.1 ± 0.3 -6.1 ± 0.1 -5.0 ± 0.3 -6.3 ± 0.2 -2.8 ± 0.3

    ∆Gbind -16.1 ± 6.8 -6.3 ± 7.7 -16.7 ± 6.3 -13.5 ± 6.9 -8.2 ± 2.6

  • J. Chem. Inf. Model. 2013, 53(12):3297-3307

    Conformational changes from WT Integrase

    WT G140S/Q148H

    E92Q/N155H

    WT vs G140S/Q148H

    145 to 148

    140 to 148 MgB to DNA Pi

    140s’ loop

    http://pubs.acs.org/action/showImage?doi=10.1021%2Fci400537n&iName=master.img-007.jpg&type=masterhttp://pubs.acs.org/action/showImage?doi=10.1021%2Fci400537n&iName=master.img-005.jpg&type=masterhttp://pubs.acs.org/action/showImage?doi=10.1021%2Fci400537n&iName=master.img-003.jpg&type=master

  • Principle Component Analysis (PCA) on 140s’ loop Cα atom fluctuations

  • Influenza Viral A M2 Proton Channel

    Behrens G., Stoll M., Kamps B.S., et al., Pathogenesis and immunology. Printed by Druckhaus Sud GmbH & Co. KG, D-50968 Cologne, 2006, 19.

    Bouvier, N. M.; Palese, P., The biology of influenza viruses. Vaccine, 2008, 26, D49-D53.

    Infection to host cell

  • Comparison of four representative structures

    Pore radius of the M2 channel in the PDB

    3BKD(2.0Å,S22-L46,I33M,[1]) 3C9J(3.5Å,S22-L46,G34A,[1])

    3KQT(S22-L46,WT,[2]) 3C9J(S22-L46,WT,[3])

    2L0J(S22-G62,WT,[4]) 2H95(S22-L46,WT,[5]) 2NYJ(S22-L46,WT,[6])

    2RLF (S23-L60,WT,[7]) 2KIH (S23-L60,S31N,[8]) 2KWX(S23-L60,V27A,[9])

    [1]、A. L. Stouffer, et al., Nature, 2008, 451,596-599 [2]、S. D. Cady, et al., Nature, 2010, 463, 689-692 [3]、S. D. Cady, et al., JMB, 2009, 385, 1127-1141 [4]、M. Sharma, et al., Science, 2010, 330, 509-511 [5]、K. J. Schweighofer, et al., BJ, 2000, 78, 150-163 [6]、V. Vijayvergiya, et al., BJ, 2004, 87, 1697-1704 [7]、J. Schnell, et al, Nature, 2008, 451, 591-595 [8]、R. M. Pielak, et al., PNAS, 2009, 106, 7379-7384 [9]、R. M. Pielak, et al., BBRC, 2010, 401, 58-63

    Comparison of M2 channel structures

    Ruo-Xu Gu, et al, JACS 2011

  • Structural and energetic analysis of drug inhibition

    of the influenza A M2 proton channel

    Ruo-Xu Gu, Limin Angela Liu*, Dong-Qing Wei*, Trends in Pharmacological Sciences, 34, 571(2013).

  • NH3+NH3+

    金刚烷胺 金刚乙胺

    Ruo-Xu Gu,

    Structural Bioinformatics Studies of Two Ion Channel Proteins.

    Simulations of binding at P-site

    Amantadine Rimantadine Rotation of the rimantadine

    molecule during the MD simulation explained the NMR result.

  • • amantadine (金刚烷胺 ) bound

    • Rimantadine (金刚乙胺 ) bound

    Ruo-Xu Gu, Limin Angela Liu, Yong-Hua Wang, Qin Xu, and Dong-Qing Wei*, JPCB 2013

    Block of M2 channel

  • ---- Ligands binding are stable at both sites; ---- Pore radius of M2 of different simulations; ---- Interactions between ligands and M2 channel.

    [1]、J. Schnell, et al, Nature, 2008, 451, 591-595

    Comparison of bindings to two sites

    Apo-form P-binding site S-binding site

    WT

    PDB structure (ID: 2RLF), ligand removed

    Docking of rimantadine to the P-binding site(ID:2RLF)

    PDB structure (ID:2RLF)

    S31N

    Mutate Ser31 to Asn Mutate Ser31 to Asn

    Mutate Ser31 to Asn

    MD simulations GROMACS、6ns+15ns

    GROMOS、Berger UA FF Preparation of the simulation system

  • S-binding Site

    P-binding Site

    Path 1

    Path 2

    Path 3 Solvent

    Lipid Bilayer

    Solvent

    Free Energy changes of bindings to two sites

    Snapshots from MD simulations

    90, 60 and103 windows for Path1, Path2 and Path3

    Umbrella sampling Force constants:4500 Kj/mol/nm2 (GROMOS、OPLSaa Force fields)

    WHAM for Free energy calculations

  • Free Energy changes of the 3 paths GROMOS force field OPLS AA force field

    ---- Similar results of two force fields

    ---- The same energy differences of two sites (~7kcal/mol) by two force fields

    ---- OPLS FF got lower binding free energy Ruo-Xu Gu, et al, JACS 2011

  • Free energy of bindings to the two sites

    rimantadine in the M2-lipid environment

    P: thermodynamic preferred (hard to bind, but stably bound)

    S: kinetic preferred (easy to come, easy to go)

    Ruo-Xu Gu, et al, JACS 2011

  • Course Outline • Introduction of CADD • Drug Targets

    – Sequence analysis – Protein structure prediction – Molecular simulation

    • Drug Design – Finger Print – Pharmacophore – Combinatorial library – De novo Drug Design – QSAR

    • Molecular Docking

  • Feature Selection

    • Face identification • Key features. • The same applies to

    molecules.

  • Descriptor Choice

    乔木、灌木

    阔叶、针叶

    落叶、常绿

  • Topology

    Electrostatic

    Geometry

    Quantum

    *

    O

    CH2 CH2

    O

    NH CH CH2

    O

    O

    O

    O

    CH2 O

    CH2

    OH

    CH2 *n

    Types of descriptor for molecules

  • Fingerprint

    { is-aromatic, has-ring, has-C }

    { has-ring, has-C }

    { has-C, has-N, has-O }

    { is-aromatic, has-ring, has-C, has-N, has-F }

    If a universe of features U = { is-aromatic, has-ring, has-C, has-N, has-O, has-S, has-P, has-halogen } there are 28=256 possible fingerprints

    Tanimoto coefficient between fingerprints X and Y is defined to be: # features in intersect(A,B) / # features in union(A,B)

  • Pharmacophore

    • IUPAC (International Union of Pure and Applied Chemistry) – An ensemble of steric and electronic

    features that is necessary to ensure the optimal supramolecular interactions with a specific biological target and to trigger (or block) its biological response

  • Pharmacophore

    • Molecular features – For molecular recognition between – a ligand – a biological macromolecule

    • Structural analysis – Superimposed active compounds – Binding site of the receptor

  • Pharmacophore in 2D

    Geometric arrangement of functional groups of the ligand that are required for “activity”

  • A pharmacophore is a spatial arrangement of atoms or functional groups believed to be responsible for biological activity

    Pharmacophore in 3D

  • Pharmacophore of KZ7088

    7 points pharmacophores

    The ligand KZ7088 in the active site of SARS-CoV Mpro

  • A pharmacophore scheme is a collection of functions that define the meaning, appearance and methods of calculation of ligand annotation points and their attached labels. The scheme defines how each ligand in the searched database is annotated. The default scheme is PCH (Polarity-Charge-Hydrophobicity).

    Label Definition Don Hydrogen bond donors, including tautomeric donors. Acc Hydrogen bond acceptors, including tautomeric acceptors. Cat Cations, including resonance cations. Ani Anions, including resonance anions. Hyd Hydrophobic areas. Aro Aromatic centers.

  • Pharmacophore • Applications

    – Vitural screening – De novo drug design – 3D-QSAR

    • Softwares – Sybyl – Discovery studio – MOE

  • Virtual combinatorial library • Fragment screening (分子片段枚举) • RECAP (分子片段化与重组)

    – RECAP analysis – RECAP synthesis

    • BREED(配体繁殖) – Crossover by Genetic algorithm – superposition

  • Terms for combinatorial library

    Functional group

    R-group

    Leaving group

    Attachment point

    Scaffold

    Reagent

    Reactive atoms

  • Fragment screening

    • Molecular scaffold – Quinazoline(喹唑啉)

  • Fragment screening

    • R-group – -CH2-(2-thienyl-))(甲基2噻吩)

  • Fragment Enumeration • Screening attachment point A0 on A1->A4 • Combinatorial molecules

  • RECAP analysis • RECAP analysis

    – Generate fragments from source molecules • Extend SMILES

    – Simplified molecular-input line-entry system

  • RECAP synthesis • Reconstruct molecules by

    fragments from RECAP analysis – Atom environment – Reaction rules – Attachment points – Databases from RECAP analysis

  • • Crossover by Genetic algorithm

    BREED

  • BREED - Ligand-based design • Crossover points defined by all

    superimposed bond pairs

  • BREED - Structure-based design With target protein structure provided, the aligned input structures are assumed to define the binding pocket.

    Protein/ligand refinement can be performed for each new structures. 1) Find all residues within a specified distance of the input structures. The atoms in these residues will be treated explicitly and the remainder ignored. 2) Side-chain flexibility can be included using a user-specified force constant. If used, small backbone-atom displacements are permitted to reduce strain energies. 3) A multi-step minimization process is used with decreasing tether strengths and increasing scope to remove steric clashes and then minimize the protein/ligand complex. 4) MM/GBVI (Generalized Born / Volume Integral) energies of the protein/ligand pair are calculated as a score. The RMSD of the ligand heavy atoms is calculated as a goodness-of-fit metric.

  • De novo Drug Design • Design drug molecules fitting the active site • Not really de novo, but select parts of drugs by

    database searching – Fragment-based – Descriptor-based

    • Docking parts into sites • Put parts into drugs

    – Scaffold replacement – Fragment evolution – Link & Merge

  • Fragment-based

    Fragments

    (databases)

    Target active site

    (3D structure)

    Docking Evaluation Link & Merge

  • Descriptor-based

    Fragments

    (databases)

    Target active site

    (3D structure)

    Pharmacophore Query & Docking

    Receptor surface calculation

    Pharmacophore & Link features

  • Quantitative Structure-Activity Relationships (QSAR)

    • Mathematical Models Found a quantitative relationship

    Molecular Parameters

    log P σ

    MR MLP

    Biological activity IC50 Ki MIC Permeation …

    Statistical tool

  • The key of QSAR

    Descriptor Choice

    Statistical Model

    construction

  • QSAR models

  • Biological activity parameters

    – IC50 (50% inhibiting concentration), – Ki (inhibitory constant), – MIC (minimum inhibitory concentration), – Permeation

  • 3D-QSAR Parameters

    • Traditional CoMFA fields – Steric fields, Lennard-Jones potential

    – Electrostatic fields, Coulomb potential

    E = r + r

    r - 2

    r + r rj k = 1

    natoms probe k

    ij

    12probe k

    jkε∑ •

    6

    E = q q

    rjprobe k

    jkk = 1

    natoms •

    •∑

    ε

  • More 3D-QSAR Parameters

    • Additional fields in CoMFA – Interaction energies with functional groups (GRID software)

    – hydrophobic field (HINT software)

    – Molecular Lipophilicity Field (CLIP software)

  • 3D-Molecular Field Calculation

    Mol3

    Mol1Mol2

    PLS Bio = cte + a*S001 + b*S002 + ..... + m*S999 +n*E001 + ..... + z*E999

    Bio S001 E001 E999....S002 S999....

  • Interpretation of CoMFA results

    Compounds with low activity

    Compounds with high activity

  • 3D-QSAR contour

    Contact Electrostatic

  • Statistical models • Linear (quantitative model)

    – LR (linear regression) – MLR (multi-linear regression) – PCR (principle component regression) – PLS (partial least square) – SVR (support vector regression)

    • Nonlinear (qualitative model) – SVM (support vector machine) – Bayesian statistics

  • ∑∑

    −−= 2

    2exp

    )()(

    1 Rmeancalc

    calc

    yyyy

    拟合效果 • 相关系数R,

    • 均方根偏差RMSE

    • Fisher检验值F

    1)(

    1 RMSE2

    exp

    −−

    −−= ∑

    knyycalc

    2

    2

    )1()1(

    RkknRF

    −−−

    =

    calcy 为计算活性; expy 为实验活性; meany

    为计算活性平均值; n为样本个数; k 为变量个数

    Evaluation of QSAR models

  • Statistical value for 3D-QSAR •For the step of crossvalidation:

    •Cross-validated correlation coefficient, q2. •Optimal number of components, N.

    •For the final model: •Squared of correlation coefficient, r2. •Standard error of estimate, s. •Residuals. •F values.

  • Course Outline • Introduction of CADD • Drug Targets

    – Sequence analysis – Protein structure prediction – Molecular simulation

    • Drug Design – Finger Print – Pharmacophore – Combinatorial library – De novo Drug Design – QSAR

    • Molecular Docking

  • Docking

    • Prediction of ligand conformation and orientation (or posing) within a targeted binding site – accurate structural modeling – correct prediction of activity

    • Free energy of binding (∆G)

  • Goals of Docking 1) Characterize binding site - make an image of binding site with

    interaction points 2) Orient ligand into binding site - grid search - descriptor mapping - energy search 3) Evaluate strength of the interaction ∆G bind= ∆G complex – (∆Gligand +

    ∆Gtarget)

  • Molecular Docking • Multistep procedure

    – Prescreening: ligands or binding site – Pose optimization – Binding energy and evaluations

    • Representations of the receptor – Atomic – Surface – Grid

    • Scoring schemes – Force field-based: Electrostatic, VDW – Empirical – Knowledge-based scoring functions

  • Multi-step Process • First, find the poses of small molecules in the

    active site • Using simple scoring functions evaluating

    compound fits on the basis of calculations of approximate shape and electrostatic complementarities.

    • Pre-selected conformers are often further evaluated using more complex scoring schemes – More detailed treatment of electrostatic and van der Waals interactions – inclusion of at least some solvation or entropic effects – And other empirical terms

  • Molecular Representation • Three basic representations of the

    receptor: – atomic, surface and grid

    • Atomic representation – only used in conjunction with a potential energy function – often only during final RANKING procedures

    • Surface-based docking – Typically used in protein–protein docking – Connolly’s surface

    • Potential energy – store information about the receptor’s energetic contributions on

    grid points so that it only needs to be read during ligand scoring. – two types of potentials: electrostatic and van der Waals

  • Standard Potential Energy • Electrostatic potential energy:

    • Van der Waals interaction – Lennard-Jones 12-6 function

  • Docking Models • System: ligand, protein, and solvent

    molecules. • Solvent molecules

    – they are normally excluded from the problem. – Or implicitly modeled in the scoring functions as a way to address

    the solvent effect.

    • Rigid-body approximation – very popular in early – treats both the ligand and the receptor as rigid and explores only

    the 6 degrees of translational and rotational freedom.

    • A more common approach – modeling ligand flexibility while assuming a rigid protein receptor,

    therefore considering only the conformational space of the ligand.

  • Docking algorithms • Systematic docking algorithms

    – conformational search methods – fragmentation methods – database methods.

    • Random or stochastic algorithms – Monte Carlo methods (MC) – Genetic Algorithm methods (GA) – Tabu Search methods

    • Simulation methods – Molecular dynamics – Local minimization

    • Flexible protein docking – MD and MC methods – rotamer libraries – protein ensemble grids – soft-receptor modeling

  • Scoring Functions • Lack of a suitable scoring function, both

    in terms of speed and accuracy, is the major bottleneck in docking

    • Generally able to predict binding free energies within 7–10 kJ/mol

    • Three major classes – Force field-based – Empirical – Knowledge-based scoring functions

  • Capabilities and Limitations • Predict known protein-bound poses with averaged

    accuracies of about 1.5–2 A with success rates in the range of 70–80%.

    • Significant improvement beyond this range seems for now unachievable, even with the inclusion of receptor flexibility.

    • Source of problems – Scoring function is the major limiting factor. – solvent effect and the direct participation of water molecules

    in protein–ligand interactions, – Limited resolutions of most crystallographic targets – Protein flexibility

    • Recent trends – Inclusion of solvation and rotational entropy contributions – development of search algorithms more able to describe and

    efficiently sample the conformational space of the protein–ligand system, within a flexible-target paradigm

  • Softwares

  • Inhibition on the Enzymatic Activity of CYP1A2 by F186L mutant

    WT F186L

    The initial state of MD Ensemble docking

    Ma et. al.. Interdiscip Sci Comput Life Sci 2014

  • A key distance related to the reactivity (O-deethylation)

    7-ethoxyresorufin

    C13

    Ma et. al.. Interdiscip Sci Comput Life Sci 2014

  • Binding sites and catalytic pocket • The catalytic pocket

    – Residues I117, T118, T124, F125, F256, F260, N312, D313, G316, A317, D320, L497.

    – SASA (Solvent accessible surface area)

    – Shape & position

    Ma et. al.. Interdiscip Sci Comput Life Sci 2014

  • Important interactions between the substrate and the binding site

    WT: grey; F186L: color Hydrophobic interactions with

    F125, F226 and V227 Hydrogen bonds with

    T118/T124.

    Ma et. al.. Interdiscip Sci Comput Life Sci 2014

  • The open and close of the active channels

    Percentage of open state

    2c WT F186L

    Apo enzyme

    38% 43%

    Substrate bound

    43% 10%

    2a WT F186L

    Apo enzyme

    81% 71%

    Substrate bound

    0% 0%

    F186L reduce the accessibility, limit the

    product release Ma et. al.. Interdiscip Sci Comput Life Sci 2014

  • Questions

    quantitative? ligand or receptor? energy or geometry?

    local or global?

    Pharma-cophore

    De novo drug design

    QSAR

    Molecular Docking

  • Not strictly defined

    quantitative? ligand or receptor? energy or geometry?

    local or global?

    Pharma-cophore no

    ligand & receptor geometry local

    De novo drug design yes

    receptor & ligand

    energy and geometry local

    QSAR yes ligand or receptor energy or geometry global

    Molecular Docking yes

    ligand & receptor

    energy and geometry local

    Computer Aided Drug DesignContactReadingsCourse OutlineWhat is a drug?幻灯片编号 6幻灯片编号 7Sources of DrugsDrug Discovery and Drug Development幻灯片编号 10Drug Discovery ProcessesDrug Discovery Processes - IIDrug Discovery Processes - IIIIssues in Drug DiscoveryIssues in Drug DiscoveryDrug Discovery DisciplinesSuccessful drug developments幻灯片编号 18幻灯片编号 19幻灯片编号 20幻灯片编号 21Computer aided drug design (CADD) in various stages of drug discoveryA Little History of Computer Aided Drug DesignDiscovery of wgx50 -�an example of computer aided drug designThe screening 药物虚拟筛选�gx-50 – The Best molecule found in TCM Database幻灯片编号 27morris water maze幻灯片编号 30Destabilization of Alzheimer’s Aβ42 Protofibrils with Wgx-50 by Molecular Dynamics SimulationsCourse OutlineDrug TargetsTherapeutic Target Database (TTD)Therapeutic Target Database (TTD)Therapeutic Target Database (TTD)Drugs and targetsSerendipity in Drug DiscoveryDrug repositioning Drug CombinationsHIV-1 virus Cocktail therapy of AIDSComputer aided Drug Target IdentificationSequence AlignmentProtein structure predictionProtein structure predictionProtein 3D structure prediction Molecular SimulationsApplications of Molecular SimulationsCalculation methods of forces�and Scales of simulationsQM/MM MD simulations Examples of biomedical researches in computer-aided Drug designnAChR �(nicotinic acetylcholine receptor )Building of nAChR model Building of nAChR model Building of nAChR model Drug Screening Docked compounds at the α7 nAChR binding site幻灯片编号 62Interactions between JN403 and three different nAChRs Interactions between JN403 and three different nAChRs Interactions between JN403 and three different nAChRs Flexibilities of the receptor extracellular domain induced by agonists, vareniclineFlexibilities of the receptor extracellular domain induced by agonists, vareniclineThe correlation of motions of the receptor extracellular domain induced by agonists, vareniclineThe correlation of motions of a~hDesign of allosteric modulatorsPossible binding sites of the positive allosteric modulators (PAM)MD simulations of compound 2 at the three postulated PAM sitesBinding of compound 2 at the α7 receptorPAMs at the binding siteImpact of resistance mutations on inhibitor binding to HIV-1 integrase(IN)Distribution of HIV-1 infection casesHIV-1 infection and replicationRaltegravir (RAL)Binding of RAL to HIV-1 integraseBinding Energies of RAL to IN-DNA ComplexesConformational changes from WT IntegrasePrinciple Component Analysis (PCA) on 140s’ loop Cα atom fluctuations Influenza Viral A M2 Proton Channel幻灯片编号 84Structural and energetic analysis of drug inhibition of the influenza A M2 proton channelSimulations of binding at P-siteBlock of M2 channel幻灯片编号 88幻灯片编号 89幻灯片编号 90Free energy of bindings to the two sitesCourse OutlineFeature SelectionDescriptor Choice幻灯片编号 95Fingerprint幻灯片编号 97幻灯片编号 98Pharmacophore Pharmacophore Pharmacophore�in 2D A pharmacophore is a spatial arrangement of atoms or functional groups believed to be responsible for biological activityPharmacophore of KZ7088幻灯片编号 104Pharmacophore Virtual combinatorial library Terms for combinatorial libraryFragment screeningFragment screeningFragment EnumerationRECAP analysisRECAP synthesisBREEDBREED - Ligand-based designBREED - Structure-based designDe novo Drug DesignFragment-basedDescriptor-basedQuantitative Structure-Activity Relationships (QSAR)The key of QSAR幻灯片编号 121Biological activity parameters幻灯片编号 123幻灯片编号 1243D-Molecular Field CalculationInterpretation of CoMFA results3D-QSAR contourStatistical models拟合效果幻灯片编号 131Course OutlineDockingGoals of DockingMolecular DockingMulti-step ProcessMolecular RepresentationStandard Potential EnergyDocking ModelsDocking algorithmsScoring FunctionsCapabilities and LimitationsSoftwares幻灯片编号 144幻灯片编号 145幻灯片编号 146幻灯片编号 147幻灯片编号 148QuestionsNot strictly defined