ligand search and data mining of structural genomics structures abhinav kumar, herbert axelrod,...
TRANSCRIPT
Ligand search and data mining of Ligand search and data mining of Structural Genomics structuresStructural Genomics structures
Abhinav Kumar, Herbert Axelrod, Ashley DeaconAbhinav Kumar, Herbert Axelrod, Ashley Deacon
Structure Determination Core, Joint Center for Structural Genomics (JCSG), Structure Determination Core, Joint Center for Structural Genomics (JCSG), Stanford Synchrotron Radiation Laboratory, Menlo Park, CA, USAStanford Synchrotron Radiation Laboratory, Menlo Park, CA, USA
UCSD & Burnham (Bioinformatics Core)John Wooley
Adam Godzik Slawomir Grzechnik Lukasz Jaroszewski Dana WeekesLian Duan Sri Krishna Subramanian Natasha Sefcovic Piotr KozbialAndrew Morse Prasad BurraTamara Astakhova Josie AlaoenCindy Cook
TSRI (NMR Core)Kurt Wüthrich Reto Horst Maggie JohnsonAmaranth ChatterjeeMichael GeraltWojtek AugustyniakPedro SerranoBill PedriniWilliam Placzek
Stanford /SSRLStructure Determination Core
Keith Hodgson Ashley DeaconMitchell Miller Debanu DasHsiu-Ju (Jessica) Chiu Kevin JinChristopher Rife Qingping XuSilvya Oommachen Scott TalafuseHenry van den Bedem Ronald Reyes Christine Trame
Scientific Advisory BoardSir Tom Blundell Robert Stroud Univ. Cambridge Center for Structure of Membrane Proteins Homme Hellinga Membrane Protein Expression Center Duke University Medical Center UC San FranciscoJames Naismith James Paulson The Scottish Structural Proteomics facility Consortium for Functional Glycomics Univ. St. Andrews The Scripps Research InstituteSoichi Wakatsuki Todd Yeates Photon Factory, KEK, Japan UCLA-DOE Inst. for Genomics and ProteomicsJames Wells UC San Francisco
The JCSG is supported by the NIH Protein Structure Initiative (PSI) Grant U54 GM074898 from NIGMS (www.nigms.nih.gov). Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory (SSRL). The SSRL is a national user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the NIH.
GNF & TSRI (Crystallomics Core)Scott Lesley Mark Knuth Dennis CarltonThomas Clayton Kevin D. Murphy Christina TroutMarc Deller Daniel McMullan Heath Klock Polat Abdubek Claire Acosta Linda M. ColumbusJulie Feuerhelm Joanna C. Hale Thamara JanaratneHope Johnson Linda Okach Edward NigoghossianSebastian Sudek Aprilfawn White Bernhard GeierstangerGlen Spraggon Ylva Elias Sanjay AgarwallaCharlene Cho Bi-Ying Yeh Anna GrzechnikJessica Canseco Mimmi Brown
JCSG Ligand Search4
7
Search Results (35 hits)
ACY ADP AMP BR CA CL EDO FMN GLC GOL IOD MG NCA NI ORO P33 PO4 SO4 Ligand Depot:
ACY ADP AMP BR CA CL EDO FMN GLC GOL IOD MG NCA NI ORO P33 PO4 SO4 HIC-Up:
Ligand Visualization Links
JCSGFMN UNL
Archaeoglobus Fulgidus Dsm 4304
Crystal Structure of Hypothetical Protein (NP_068944.1) from Archaeoglobus Fulgidus at 1.30 Å resolution
NP_068944.1PF089811vp8TB0885A35
CESGFMNArabidopsis Thaliana
12-0xo-Phytodienoate Reductase Isoform 3NP_178662.1PF007241q45SGT9848034
…………………….
JCSGFMN GOL SO4
Jannaschia Sp. Ccs1
Crystal Structure of Pyridoxamine 5'-phosphate Oxidase- Related FMN-binding (YP_508196.1) From Jannaschia Sp. Ccs1 at 1.60 Å resolution
YP_508196.1PF012432ou5FJ9446A3
JCSGEDO FMN SO4 UNL
Clostridium Acetobutylicum
Crystal Structure of NIMC/NIMA Family Protein (NP_349178.1) from Clostridium Acetobutylicum at 1.80 Å resolution
NP_349178.1PF012432ig6FH7614A2
JCSGEDO FMN NCA
Pyrococcus Horikoshii Ot3
Crystal Structure of FMN-binding Protein (NP_142786.1) from Pyrococcus Horikoshii at 1.35 Å resolution
NP_142786.1PF016132r6vFB10607B1
PSILigandsOrganismDescriptionAccessionPFAMPDBTargetN
6
Unique PSI Ligands8
PDB Ligand Name Ligand PSI2A3L Coformycin 5'-Phosphate CF5 CESG2OU3 1H-Indole-3-Carbaldehyde I3A JCSG1VR0 (2R)-3-Sulfolactic Acid 3SL JCSG2OD6 10-Oxohexadecanoic Acid OHA JCSG1X92 D-Glycero-D-Mannopyranose-7-Phosphate M7P MCSG1O8B Beta-D-Arabinofuranose-5'-Phosphate ABF MCSG2OSU 6-Diazenyl-5-Oxo-L-Norleucine DON MCSG1M33 3-Hydroxy-Propanoic Acid 3OH MCSG1RTW (4-Amino-2-Methylpyrimidin-5-Yl)Methyl Dihydrogen Phosphate MP5 NESG2NW9 6-Fluoro-L-Tryptophan FT6 NESG1XKL 2-Amino-4H-1,3-Benzoxathiin-4-Ol STH NESG1LW4 3-Hydroxy-2-[(3-Hydroxy-2-Methyl-5-Phosphonooxymethyl- Pyridin-4-Ylmethyl)-Amino]-Butyric Acid TLP NYSGXRC2B4B N-Ethyl-N-[3-(Propylamino)Propyl]Propane- 1,3-Diamine B33 NYSGXRC1TUF Azelaic Acid AZ1 NYSGXRC2PUZ N-(Iminomethyl)-L-Glutamic Acid NIG NYSGXRC2Q09 3-[(4S)-2,5-Dioxoimidazolidin-4-Yl]Propanoic Acid DI6 NYSGXRC2GVC 1-Methyl-1,3-Dihydro-2H-Imidazole-2-Thione MMZ NYSGXRC1Y0G 2-[(2E,6E,10E,14E,18E,22E,26E)-3,7,11,15,19,23,27,31- Octamethyldotriaconta-2,6,10,14,18,22,26,30- Octaenyl]Phenol 8PP NYSGXRC1Z2L Allantoate Ion 1AL NYSGXRC1Y80 Co-5-Methoxybenzimidazolylcobamide B1M SECSG1KPH Didecyl-Dimethyl-Ammonium 10A TBSGC1KPI Didecyl-Dimethyl-Ammonium 10A TBSGC1N2H Pantoyl Adenylate PAJ TBSGC1N2I Pantoyl Adenylate PAJ TBSGC1BVR Trans-2-Hexadecenoyl-(N-Acetyl-Cysteamine)- Thioester THT TBSGC1QPR 5-Phosphoribosyl-1-(Beta-Methylene) Pyrophosphate PPC TBSGC1P44 5-{[4-(9H-Fluoren-9-Yl)Piperazin-1-Yl]Carbonyl}- 1H-Indole GEQ TBSGC
Unique Ligands9
(R)-2-Hydroxy-3-Sulfopropanoic acid (3SL) bound to the structure of putative
2-phosphosulfolactatetitle 2 phosphatase from Clostridium Acetobutylicum (1VR0)
Indole-3-Carboxaldehyde (I3A) bound to the structure of tellurite resistance
protein of cog3793 (zp_00109916.1) from Nostoc Punctiforme PCC 73102 (2OU3)
10-Oxohexadecanoic acid (OHA) bound to the structure of Ferredoxin-like
Protein (JCVI_PEP_1096682647733) from an environmental metagenome
(Unidentified Marine Microbe) (2OD6)
FK9436A (2OH1)Acetyltransferase Gnat family
FB8805A (2Q9K)Unknown protein
Unknown Ligands (UNL)
Ligands bound to JCSG new folds10
Target PDB Description Organism Ligand
CL6107A 2ICH Putative ATTH (NP_841447.1) at 2.00 A Nitrosomonas Europaea NHE
TB0797A 1VR0 Putative 2-phosphosulfolactate Phosphatase at 2.6 A Clostridium Acetobutylicum 3SL
TM0160 1VJL Predicted Protein related to Wound Inducive Proteins in Plants at 1.90 A Thermotoga Maritima UNL
TM0449 1KQ4 Thy1-complementing Protein at 2.25 A Thermotoga Maritima FAD
TM0574 1VKY S-adenosylmethionine Trna Ribosyltransferase at 2.00 A Thermotoga Maritima UNL
TM1394 1VQ0 33 kDa Chaperonin (heat Shock Protein 33 Homolog) at 2.20 A Thermotoga Maritima UNL
TM1464 1VKM Conserved Hypothetical Protein Possibly Involved in Carbohydrate Metabolism at 1.90 A Thermotoga Maritima Msb8 UNL
TM1506 1VK9 Hypothetical Protein at 2.70 A Thermotoga Maritima UNL
TM1553 1VRM Hypothetical Protein at 1.58 A Thermotoga Maritima Msb8 UNL
2ICH
1VQ0
1VR0 1VJL 1KQ4 1VKY
1VRM1VK91VKM
Each project moves from target selection through publication along the Target Pipeline.
The JCSG Target Pipeline2
Autoindex Integrate Solve TraceScale
1. Screen Crystals and Collect Data
2. Automatically Process Data
3. Refine and Evaluate Structures
4. Disseminate Information* Publish Web based Tools
TOPSPAN (www.topsan.org) Ligand Search (smb.slac.stanford.edu/public/jcsg/cgi/jcsg_ligand_check.pl)
* in collaboration with BIC
The Role of the Structure Determination Core in the JCSG3
The JCSG (www.jcsg.org) is one of the four large-scale structural genomics centers funded by NIGMS as part of the production phase of the Protein Structure Initiative (PSI). More than 2600 structures have been deposited into the PDB by the PSI centers as of 2007, of which the JCSG has contributed over 500 structures.
The Joint Center for Structural Genomics (JCSG)1
Binding Modes of Ligands11
There are over 340 structures in PDB with the co-factor Flavin Mononucleotide (FMN) bound to the protein
The binding poses of FMN display considerable variations due to the torsional flexibility in the molecule.
However, unique binding poses can be observed in proteins belonging to specific PFAM families.
Number of Structures
303PF04289
1082PF01613
21147PF01243
TotalNon-PSIPSIPFAM
PF01243 (Pyridoxamine 5'-
phosphate oxidase)PF01613
(Flavin reductase like domain)
PF04289 (Unknown Function DUF447)
1y30
5Summary of Ligands (1606 structures)
Ligands (269 structures; 140 different ligands): UNL(70), UNX(22), LLP(6), SIN(6), NDP(6), MA7(6), NAG(5), PLM(4), UNK(4), GUN(3), APC(3), SUC(3), BAL(3), GLC(3), PAF(3), APR(2), GAL(2), NCN(2), CSD(2), SAI(2), CEI(2), BIO(2), HMH(2), SAP(2), GNP(2), 144(2), NCA(2), G4P(2), MPO(2), SRT(2), ANP(2), PCP(2), BGC(2), PAJ(2), NIG(1), PRP(1), NIO(1), ABF(1), IPR(1), MTA(1), CP(1), MLT(1), DI6(1), MED(1), MLZ(1), 5GP(1), CSO(1), CDP(1), I3A(1), 2PL(1), HED(1), G1P(1), NBZ(1), CSY(1), FRU(1), PLG(1), THF(1), B1M(1), ACP(1), DU(1), MMZ(1), OHA(1), 16A(1), THT(1), M7P(1), 3GC(1), CF5(1), PEO(1), CTZ(1), ADE(1), FT6(1), KEG(1), LUM(1), XLS(1), BAM(1), ADN(1), PMP(1), ADQ(1), B33(1), DGI(1), G3H(1), OXG(1), NDS(1), SAL(1), 3SL(1), SIB(1), STH(1), FEO(1), G3P(1), OXN(1), FES(1), TYD(1), DGT(1), 8PP(1), CO2(1), MP5(1), NTM(1), PNS(1), AES(1), APK(1), UVW(1), TRE(1), PYR(1), NAI(1), TCL(1), NMN(1), MAN(1), BFD(1), HHP(1), RIP(1), RBF(1), ORO(1), SNN(1), DTP(1), ZID(1), DEP(1), UPG(1), HXA(1), AAT(1), DTY(1), DON(1), NPO(1), C2E(1), AGC(1), BDF(1), PHT(1), OSB(1), NVA(1), CRO(1), BDN(1), TNE(1), SOG(1), AGS(1), TLP(1), 1PS(1), DUT(1), CXS(1), GEQ(1), MRD(1), G6P(1)
Co-factors (211 structures; 21 different co-factors): FMN(36), NAD(29), COA(18), NAP(17), PLP(15), ADP(15), FAD(15), SAM(14), ATP(9), SAH(9), AMP(9), HEM(8), ACO(7), GDP(4), FS4(3), U5P(2), MLC(1), COD(1), CNC(1), UTP(1), CTP(1)
Metal Ions (647 structures; 30 different metal ions): MG(177), ZN(174), NA(102), CA(83), NI(40), MN(31), FE(26), K(16), FE2(9), CD(8), PT(8), HG(7), CO(5), SM(2), WO4(2), PR(2), AU(2), BA(1), CS(1), MW2(1), SE(1), ARS(1), ZN3(1), O4M(1), YT3(1), LI(1), MO2(1), MO3(1), VO4(1), MO6(1)
Non-metal Ions (692 structures; 22 different non-metal ions): SO4(324), CL(243), PO4(118), NO3(11), IOD(10), BR(10), SCN(8), CO3(4), CAC(4), POP(3), AZI(3), SUL(2), BCT(2), ALF(2), OXL(2), PER(1), SO3(1), MLI(1), PO3(1), THJ(1), 1AL(1), NH4(1)
Organics (90 structures; 26 different organics): IPA(14), EOH(13), BME(9), BEZ(5), TLA(5), SEO(5), AKG(5), ETX(4), TAR(4), PGO(4), DTT(4), OAA(2), ACE(2), DMS(2), MLA(1), DOX(1), XYL(1), MOH(1), 3OH(1), AZ1(1), PPI(1), IOH(1), FOR(1), MYR(1), GTT(1), LMT(1)
Buffers (240 structures; 15 different buffers): ACT(86), ACY(47), FMT(37), CIT(27), TRS(16), EPE(15), MES(12), IMD(8), TMN(2), 10A(2), BTB(2), ICT(1), CPS(1), FLC(1), NHE(1)
Precipitants (98 structures; 13 different precipitants): PEG(38), PG4(28), PGE(16), 1PE(8), P6G(7), 2PE(3), PE4(3), P33(3), PE5(2), PEF(1), BU3(1), 1PG(1), PE8(1)
Salts (3 structures; 3 different salts): DPO(1), AF3(1), PPC(1)
Detergents (2 structures; 1 different detergents): BOG(2)
Cryos (502 structures; 5 different cryos): GOL(244), EDO(241), MPD(32), EGL(3), CRY(2)