enhanced free energy based structure prediction in ... · 5 j. phys. chem. b (2015); j. mol....
TRANSCRIPT
Mark E. Tuckerman Dept. of Chemistry and Courant Institute of Mathematical Sciences
New York University, 100 Washington Square East, NY 10003 NYU-ECNU Center for Computational Chemistry at NYU Shanghai 200062, China
纽约大学 - 华东师范大学计算化学中心在上海纽约大学
Enhanced Free Energy Based Structure Prediction
in Materials Science
NEW YORK UNIVERSITY MRSEC
Crystallization and the Pharmaceutical Industry: Materials Science?
Pharmaceutical Materials Science:
An Active New Frontier in Materials
Research
J. Elliott and B. Hancock
Guest Editors
MRS Bulletin 31, Issue 11 (2006)
Pharmaceutical Industry
• $500 billion global market
• $250 billion US market
• Majority of pharmaceuticals are
small molecule crystals
“One of the continuing scandals in the physical sciences is that it remains impossible to
predict the structure of even the simplest crystalline solids from their chemical composition.”
-- John Maddox, Nature (1998)
Targets of the sixth blind structure prediction test
Accuracy of intermolecular interactions
Empirical potentials fit to SAPT(DFT) calculations
( )
( ) (1) ( )
1 bonded
6,8,10
( ) 1 ( , ) ( , ) ( )ij ij
n
r i j ijl l n
ij ij ij ij n ij ij nij l nij ij
q q CU A r e f r f r U
r r
R R
0
( , ) 1!
mn
r
n
m
rf r e
m
Machine-learning (‘mathematical’) potentials
Standard force fields
bonded12 6( ) ( )
i j ij ij
ij ij ij ij
q q a bU U
r r r
R R
Kohn-Sham density functional theory (plus dispersion):
2
{ }
1 1 ( ) ( )( ) [ ] [ , ] ( )
2 2 'min i i xc ext NN
i
n nU d d E n E n U
r r
R r r R Rr r
2( ) ( ) i i j ij
i
n r r
Structure prediction protocol for Target XXII
P212121
Lowest Energy Structures
P21/n P21/n
Pbca C2/c Pbca
Optimize monomer
geometry using DFT
[PBE0 + D3, 6-311G**(d,p)
Generate Unit Cells
(UPACK)
25,000 generated structures
4,042 unique after clustering
107 within 10 kJ/mol of lowest energy
Run MD at exptl. P,T
Summary of 50 predicted structures based on MD averaged energies
I: P21/n II: Pbca, E = 3.46 kJ/mol III: P21/c, E = 5.37 kJ/mol
IV: P21/n, E = 5.99 kJ/mol V: P21/c, E = 6.31 kJ/mol
Spc grp a b c β
Expt. P21/n 11.95 6.70 12.60 108.6
Calc. P21/n 12.02 6.72 12.48 108.7
Prediction of the experimental structure
At 150 K: RMSD20 from experimental structure = 0.15 Å. Rank = 4 on our list
Final ranking based on POLY1 SAPT(DFT) potential
At 300 K: RMSD20 from experimental structure = 0.18 Å. Rank = 4 on our list
Attempted predictions = 21, Times predicted = 12, Times ranked 5 or lower = 7
Pharmaceutical polymorphism
HIV-1 protease
Polymorphism refers to the ability
of a compound to form multiple
crystal structures.
Ritonavir: HIV protease inhibitor
• NIH: $3,500,000 investment
• Abbott Labs: $200,000,000 investment
• Originally dispensed in 1996 as ordinary
capsules, no refrigeration required
• Converted to lower energy (hitherto unknown)
polymorph (form I to form II) on the shelf
• Form II: Poor solubility, lower bioavailability
• Required recall (1998) and reformulation as
gel caps (2002).
Pharmaceutical Polymorphism
Ranitidine hydrochloride (Zantac) C13H22N4O3S
Bond et al. Amer. Pharma. Rev. (2007) Acetylsalicylic acid
Rough energy landscapes in proteins and crystals
Dunitz and Scheraga
PNAS 101, 14309 (2004)
Sampling via collective variables in molecular simulation
φ ψ
Peptides:
Crystals:
Packing:
Orientation:
Cell shape:
x x x
y y y
z z z
a b c
a b c
a b c
h
2
1
4( ) ( , )
(2 1)
bNl
l b lm b b
m l b
Q f r Yl N
( , )
1
1
( ,..., ) d d ( ( ) )n
H
nP s s e q s
p r
p r r
Adiabatic free energy (temperature-accelerated molecular) dynamics (AFED/TAMD)
L. Rosso, P. Minary, Z. Zhu, MET J. Chem. Phys. 116, 4389 (2002)
Margliano and Vanden-Eijnden, Chem. Phys. Lett. 426, 168 (2006); J. B. Abrams and MET, J. Phys. Chem. B 112, 14752 (2008)
For prediction of crystal structures: Yu and MET Phys. Rev. Lett. 107, 015701 (2011); Yu et al. J. Chem. Phys. 150, 214109 (2014)
1/2
2
{ }( ) lim exp ( )
2 2q s q s
r r
Write δ-functions as product of Gaussians:
Canonical probability distribution and free energy surface:
2
2
1 1
( , , , ) ( , ) ( )2 2
n ns
s
ps p H q s
m
p r p r rH
1( ,..., ) ( ) 1,...,Nq q q n r r r
Suppose n collective variables characterize a free energy landscape of interest
1 1( ,..., ) ln ( ,..., )n B nA s s k T P s s
Dynamically sample Gaussian centers s1,…,sn by introducing a kinetic energy for them
And extending the phase space, which gives the following Hamiltonian:
Adiabatically decoupled equations of motion:
( ) heat bath( )i i
i i
qHm s q T
r rr r
( ) heat bath( )sm s s q T r
Introduce high temperature for extended variables and high masses s iT T m m
Under adiabatic conditions, we generate a distribution ({ })
adb 1( ,..., , , )n sP s s T T
({ })
adb 1{ }
1 1{ }
lim ln ( ,..., , , )
lim ln ( ,..., , ) ( ,..., , )
s n s
s n n
s
kT P s s T T
TkT P s s T A s s T
T
/({ })
adb 1 1{ }lim ( ,..., , , ) ( ,..., , ) sT T
n s nP s s T T P s s T
Adiabatic free energy (temperature-accelerated molecular) dynamics (AFED/TAMD)
Alanine Decamer (gas phase)
Force field: CHARMM22
20 CVs: All (φ,ψ) pairs
CV Temperature: Ts = 900 K
Physical Temp: T = 300 K
CV mass: m(ψ,φ) = 600mH
Harmonic coupling: 300.5 kcal/mol/rad2
Other developments: 1. Combining with metadynamics used as a bias:
2. Optimal free energy reconstruction protocols:
M. Chen. M. Cuendet, MET J. Chem. Phys. 137, 024102 (2012)
M. Cuendet and MET J. Chem. Theor. Comput. 10, 2975 (2014)
GROMOS FF, 100 K, 32,000 40,000 KsT T
216 (3 3 3), 2 GPa, Total run time = 500 ps, 5 nsN P
Constant pressure version
with cell matrix as CVs
II(01)
X-ray powder diffraction patterns
Yonetani et al. (2001).
II(98)
98
01
MURI/ARO
Free energies of benzene polymorphs
Six-dimensional free energy surface analyzed using
clustering based on a Gaussian mixture model.
II(98) II(01)
Cmca
P41212
P21/c
C2/c
P21/c
(see, also, J. Yang et al.
Science 345, 640 (2014))
Polymorphs of resorcinol
• Resorcinol (C6H6O2) is a simple benzene derivative.
• At ambient conditions, forms two stable forms (α and β) depending on the relative orientations of the OH bonds, both having space group Pna21
1.
• High pressure phase (δ phase) exists and arises from an α → δ at pressures above 3 GPa2. Structure unknown.
• Another high pressure phase (ε phase) can be reached from β phase above 5.6 GPa. This phase is also thought to be disordered based on Raman scattering2.
• Another ambient polymorph (γ phase) in the Rb conformation recently found3. Its exact structure is unknown4.
• Recent new experimental activity in study of resorcinol.5
1Robertson and Ubbelohde Proc. Royal Soc. London 167 122 (1938) 2 Rao et al. Phys. Rev. B 65 054108 (2002). 3 Kahr et al. Cryst. Growth Design 11 2070 (2011). 4 Zhu et al. Angew. Chem. Intl. Ed. (submitted) 5 J. Phys. Chem. B (2015); J. Mol. Struct. (2015).
polymorph
12P
transition
at 3 GPa
Preliminary free energy estimates for resorcinol
form δ form
a/A b/A c/A a/A b/A c/A
Cell length 10.53 9.53 5.60 8.20 11.65 4.88
Cell angle 90 90 90 90 90 90
Space GP Pna21 Pna21
Supercell length A
Energetic Co-crystals: DADP.TxTNB
DADP
TCTNB
β
( iodo- )
α
( chloro- )
( bromo- )
Experimental Structures
Polarizable system without pre-determined force field parameters
= opportunity for ab initio enhanced sampling!
Exp. Structures (Maztger Group):
J. Am. Chem. Soc. 2015, 137, 5074−5079
Angew. Chem. Int. Ed. 2013, 52, 6468-6471
1 3
CP2K 2.5: PBE+D3/DZVP-MOLOPT-(SR)-GTH
For optimized unit cells
within PBE+D3, DZVP, 450 Ry in CP2K
Cl α < Cl β ΔE ≈ 1.6 kJ/mol
Br α > Br β ΔE ≈ 4.9 kJ/mol
I α >> I β ΔE ≈ 15.8 kJ/mol
Energetic Co-crystals: DADP.TxTNB
β
( iodo- )
α
( chloro- )
( bromo- )
• Observed DADP.TBTNB co-crystal is a
kinetic product – aggregation promotes
co-crystal growth?
• Can we obtain the actual free energy
difference between the two forms for
each co-crystal?
• Implementation of crystal-AFED in CP2K
Start with nuclei Compute ,i n ,i n F
Propagate nuclei a
short time Δt with F
Add electrons
Add electrons
Ab initio molecular dynamics
2
( ) 2 (0) ( ) (0)I I I I
I
tt t
M
R R R F
e.g. Verlet:
{ }min [{ }, ]I I IM E
R R
11.0 picosecond of ab initio crystal-AFED with CP2K for TCTNB/DADP co-crystal
Experimental structure: α New Polymorph Alternate polymorph: β
3
14.344
15.272
9.006
91.3
1.549 g/cm
a
b
c
3
25.797
9.450
12.638
99.6
1.542 g/cm
a
b
c
3
16.724
9.450
12.638
69.7
1.631 g/cm
a
b
c
MURI/ARO
Crystal structures of Xenon at 2700 K and 25 GPa
Using Q4 and Q6 only
as collective variables
Ts = 1.5 x 105 K
ms = 108 amu
Using Q4, Q6, and the
cell matrix hd as
collective variables
Ts = Th = 105 K,
ms = 108 amu
T. Q. Yu, P. –Y. Chen, A. Samanta, E. Vanden-Eijnden, MET J. Chem. Phys. (2014)
Experimental evidence mixed fcc-hcp structures
in pressure-induced Martensitic transitions
from fcc to hcp as intermediates:
Jephcoat et al. Phys. Rev. Lett. 59, 2670 (1987)
Cynn et al. Phys. Rev. Lett. 86, 4552 (2001)
The mechanism of equilibrium melting of a solid
A. Samanta, T.-Q. Yu, W. E, MET Science (November, 2014)
Classical nucleation theory:
3 24( ) 4 , < 0
3sG r r r
3
2
2
16 ( )
3
s
s
r
G r
r*
Crystal d-AFED simulations of the melting of Cu
Collective variables: hd, Q4, Q6, T = Tm, Ts=Th = 107 K
Potential model: Embedded Atom Model (EAM) of Mishin et al.
Phys. Rev. B 63, 224106 (2001).
A. Samanta, T.-Q. Yu, W. E, MET Science (November, 2014)
Free energy surfaces
One-dimensional profiles constructed using string method [Maragliano et al. JCP (2006)]
Navigating on a high-dimensional surface
From a minimum, we ask for the
nearest accessible saddle points.
These can be found using gentlest
ascent dynamics (GAD) on the FES.
A. Samanta, M. Chen, T. –Q. Yu, W. E, MET J. Chem. Phys. 140, 164109 (2014).
M. Chen, T. –Q. Yu, MET PNAS 112, 3235 (2015)
s F
Activation:
Combine with relaxation to next minimum:
Termed “STochastic Activation-Relaxation Technique” or START
Example: Alanine tripeptide
Landmarks determined using density-based clustering methods.
Minima and saddles are assigned as vertices of different colors.
An edge connects a saddle vertex to a minimum vertex if a relaxation initiated at
the saddle ends up at the corresponding minimum.
Graph representation of free energy surfaces
Full network diagram for
met-enkephalin using
10 Ramachandran angles
as CVs:
1081 minima, 1431 saddles
Graph nodes sorted based on
Sketch Map analysis
M. Ceriotti, G. A. Tribello, M. Parrinello Proc. Natl. Acad. Sci. 108, 13023 (2011)
22 ( ) ( )ij ij
i j
F R f r
If we want to compute free energies, use the minima obtained in a START run to
indicate where a set of bins should be placed and use AFED sampling to obtain the
populations in those bins.
Can be adapted for crystal structure prediction! [M. Chen, T. –Q. Yu, MET (in prep)]
Can handle explicit solvent!
Spc grp a b c β
Expt. P21/c 9.45 6.04 8.56 103.4
Calc. P21/c 9.95 5.90 8.79 106.6
Experimental structure first found in 1 ns using orientational CVs. RMSD4 of carbons = 0.16 Å, full RMSD4 = 0.44 Å using a simple, universal force field
M. Chen, T. –Q. Yu,
MET (in prep)
Conclusions and Perspectives
1. Protocol for crystal structure prediction including specialized force field
generation [SAPT(DFT)] combined with molecular dynamics refinement.
2. Enhanced sampling approaches further allow for polymorph prediction and
ranking based on free energy, which is the proper figure of merit.
3. Surface navigation combined with enhanced sampling looks to be a promising
approach going forward, however, it is important to have a robust set of
internal collective variables.
4. Considerable challenges remain including:
i. Highly flexible molecules.
ii. High-temperature and/or high-pressure polymorphs.
iii. Stacking faults, screw axes, mixed phases,….
iv. Isotopic polymorphism (requires, e.g., path integrals).
5. From Nature press release based on CCDC 6th blind test, “It is fair to claim that,
to a large extent, this blind test has shown that the problem of organic structure
prediction has been solved.” – M. Neumann. This claim might be exaggerated,
and the field of crystal structure prediction far from dead.
Acknowledgments
Postdocs • Elia Schneider • Linas Vilciauskas • Leslie Vogt
External
• Krzysztof Szalewicz (U. Del.) • Michael Metz (U. Del.) • Andrew Rohl (Curtin U.) • Bart Kahr (NYU) • Qiang Zhu (Stony Brook)
Students
• Ming Chen • Tang-Qing Yu • Manav Kumar • Hongxing Song