quantum mechanical properties of atoms in molecules via ... · quantum mechanical properties of...
TRANSCRIPT
Quantum Mechanical Properties of Atoms inMolecules via Machine Learning
Matthias Rupp
Fritz Haber Institute of the Max Planck Society, Berlin, Germany
Joint work with Raghunathan Ramakrishnan and O. Anatole von LilienfeldUniversity of Basel, Switzerland
Ψk 2015 ConferenceSeptember 6–10, San Sebastian, Spain
Overview
ProblemComputational cost of numerical approximationslimits uses of electronic structure theory
GoalCombining quantum mechanics with machine learning to handlelarger systems, longer simulations, more systems, and higher accuracy
ApproachInterpolation between reference calculationsacross geometries and compositions
2
Idea of QM/ML models
• QM/ML = quantum mechanics + machine learning
• exploit redundancy in a series of QM calculations
• interpolate between QM calculations using ML
pr
op
er
ty
æ
æ
æ
æ
æ
æ
molecular structure
• reference calculations
— QM
- - - ML
3
Rupp, Int J Quant Chem 115(16): 1058, 2015.
Kernel ridge regression
model f̂ (x) =n∑
i=1
αik(x i , x)
optimization problem argminα∈Rn
n∑i=1
(f̂ (x i )− yi
)2+ λ αTKα
solution α =(K + λI
)−1y
with k positive definite, K ij = k(x i , x j), regularization strength λ ∈ R.
4
Rupp, Int J Quant Chem 115(16): 1058, 2015.
Kernel ridge regression example
Weighted basis functions placed on training samples xi
��� � �����
���
�
�
Example:— f (x) = cos(x)
� training samples
— Gaussian basis functions
- - prediction f̂
5
Vu et al., Int J Quant Chem 115(16): 1115, 2015.
Local environments
Local properties of atoms in molecules
⟶ z
Q
6
Rupp et al., J Phys Chem Lett 6(16): 3309, 2015.
Local environments
Local atom-centered coordinate systems.
atomic Coulomb matrix
M(Q)I ,J =
12Z
2.4I I = J
ZIZJ‖R I−RJ‖ I 6= J
principal component coordinates
(1
nXTX
)v ` = λ`v `
XV T
augmented by ZI
Representations sorted by distance to atom Q.
7
Rupp et al., Phys Rev Lett 108: 058301, 2012. Rupp et al., J Phys Chem Lett 6: 3309, 2015.
Data set and properties
• 9 k small organic molecules
• C, N, O, H; 7–9 non-H atoms
• subset of GDB9
• forces: 100 conformations foreach of 168 C7H10O2 isomers
• nuclear chemical shifts
• core level excitations
• forces
Calculations at DFT/PBE0/def2TZVP level using Gaussian
8
Blum & Reymond, J Am Chem Soc 131(25): 8732, 2009.
Results
Property Ref. Range MAE % R2
13C δ/ppm 2.4 6 – 211 3.9± 0.28 1.9 0.988± 0.0011H δ/ppm 0.11 0 – 10 0.28± 0.01 2.8 0.954± 0.0051s C δ/mEh 7.5 -165 – -2 4.9± 0.12 3.0 0.971± 0.002FC/mEh a0
−1 1 -99 – 96 3.6± 0.10 1.8 0.983± 0.002FH/mEh a0
−1 1 -43 – 43 0.8± 0.02 0.9 0.996± 0.003
MAE = mean absolute error, R = correlation coefficient
9
Rupp et al., J Phys Chem Lett 6(16): 3309, 2015.
Linear scaling
● ●● ● ●
●● ● ● ●
■ ■
■ ■■
■ ■ ■ ■■
▲ ▲ ▲▲
▲▲ ▲
▲▲
▲
○ ○○
○ ○ ○ ○ ○ ○○
□
□□ □ □ □ □ □ □ □
4 14 25 35
1
2
3
234 906 1578 2250
1
3
5
7
polymer length / nm
RMSE
/%polymer size / # electrons
computetime/days
● 13C δ ■ 1H δ ▲ 1s C δ ○ FC □ FH
10
Rupp et al., J Phys Chem Lett 6(16): 3309, 2015.
Prediction of chemical shifts
0 50 100 150 200
10
102
103
104
13C δ / ppm
# DFT ML 0.5k ML 1kML 10k GDB9
11
Rupp et al., J Phys Chem Lett 6(16): 3309, 2015.