Expansion of bond dissociation prediction with machine learning to medicinally and environmentally relevant chemical space
文献信息
Shree Sowndarya S. V., Yeonjoon Kim, Seonah Kim, Peter C. St. John, Robert S. Paton
Bond dissociation energetics underpin the thermodynamics of chemical transformations where bonds are broken or formed and can also be used to predict reaction rates and selectivities. Current machine learning (ML) models to predict bond dissociation energy (BDE) are largely limited in their elemental coverage to hydrogen and the second-row elements. This has restricted the applicability of ML-derived BDE predictions, particularly for molecules of medicinal relevance, since the heteroatoms S, Cl, F, P, Br, and I are commonly found in approved pharmaceuticals. Atmospherically and environmentally relevant molecules containing multiple halogen atoms have been similarly inaccessible. In this study, we considerably expand the size, elemental composition, and bond types of an extensive BDE database and train a new ML BDE model that includes C, H, N, O, S, Cl, F, P, Br, and I. We curate a new quantum chemical dataset of 531 244 unique zero-point energy inclusive homolytic dissociations of organic compounds. We investigate accuracy for out-of-sample molecules and implement iterative training and testing cycles during model development to improve the model accuracy. Improvements in predictive accuracy were achieved for datasets of pharmaceutically relevant molecules containing multiple C(sp2)–halogen bonds from 5.7 to 0.8 kcal mol−1 and polyhaloalkyl compounds with multiple C(sp3)–halogen bonds from 2.7 to 1.2 kcal mol−1 through the targeted augmentation of training data by as little as eight additional molecules. Our updated and expanded model (ALFABET) achieves a mean absolute error of 0.6 kcal mol−1 for both enthalpies and free energies compared to the quantum chemical ground truth. The graph-based representations utilized here outperform traditional cheminformatics features such as radial fingerprints, and there is no discernible improvement in accuracy by including more expensive QM-derived parameters, such as optimized bond lengths. Finally, we illustrate high accuracy in external prediction tasks for large halogenated natural products, pharmaceutically relevant halogenated molecules, atmospherically important halocarbons, and polyfluoroalkyl substances related to environmental toxicity.
相关文献
Identification of an emitting molecular species by time-resolved fluorescence applied to the excited state dynamics of pigment yellow 101
Seung Noh Lee, Jaeheung Park, Manho Lim, Taiha Joo
DOI: 10.1039/C3CP54546F
Platinum–hydrogen vibrations and low energy electronic excitations of 13-atom Pt nanoclusters
Melanie Keppeler
DOI: 10.1039/C4CP02052A
Unveiling the effects of post-deposition treatment with different alkaline elements on the electronic properties of CIGS thin film solar cells
Fabian Pianezzi, Patrick Reinhard, Adrian Chirilă, Benjamin Bissig, Shiro Nishiwaki, Stephan Buecheler, Ayodhya N. Tiwari
DOI: 10.1039/C4CP00614C
Silica-surface reorganization during organotin grafting evidenced by 119Sn DNP SENS: a tandem reaction of gem-silanols and strained siloxane bridges
Matthew P. Conley, Aaron J. Rossini, Aleix Comas-Vives, Maxence Valla, Gilles Casano, Olivier Ouari, Paul Tordo, Anne Lesage, Lyndon Emsley, Christophe Copéret
DOI: 10.1039/C4CP01973C
The role of an active site Mg2+ in HDV ribozyme self-cleavage: insights from QM/MM calculations
Vojtěch Mlýnský, Nils G. Walter
DOI: 10.1039/C4CP03857F
Interaction of gold nanoparticles mediated by captopril and S-nitrosocaptopril: the effect of manganese ions in mild acid medium
Emilia Iglesias, Rafael Prado-Gotor
DOI: 10.1039/C4CP03969F
Local silico-aluminophosphate interfaces within phosphated H-ZSM-5 zeolites
Hendrik E. van der Bij, Bert M. Weckhuysen
DOI: 10.1039/C3CP54791D
High DNP efficiency of TEMPONE radicals in liquid toluene at low concentrations
Nikolay Enkin, Guoquan Liu, Igor Tkach, Marina Bennati
DOI: 10.1039/C4CP00854E
Graphene-mediated surface enhanced Raman scattering in silica mesoporous nanocomposite films
Davide Carboni, Barbara Lasio, Valeria Alzari, Alberto Mariani, Danilo Loche, Maria F. Casula, Luca Malfatti, Plinio Innocenzi
DOI: 10.1039/C4CP03582H
Role of the nano amorphous interface in the crystallization of Sb2Te3 towards non-volatile phase change memory: insights from first principles
Xue-Peng Wang, Nian-Ke Chen, Xian-Bin Li, Yan Cheng, X. Q. Liu, Meng-Jiao Xia, Z. T. Song, X. D. Han, Hong-Bo Sun
DOI: 10.1039/C3CP55476G
您可能还喜欢
硅烷偶联剂ZQ-172(CAS号:1067-53-4)的主要用途是什么?
硅烷偶联剂ZQ-172主要用于增强无机填料与有机高分子材料之间的相容性,常见于橡胶、塑料、涂料和胶黏剂等复合体系中。其硅氧烷基团可与玻璃纤维、二氧化硅等无机物表...
如何处理含有6-(2,4-二甲氧基苯基)-2-吡啶甲醇(CAS号:887981-31-9)的废料?
对于含有该化合物的废料,首先应收集并分类存放,避免与其它化学品混合。在处理前,需进行必要的检测,确定其含量和性质。随后,可以采用化学氧化、生物降解或物理吸附等方...
甲砜霉素甘氨酸酯盐酸盐(CAS号:2611-61-2)的物理化学性质是什么?
该化合物为白色或类白色结晶性粉末,不溶于水,溶于乙醇和氯仿。分子量为403.03 g/mol。它具有手性,含有三个手性中心,分别为2S,3R构型。该化合物在酸性...
如何储存反式-环丙烷-1,2-二胺双盐酸盐(CAS号:3187-76-6)?
反式-环丙烷-1,2-二胺双盐酸盐应存放在阴凉、干燥且通风良好的地方,避免阳光直射。储存容器应密封,以防挥发和受潮。同时,应远离火源和热源,确保储存环境温度不超...
什么是吩嗪硫酸甲酯(CAS号:299-11-6)?
吩嗪硫酸甲酯是一种有机化合物,化学结构由吩嗪环与甲酯基团构成,分子式为C10H9N2SO4。其为吩嗪类衍生物,具有典型的芳香环结构和酯基官能团,常作为氧化剂或染...
N1-异丙基二乙烯三胺(CAS号:207399-20-0)的市场或研究趋势如何?
随着绿色化学和环保意识的提高,N1-异丙基二乙烯三胺的研究趋势正向低毒、环保的方向发展。市场趋势方面,由于其在功能性材料、药物合成等领域的需求,预计其市场需求将...
4,4-Dimethyl-5,6-dihydro-4H-cyclopenta[d][1,3]thiazol-2-amine(CAS号:1182284-47-4)应用于哪些行业?
该化合物在医药、聚合物、传感器和半导体领域有潜在的应用。在医药领域,作为一种新型的噻唑类化合物,它可能具有抗炎、抗病毒等生物活性。在聚合物领域,该化合物可用作增...
处理5-(PYRIDIN-4-YL)-OXAZOL-2-YLAMINE(CAS号:1014629-83-4)时应注意哪些实验室安全事项?
在处理5-(吡啶-4-基)-2-氧代-1-氧杂环己烷-3-胺时,应佩戴防护眼镜、手套和防护服。实验应在通风橱中进行,以避免吸入有害气体。如果发生泄露,应立即用大...
什么是伊托必利N-氧化物(CAS号:141996-98-7)?
伊托必利N-氧化物是一种化学化合物,其分子结构是伊托必利的N位进行氧化处理后的产物。它具有一定的生物活性,主要用于药物研究和开发。
氟氯烟酸(CAS号:82671-06-5)安全吗?
氟氯烟酸属于有机氯化物,具有一定的毒性,需谨慎处理。在操作过程中,应佩戴防护手套、护目镜和实验服,避免吸入其粉尘或蒸汽。接触皮肤或眼睛可能导致刺激,应采取适当的...















