DFT-Machine Learning Approach for Accurate Prediction of pK(a)
- Authors
- Lawler, Robin; Liu, Yao-Hao; Majaya, Nessa; Allam, Omar; Ju, Hyunchul; Kim, Jin Young; Jang, Seung Soon
- Issue Date
- 2021-10-07
- Publisher
- AMER CHEMICAL SOC
- Citation
- JOURNAL OF PHYSICAL CHEMISTRY A, v.125, no.39, pp.8712 - 8722
- Abstract
- In this study, we propose a novel method of pK(a) prediction in a diverse set of acids, which combines density functional theory (DFT) method with machine learning (ML) methods. First, the DFT method with B3LYP/6-31++G**/SM8 is used to predict pK(a), yielding a mean absolute error of 1.85 pK(a) units. Subsequently, such pK(a) values predicted from the DFT method are employed as one of 10 molecular descriptors for developing ML models trained on experimental data. Kernel Ridge Regression (KRR), Gaussian Process Regression, and Artificial Neural Network are optimized using three Pipelines: Pipeline 1 involving only hyperparameter optimization (HPO), Pipeline 2 involving HPO followed by a relative contribution analysis (RCA) and recursive feature elimination (RFE), and Pipeline 3 involving HPO followed by RCA and RFE on an expanded set of composite features. Finally, it is demonstrated that KRR with Pipeline 3 yields optimal pK(a) prediction at an MAE of 0.60 log units. This algorithm was then utilized to predict the pKa of 37 novel acids. The two most important features were determined to be the number of hydrogen atoms in the molecule and the degree of oxidation of the acid. The predicted pKa values were documented for future reference.
- Keywords
- ACID DISSOCIATION-CONSTANTS; DENSITY-FUNCTIONAL METHODS; SOLVATION FREE-ENERGIES; COMPLETE BASIS-SET; PHOSPHONIC ACID; PROTON CONDUCTIVITY; PROTOGENIC GROUP; NEURAL-NETWORKS; SULFONIC-ACID; VALUES; ACID DISSOCIATION-CONSTANTS; DENSITY-FUNCTIONAL METHODS; SOLVATION FREE-ENERGIES; COMPLETE BASIS-SET; PHOSPHONIC ACID; PROTON CONDUCTIVITY; PROTOGENIC GROUP; NEURAL-NETWORKS; SULFONIC-ACID; VALUES
- ISSN
- 1089-5639
- URI
- https://pubs.kist.re.kr/handle/201004/116268
- DOI
- 10.1021/acs.jpca.1c05031
- Appears in Collections:
- KIST Article > 2021
- Files in This Item:
There are no files associated with this item.
- Export
- RIS (EndNote)
- XLS (Excel)
- XML
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.