Protein Ensemble Generation Through Variational Autoencoder Latent Space Sampling

Authors
Mansoor, SanaaBaek, MinkyungPark, HahnbeomLee, Gyu RieBaker, David
Issue Date
2024-04
Publisher
American Chemical Society
Citation
Journal of Chemical Theory and Computation, v.20, no.7, pp.2689 - 2695
Abstract
Mapping the ensemble of protein conformations that contribute to function and can be targeted by small molecule drugs remains an outstanding challenge. Here, we explore the use of variational autoencoders for reducing the challenge of dimensionality in the protein structure ensemble generation problem. We convert high-dimensional protein structural data into a continuous, low-dimensional representation, carry out a search in this space guided by a structure quality metric, and then use RoseTTAFold guided by the sampled structural information to generate 3D structures. We use this approach to generate ensembles for the cancer relevant protein K-Ras, train the VAE on a subset of the available K-Ras crystal structures and MD simulation snapshots, and assess the extent of sampling close to crystal structures withheld from training. We find that our latent space sampling procedure rapidly generates ensembles with high structural quality and is able to sample within 1 & Aring; of held-out crystal structures, with a consistency higher than that of MD simulation or AlphaFold2 prediction. The sampled structures sufficiently recapitulate the cryptic pockets in the held-out K-Ras structures to allow for small molecule docking.
Keywords
PREDICTION
ISSN
1549-9618
URI
https://pubs.kist.re.kr/handle/201004/149622
DOI
10.1021/acs.jctc.3c01057
Appears in Collections:
KIST Article > 2024
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE