为了正常的体验网站,请在浏览器设置里面开启Javascript功能!
首页 > CHARMM General Force Field (CGenFF)

CHARMM General Force Field (CGenFF)

2012-06-07 42页 pdf 2MB 175阅读

用户头像

is_664412

暂无简介

举报
CHARMM General Force Field (CGenFF) CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian, O. Guvench, P. Lopes, I. Vorobyov, an...
CHARMM General Force Field (CGenFF)
CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong, J. Shim, E. Darian, O. Guvench, P. Lopes, I. Vorobyov, and A. D. MacKerell Jr.* Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Baltimore, Maryland 21201 Abstract The widely used CHARMM additive all-atom force field includes parameters for proteins, nucleic acids, lipids and carbohydrates. In the present paper an extension of the CHARMM force field to drug-like molecules is presented. The resulting CHARMM General Force Field (CGenFF) covers a wide range of chemical groups present in biomolecules and drug-like molecules, including a large number of heterocyclic scaffolds. The parametrization philosophy behind the force field focuses on quality at the expense of transferability, with the implementation concentrating on an extensible force field. Statistics related to the quality of the parametrization with a focus on experimental validation are presented. Additionally, the parametrization procedure, described fully in the present paper in the context of the model systems, pyrrolidine, and 3-phenoxymethylpyrrolidine will allow users to readily extend the force field to chemical groups that are not explicitly covered in the force field as well as add functional groups to and link together molecules already available in the force field. CGenFF thus makes it possible to perform “all-CHARMM” simulations on drug-target interactions thereby extending the utility of CHARMM force fields to medicinally relevant systems. Introduction Background Computational biochemistry and biophysics is an ever growing field that is being applied to a wide range of heterogeneous systems of increasing size and complexity. Recent examples include simulations of the nucleosome,1 ion channels2 and the ribosome.3 While greater computational resources, including massively parallel architectures, and efficient codes such as NAMD4 and Desmond5 have made important contributions to these advances, the quality of the force fields that act as the framework of computational biochemistry and biophysics have improved to the point that stable simulations of these systems are possible. Towards this goal the CHARMM additive, all-atom force field6 has made an important contribution. Apart from proteins,7 it supports nucleic acids8–10 and lipids,11,12 and has limited support for carbohydrates,13–15 with a more complete carbohydrate extension in preparation,16 allowing simulations on all commonly encountered motifs in biological systems. However, its coverage of the wide range of chemical space required for the field of computational medicinal chemistry is limited. *To whom correspondence should be addressed: alex@outerbanks.umaryland.edu. Supplementary material A listing of target data and corresponding CGenFF values is available as supplementary material. Additionally, the most recent CGenFF topology and parameter files are available at http://mackerell.umaryland.edu/. NIH Public Access Author Manuscript J Comput Chem. Author manuscript; available in PMC 2011 March 1. Published in final edited form as: J Comput Chem. 2010 March ; 31(4): 671–690. doi:10.1002/jcc.21367. N IH -PA Author M anuscript N IH -PA Author M anuscript N IH -PA Author M anuscript To date, a number of force fields with wide coverage of drug-like molecules are available. Here, we will limit our discussion to force fields that were parametrized to reproduce condensed phase properties. Indeed, as classical additive force fields do not account for polarizability, a given additive force field will only perform well in a given dielectric medium. As a special case, Allinger’s most recent MM4 force field accurately predicts gas-phase conformational energetics of organic molecules and includes terms that account for polarizability in an approximate way.17 Nevertheless, this force field has not been evaluated in a condensed phase, high-dielectric medium, and may be assumed to be unsuitable for condensed phase studies of, for example, drug-protein interactions. The Merck Molecular Force Field (MMFF94), on the other hand, was developed with the explicit goal of performing MD simulations on pharmaceutically relevant systems in the condensed phase. Indeed, it was originally conceived as “a combined organic/protein force field that is equally applicable to proteins and other systems of biological significance”.18 One drawback of this approach is that the general nature of the force field inherently compromises its accuracy in representing classes of molecules in a well-defined chemical space, such as proteins. To overcome this problem, efforts were started to create general force fields that are meant to be used with existing, highly optimized and tested biomolecular force fields. As a result of this effort, Amber19,20 now includes the General Amber Force Field (GAFF)21 and the Antechamber toolkit,22 which allow the user to generate an Amber force field model for an arbitrary input molecule. Another example is OPLS-AA, whose optimizations emphasized condensed phase properties of small molecules and has been extended to cover a diverse set of small molecule model compounds.23,24 However, none of these force fields is expected to be applicable in combination with the CHARMM force field, as interactions between molecules are dominated by a delicate balance of parameters. Indeed, while it is tempting to combine a biomolecular force field (like CHARMM) for a system’s biological part with an organic force field (such as MMFF94) for its drug-like part, it is unlikely that this will yield properly balanced intermolecular interactions, because the nonbonded parameters are developed using different strategies for different force fields.25 To cure this problem, the CHARMM General Force Field (CGenFF) was created. CGenFF is an organic force field explicitly aimed at simulating drug-like molecules in a biological environment represented by the CHARMM additive biomolecular force fields. Consequently, CGenFF uses the same class I potential energy function as the other CHARMM force fields,6 and more generally, all the conventions and recommendations for usage of the biomolecular CHARMM force fields apply to CGenFF as well. The form of the CHARMM potential energy function used to calculate the energy, V(r), were r represents the Cartesian coordinates of the system, is shown in equation 1. (1) The intramolecular portion of the potential energy function includes terms for the bonds, valence angles, torsion or dihedral angles, improper dihedral angles and a Urey-Bradley 1,3- term, where bo, θo, ψo and r1,3,o are the bond, angle, improper and Urey-Bradley equilibrium terms, respectively, n and δ are the dihedral multiplicity and phase and the K’s are the respective Vanommeslaeghe et al. Page 2 J Comput Chem. Author manuscript; available in PMC 2011 March 1. N IH -PA Author M anuscript N IH -PA Author M anuscript N IH -PA Author M anuscript force constants. The intermolecular terms include electrostatic and van der Waals (vdW) interactions, where qi and qj is the partial atomic charge of atom i and j, respectively, εij is the well depth, Rmin,ij is the radius in the Lennard-Jones (LJ) 6–12 term used to treat the vdW interactions, and rij is the distance between i and j. In addition, the energy function in equation 1 has been extended to include a 2D dihedral energy correction map, referred to as CMAP, which has been used to improve the treatment of the conformational properties of the φ, ψ terms in the peptide backbone,26,27 though it may be applied to other systems. More details of the CHARMM potential energy function may be obtained from reference 25. Parametrization Philosophy The main challenge in creating a general force field is to cover enough of the vast chemical space occupied by drug-like molecules to make the model of utility. To attain this, a systematic optimization protocol is required that is 1) of a level of accuracy appropriate for the proposed application of the force field, 2) fast enough to parametrize large numbers of model compounds and 3) simple enough to enable CHARMM users to easily extend the force field to chemical groups of their interest. For these reasons, the standard CHARMM optimization procedure for biomolecular force fields has been simplified for the purpose of creating CGenFF, with a stronger emphasis on quantum mechanical (QM) calculations than with the biomolecular CHARMM force fields. Nevertheless, those simplifications are designed to maintain the consistency of the force field required for computational studies of heterogeneous systems. Indeed, improvements in computer power and QM methodology allow for QM calculations of a sufficiently high level of theory to attain the required accuracy to be performed in a timely fashion. In order to build a successful force field for drug-like molecules, it is essential to select appropriate model compounds. For CGenFF, two classes of model compounds were targeted. The first class comprises a wide range of heterocycles. As this class of compounds acts as the scaffold for the majority of pharmaceuticals, a comprehensive set of heterocycles would act as building blocks for a diverse range of compounds. Accordingly, emphasis was placed on accurate optimization of the intermolecular parameters in these molecules as well as reproduction of target geometries, vibrational spectra and conformational properties. The second class consists of simple functional groups. A wide variety of chemical groups occur in drug-like compounds, fulfilling roles that range from linking different parts of the molecule to being substituents on aromatic and heterocyclic groups, often in orientations that involve direct interactions with the biomacromolecular binding partner. Accordingly, a large number of functional groups were explicitly parametrized when building CGenFF. Given the availability of a wide range of heterocycles and functional groups, it is anticipated that eventually the majority of effort to extend the force field will involve linking those groups together to create the molecule of interest. This procedure requires the selection of the appropriate fragments from the palette of molecules in CGenFF, applying standard approaches for covalently linking those rings and functional groups, followed by evaluation of the model. The force field includes a wide range of default internal parameters for many of the links, such that once the user creates the topology for their molecule, they will generally be able to perform molecular mechanics calculations. However, it is strongly suggested that the user perform a series of well-defined QM calculations to test the conformational properties of the linkers between the rings comprising their molecule, compare the molecular mechanical (MM) and QM conformational properties and perform optimization of the appropriate (typically dihedral) parameters as described below. Accordingly, this manuscript focuses on presentation of the parameter optimization methodology, a detailed description of the methods required to extend the force field, and validation of the obtained model for ring systems and functional groups. Efforts to automate many of these tasks are in progress and will be presented in future works. Vanommeslaeghe et al. Page 3 J Comput Chem. Author manuscript; available in PMC 2011 March 1. N IH -PA Author M anuscript N IH -PA Author M anuscript N IH -PA Author M anuscript Methodology Principles Class I additive force fields (see equation 1), which do not explicitly treat electronic polarization, have been designed for use in polar environments typically found in proteins and in solution. To achieve this, the use of experimental target data, supplemented by QM data, was strongly emphasized during optimization of the nonbonded parameters in the biomolecular CHARMM force fields, in order to ensure physical behavior in the bulk phase. However, reproducing experimental data requires molecular dynamics (MD) simulations, which have to be set up carefully and repeated multiple times in the course of the parametrization, making the usage of experimental target data non-trivial and time-consuming. In addition, for many functional groups that may occur in drug-like molecules experimental data may not be available. Due to this lack of data, and since one of the main goals of CGenFF is easy and fast extensibility, a slightly different philosophy was adapted, with more emphasis on QM results as target data for parameter optimization. This is possible due to the wide range of functionalities already available whose parameters were optimized based largely on experimental data, along with the establishment of empirical scaling factors that can be applied to QM data in order to make them relevant for the bulk phase. The only cases where experimental data would be required are situations where novel atom types are present for which LJ parameters are not already available in CGenFF. These cases would require optimization of the LJ parameters, supplemented with Hartree-Fock (HF) model compound-water minimum interaction energies and distances (see step 2.a under “Generation of target data for parameter validation and optimization” and step 1 under “Parametrization procedure”), based on the reproduction of bulk phase properties, typically pure solvent molecular volumes and heats of vaporization or crystal lattice parameters and heats of sublimation. Descriptions of the optimization protocol have been published previously.7,9,25 However, it should be noted that CGenFF has been designed to cover the majority of atom types in pharmaceutical compounds, such that optimization of LJ parameters is typically not required. The remainder of this section includes 1) the procedure to add new model compounds and chemical groups to the force field, 2) the procedure for generating the QM target data, and 3) the procedure for application of the QM information to parametrize new molecules. To put these procedures in better context, example systems including pyrollidine, the addition of substituents to pyrollidine and the development of a linker between pyrollidine and benzene are presented. Addition of model compounds Model compounds are added in the context of a hierarchical and extensible force field. Accordingly, step one of extending CGenFF involves identifying available compounds that are chemically similar to the compound of interest. These available compounds are then used as the starting point to create an initial topology entry; information from the available compounds in CGenFF should be used to select the appropriate atom types and initial estimates of the partial atomic charges. At this stage the molecule, along with its bonded and nonbonded lists, can be generated in CHARMM,28,29 which will return a list of missing bonded parameters as an error message. These parameters may be either 1) treated with wildcards or 2) explicitly added to the force field via their inclusion in the parameter file. In case 1, energies as well as the full array of calculations available in CHARMM may readily be performed; however, it is recommended that the user perform validation calculations to determine that the parameters are yielding satisfactory geometries and conformational energies. This involves performing the appropriate QM calculations followed by the analogous MM calculations and checking Vanommeslaeghe et al. Page 4 J Comput Chem. Author manuscript; available in PMC 2011 March 1. N IH -PA Author M anuscript N IH -PA Author M anuscript N IH -PA Author M anuscript that the target properties are adequately reproduced. In case 2 the user would undertake the parameter optimization procedure on only the new parameters required for that molecule. This includes the QM calculations, followed by the analogous MM calculations and parameter optimization to minimize the difference between the MM and QM data (see below). It should be emphasized that in the context of a drug optimization project these procedures typically need to be performed only once on the parent compound as the addition of various functional groups, as required to create a congeneric series, may be performed without additional parameter optimization. Addition of functional groups to parent compounds/linking rings Following the assignment of parameters and their optimization for the new parent molecule, standard functional groups such as acids, amines, and alkyl moieties, may readily be added. The same approach is applicable both to adding functional groups to available compounds in CGenFF and to linking ring systems to create more complex chemical species. The first step of this procedure consists of removing hydrogen atoms from the heavy (ie. non-hydrogen) atoms between which the new covalent bond will be created. Next, taking advantage of the modular nature of the distribution of charges in CHARMM, the charges on those hydrogens are summed into their original parent heavy atoms, thereby preserving the integer charge of the molecule. In the case of adding functional groups to an existing molecule, typically, no additional parameters are required and the molecule is ready for study. However, if new parameters are required, which is often the case when more complex functional groups or linkers are being added, it will be necessary to assign wildcard or add explicit parameters followed by the appropriate level of validation and optimization. Generation of target data for parameter validation and optimization Central to the development of the CHARMM additive force fields was the use of a consistent optimization protocol, especially with the nonbonded parameters. To maintain this consistency, it is necessary to generate the appropriate target data. As discussed above, all the target data required for CGenFF extensions is obtained from QM calculations, with the exception of cases where explicit optimization of LJ parameters is required. Details of target data generation follow. 1. Internal parameters: Target data for the internal or bonded parameters include geometries, vibrational spectra and conformational energies. While not mutually exclusive, these may be partitioned into three aspects. a. Bond, valence angle, and Urey-Bradley equilibrium terms and the dihedral phase and multiplicity are based on geometries optimized at the MP2/6–31G (d) (MP2/6–31+G(d) in the case of anions) level. This level of theory has been shown to yield satisfactory agreement with experimental geometries for complex systems such as nucleic acid bases and sugars9,30,31 while being computationally accessible. b. Bond, valence angle, Urey-Bradley, improper and torsion force constants are based on MP2/6–31G(d) vibrational spectra. The F matrix is scaled by a factor 0.89, which corresponds to scaling all frequencies by a factor 0.943,32 and a symbolic potential energy distribution (PED) analysis is performed in the “local internal valence coordinate” space that was first proposed by Pulay et al.33 The PED analysis may be performed using the MOLVIB34 module in CHARMM.28,29 c. Force constants for torsions comprised of only non-hydrogen atoms are usually optimized based on (relaxed) one-dimensional potential energy scans performed at the MP2/6–31G(d) level. This level of theory has been Vanommeslaeghe et al. Page 5 J Comput Chem. Author manuscript; available in PMC 2011 March 1. N IH -PA Author M anuscript N IH -PA Author M anuscript N IH -PA Author M anuscript shown to yield energy surfaces consistent with distributions of dihedrals in oligonucleotides9,30,31. However, for systems in which dispersion dominates the conformation properties, like alkanes,35 higher levels of theory may be required. Single point calculations at th
/
本文档为【CHARMM General Force Field (CGenFF)】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。 本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。 网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。

历史搜索

    清空历史搜索