1、一个使用gromacs进行蛋白质模拟的入门教程一个使用gromacs进行蛋白质模拟的入门教程GROMACS TutorialLysozyme in WaterJustin LemkulDepartment of Biochemistry, Virginia TechThis example will guide a new user through the process of setting up a simulation system containing a protein (lysozyme) in a box of water, with ions. Each step will c
2、ontain an explanation of input and output, using typical settings for general use.This tutorial assumes you are using a GROMACS version in the series.GROMACS TutorialStep One: Prepare the TopologyWe must download the protein structure file we will be working with. For this tutorial, we will utilize
3、hen egg white lysozyme (PDB code 1AKI). Go to the RCSB website and download the PDB text for the crystal structure.Once you have downloaded the structure, you can visualize the structure using a viewing program such as VMD, Chimera, PyMOL, etc. Once youve had a look at the molecule, you are going to
4、 want to strip out the crystal waters. Use a plain text editor like vi, emacs (Linux/Mac), or Notepad (Windows). Do not use word processing software! Delete the lines corresponding to these molecules (residue HOH in the PDB file). Note that such a procedure is not universally appropriate ., the case
5、 of a bound active site water molecule). For our intentions here, we do not need crystal water.Always check your .pdb file for entries listed under the comment MISSING, as these entries indicate either atoms or whole residues that are not present in the crystal structure. Terminal regions may be abs
6、ent, and may not present a problem for dynamics. Incomplete internal sequences or any amino acid residues that have missing atoms will cause pdb2gmx to fail. These missing atoms/residues must be modeled in using other software packages. Also note that pdb2gmx is not magic. It cannot generate topolog
7、ies for arbitrary molecules, just the residues defined by the force field (in the *.rtp files - generally proteins, nucleic acids, and a very finite amount of cofactors, like NAD(H) and ATP).Now that the crystal waters are gone and we have verified that all the necessary atoms are present, the PDB f
8、ile should contain only protein atoms, and is ready to be input into the first GROMACS tool, pdb2gmx. The purpose of pdb2gmx is to generate three files:1.The topology for the molecule.2.A position restraint file.3.A post-processed structure file.The topology by default) contains all the information
9、necessary to define the molecule within a simulation. This information includes nonbonded parameters (atom types and charges) as well as bonded parameters (bonds, angles, and dihedrals). We will take a more detailed look at the topology once it has been generated.Execute pdb2gmx by issuing the follo
10、wing command:pdb2gmx -f -o -water spceThe structure will be processed by pdb2gmx, and you will be prompted to choose a force field:Select the Force Field:From /usr/local/gromacs/share/gromacs/top: 1: AMBER03 force field (Duan et al., J. Comp. Chem. 24, 1999-2012, 2003) 2: AMBER94 force field (Cornel
11、l et al., JACS 117, 5179-5197, 1995) 3: AMBER96 force field (Kollman et al., Acc. Chem. Res. 29, 461-469, 1996) 4: AMBER99 force field (Wang et al., J. Comp. Chem. 21, 1049-1074, 2000) 5: AMBER99SB force field (Hornak et al., Proteins 65, 712-725, 2006) 6: AMBER99SB-ILDN force field (Lindorff-Larsen
12、 et al., Proteins 78, 1950-58, 2010) 7: AMBERGS force field (Garcia & Sanbonmatsu, PNAS 99, 2782-2787, 2002) 8: CHARMM27 all-atom force field (with CMAP) - version 9: GROMOS96 43a1 force field10: GROMOS96 43a2 force field (improved alkane dihedrals)11: GROMOS96 45a3 force field (Schuler JCC 2001 22
13、1205)12: GROMOS96 53a5 force field (JCC 2004 vol 25 pag 1656)13: GROMOS96 53a6 force field (JCC 2004 vol 25 pag 1656)14: OPLS-AA/L all-atom force field (2001 aminoacid dihedrals)15: DEPRECATED Encad all-atom force field, using full solvent charges16: DEPRECATED Encad all-atom force field, using scal
14、ed-down vacuum charges17: DEPRECATED Gromacs force field (see manual)18: DEPRECATED Gromacs force field with hydrogens for NMRThe force field will contain the information that will be written to the topology. This is a very important choice! You should always read thoroughly about each force field a
15、nd decide which is most applicable to your situation. For this tutorial, we will use the all-atom OPLS force field, so type 14 at the command prompt, followed by Enter.There are many other options that can be passed to pdb2gmx. Some are listed here:-ignh: Ignore H atoms in the PDB file; especially u
16、seful for NMR structures. Otherwise, if H atoms are present, they must be in the correct order and named exactly how GROMACS expects them to be.-ter: Interactively assign charge states for N- and C-termini.-inter: Interactively assign charge states for Glu, Asp, Lys, Arg, and His; assign disulfides
17、to Cys.You have now generated three new files: , , and . is a GROMACS-formatted structure file that contains all the atoms defined within the force field ., H atoms have been added to the amino acids in the protein). The file is the system topology (more on this in a minute). The file contains infor
18、mation used to restrain the positions of heavy atoms (more on this later).Step Two: Examine the TopologyLets look at what is in the output topology . Again, using a plain text editor, inspect its contents. After several comment lines (preceded by ;), you will find the following:#include This line ca
19、lls the parameters within the OPLS-AA force field. It is at the beginning of the file, indicating that all subsequent parameters are derived from this force field. The next important line is moleculetype , below which you will find; Name nrexclProtein_A 3The name Protein_A defines the molecule name,
20、 based on the fact that the protein was labeled as chain A in the PDB file. There are 3 exclusions for bonded neighbors. More information on exclusions can be found in the GROMACS manual; a discussion of this information is beyond the scope of this tutorial.The next section defines the atoms in the
21、protein. The information is presented as columns: atoms ; nr type resnr residue atom cgnr charge mass typeB chargeB massB; residue 1 LYS rtp LYSH q + 1 opls_287 1 LYS N 1 ; qtot 2 opls_290 1 LYS H1 1 ; qtot 3 opls_290 1 LYS H2 1 ; qtot 4 opls_290 1 LYS H3 1 ; qtot 5 opls_293B 1 LYS CA 1 ; qtot 6 opl
22、s_140 1 LYS HA 1 ; qtot 1The interpretation of this information is as follows: nr: Atom number type: Atom type resnr: Amino acid residue number residue: The amino acid residue nameNote that this residue was LYS in the PDB file; the use of .rtp entry LYSH indicates that the residue is protonated (the
23、 predominant state at neutral pH). atom: Atom name cgnr: Charge group numberCharge groups define units of integer charge; they aid in speeding up calculations charge: Self-explanatoryThe qtot descriptor keeps a running count of the total charge on the molecule mass: Also self-explanatory typeB, char
24、geB, massB: Used for free energy perturbation (not discussed here)Subsequent sections include bonds , pairs , angles ,and dihedrals . Some of these sections are self-explanatory (bonds, angles, and dihedrals). The parameters and function types associated with these sections are elaborated on in Chap
25、ter 5 of the GROMACS manual. Special 1-4 interactions are included under pairs (section of the GROMACS manual).The remainder of the file involves defining a few other useful/necessary topologies, starting with position restraints. The file was generated by pdb2gmx; it defines a force constant used t
26、o keep atoms in place during equilibration (more on this later).; Include Position restraint file#ifdef POSRES#include #endifThis ends the Protein_A moleculetype definition. The remainder of the topology file is dedicated to defining other molecules and providing system-level descriptions. The next
27、moleculetype (by default) is the solvent, in this case SPC/E water. Other typical choices for water include SPC, TIP3P, and TIP4P. We chose this by passing -water spce to pdb2gmx. For an excellent summary of the many different water models, click?here, but be aware that not all of these models are p
28、resent within GROMACS.; Include water topology#include #ifdef POSRES_WATER; Position restraint for each water oxygen position_restraints ; i funct fcx fcy fcz 1 1 1000 1000 1000#endifAs you can see, water can also be position-restrained, using a force constant (kpr) of 1000 kJ mol-1nm-2.Ion paramete
29、rs are included next:; Include generic topology for ions#include Finally come system-level definitions. The system directive gives the name of the system that will be written to output files during the simulation. The molecules directive lists all of the molecules in the system. system ; NameLYSOZYM
30、E molecules ; Compound #molsProtein_A 1A few key notes about the molecules directive:1.The order of the listed molecules mustexactlymatch the order of the molecules in the coordinate (in this case, .gro) file.2.The names listed must match the moleculetype name for each species, not residue names or
31、anything else.3.If you fail to satisfy these concrete requirements at any time, you will get fatal errors from grompp (discussed later) about mismatched names, molecules not being found, or a number of others.Now that we have examined the contents of a topology file, we can continue building our sys
32、tem.Step Three: Defining the Unit Cell & Adding SolventNow that you are familiar with the contents of the GROMACS topology, it is time to continue building our system. In this example, we are going to be simulating a simple aqueous system. It is possible to simulate proteins and other molecules in different solvents, provided that good parameters are available for all species involve
copyright@ 2008-2023 冰点文库 网站版权所有
经营许可证编号:鄂ICP备19020893号-2