Protein and Ligand PreparationΒΆ

In this section, reduce (non-OpenEye software available at is used to add and optimize explicit hydrogens. You are free to use alternative methods to perform this and other “prep” tasks.

Good protein and ligand preparation is vital before running SZMAP or GamePlan. This consists of trimming molecules to the relevant parts; adding hydrogens, partial charges, and atomic radii; and organizing them into separate protein and ligand files.

The commands below are shown in a form appropriate for Linux, Unix or Mac OS X. On Windows some of the auxiliary commands will not be available and a substitution will be necessary. For example, you may need to use winzip rather than gzip. When you install SZMAP on Windows, a pre-configured version of the DOS command prompt is constructed and can be used to run any SZMAP program or utility without any extra set-up. This window can be found under the Start menu in All Programs >> OpenEye >> SZMAP {version} >> Command Prompt.

First, examine your structure in VIDA to determine the number of subunits and where the ligand is with respect to the subunit interface. VIDA can read gzipped structure files as-is and has the File >> Open Special >> From PDB menu command to fetch structures from the Protein Data Bank directly.


File >> New Molecule >> From Split >> Selected

The structure for 4STD is a trimer with the ligand some distance from the protein/protein interface so we can edit the file to delete protein chains B and C (proteins and small molecules). To do this in VIDA, open the protein in the List window and drill down until you see the three chains. Clicking on Chain A and then right-clicking (control-click on the Mac) to bring up the pop-up menu will allow you to select chain A. Selecting the menu item File >> New Molecule >> From Split >> Selected will generate a new list item with two entries: one for chain A and one for the other chains. Selecting the split-out A chain and right-clicking will bring up a menu that will allow you to save this item to a file. Be sure to change the format to PDB as reduce will expect a .pdb file.

If you wish to delete detergents or other extraneous molecules, make sure they are not selected when you do the split operation. Alternatively, you may prefer to edit your molecule by hand as follows. If your structure file is gzipped PDB file, unzip it (gzip is a widely-available open-source command-line tool for doing this; on Windows a program like winzip may be more convenient). Edit it to delete extra subunits, detergent, or other extraneous molecules using whichever text editor you prefer. It is not necessary to remove the waters at this stage. They will be culled by Pch in a later step. And it is also not necessary to prune the connection table—any references to deleted atoms will be ignored

> gzip -dc pdb4std.ent.gz > 4std.pdb
> edit 4std.pdb

Since SZMAP and GamePlan require explicit hydrogen atoms on the molecules and most PDB structures do not include the hydrogen atoms, the next steps produce a protein structure with all the hydrogen atoms explicitly represented. There are many ways to do this. Here, we will use reduce, a free program to add and optimize hydrogens that is available from the Richardson laboratory at Duke University.

If your atom names contain duplicates (for example, if all hydrogens are named ” H ”) you need to convert them to unique atom names, see chapter FixDupAtomNames for instructions. In our example this is not required.

Next, make a Protein DataBank heterogen dictionary, a format reduce can use to work out how to protonate ligands.

> mkhetdict 4std.pdb 4std_hets.txt

If you need to use an ionization or tautomerization state other than the one MKHetDict assigns, you can edit the heterogen dictionary to add or delete hydrogens as required.

The next step is to add hydrogens and optimize OH, SH, His, Asn/Gln, etc. in the context of the complex. This is currently done using non-OpenEye software such as reduce [Word-1999] (free to license and available for download at kinemage.biochem.duke.eduWindows users should use the most recent installer and if using with cygwin, see this discussion of auto restart).


Reduce is not produced or supported by OpenEye. Information is provided here for your convenience. You are free to use programs other those described here as long as they produce similar results. Future versions of SZMAP will not require third-party software for this function.

Reduce requires both the input and the output structure files to be in .pdb format.

> reduce -db 4std_hets.txt -rotexist -build 4std.pdb >4stdH.pdb 2>4stdH_reduce.log

Next, split the structure into protein and ions in one file and the ligand in another and add partial charges and radii to all the atoms. If the structure contains multiple small molecules (ligand + cofactor, salt, etc.), pch -ligand_res LIG will ensure that only the residue LIG is put in the ligand file and the cofactors, etc. are added to the protein file (see chapter Pch for a complete list of options for distinguishing the ligand from other molecules).


The Pch utility is supplied for your convenience. If you have another mechanism for assigning partial charges to your molecules, SZMAP will accept the results.

Pch will assign partial charges to amino-acids that contain covalent modifications, using AmberFF94 charges for any standard amino acid and AM1BCC for any other group. It will also eliminate alternate conformations from a structure, leaving only the conformation with the highest occupancy. Occasionally, X-ray structures contain mistakes where alternate location codes are scrambled, leading to incorrect bonds being assigned. These bonds are often much longer or much shorter than they should be. Errors such as these in your input may have to be resolved before it is suitable for use with Pch.

> pch -ligand_res BFS 4stdH.pdb 4std_prot.oeb.gz 4std_lig.oeb.gz

Warnings of No Amber charges and Formal charge(#) is not equal to sum of partial charges(#) indicate missing atoms and should be accompanied by Missing atom warnings listing the missing protein (non-hydrogen) atoms. Similarly, warnings of bad MMFF types in SZMAP can usually be traced to missing atoms or very poor geometry that leads to inappropriate bonds being assigned. Missing protein atoms or bad bonds may not actually be a problem if they are sufficiently distant from the binding site.

The charged molecules are usually written to OE binary files but if you need to modify the charges, for example to change iron II to iron III, the molecules can be written to editable DelPhi format .pdb files (with radii and partial charges in the occupancy and B-factor fields) which SZMAP can read.


If SZMAP is given an input .pdb file which still contains the usual occupancy and B-factor, rather than radii and partial charges, any energies it generates will be meaningless.