The pch utility helps prepare input files for use with szmap by adding partial charges (thus the name) and radii to atoms, and separating protein chains from any ligand(s) and any waters. It reads a structure file—preferably one where the hydrogens have already been added and oriented in optimal positions—and writes out two charged molecule files, one with the protein and any metals and one with other (non-water) molecules, such as ligands (waters can be written out to a separate file, if needed). The molecules in these output files have AmberFF94 partial charges assigned to protein atoms, formal charges on the ions, and AM1BCC partial charges on heterogen atoms. Modifications to residues, such as sugars or covalently-bonded small molecules or non-standard residues, are charged separately from the rest of the protein using AM1BCC and the charges are then transferred back to the modified protein.
The OpenEye Python Cookbook contains a recipe for assigning canonical AM1BCC partial charges to ligands which are much less dependent on ligand conformation, something pch cannot currently do.
By default, pch will eliminate all alternate conformations from the input except the one with the highest occupancy. Although this behavior can be overridden with the option -keep_alts, szmap will not process more than one conformation so keeping multiple alternative conformations is not appropriate for input to szmap.
Because many proteins require co-factors to function, pch provides a rich set of options to define precisely which residue(s) will go into the ligand file, leaving the rest to be incorporated into the protein file.
A useful protein preparation procedure before running szmap or gameplan starts with deleting any unwanted subunits, detergent and other non-essential molecules. Next, hydrogens are added and their orientations optimized, see the Tutorial and chapter mkhetdict for more information. Then, the protein and any ions are separated from any small-molecules and partial charges and radii are added using pch:
> pch structure.pdb prot+ions.oeb.gz small-mols.oeb.gz
A warning that the formal charge is not equal to the sum of the partial charges usually indicates one or more atoms were missing from the protein structure. The degree to which this affects calculation results depends on the distance of the group with missing atoms from the region where the szmap calculations are performed. Missing atoms more than 10 Å from the binding site usually do not alter the results significantly. pch will display a list of any non-hydrogen atoms missing from standard protein residues, which can help you to check the location of missing atoms with respect to the binding site.
If the structure contains small-molecules other than the ligand, such as ions, detergent, or co-factors, use -lig_res or one of the other selection options to ensure that only the specified residue will be placed in the second (ligand) output file and all the other non-water molecules are placed in the first (protein) file.
> pch -lig_res cam 2cppH.pdb 2cpp_prot.oeb.gz 2cpp_lig.oeb.gz
To identify a peptide or nucleic acid as the ligand, select it by chain or by a range of residue numbers.
If you need to modify the charge pch assigns (for example, to change iron II to iron III), either modify the charge in VIDA using the builder or save the output to a .mol2 format or (DelPhi) .pdb format rather than .oeb format where it can easily be edited. DelPhi format is a non-standard version of the PDB format where the radii and partial charge are stored in the occupancy and B-factor fields, respectively.
> pch -lig_res cam 2cppH.pdb 2cpp_prot.pdb 2cpp_lig.oeb.gz > edit 2cpp_prot.pdb
If you store the protein and ligand you want to use as input to pch in separate files, run pch twice and discard the extra (empty) files. Just make sure the protein and the ligand has hydrogens and that these hydrogens are in positions that make all the appropriate interactions with the corresponding protein or ligand.
> pch proteinH.pdb prot.oeb.gz /tmp/junk.oeb.gz > pch ligand.sdf /tmp/junk.oeb.gz lig.oeb.gz
Because small-molecules and modifications to amino-acids are charged with AM1BCC and szmap uses MMFF van der Waals terms, there are restrictions on the elements that may be used with szmap and gameplan.
AM1BCC supports the following elements:
H, C, N, O, F, P, S, Cl, Br, I, and Si
MMFF supports the following elements:
H, B, C, N, O, F, P, S, Cl, Br, I, Si, Se, Li, Na, K, Ca, Fe, Zn, Cu, and Mg
pch will generate a warning if other elements are found in the input. If the offending atom happens to be an metal used to determine the crystallographic phase and it is not related to the binding site, you can usually edit it out of the input and rerun pch. If, on the other hand, you need to replace this atom with a “reasonable facsimile”, you can specify -fix_elements to request that pch attempt such a replacement, for example replacing with . Note that the oxidation state may be a problem regardless of the substitution: if molybdenum IV is converted to iron II, you can’t just change this to iron IV because MMFF only contains van der Waals parameters for iron II and iron III. Future versions of SZMAP will provide van der Waals parameters across a wider range of elements and oxidation states.
A description of the basic command line interface can be obtained by executing pch with no arguments.
will generate output similar to the following:
:jGf: :jGDDDDf: PPPPP CCC H H ,fDDDGjLDDDf, P P C C H H ,fDDLt: :iLDDL; P P C H H ;fDLt: :tfDG; PPPPP C HHHHHH ,jft: ,ijfffji, :iff P C H H .jGDDDDDDDDDGt. P C C H H ;GDDGt:''':tDDDG, P CCC H H .DDDG: :GDDG. ;DDDj tDDDi Copyright (c) 2010-2015 ,DDDf fDDD, OpenEye Scientific Software, Inc. LDDDt. .fDDDj Version: 1.2.1 .tDDDDfjtjfDDDGt Release: 20150305 :ifGDDDDDGfi. OEChem version: 1.9.2 20150305 .:::. Platform: redhat-RHEL5-g++4.1-x64 ...................... DDDDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDDDD Licensed for the exclusive use of Company Name. Licensed for use only in Site. License expires on August 15, 2015. No arguments specified on the command line pch : add charges and radii and split into protein+ion and ligand files Required parameters: -input_mol : Input molecule file. Should have coordinates for hydrogens as well as heavy atoms. For more help type: pch --help
[keyless parameter 2; default pch_prot.oeb.gz]
Name for output protein file. This file is either a DelPhi format PDB file (radii and charges stored in occupancy and Bfactor) or another format that retains the partial charges: OEBinary or MOL2. Saving in an OEBinary format allows the charges and radii to be easily inspected in VIDA. The extension modifier .gz means gzipped and is usually more compact than the uncompressed format but otherwise identical.
|PDB(DelPhi)||.pdb .ent .pdb.gz .ent.gz|
[keyless parameter 3; default pch_lig.oeb.gz]
Name for output ligand file. This file is either a DelPhi format PDB file (radii and charges stored in occupancy and Bfactor) or another format that retains the partial charges: OEBinary or MOL2. Saving in an OEBinary format allows the charges and radii to be easily inspected in VIDA. The extension modifier .gz means gzipped and is usually more compact than the uncompressed format but otherwise identical.
|PDB(DelPhi)||.pdb .ent .pdb.gz .ent.gz|
Generate AM1BCC partial charges where topologically equivalent atoms are not forced to have the same charge. For example, by if false the hydrogens on a methyl will all have the same partial charge. Charges generated if this option is true will generally differ for each of these “equivalent” hydrogens.
Since szmap does not vary the conformation of the ligand, non-symmetrized charges can be used to describe a specific conformation of each atom. But for any workflow down the road where the conformation may change, using symmetrized charges is probably more appropriate.