The mkhetdict utility generates PDB format heterogen (small-molecule) dictionary files that are used by reduce to add hydrogens to ligand and cofactor atoms. Each non-standard residue will have an entry in the dictionary. The reduce option -db <dict.txt> ensures these residues will be protonated, as long as the “het-dict” is a .txt file and the input structure file is provided to reduce in .pdb format.

Reduce [Word-1999] is open source software from the Richardson laboratory at Duke University. It is free to license and available for download available at Windows Vista or Windows 7 users should use the latest version as some Windows versions of reduce do not process the heterogen dictionary correctly. This software is described here for your convenience only, you are free to use any software you prefer to add and optimize hydrogens.


OpenEye has not been involved in the development of reduce and does not provide support for reduce.


The usual workflow is to make a “het-dict” and then use it to generate a protonated output file (reduce writes the protonated structure to standard output, in the example below this is redirected to the file molH.pdb with the ‘>’ operator).

> mkhetdict mol.pdb mol_hetdict.txt
> reduce -db mol_hetdict.txt -build -rotexist mol.pdb >molH.pdb

If mkhetdict chooses a different tautomer or ionization state than the one you feel is appropriate, the hetdict.txt file can be modified in a text editor before running reduce.

Below is an example heterogen dictionary made by mkhetdict. Although this file contains only one entry, structures with more than one type of small-molecule will contain multiple entries, one per type. Each CONECT record identifies an atom, specifies how many other atoms it is bonded to, and enumerates each bonded atom. Each atom is listed in multiple places, as the parent atom and again as a bonded neighbor. When editing this file, care must be taken to maintain the column positions in the file—blank spaces are significant. Also, when atoms are added or removed from an entry, be sure to update the total number of atoms on the lines that begin with RESIDUE and HET.

RESIDUE   BFS     33
CONECT      C1     3 C2   C6   C
CONECT      C2     3 C1   C3   O2
CONECT      C3     3 C2   C4   H3
CONECT      C4     3 C3   C5   H4
CONECT      C5     3 C4   C6   F5
CONECT      C6     3 C1   C5   H6
CONECT      C      3 C1   N    O
CONECT      CE1    4 CE2  C1'  N    HE1
CONECT      CE2    4 CE1 HE21 HE22 HE23
CONECT      C1'    3 CE1  C2'  C6'
CONECT      C2'    3 C1'  C3'  H2'
CONECT      C3'    3 C2'  C4'  H3'
CONECT      C4'    3 C3'  C5' BR4
CONECT      C5'    3 C4'  C6'  H5'
CONECT      C6'    3 C1'  C5'  H6'
CONECT      N      3 C    CE1  HN
CONECT      O      1 C
CONECT      O2     2 C2   H2
CONECT      F5     1 C5
CONECT     BR4     1 C4'
CONECT      HN     1 N
CONECT      H3     1 C3
CONECT      H4     1 C4
CONECT      H6     1 C6
CONECT      HE1    1 CE1
CONECT     HE21    1 CE2
CONECT     HE22    1 CE2
CONECT     HE23    1 CE2
CONECT      H2'    1 C2'
CONECT      H3'    1 C3'
CONECT      H5'    1 C5'
CONECT      H6'    1 C6'
CONECT      H2     1 O2
HET    BFS             33

Command Line Interface

A description of the basic command line interface can be obtained by executing mkhetdict with no arguments.

prompt> mkhetdict

will generate output similar to the following:

  mkhetdict, Copyright (c) 2010-2015
  OpenEye Scientific Software, Inc.
  Version: 1.2.1
  Release: 20150305
  OEChem version: 1.9.2 20150305
  Platform: redhat-RHEL5-g++4.1-x64

  Licensed for the exclusive use of Company Name.
  Licensed for use only in Site.
  License expires on August 15, 2015.

No arguments specified on the command line
Required parameters:
    -input_mol : Input molecule file.
For more help type:
  mkhetdict --help

Required Parameters

-input_mol <filename>
-i <filename>

[keyless parameter 1]

Input molecule file containing small-molecule residues.

File type Extension
OEBinary .oeb .oeb.gz
PDB .pdb .ent .pdb.gz .ent.gz
SDF .sdf .mol .sdf.gz .mol.gz
MOL2 .mol2 .mol2.gz
MacroModel .mmod .mmod.gz

Command Line Options

Output Options

-output_txt <filename.txt>
-o <filename.txt>

[keyless parameter 2]

Output PDB format small-molecule dictionary file.


Must have the extension .txt for reduce to recognize it.

Other Options

Print additional information to standard-error.

Table Of Contents

Previous topic


Next topic