Macromolecule Conformations¶
Alternate Locations¶
Because macro-molecular structures are usually represented as static shapes, this gives the mistaken impression that proteins, nucleic acids, etc. are rigid molecules. In truth, these molecules move around quite a lot and in a crystal, loops and other bits are often disordered. Crystallographers work to model multiple conformations in parts of the structure where disorder is observed. If they can, they include a separate copy of each moving atom for each conformation, marking each with an ‘alternate location code’, a fractional ‘occupancy’ and a ‘temperature factor’ quantifying the atom’s thermal motion. Section Biopolymer Residues discusses how these properties are stored in an OEResidue.
Because a structure with alternate locations describes an ensemble of molecules
rather than a single molecule, they are unsuitable as-is for calculating
molecular properties. Often, this is dealt with by dropping all but the first
alternate location from the molecule and this is what
OEReadMolecule
does for
Protein Data Bank (PDB) files, by default. The first step in dealing with
alternate locations is to retain all the alternate location atoms
by setting the input flavor before reading the molecule, as shown below.
ims.SetFlavor(OEFormat::PDB, OEIFlavor::PDB::ALTLOC);
With all the alternate atoms retained, you can use the predicate
OEHasAlternateLocation to identify these atoms.
Although alternate locations are atom properties, they usually
describe the coordinated motion of groups of atoms. Each connected
set of atoms with alternate location codes that move in a coordinated
fashion is called an alternate location group (represented by an
OEAltGroup
) and each conformation
of a group’s atoms is called an alternate location (represented
by an OEAltLocation
). The
second step in dealing with alternate locations is to use the
OEAltLocationFactory, a class that will
manage these groups and locations for you.
Listing 1: Alternate location factory groups
#include <openeye.h>
#include <oesystem.h>
#include <oechem.h>
#include <oebio.h>
using namespace OESystem;
using namespace OEChem;
using namespace OEBio;
void PrintAltGroupInfo(OEMolBase &mol)
{
if (!OEHasResidues(mol))
OEPerceiveResidues(mol, OEPreserveResInfo::All);
OEAltLocationFactory alf(mol); // create factory for mol
std::cout << mol.GetTitle() << "\t"
<< "(" << alf.GetGroupCount() << " groups)" << std::endl;
for (OEIter<const OEAltGroup> grp = alf.GetGroups(); grp; ++grp)
{
std::cout << "\t" << grp->GetLocationCount() << " locs"
<< ":" << alf.GetLocationCodes(grp) << std::endl;
}
}
int main(int argc, char *argv[])
{
if (argc != 2)
OEThrow.Usage("%s <mol-infile>", argv[0]);
oemolistream ims;
if(! ims.open(argv[1]))
OEThrow.Fatal("Unable to open %s for reading", argv[1]);
// need this flavor to read alt loc atoms
ims.SetFlavor(OEFormat::PDB, OEIFlavor::PDB::ALTLOC);
OEGraphMol mol;
while(OEReadMolecule(ims, mol))
{
PrintAltGroupInfo(mol);
}
return 0;
}
In addition to providing methods to work with alternate locations and
groups, the OEAltLocationFactory
corrects its copy of the input source molecule for bond and formal charge problems
caused by atoms having multiple locations (something the standard
molecule perception routines are not setup to handle).
The OEAltLocationFactory
also provides methods for manufacturing subset molecules that represent
specific selections among the different groups of alternate locations.
The initial (primary) selection is the alternate location in each group
with the largest average occupancy. In the example below, the subset
is for the previous set of location selections plus the location
with code 'B'
that includes the residue of the specified atom.
Listing 2: Making an alternate location factory subset mol
// given OEAltLocationFactory alf and OEIter<OEAtomBase> atom ...
OEAltLocation loc = alf.GetLocation(atom, 'B');
OEGraphMol ssmol;
if (alf.MakeAltMol(ssmol, loc))
{
// use the subset mol...
Dihedrals and Sidechain Rotamers¶
The function OEGetRotamers
returns an iterator of ‘rotameric’
sidechain conformations for a given amino-acid type, while
OESetRotamer
will set the sidechain chi angles of a specified
OEHierResidue
or OEAtomBase
to a given rotamer.
The Dunbrack
[Dunbrack-1997],
Richardson
[Lovell-2000], and
newer Richardson_2016
[Hintze-2016]
rotamer libraries are supported.
There are also functions that return backbone
and sidechain dihedral angles
(OEGetPhi
, OEGetPsi
,
OEGetChis
, OEGetTorsion
)
and modify dihedrals (OESetTorsion
).
Swapping Ambiguous Isoelectronic Residue Atoms¶
The function OESwapAIEResidueAtoms
exchanges the coordinates
of nitrogen and oxygen atoms in aspartic acid, asparagine,
glutamic acid and glutamine sidechains
and the ND1/CD2 and CE1/NE2 atoms in histidine rings.
These are atoms that may be confused with one another in an
electron density map because they have the same or very similar
electron density and swapping coordinates is occasionally
required to correct an error in a structure.