Best Practices for Java

How to Reduce Memory Fragmentation

To reduce memory fragmentation, reuse molecules and call Clear().

// Creating a new molecule each iteration generates a significant amount of memory allocation/deallocation activity
for(int i=0; i!=10; i++) {
    OEMol mol = new OEMol();
    oechem.OESmilesToMol(mol, smiles[i]);
    ...
}
// Reusing molecule objects reduces memory fragmentation
OEMol mol = new OEMol();
for(int i=0; i!=10; i++) {
    oechem.OESmilesToMol(mol, smiles[i]);
    ...
    mol.Clear();
}

When to Call close()

Call close() on any object that has the method (listed below), when it is done being used.

Despite the fact that Java is garbage collected, calling close() on any object with an underlying file handle is necessary. The reason for this is that garbage collector (GC) is non-deterministic. If the GC were to handle closing files, the stream might be waiting to flush for several GC cycles, causing corruption if the same file were written to again by another object.

The following example reads molecules from stdin in SMILES format and write them to stdout in absolute SMILES format. Notice that the oemolstreambase.close method must be explicitly called on the molecule streams.

When to Call delete()

Call delete() as soon as possible on objects that have it defined from any of OpenEye toolkits. For scripts that create many objects (including implicitly created object, such as by OEMolBase.GetAtoms and OEAtomBase.GetBonds, etc.), solely relying on GC can cause out of memory error. The delete() method invokes the underlying C++ destructor to release the allocated memory, and is especially required to prevent such error.

// Calling .delete() is recommended
OEMol mol = new OEMol();
...
mol.delete();
// If many objects are created in a loop, it is
// mandatory to call .delete() to prevent out of memory error
while (oechem.OEReadMolecule(ifs, mol)) {
    OEAtomBaseIter aiter = mol.GetAtoms();
    for (OEAtomBase atm : aiter) {
        ...
    }
    aiter.delete();
}

Warning

Calling any method on an object after .delete() will cause a hard crash.

OEFloatArray vs float[]

It is at least 2 times faster to use a float[], instead of OEFloatArray, when the contents of an array needs to be manipulated between OpenEye API calls. This is best explained with the following examples:

The performance difference is even more significant when this is called in many loops.

OpenEye Defines

  • -Doejava.libs.debug=[1|0] displays debugging output to the console while loading shared libraries.

  • -Doejava.libs.path=path loads OpenEye shared libraries directly from path. It make no attempt to unpack them from the jar and fails if no libraries are found.

Keep in Mind

  • “You can be sure only about which objects are marked for garbage collection. You can never be sure exactly when the object will be garbage collected. ([Gupta-2013])

  • “Garbage collection happens in a low priority thread.” ([Gupta-2013])

  • “… the cost of automatic dynamic memory management is highly dependent on application behavior.” ([Jones-2012])

References

Gupta-2013(1,2)
M. Gupta,
OCA Java SE 7 Programmer I Certification Guide,
Manning Publications Company, pp. 120-124, ISBN: 9781617291043, 2013.
Jones-2012
R. Jones, A. Hosking, and E. Moss,
The Garbage Collection HandBook,
CRC Press, pp. 9, 2012.