OEMCSFunc

class OEMCSFunc

The OEMCSFunc is an abstract base class that defines the API used for scoring subgraph matches.

The scores generated by implementations of OEMCSFunc influence the sorting and retention of maximum common subgraph matches generated by the OEMCSSearch and the OECliqueSearch classes.

The following classes derive from this class:

Note

Custom implementation of OEMCSFunc can be done by deriving from the OEMCSFunc class and implementing all of the methods listed below.

Example of using custom scoring function

class MyMaxAtomsBondsMCSFunc : public OEMCSFunc
{
public:
  double operator()(const OEMolBase &pattern, const OEMolBase&,
                    OEAtomBase **amap, OEBondBase **bmap)
  {
    auto atomcount = 0u;
    for (auto i = 0u; i < pattern.GetMaxAtomIdx(); ++i)
    {
      if (amap[i] != nullptr)
        ++atomcount;
    }

    auto bondcount = 0u;
    for (auto i = 0u; i < pattern.GetMaxBondIdx(); ++i)
    {
      if (bmap[i] != nullptr)
        ++bondcount;
    }

    return oeCast(double, atomcount + bondcount);
  }

  OEMCSFunc *CreateCopy() const
  {
    return new MyMaxAtomsBondsMCSFunc;
  }
};

int main()
{
  OEGraphMol pattern;
  OEGraphMol target;
  OESmilesToMol(pattern, "c1cc(O)c(O)cc1CCN");
  OESmilesToMol(target,  "c1c(O)c(O)c(Cl)cc1CCCBr");

  const unsigned int atomexpr = OEExprOpts::DefaultAtoms;
  const unsigned int bondexpr = OEExprOpts::DefaultBonds;
  OEMCSSearch mcss(pattern, atomexpr, bondexpr, OEMCSType::Exhaustive);

  mcss.SetMCSFunc(MyMaxAtomsBondsMCSFunc());

operator()

double operator()(const OEMolBase &pattern, const OEMolBase &target,
                  OEAtomBase **amap, OEBondBase **bmap)=0

This method is called automatically by the OEMCSSearch and OECliqueSearch classes when a common subgraph of two molecules is identified.

The ‘pattern’ (query) molecule and ‘target’ molecule currently being matched are passed as the first and second arguments to the function. The arrays of pointers to atoms (‘amap’) and bonds (‘bmap’) hold the atom and bond correspondences between the pattern and target. The arrays are the length of the maximum atom and bond indices of the pattern molecule. The indices of the atoms and bonds in the pattern molecule can be used to look up the corresponding atoms and bonds in the target molecule. Subgraphs may not include all pattern atoms. Array positions for unmatched atoms and bonds are assigned to the NULL pointer.

The integer part of the floating point value returned by the method is used to determine maximal common subgraphs.

All integer part scores which are smaller than the maximum computed value for any subgraph are discarded by OEMCSSearch. In case of OECliqueSearch the any subgraph are discarded that has a smaller scoring value than the maximum score minus the ‘range’ set by OECliqueSearch::SetSaveRange.

The decimal part of the floating point value returned by the method is used to sort the matches found by OEMCSSearch or OECliqueSearch For example, by scoring matches using the function

\(num.\ of\ mapped\ atoms + \frac{num.\ of\ mapped\ bonds}{100}\)

all matches which have the same number of subgraph atoms would be retained but the matches would be returned in order of decreasing number of bonds matched.

CreateCopy

OEMCSFunc *CreateCopy() const =0

Deep copy constructor that returns a copy of the object. The memory for the returned OEMCSFunc object is dynamically allocated and owned by the caller.

The returned copy should be deallocated using C++ delete operator in order to prevent a memory leak.