OEApplyChEMBL18SolubilityTransformsΒΆ

OESystem::OEIterBase<OEChem::OEMolBase> *
   OEApplyChEMBL18SolubilityTransforms(OEChem::OEMolBase &input, int context, unsigned int minMMPThreshold=5)

Given an input molecule, apply transformations derived from solubility data obtained from the [ChEMBL18-2014] database. The context argument controls the amount of chemistry information that should be included for the transformation reaction, see OEMatchedPairContext. This function supports only the OEMatchedPairContext_Bond0 or OEMatchedPairContext_Bond2 context values. The minMMPThreshold argument will only apply transformations that meet or exceed the specified number of matched pairs. Use a minMMPThreshold value of 0 to apply all transformations regardless of the number of matched pairs associated with them.

In the example below, the input structures are transformed by the ChEMBL solubility transforms and exported to a file format that supports SD data. Each transformed structure will contain information about the solubility transform (as SMIRKS) that generated it, and the the matched pair information associated with each transform (ChEMBL identifiers and solubility data). The added annotation data will contain the data fields, OEMMP_normalized_value (uM), OEMMP_published_value, OEMMP_examples (SMILES), and OEMMP_transform (SMILES) for subsequent analysis.

    # number of bonds of chemistry context at site of change
    #  for the applied transforms
    totalmols = 0
    xformctxt = OEMatchedPairContext_Bond2
    for molidx, mol in enumerate(ifs.GetOEGraphMols(), start=1):
        # consider only the largest input fragment
        OETheFunctionFormerlyKnownAsStripSalts(mol)

        smolcnt = 0
        # only consider solubility transforms having at least 5 matched pairs
        for solMol in OEApplyChEMBL18SolubilityTransforms(mol, xformctxt, 5):
            # compute net change in solubility from MMP data
            deltasol = []
            if OEHasSDData(solMol, "OEMMP_normalized_value (uM)"):
                for sditem in OEGetSDData(solMol,
                                          "OEMMP_normalized_value (uM)").split('\n'):
                    # fromIndex,toIndex,fromValue,toValue
                    sdvalues = sditem.split(',')
                    if not sdvalues[2] or not sdvalues[3]:
                        continue
                    deltasol.append(float(sdvalues[3])-float(sdvalues[2]))
            if not len(deltasol):
                continue

            avgsol = deltasol[0]
            if len(deltasol) > 1:
                avgsol = average(deltasol)

            # reject examples with net decrease in solubility
            if avgsol < 0.0:
                continue
            sdev = stddev(deltasol)

            # annotate with average,stddev,num
            OEAddSDData(solMol,
                        "OEMMP_average_delta_normalized_value",
                        "{0:.1F},{1:.2F},{2}".format(avgsol, sdev, len(deltasol)))

            # export solubility transformed molecule with SDData annotations
            if OEWriteMolecule(ofs, solMol) == OEWriteMolReturnCode_Success:
                smolcnt += 1

        OEThrow.Info("{0}: Exported molecule count, {1}".format(molidx, smolcnt))
        totalmols += smolcnt