Basic Filtering for a Molecule File
All filtering operations are controlled via the
OEFilter object. The OEFilter
object is typically configured with a specified filter and then
applied iteratively over a molecule file. This example
demonstrates configuring the OEFilter object with
the lead-like filter
and then writes out the molecules that pass the filter. The
OEFilter.operator() method is used to
test whether the molecule passes the filter.
Note
The molecule will also be altered by all the specified Filter Preprocessing steps.
Command Line Interface
A description of the command line interface can be obtained by executing the program with the –help argument.
prompt> python molfilter.py --help
will generate the following output:
Simple parameter list
filter options :
-filtertype : filter type
input/output options :
-in : Input filename
-out : Output filename
Usage: ./molfilter <input> <output>
Code
Download code
#!/usr/bin/env python
# (C) 2022 Cadence Design Systems, Inc. (Cadence)
# All rights reserved.
# TERMS FOR USE OF SAMPLE CODE The software below ("Sample Code") is
# provided to current licensees or subscribers of Cadence products or
# SaaS offerings (each a "Customer").
# Customer is hereby permitted to use, copy, and modify the Sample Code,
# subject to these terms. Cadence claims no rights to Customer's
# modifications. Modification of Sample Code is at Customer's sole and
# exclusive risk. Sample Code may require Customer to have a then
# current license or subscription to the applicable Cadence offering.
# THE SAMPLE CODE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED. OPENEYE DISCLAIMS ALL WARRANTIES, INCLUDING, BUT
# NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
# PARTICULAR PURPOSE AND NONINFRINGEMENT. In no event shall Cadence be
# liable for any damages or liability in connection with the Sample Code
# or its use.
#############################################################################
# Filter a molecule file for "Lead-like" molecules
#############################################################################
import sys
from openeye import oechem
from openeye import oemolprop
def main(argv=[__name__]):
itf = oechem.OEInterface(InterfaceData)
oemolprop.OEConfigureFilterParams(itf)
if not oechem.OEParseCommandLine(itf, argv):
oechem.OEThrow.Fatal("Unable to interpret command line!")
iname = itf.GetString("-in")
oname = itf.GetString("-out")
ifs = oechem.oemolistream()
if not ifs.open(iname):
oechem.OEThrow.Fatal("Cannot open input file!")
ofs = oechem.oemolostream()
if not ofs.open(oname):
oechem.OEThrow.Fatal("Cannot create output file!")
ftype = oemolprop.OEGetFilterType(itf)
filt = oemolprop.OEFilter(ftype)
filt.SetErrorLevel(oechem.OEErrorLevel_Info)
for mol in ifs.GetOEGraphMols():
if filt(mol):
oechem.OEWriteMolecule(ofs, mol)
#############################################################################
# INTERFACE
#############################################################################
InterfaceData = '''
!BRIEF [-in] <input> [-out] <output>
!CATEGORY "input/output options :"
!PARAMETER -in
!ALIAS -i
!TYPE string
!REQUIRED true
!KEYLESS 1
!VISIBILITY simple
!BRIEF Input filename
!END
!PARAMETER -out
!ALIAS -o
!TYPE string
!REQUIRED true
!KEYLESS 2
!VISIBILITY simple
!BRIEF Output filename
!END
!END
'''
if __name__ == "__main__":
sys.exit(main(sys.argv))
See also
OEFilterclassOEConfigureFilterParamsandOEConfigureFilterTypefunctionsOEGetFilterTypefunction
Examples
prompt> python molfilter.py -in mcss.smi.gz -out .smi -filtertype Lead
The following is an example of the output:
CC1=CC(=O)C=CC1=O NSC 1,Minimum atom count(10) not reached: 9
c1ccc2c(c1)nc(s2)SSc3nc4ccccc4s3 NSC 2,Maximum disulfide(0) exceeded: 1
c1c(cc(c(c1[N+](=O)[O-])[O-])Cl)[N+](=O)[O-] NSC 3,Maximum heteroatom to carbon ratio(1.10) exceeded: 1.33
c1c(sc(n1)N)[N+](=O)[O-] NSC 4,Minimum atom count(10) not reached: 9
c1ccc2c(c1)C(=O)c3ccc(cc3C2=O)N NSC 5,Maximum dye(0) exceeded: 2
c1ccc(c(c1)c2c3ccc(c(c3oc-4c(c(=O)ccc24)Br)Br)O)C(=O)[O-] NSC 6,Maximum atom count(25) exceeded: 27
C[NH+](C)C1=C(C(=O)c2ccccc2C1=O)Cl NSC 7,Maximum alkyl_halide(0) exceeded: 1
Cc1ccc2c(c1[N+](=O)[O-])C(=O)c3ccccc3C2=O NSC 8,Maximum nitro(0) exceeded: 1
CC(C)(C)c1cc(c(cc1O)C(C)(C)C)O NSC 11,Pass
CC1=NN(C(=O)C1)c2ccccc2 NSC 12,Pass
By default the OEFilter.operator() method will emit
information to OEThrow about every molecule passed to it. This example
prints that to the screen. The next example,
quietfilter, shows how to suppress
that output. Only the molecules that pass the filter will be written to
the output file. The following is the format for what is emitted:
[Isomeric SMILES]\t[Title],[Pass|Reason for failure]