Dataset Filtering – Create Custom Filter
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Task-based/Data Science/Filtering
Solution-based/Virtual-screening/Analysis/Filtering
Role-based/Medicinal Chemist
Description
Create a custom molecule filter file compatible with the OEMolProp toolkit.
Further details on the parameter settings below can be found in the OEMolProp documentation.
Parameters in the “Filter Customization from Files or String” group can be used to incorporate filter properties from existing filter files or filter strings, or from built-in OpenEye filters. If a parameter is both set in a custom file, built-in file, or string and also set in the floe parameters, the parameter setting will override the setting from the custom file or string.
The filter file produced by this floe can be used by the ‘Dataset Filtering – Custom or Built-in Filter Types’ floe, and in any other floe using a customized filter file as input. It can also be used directly with the OpenEye OEMolProp toolkit in scripts to construct a custom OEFilter object.
Promoted Parameters
Title in user interface (promoted name)
Outputs
Custom Filter File Name (custom_filter_file_name): Name of the custom output filter file.
Required
Type: file_out
Default: custom_filter_file.txt
Filter Customization from Files or String
Base Filter (base_filter): One of the OpenEye default filters to optionally use as a base for typical drug filter settings. Settings that are set in the parameters below or in a custom provided filter file will override any corresponding settings in this file.
Type: string
Default: Not Set
Choices: [‘Lead’, ‘Drug’, ‘BlockBuster’, ‘PAINS’, ‘Not Set’]
Custom String to Add to Base Filter (custom_filter_string): Optional String Input of filter lines, separated by semicolons. Will be overridden by base parameters set in either base file above or parameters provided in the floe UI.
Type: string
Overall Molecule Filter Parameters
Minimum Solubility (min_solubility): MIN_SOLUBILITY: Minimum solubility
Type: string
Default: not_set
Choices: [‘insoluble’, ‘poorly’, ‘moderately’, ‘soluble’, ‘very’, ‘highly’, ‘not_set’]
Minimum Molecular Weight (min_mol_wt): MIN_MOLWT: Minimum molecular weight for a molecule to pass filter.
Type: integer
Maximum Molecular Weight (max_mol_wt): MAX_MOLWT: Maximum molecular weight for a molecule to pass filter.
Type: integer
Minimum XLogP (min_xlogp): MIN_XLOGP: Minimum XLogP for a molecule.
Type: decimal
Maximum XLogP (max_xlogp): MAX_XLOGP: Maximum XLogP for a molecule.
Type: decimal
Mininum Number of Chiral Centers (min_chiral_centers): MIN_CHIRAL_CENTERS: Minimum number of chiral centers
Type: integer
Maximum Number of Chiral Centers (max_chiral_centers): MAX_CHIRAL_CENTERS: Maximum number of chiral centers.
Type: integer
Maximum Molecular Weight (min_sum_crg): MIN_COUNT_FORMAL_CRG: Minimum number of formal charges
Type: integer
Maximum Sum Formal Charges (max_sum_crg): MAX_COUNT_FORMAL_CRG: Maximum number of formal charges.
Type: integer
Screen for unusual valences or charges (type_check): TYPECHECK: screen for unusual valences or charges
Type: string
Screen for atoms with unknown MMFF atom types (mmff_type_check): MMFFTYPECHECK: screen for atoms with unknown MMFF atom types
Type: string
Atom Filter Parameters
Minimum Heavy Atoms (min_hvy_atoms): MIN_NUM_HVY: minimum number of heavy atoms.
Type: integer
Maximum number of heavy atoms (max_hvy_atoms): MAX_NUM_HVY: maximum number of heavy atoms
Type: integer
Minimum Unbranched Atoms (min_unbranched): MIN_UNBRANCHED: Minimum number of connected unbranched non-ring atoms
Type: integer
Maximum unbranched atoms (max_unbranched): MAX_UNBRANCHED: Maximum number of connected unbranched non-ring atoms
Type: integer
Minimum Unbranched Carbon Atoms (min_unbranched_carbon): MIN_UNBRANCHED_C: Minimum number of connected unbranched non-ring carbon atoms
Type: integer
Maximum Unbranched Carbon Atoms (max_unbranched_carbon): MAX_UNBRANCHED_C: Maximum number of connected unbranched non-ring carbon atoms
Type: integer
Minimum Carbons (min_carbons): MIN_CARBONS: Minimum number of carbons
Type: integer
Minimum Carbons (max_carbons): MIN_CARBONS: Minimum number of carbons
Type: integer
Minimum Anionic Carbons (min_anion_c): MIN_ANION_C: Minimum number of anionic carbons
Type: integer
Minimum Anionic Carbons (max_anion_c): MAX_ANION_C: Maximum number of anionic carbons
Type: integer
Ring System Filter Parameters
Estimate Degrees of Freedom in Rings (adjust_rot_for_ring): ADJUST_ROT_FOR_RING: boolean for whether to estimate degrees of freedom in rings
Type: string
Choices: [‘true’, ‘false’]
Minimum Ring Size (min_ring_size): MIN_RING_SIZE: minimum number of atoms in a ring system
Type: integer
Maximum Ring Size (max_ring_size): MAX_RING_SIZE: maximum number of atoms in a ring system
Type: integer
Minimum number of Ring Systems (min_ring_systems): MIN_RING_SYS: minimum number of ring systems
Type: integer
Maximum number of ring systems (max_ring_systems): MAX_RING_SYS: maximum number of ring systems
Type: integer
Bond Filter Parameters
Minimum Rotatable Bonds (min_rot_bonds): MIN_ROT_BONDS: minimum number of rotatable bonds
Type: integer
Maximum Rotatable Bonds (max_rot_bonds): MAX_ROT_BONDS: maximum number of rotatable bonds
Type: integer
Minimum Rigid Bonds (min_rigid_bonds): MIN_RIGID_BONDS: minimum number of rigid bonds
Type: integer
Maximum Rigid Bonds (max_rigid_bonds): MAX_RIGID_BONDS: maximum number of rigid bonds
Type: integer
Heteroatom Parameters
Minimum Heteroatoms (min_heteroatoms): MAX_HETEROATOMS: minimum number of heteroatoms
Type: integer
Maximum Heteroatoms (max_heteroatoms): MAX_HETEROATOMS: maximum number of heteroatoms
Type: integer
Minimum Heteroatom / Carbon Ratio (min_het_c_ratio): MIN_Het_C_Ratio: Minimum heteroatom to carbon ratio
Type: decimal
Maximum Heteroatom / Carbon Ratio (max_het_c_ratio): MAX_Het_C_Ratio: Maximum heteroatom to carbon ratio
Type: decimal
Minimum Halide Fraction (min_halide_fraction): MIN_HALIDE_FRACTION: Minimum halide fraction
Type: decimal
Maximum Halide Fraction (max_halide_fraction): MAX_HALIDE_FRACTION: Maximum halide fraction
Type: decimal
Hydrogen Bond Parameters
Minimum Number Hydrogen Donors (min_h_donors): MIN_HBOND_DONORS: minimum number of hydrogen-bond donors
Type: integer
Maximum Number Hydrogen Donors (max_h_donors): MAX_HBOND_DONORS: maximum number of hydrogen-bond donors
Type: integer
Minimum Number Hydrogen Acceptors (min_h_acceptors): MIN_HBOND_ACCEPTORS: minimum number of hydrogen-bond acceptors
Type: integer
Maximum Number Hydrogen Acceptors (max_h_acceptors): MAX_HBOND_ACCEPTORS: maximum number of hydrogen-bond acceptors
Type: integer
Minimum Number Lipinski Donors (min_lipinski_donors): MIN_LIPINSKI_DONORS: Minimum number of hydrogens on O & N atoms
Type: integer
Maximum Number Lipinski Donors (max_lipinski_donors): MAX_LIPINSKI_DONORS: Maximum number of hydrogens on O & N atoms
Type: integer
Minimum Number Hydrogen Acceptors (min_lipinski_acceptors): MIN_LIPINSKI_ACCEPTORS: Minimum number of oxygen & nitrogen atoms
Type: integer
Maximum Number Hydrogen Acceptors (max_lipinski_acceptors): MAX_LIPINSKI_ACCEPTORS: Maximum number of oxygen & nitrogen atoms
Type: integer
Functional Group Parameters
Minimum Number of Functional Groups (min_functional_groups): MIN_FCNGRP: Minimum number of functional groups
Type: integer
Maximum Number of Functional Groups (max_functional_groups): MAX_FCNGRP: Maximum number of functional groups
Type: integer
Allowed Elements
Allowed Elements (allowed_elements): ALLOWED_ELEMENTS: symbols of elements to allow, separated by commas
Type: string
Eliminate Metals (eliminate_metals): ELIMINATE_METALS: symbols of metals to eliminate, separated by commas
Type: string
Polar Properties
Minimum 2D Polar Surface Area (min_2d_psa): MIN_2D_PSA: minimum 2D polar surface area
Type: decimal
Maximum 2D Polar Surface Area (max_2d_psa): MAX_2D_PSA: maximum 2D polar surface area
Type: decimal
Count S and P as polar atoms for polar surface area (psa_use_sandp): PSA_USE_SandP: Count S and P as polar atoms
Type: string
Choices: [‘true’, ‘false’]
Stereochemistry
Minimum Unspecified Atom Stereos (min_unspecified_atom_stereos): MIN_UNSPECIFIED_ATOM_STEREOS: Minimum unspecified atom stereos
Type: integer
Maximum Unspecified Atom Stereos (max_unspecified_atom_stereos): MAX_UNSPECIFIED_ATOM_STEREOS: Maximum unspecified atom stereos
Type: integer
Minimum Unspecified Bond Stereos (min_unspecified_bond_stereos): MIN_UNSPECIFIED_BOND_STEREOS: Minimum unspecified bond stereos
Type: integer
Maximum Unspecified Bond Stereos (max_unspecified_bond_stereos): MAX_UNSPECIFIED_BOND_STEREOS: Maximum unspecified bond stereos
Type: integer
Aggregators
Eliminate Known Aggregators (elim_known_agg): AGGREGATORS: eliminate known aggregators
Type: string
Choices: [‘true’, ‘false’]
Eliminate Predicated Aggregators (elim_pred_agg): PRED_AGG: eliminate known aggregators
Type: string
Choices: [‘true’, ‘false’]
Secondary Filters
GSK Veber filter (gsk_veber): GSK_VEBER: PSA>140 or >10 rot bonds
Type: string
Choices: [‘true’, ‘false’]
Maximum Lipinski Violations (max_lipinski_violations): MAX_LIPINSKI: maximum number of lipinski violations
Type: integer
Minimum ABS (min_abs): MIN_ABS: Minimum probability F>10% in rats
Type: decimal
Pharmacopia (pharmacopia): PHARMACOPIA: LogP > 5.88 or PSA > 131.6
Type: string
Choices: [‘true’, ‘false’]
Functional Group Maximums: A
Maximum number of acetal groups (acetal):
Type: integer
Maximum number of acid groups (acid):
Type: integer
Maximum number of acid_chloride groups (acid_chloride):
Type: integer
Maximum number of acid_halide groups (acid_halide):
Type: integer
Maximum number of acyclic_NCN groups (acyclic_NCN):
Type: integer
Maximum number of acyclic_NS groups (acyclic_NS):
Type: integer
Maximum number of acyl_cyanides groups (acyl_cyanides):
Type: integer
Maximum number of acylhydrazide groups (acylhydrazide):
Type: integer
Maximum number of alcohol groups (alcohol):
Type: integer
Maximum number of adehyde groups (adehyde):
Type: integer
Maximum number of alkene groups (alkene):
Type: integer
Maximum number of alkyl groups (alkyl):
Type: integer
Maximum number of alkyl_halide groups (alkyl_halide):
Type: integer
Maximum number of alkyl_phosphate groups (alkyl_phosphate):
Type: integer
Maximum number of alkylaniline groups (alkylaniline):
Type: integer
Maximum number of alkylating_agent groups (alkylating_agent):
Type: integer
Maximum number of alkyne groups (alkyne):
Type: integer
Maximum number of alphahalo_amine groups (alphahalo_amine):
Type: integer
Maximum number of alphahalo_ketone groups (alphahalo_ketone):
Type: integer
Maximum number of amide groups (amide):
Type: integer
Maximum number of aminal groups (aminal):
Type: integer
Maximum number of amine groups (amine):
Type: integer
Maximum number of amino_acid groups (amino_acid):
Type: integer
Maximum number of anhydride groups (anhydride):
Type: integer
Maximum number of aniline groups (aniline):
Type: integer
Maximum number of aniline_unsubstituted groups (aniline_unsubstituted):
Type: integer
Maximum number of arene groups (arene):
Type: integer
Maximum number of arenesulfonyl groups (arenesulfonyl):
Type: integer
Maximum number of aryl groups (aryl):
Type: integer
Maximum number of aryl_halide groups (aryl_halide):
Type: integer
Maximum number of aryl_mono_BrI groups (aryl_mono_BrI):
Type: integer
Maximum number of azide groups (azide):
Type: integer
Maximum number of aziridine groups (aziridine):
Type: integer
Maximum number of azo groups (azo):
Type: integer
Maximum number of azocyanamides groups (azocyanamides):
Type: integer
Functional Group Maximums: B
Maximum number of base groups (base):
Type: integer
Maximum number of benzyl_ether groups (benzyl_ether):
Type: integer
Maximum number of benzyloxycarbonyl_CBZ groups (benzyloxycarbonyl_CBZ):
Type: integer
Maximum number of beta_azo_carbonyl groups (beta_azo_carbonyl):
Type: integer
Maximum number of beta_carbonyl_quat_nitrogen groups (beta_carbonyl_quat_nitrogen):
Type: integer
Maximum number of beta_halo_carbonyl groups (beta_halo_carbonyl):
Type: integer
Functional Group Maximums: C to E
Maximum number of carbamate groups (carbamate):
Type: integer
Maximum number of carbamic_acid groups (carbamic_acid):
Type: integer
Maximum number of carbodiimide groups (carbodiimide):
Type: integer
Maximum number of carbonate groups (carbonate):
Type: integer
Maximum number of carbonyl groups (carbonyl):
Type: integer
Maximum number of carboxylic_acid groups (carboxylic_acid):
Type: integer
Maximum number of cation_C_Cl_I_P_or_S groups (cation_C_Cl_I_P_or_S):
Type: integer
Maximum number of charge groups (charge):
Type: integer
Maximum number of cyanohydrins groups (cyanohydrins):
Type: integer
Maximum number of cycloheximide_derivatives groups (cycloheximide_derivatives):
Type: integer
Maximum number of cyclopropyl groups (cyclopropyl):
Type: integer
Maximum number of cytochalasin_derivatives groups (cytochalasin_derivatives):
Type: integer
Maximum number of di_peptide groups (di_peptide):
Type: integer
Maximum number of dioxamide_6MR groups (dioxamide_6MR):
Type: integer
Maximum number of dioxolane_5MR groups (dioxolane_5MR):
Type: integer
Maximum number of disulfide groups (disulfide):
Type: integer
Maximum number of dithioacetal groups (dithioacetal):
Type: integer
Maximum number of dye groups (dye):
Type: integer
Maximum number of enamine groups (enamine):
Type: integer
Maximum number of enol_ether groups (enol_ether):
Type: integer
Maximum number of epoxide groups (epoxide):
Type: integer
Maximum number of ester groups (ester):
Type: integer
Maximum number of ether groups (ether):
Type: integer
Functional Group Maximums: F to H
Maximum number of fluorenylmethoxycarbonyl_Fmoc groups (fluorenylmethoxycarbonyl_Fmoc):
Type: integer
Maximum number of guanidine groups (guanidine):
Type: integer
Maximum number of halide groups (halide):
Type: integer
Maximum number of halo_alkene groups (halo_alkene):
Type: integer
Maximum number of halo_amine groups (halo_amine):
Type: integer
Maximum number of halopyrimidine groups (halopyrimidine):
Type: integer
Maximum number of hemiacetal groups (hemiacetal):
Type: integer
Maximum number of hemiaminal groups (hemiaminal):
Type: integer
Maximum number of hemiketal groups (hemiketal):
Type: integer
Maximum number of hetatm groups (hetatm):
Type: integer
Maximum number of hetero_hetero groups (hetero_hetero):
Type: integer
Maximum number of HOBT_esters groups (HOBT_esters):
Type: integer
Maximum number of hydrazine groups (hydrazine):
Type: integer
Maximum number of hydrazone groups (hydrazone):
Type: integer
Maximum number of hydroxamic_acid groups (hydroxamic_acid):
Type: integer
Maximum number of hydroxyl groups (hydroxyl):
Type: integer
Maximum number of hydroxylamine groups (hydroxylamine):
Type: integer
Functional Group Maximums: I to L
Maximum number of imidoyl_chlorides groups (imidoyl_chlorides):
Type: integer
Maximum number of imine groups (imine):
Type: integer
Maximum number of imino groups (imino):
Type: integer
Maximum number of iodine groups (iodine):
Type: integer
Maximum number of iodoso groups (iodoso):
Type: integer
Maximum number of iodoxy groups (iodoxy):
Type: integer
Maximum number of isocyanate groups (isocyanate):
Type: integer
Maximum number of isonitrile groups (isonitrile):
Type: integer
Maximum number of isothiocyanate groups (isothiocyanate):
Type: integer
Maximum number of ketal groups (ketal):
Type: integer
Maximum number of ketone groups (ketone):
Type: integer
Maximum number of lactam groups (lactam):
Type: integer
Maximum number of lactone groups (lactone):
Type: integer
Maximum number of lawesson_s_reagent groups (lawesson_s_reagent):
Type: integer
Maximum number of long_aliphatic_chain groups (long_aliphatic_chain):
Type: integer
Functional Group Maximums: M to N
Maximum number of malonic groups (malonic):
Type: integer
Maximum number of mercapto groups (mercapto):
Type: integer
Maximum number of methoxyethoxymethyl_MEM groups (methoxyethoxymethyl_MEM):
Type: integer
Maximum number of methyl_ketone groups (methyl_ketone):
Type: integer
Maximum number of michael_acceptor groups (michael_acceptor):
Type: integer
Maximum number of monensin_derivatives groups (monensin_derivatives):
Type: integer
Maximum number of mono_alkene groups (mono_alkene):
Type: integer
Maximum number of mono_alkyne groups (mono_alkyne):
Type: integer
Maximum number of nitrile groups (nitrile):
Type: integer
Maximum number of nitro groups (nitro):
Type: integer
Maximum number of nitroso groups (nitroso):
Type: integer
Maximum number of N_methoyl groups (N_methoyl):
Type: integer
Maximum number of nonacylhydrazone groups (nonacylhydrazone):
Type: integer
Maximum number of noxide groups (noxide):
Type: integer
Maximum number of N_P_S_Halides groups (N_P_S_Halides):
Type: integer
Maximum number of NS_beta_halothyl groups (NS_beta_halothyl):
Type: integer
Maximum number of nucleophile groups (nucleophile):
Type: integer
Functional Group Maximums: O to R
Maximum number of organometallic groups (organometallic):
Type: integer
Maximum number of oxalyl groups (oxalyl):
Type: integer
Maximum number of oxaziridine groups (oxaziridine):
Type: integer
Maximum number of oxime groups (oxime):
Type: integer
Maximum number of oxygen_cation groups (oxygen_cation):
Type: integer
Maximum number of paranitrophenyl_esters groups (paranitrophenyl_esters):
Type: integer
Maximum number of pentafluorophenyl_esters groups (pentafluorophenyl_esters):
Type: integer
Maximum number of perhalo_ketone groups (perhalo_ketone):
Type: integer
Maximum number of peroxide groups (peroxide):
Type: integer
Maximum number of phenol groups (phenol):
Type: integer
Maximum number of phosphanes groups (phosphanes):
Type: integer
Maximum number of phosphinic_acid groups (phosphinic_acid):
Type: integer
Maximum number of phosphonamide groups (phosphonamide):
Type: integer
Maximum number of phosphonic_acid groups (phosphonic_acid):
Type: integer
Maximum number of phosphonic_ester groups (phosphonic_ester):
Type: integer
Maximum number of phosphonylnitrile groups (phosphonylnitrile):
Type: integer
Maximum number of phosphoramides groups (phosphoramides):
Type: integer
Maximum number of phosphoranes groups (phosphoranes):
Type: integer
Maximum number of phosphoric_acid groups (phosphoric_acid):
Type: integer
Maximum number of phosphoric_ester groups (phosphoric_ester):
Type: integer
Maximum number of phosphoryl groups (phosphoryl):
Type: integer
Maximum number of phthalimides_PHT groups (phthalimides_PHT):
Type: integer
Maximum number of polyenes groups (polyenes):
Type: integer
Maximum number of primary_amine groups (primary_amine):
Type: integer
Maximum number of propiolactones groups (propiolactones):
Type: integer
Maximum number of pseudo_amine groups (pseudo_amine):
Type: integer
Maximum number of quinone groups (quinone):
Type: integer
Maximum number of ring groups (ring):
Type: integer
Functional Group Maximums: S
Maximum number of saponin_derivates groups (saponin_derivates):
Type: integer
Maximum number of SCN2 groups (SCN2):
Type: integer
Maximum number of secondary_amine groups (secondary_amine):
Type: integer
Maximum number of squalestatin_derivatives groups (squalestatin_derivatives):
Type: integer
Maximum number of suflide groups (suflide):
Type: integer
Maximum number of sulfinimine groups (sulfinimine):
Type: integer
Maximum number of sulfinylthio groups (sulfinylthio):
Type: integer
Maximum number of sulfonamide groups (sulfonamide):
Type: integer
Maximum number of sulfone groups (sulfone):
Type: integer
Maximum number of sulfonic_acid groups (sulfonic_acid):
Type: integer
Maximum number of sulfonic_ester groups (sulfonic_ester):
Type: integer
Maximum number of sulfonimine groups (sulfonimine):
Type: integer
Maximum number of sulfonyl_halide groups (sulfonyl_halide):
Type: integer
Maximum number of sulfonylnitrile groups (sulfonylnitrile):
Type: integer
Maximum number of sulfonylurea groups (sulfonylurea):
Type: integer
Maximum number of sulfoxide groups (sulfoxide):
Type: integer
Functional Group Maximums: T to U
Maximum number of t_butyldimethylsilyl_TBDMS groups (t_butyldimethylsilyl_TBDMS):
Type: integer
Maximum number of t_butyldiphenylsilyl_TBDPS groups (t_butyldiphenylsilyl_TBDPS):
Type: integer
Maximum number of t_butyl_ether groups (t_butyl_ether):
Type: integer
Maximum number of t_butoxycarbonyl_tBOC groups (t_butoxycarbonyl_tBOC):
Type: integer
Maximum number of terminal_vinyl groups (terminal_vinyl):
Type: integer
Maximum number of tertiary_amine groups (tertiary_amine):
Type: integer
Maximum number of tetrahydropyran_THP groups (tetrahydropyran_THP):
Type: integer
Maximum number of thioamide groups (thioamide):
Type: integer
Maximum number of thiocarbamate groups (thiocarbamate):
Type: integer
Maximum number of thiocarbonyl groups (thiocarbonyl):
Type: integer
Maximum number of thioester groups (thioester):
Type: integer
Maximum number of thiol groups (thiol):
Type: integer
Maximum number of thiourea groups (thiourea):
Type: integer
Maximum number of triacyloxime groups (triacyloxime):
Type: integer
Maximum number of triazine groups (triazine):
Type: integer
Maximum number of tricarbo_phosphene groups (tricarbo_phosphene):
Type: integer
Maximum number of triflates groups (triflates):
Type: integer
Maximum number of triisopropylsilyl_TIPS groups (triisopropylsilyl_TIPS):
Type: integer
Maximum number of trimethylsilyl_TMS groups (trimethylsilyl_TMS):
Type: integer
Maximum number of unbranched_chain groups (unbranched_chain):
Type: integer
Maximum number of urea groups (urea):
Type: integer