The argument for this flag is the name of a file containing control parameters. The control parameter file acts to either replace or augment the command line interface. All parameters necessary for program execution may be provided in the control parameter file, although any command given explicitly on the command line will supersede options found in the parameter file. FILTER generates a new parameter file containing the full set of execution parameters upon every execution. The name of the parameter file written by FILTER is created by combining the prefix base name with the ‘.param’ extension.
This optional parameter specifies a filter file to be used in place of the default filter. If only simple additions to the default filters are desired, please see the -newrule parameter. The file format for this file is described in the Filter Files chapter. There are several special reserved values for this parameter. If the -filter parameter is:
- Blockbuster This is the recommended and default filter
- Lead Corresponds to the [Oprea-2000] lead-like filters useful for preparing HTS databases.
- Drug Corresponds to the [Oprea-2000] drug-like filters useful for later stage drug development.This is provided as a backwards compatible option. Experience has shown it to be too restrictive
- Macrocycle Non-macrocycle molecules will be removed
- Nonmacrocycle Macrocycle molecules will be removed
- Frag Filter for extracting small molecules from a database that have attachment points
- Pains Corresponds to the [Baell-2010] filter useful for removing common substructures that interfere with bioassays
The default filter files for these reserved types can be found in the data directory included with the applications bundle.
This specifies an optional molecular output file where the molecules that fail to pass the filter will be written. If this parameter is specified, then every molecule from -in will either be written to -out or to -fail. [default = null]
For an execution of the FILTER program, three general purpose files are written in addition to the output file specified on the command-line. These file are the “info”, “log” and “param” files. Normally, they all begin with the prefix “filter”. However, this can be overridden with the -prefix parameter. This is particularly useful if you want to run multiple FILTER jobs in the same directory without overwriting files.
Normally, FILTER writes an “info” file during the progress of any execution. At regular intervals during the execution, the info file is updated to reflect the most recent progress. If you are interested in seeing the progress of a run, it is best to either use the -dots parameter or look at the info file. Normally, the info file is saved as filename.info where “filename” is the prefix specified by the -prefix flag. However, one may use the -info flag to specify an info file with a separate name. [default = null]
This optional parameter can specify a file that contains filter rules to supplement the default filter or the filter specified with the -filter parameter. This parameter can be used to extend the functional group list used to filter. New filter rules can also be added directly to the filter file specified with -filter. [default = null]
This boolean flag controls whether the valence states of atoms will be checked. This check identifies molecules that are poorly specified, or represent nonsensical chemical states. For example, an oxygen with eight hydrogens attached or a carbon with a +9 formal charge would be rejected. [default = true]
This parameter is a SMARTS string that allows a user to require a specific functional group or substructure be present in all molecules that pass the filter. This feature is particularly useful for identification of reagents for library design. Selection items can also be added directly to the filter file specified with -filter. The command-line argument only allows specification of the SMARTS pattern, and exactly one copy of that functional group is required. If a user wants to specify a selection SMARTS with minimum and maximum number of occurrences other than 1, then they can use a SELECT statement inside the filter file. [default = null]
This flag determines where the logging information is written. The logging information includes a listing of the filter used, followed by a one line comment about why each molecule failed, or if it passed, an assessment of the probability that the compound lies in drug-like space.
This flag specifies a file for a tab-separated format table that includes all of the filter data. These data files are ready for import into a spreadsheet program for easy examination. Each column of the table includes one of the filter categories (such as “Molecular Weight”) and each row of the table corresponds to a single molecule. The table contains complete entries for all of the molecules in the input file regardless of whether they pass or fail. NOTE: Setting this flag will cause the program to slow down. [default = null]
This flag specifies that, if a table is being written, any values in the table that would cause a molecule to fail a filter will be flagged with an asterisk. This provides a means of seeing all the filters a molecule might fail, as the log file typically only provides the first failure. [default = false]
This is the interval at which data is written to the filter.info file. The filter.info file contains running totals that are relevant to a FILTER run. Examining the filter.info file is the best means of checking on the progress of a FILTER execution. If this flag is 50, then the filter.info file is re-written every 50 molecules. [default = 5000]
This boolean flag determines whether compounds will be modified to reflect a pH=7.4 model. Notice, this will modify the molecules permanently. [default = true]
This flag indicates an optional SMIRKS file. This file should contain the set of reactions you wish to use to normalize the connection table of your molecules. Please note: These reactions are applied before the filtering process and can significantly slow the filtering process. [default = null]
This flag specifies a molecule file that you consider to be salts. If any molecular entries contain multiple disconnected fragments, then any fragment contained in the “salt” file will be removed. If no file is specified, or if there are multiple disconnected fragments in a molecule record that are not in the salt file, then the first largest remaining fragment will be retained and all others discarded. [default = null]
This boolean flag indicates whether you want the molecular properties used for the filtering run (see -filter) to be attached to output molecules as SD tag data. This parameter will only work for .sdf or .oeb formats. [default = false]
Boolean flag that determines whether FILTER writes a single dot (.) to the terminal (stdout) for every 500 compounds that are processed.
This flag accepts the name of a file that contains molecules for FILTER to skip. Only unique molecules that do not appear in this file will be sent to the output. Molecules are checked for uniqueness after FILTER has processed them, therefore a parameter such as -pkanorm could change an input structure from a duplicate to a unique molecule.