Substructure Search - Small Scale Substructure Matching

Substructure Search - Small Scale Substructure Matching is a tool for finding molecules containing patterns of interest, defined in terms of an MDL query, from a database of molecules.

The minimal inputs into Substructure Search are an MDL query and a search database of molecules both in either 1D (SMILES), 2D (SD, mol2) or 3D format.

The output from the Substructure Search floe is a hitlist with the least complex molecules at the top.

Extra Required Parameters

  • Fragment Input (fragment_input) :
  • Unmatched Results (dataset_out) : Dataset of molecules not matching the Substructure
    Default: Unmatched Results
  • Num Best Hits (integer) : Number of least complex molecules to keep
    Default: 500 Min: 1 Max: 20000
  • Float Sort Field (Field Type: Float) : Record field containing the key value to sort by
  • Export Complexity Terms (boolean) : This parameter determines whether to output all the terms that are used to generate the total molecular complexity value
    Default: False
  • Total Molecular Complexity Field (Field Type: Float) : The name of the total molecule complexity field
    Default: Total Molecular Complexity
  • Search Results (dataset_out) : Output dataset for Substructure Search Results
    Default: Search Results
  • 2D Depiction Field (Field Type: String) : The name of the field that stores the 2D depiction in SVG image format.
    Default: “”
  • Highlight Tag (string) : The tag that is used to mark atoms and bonds being highlighted.
    Default: selected
  • Match Tag (string) : The tag that is used to identify matched atoms and bonds.
    Default: selected
  • Input Dataset (data_source) : Dataset to search into