Fingerprint Search - Small Scale 2D SimilarityΒΆ

Fingerprint Search - Small Scale 2D Similarity is a tool for finding similarity between an input dataset of molecules to a query or template molecule, based on molecular fingerprints.

The minimal inputs into 2D Similarity are a query molecule and a search database of molecules both in either 1D (SMILES), 2D (SD, mol2) or 3D format.

The output from the 2D Similarity floe is a hitlist with highly similar molecules at the top.

Extra Required Parameters

  • Output Dataset (dataset_out) : Output dataset of successful calculations
    Default: Output for Fingerprint Search - Small Scale 2D Similarity
  • Size cutoff (integer) : Used for performance optimization. Below this size, query molecules are passed to the similarity cube initialization port.
    Default: 10000
  • Database Molecules (data_source) : Dataset containing one or more molecules to compare against query
  • Added Boolean Field (Field Type: Bool) : The added boolean field.
    Default: Added Boolean
  • Failed Dataset (dataset_out) : Output dataset of failed calculations
    Default: Failed Output for Fingerprint Search - Small Scale 2D Similarity
  • Size cutoff (integer) : Used for performance optimization. Below this size, query molecules are passed to the similarity cube initialization port.
    Default: 10000
  • Num Best Hits (integer) : Number of best-scoring molecules to keep
    Default: 500 Min: 1 Max: 20000
  • Float Sort Field (Field Type: Float) : Record field containing the key value to sort by
  • Query Molecule Title Field (Field Type: String) : The title of the query molecule used to obtain the score.
    Default: Query Molecule Title Field
  • Similarity Score Field (Field Type: Float) : Name for the field that stores fingerprint similarity scores.
    Default: Similarity Score
  • Similarity Score Field (Field Type: Float) : Name for the field that stores fingerprint similarity scores.
    Default: Similarity Score
  • Deduplicate Results (boolean) : If set to True, if multiple input molecules are the same, only retain similarity resultfor the query molecule with the highest score
    Default: True
  • Query Molecule (data_source) : Dataset containing single molecule to use as query