Fast Fingerprint Similarity Search¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Role-based/Medicinal Chemist
Task-based/Library Prep & Design/Substructure & Similarity Search
Solution-based/Virtual-screening/DB Search/2D Similarity and SubSearch
Description
Searches a collection (prepared using either of the floes ‘Prepare Collection for Fast Similarity or Substructure Search from Dataset or File` for similar molecules using OEGraphSim fingerprints.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Prepared Fingerprint Collection (input_collection): This collection must be created with the floe ‘Prepare Collection for Fast Fingerprint Similarity Search’
Required
Type: collection_source
Query Molecule (data_in): Select query molecule from input dataset or sketcher
Required
Type: data_source
Similarity Score Cutoff (cutoff): Molecules with scores above this cutoff will be sent to the output hit collection and dataset.
Required
Type: decimal
Similarity Search Settings
Similarity Score Type (score_type): OEGraphSim Score type used in similarity calculation.
Required
Type: string
Default: Tanimoto
Choices: [‘Tanimoto’, ‘Tversky’, ‘Manhattan’, ‘Dice’, ‘Cosine’, ‘Euclid’]
Fingerprint Type (fp_type): Fingerprint type used in similarity calculation.
Required
Type: string
Default: CircularVS
Choices: [‘CircularVS’, ‘Circular’, ‘TreeVS’, ‘Tree’, ‘PathVS’, ‘Path’]
Outputs
Floe Report Name (floe_report_name): Name of report containing summary statistics.
Type: string
Default: Fast Fingerprint Search Similarity Score Report
Collection Name (out_coll): Name of the collection to create
Required
Type: collection_sink
Default: Fast Similarity Search Hits Collection
Output Dataset (data_out): Output dataset to write to
Required
Type: dataset_out
Default: Fast Similarity Search Hits
Advanced
Maximum Number of Records in Output Dataset (n_records): Dataset size will be restricted to this many records.
Required
Type: integer
Default: 10000
Records per Shard (records_per_shard): The target number of records in a shard.
0 indicates to run up to the max_shard_bytes limit per shard
Required
Type: integer
Default: 10000