Generate 3D Similarity Matrix¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Description
This floe outputs a basic distribution for NxN 3D similarity scores calculated in parallel. It also can optionally write the similarity matrix to a numpy 2D array in a .npy binary file, and a corresponding numpy .npy file with a 1D array of SMILES that label the molecules for each row of the 2D array.
If and only if the input molecules do not have coordinates assigned, the floe will generate a single conformer for each molecule using OMEGA. For multiconformer molecules, only the active conformer will be used for similarity calculation.
Please note that for large input sizes, writing the matrix can require a large amount of memory. Please adjust the Advanced: Matrix File Writer Memory parameter for large input sizes.
Promoted Parameters
Title in user interface (promoted name)
3D Similarity Calculation
3D Similarity Score Function (score_type):
Type: string
Default: Shape Tanimoto
Choices: [‘Shape Tanimoto’, ‘Color Tanimoto’, ‘Tanimoto Combo’]
Align Molecules (use_align): If set to true, molecules will be aligned before similarity calculation; otherwise they will retain input coordinates.
Type: boolean
Default: True
Choices: [True, False]
Similarity Score Cutoff (sim_cutoff): Similarity scores below this value will be calculated as 0
Type: decimal
Default: 0.05
Outputs
Write Matrix To File (write_switch): Set to True, in order to write the similarity matrix to a file. WARNING: setting this to True will cause the parallel to run significantly more slowly, and memory on the Matrix File Writer cube may need to be increased for matrix sizes over 10,000 x 10,000.
Type: boolean
Default: False
Choices: [True, False]
Floe Report Name (floe_report_name): Name of report containing summary statistics.
Type: string
Default: 3D Similarity Score Report
Matrix File Name (similarity_matrix_filename): .npy file extension is required. This will be the numpy binary file containing the full similarity matrix as a 2D numpy ndarray.
Type: string
Default: 3D_similarity_matrix.npy
SMILES row labels (row_label_filename): .npy file extension is required. This will be the numpy binary file containing SMILES labels for each row of the similarity matrix, as a 1D numpy ndarray.
Type: string
Default: 3D_similarity_matrix_SMILES_row_labels.npy
Output Text File (write_text): If set to True, in addition to the Write Matrix To File switch above, the floe will output text files for row labels and matrix, in addition to the binary .npy files that are generated
Type: boolean
Default: False
Choices: [True, False]
Advanced: Matrix File Writer Memory (memory_mb): For large datasets, increase the memory available to the matrix writer cube.
Type: decimal
Default: 22000
Use Distance Matrix (use_distance): If True, distance, as (1.0 - similarity), will be output, instead of similarity.
Type: boolean
Default: False
Choices: [True, False]