Generate 3D Similarity Matrix

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

Description

This floe outputs a basic distribution for NxN 3D similarity scores calculated in parallel. It also can optionally write the similarity matrix to a numpy 2D array in a .npy binary file, and a corresponding numpy .npy file with a 1D array of SMILES that label the molecules for each row of the 2D array.

If and only if the input molecules do not have coordinates assigned, the floe will generate a single conformer for each molecule using OMEGA. For multiconformer molecules, only the active conformer will be used for similarity calculation.

Please note that for large input sizes, writing the matrix can require a large amount of memory. Please adjust the Advanced: Matrix File Writer Memory parameter for large input sizes.

Promoted Parameters

Title in user interface (promoted name)

3D Similarity Calculation

3D Similarity Score Function (score_type):

  • Type: string

  • Default: Shape Tanimoto

  • Choices: [‘Shape Tanimoto’, ‘Color Tanimoto’, ‘Tanimoto Combo’]

Align Molecules (use_align): If set to true, molecules will be aligned before similarity calculation; otherwise they will retain input coordinates.

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Similarity Score Cutoff (sim_cutoff): Similarity scores below this value will be calculated as 0

  • Type: decimal

  • Default: 0.05

Outputs

Write Matrix To File (write_switch): Set to True, in order to write the similarity matrix to a file. WARNING: setting this to True will cause the parallel to run significantly more slowly, and memory on the Matrix File Writer cube may need to be increased for matrix sizes over 10,000 x 10,000.

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Floe Report Name (floe_report_name): Name of report containing summary statistics.

  • Type: string

  • Default: 3D Similarity Score Report

Matrix File Name (similarity_matrix_filename): .npy file extension is required. This will be the numpy binary file containing the full similarity matrix as a 2D numpy ndarray.

  • Type: string

  • Default: 3D_similarity_matrix.npy

SMILES row labels (row_label_filename): .npy file extension is required. This will be the numpy binary file containing SMILES labels for each row of the similarity matrix, as a 1D numpy ndarray.

  • Type: string

  • Default: 3D_similarity_matrix_SMILES_row_labels.npy

Output Text File (write_text): If set to True, in addition to the Write Matrix To File switch above, the floe will output text files for row labels and matrix, in addition to the binary .npy files that are generated

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Advanced: Matrix File Writer Memory (memory_mb): For large datasets, increase the memory available to the matrix writer cube.

  • Type: decimal

  • Default: 22000

Use Distance Matrix (use_distance): If True, distance, as (1.0 - similarity), will be output, instead of similarity.

  • Type: boolean

  • Default: False

  • Choices: [True, False]