Cluster Poses
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/FastROCS
Product-based/Gigadock
Role-based/Computational Chemist
Solution-based/Virtual-screening/Analysis/Clustering
Task-based/Data Science/Clustering
Description
Clusters Poses based on 3D similarity.
The outputted cluster information is. 1) An integer cluster ID for each pose that identified the which cluster the pose belongs to. 2) An integer cluster rank for each pose that indicates the rank of the pose within its cluster (rank is based on the order of the poses in the dataset). 3) The Tanimoto of each pose to its cluster center; cluster centers will have a Tanimoto of 1.0.
The 3D similarity is calculated in place, i.e, the poses are not moved/overlayed before calculating the 3D similarity.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Input Dataset (input_dataset): The dataset(s) from which to read records.
Required
Type: data_source
Outputs
Output Dataset (output_dataset): Dataset for the clustered output
Required
Type: dataset_out
Default: Clustered
Options
Cluster Tanimoto Threshold (cluster_tanimoto_threshold): Tanimoto Similarity threshold used to determine cluster centers. Larger values will result in more clusters with fewer conformers/poses in each cluster that are more similar to each other.
Type: decimal
Default: 0.9
Single Conformer/Pose Input (single_conformerpose_input): If ‘On’ the floe will assumed that the input molecules are single conformer and place the clustering information on the output records directly. If ‘Off’ the floe will cluster all conformers of each molecule and place the clustering information on the child conformer records of the output records. If multi conformer molecules are passed to this floe with this option ‘On’ the floe will use the active conformer of the molecule.
Type: boolean
Default: True
Choices: [True, False]
Options: Advanced
Charge Model (charge_model): Charge model to use in the electrostatic similar part of the 3D similarity calculation.
Type: string
Default: elf10
Choices: [‘elf10’, ‘mmff’, ‘input’]
Shape Falloff (shape_falloff): Distance at which the gaussian atom density is half it’s max value. This can be thought of roughly as the effective radius of the heavy atoms in the similarity model. Higher values mean that two poses can with atoms at are not exactly on top of each other can still have high a high similarity/Tanimoto
Type: decimal
Default: 2.0
Charge Falloff (charge_falloff): Distance at which the gaussian atom charge density is half it’s max value. Higher values mean that atoms with different partial charges are more likely to be considered similar and that poses with the same shape but differing partial charges can have high similarity/Tanimoto.
Type: decimal
Default: 0.25
Output Fields
Pose Cluster ID (pose_cluster_id): Integer field with the identifier of the cluster the pose/conformer is associated with.
Type: field_parameter::int
Default: Pose Cluster ID
Pose Cluster Rank (pose_cluster_rank): Integer field with the rank of the pose/conformer within its cluster. This rank is based on the order the pose/conformer appears in the original dataset. Rank=1 is not necessarily a cluster center. Cluster center have a ‘3D Cluster Tanimoto’ of 1.0 (see field parameter of that name)
Type: field_parameter::int
Default: Pose Cluster Rank
Pose Cluster Tanimoto (pose_cluster_tanimoto): Tanimoto similarity between the pose/conformer and its cluster center pose/conformer. A Tanimoto of 1.0 indicates the pose/conformer is the cluster center.
Type: field_parameter::float
Default: Pose Cluster Tanimoto
Input Fields
Molecule Field (Input Molecule Field): Field on the input records containing the molecules to cluster. If this field is left blank the primary (i.e., default) molecule field will be used.
Type: field_parameter::mol
Development
Catch exceptions (catch_exceptions): If Off exception handling will be disabled for this cube.
Type: boolean
Default: True
Choices: [True, False]
Catch exceptions (parallel_catch_exception_methods): Specifies which methods of a parallel cube an exception will be caught and emitted to the exception port if the port is connected. If the exception port is connected to an exception handler this will stop the floe
Type: string
Default: [‘begin’]
Choices: [‘begin’, ‘process’, ‘end’]
Enable cube timing report (time_all_cubes): If true this cube will emit timing information to the timing_data port.
Type: boolean
Default: True
Choices: [True, False]