Antibody SiteHopper-based Clustering
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/SPRUCE
Product-based/SiteHopper
Role-based/Computational Chemist
Solution-based/Virtual-screening/Analysis
Solution-based/Hit to Lead/Target Preparation/Structural Data Preparation
Solution-based/Biologics/Antibody Design/Target Preparation/CDR Analysis
Solution-based/Biologics/Antibody Design/Target Preparation/Surface Patch Analysis
Task-based/Target Prep & Analysis/Protein Similarity Search
Description
This floe takes all antibody structures from the input dataset(s) and clusters them based on the CDR surface patches. These patches are based on SiteHopper patch generation. Input systems could be of a single antibody in multiple configurations, difference antibodies, or a combination of both.
Limitations: Due to the limits of clustering, this floe is not suitable for systems with only a handful of structures. The greater the number of input structures, the better clusters can be defined.
Potential Input Sources: Antibody Sequences to 3D Models Floe, Antibody Experimental Structure Prep Floe
Promoted Parameters
Title in user interface (promoted name)
Distance Cutoff (Distance_cutoff) type: decimal: Sets the distance cutoff when running clustering on the distance matrix.Default: 2.0 Cube memory for NxN cube (aggregator_memory) type: decimal: Controls the memory needed to processes the NxN matrix. Memory requirement is dependent on input size N=10k ~0.1GB, N=100k ~10GB.Default: 1800 , Min: 256.0, Max: 8589934592 Output dataset of centroid records (centroids) type: dataset_out: Output dataset to write to Chunk Size (chunk_size) type: integer: Control chunk size for patch overlays.Default: 50 Failure output dataset of records (fail_out) type: dataset_out: Output dataset to write to Input dataset of 3D Antibodies (in) type: data_source: The dataset(s) to read records from Sequence Numbering Scheme (numbering_scheme) type: string: This parameter sets the numbering scheme applied to antibodies.Default: IMGTChoices: IMGT, Chothia, Martin, Kabat Output dataset of cluster records (out) type: dataset_out: Output dataset to write to
Titles of required parameters (promoted names)
Collection Name (collection_name) type: collection_sink: Name of the collection to createDefault: temp_ab_sh_patch_collection Collection Name (collection_name) type: collection_sink: Name of the collection to createDefault: temp_ab_sh_input_collection
Optional parameters (promoted names)
Input dataset of 3D Antibodies (data_in) type: data_source: The dataset(s) to read records from Sequence Numbering Scheme (numbering_scheme) type: string: This parameter sets the numbering scheme applied to antibodies.Default: IMGTChoices: IMGT, Chothia, Martin, Kabat Output dataset of centroid records (data_out) type: dataset_out: Output dataset to write to Distance Cutoff (distance_cutoff) type: decimal: Sets the distance cutoff when running clustering on the distance matrix.Default: 2.0 Cube memory for NxN cube (memory_mb) type: decimal: Controls the memory needed to processes the NxN matrix. Memory requirement is dependent on input size N=10k ~0.1GB, N=100k ~10GB.Default: 1800 , Min: 256.0, Max: 8589934592 Output dataset of cluster records (data_out) type: dataset_out: Output dataset to write to Failure output dataset of records (data_out) type: dataset_out: Output dataset to write to Chunk Size (chunk_size) type: integer: Control chunk size for patch overlays.Default: 50