C1. Cryptic Pocket Detection: Exposon Analysis¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/Molecular Dynamics
Solution-based/Virtual-screening/Target Preparation
Solution-based/Hit to Lead/Target Preparation/Enhanced Sampling
Solution-based/Target Identification/Target Preparation/Pocket Detection
Solution-based/Hit to Lead/Target Preparation/Cryptic Pocket Detection
Role-based/Computational Chemist
Task-based/Target Prep & Analysis/Pocket Detection
Description
This floe identifies cryptic pockets as groups of residues that undergo co-operative changes in their solvent exposure during Weighted Ensemble simulations
Promoted Parameters
Title in user interface (promoted name)
Inputs from Protein Sampling
Solvated and Equilibrated Design Unit (du_data_in): This is the ‘Solvated and Equilibrated Design Unit’ output dataset from the ‘A1. Protein Sampling (for Cryptic Pockets): Solvate and Equilibrate Target Protein’ floe.
Required
Type: data_source
Topology File (top_file): PDB file specifying the system topology. This file is generated by
the ‘A1. Protein Sampling (for Cryptic Pockets): Solvate and Equilibrate Target Protein’ Floe.
Required
Type: file_in
Protein Sampling (Weighted Ensemble MD Simulation) Dataset (westdata_in): This is a ‘Protein Sampling Dataset’ output generated by ‘A3a. Protein Sampling (for Cryptic Pockets): Run a Weighted Ensemble MD Simulation’ or ‘A3b. Protein Sampling (for Cryptic Pockets): Continue a Weighted Ensemble MD Simulation’. The dataset should come from the most recent Protein Sampling job run for a given protein.
Required
Type: data_source
Inputs from Trajectory Analysis
Cluster Members Dataset (Per-Residue SASA) (clusters_data_in): This is the ‘Cluster Members’ dataset output from the ‘B2. Trajectory Analysis (for Cryptic Pockets): Cluster Conformations’.The datset contains cluster-labels assigned to each MD frame.
Required
Type: data_source
Cluster Medoids Dataset (Per-Residue SASA) (medoids_data_in): This is the ‘Cluster Medoids’ dataset output from the ‘B2. Trajectory Analysis (for Cryptic Pockets): Cluster Conformations’. The dataset contains MD features and atomic coordinates of cluster medoids.
Required
Type: data_source
Outputs
Ranked Pockets (pockets_data_out): This dataset saves information pertaining to each pocket including pocket residues, COM distance from functionally important site, and other pocket characteristics.
Required
Type: dataset_out
Default: Ranked Pockets - Exposon Analysis
MSM Weighted Medoids (msm_weights_data_out): Output dataset containing the cluster medoids and equilibrium populations of clusters derived from Markov state estimation.
Required
Type: dataset_out
Default: MSM Weighted Medoids - Exposon Analysis
Failure Output Dataset (failure_data_out): Failure output dataset to write to.
Required
Type: dataset_out
Default: Failure - Exposon Analysis
Floe Report Output Collection (floe_report_out):
Required
Type: string
Default: Floe Report - Exposon Analysis
Exposon Analysis Inputs
Important Residues (select_string_key_resids): String for selecting functionally important residues e.g. active site residues or a known disease mutation. Distance between center of mass (COM) of selected residues and COM of pocket residues will be computed as a pocket ranking parameter. Residues should be specified in <residue number><chain id> format. For example, active site consisting of residues 11, 12 (chain A) and residues 23 (chain B) should be specified as 11A, 12A, 23B.. Residue numbers and chain IDs should match those given in the pdb file generated by ‘A1. Protein Sampling (for Cryptic Pockets): Solvate and Equilibrate Target Protein’ Floe.
Required
Type: string
Pocket Ranking Parameter (pocket_ranking_metric): Metric used for ranking the pockets. ‘Key distances’ ranks the pocket by center of mass distance between functionally important residue(s) and pocket residues in ascending order. ‘Intra-pocket cooperativity’ ranks the pockets by average strength of cooperativity between pocket residues.
Required
Type: string
Default: Key distances
Choices: [‘Key distances’, ‘Intra-pocket cooperativity’]
Exposed Residue Minimum SASA (threshold): Threshold (in Ų) used for binary classification of residue exposure state for computing mutual information. If solvent accessible surface area of a residue is greater than the threshold value, it will be classified as exposed (1). Otherwise, the residue will be classified as buried (0).
Type: decimal
Default: 2.0