MMDS 02. Generate Target and Family Dataset¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/SPRUCE
Product-based/MMDS
Role-based/MMDS Staff User/MMDS Data Prep
Solution-based/Virtual-screening/Target Preparation
Solution-based/Hit to Lead/Target Preparation
Solution-based/Hit to Lead/Target Preparation/Structural Data Preparation
Task-based/Data Science/Clustering
Task-based/Target Prep & Analysis/Protein Preparation
Task-based/Target Prep & Analysis/Protein Similarity Search
Description
MMDS requires protein target categorization in order to determine related target families. Targets are identified by their UniProt sequences and categorization for drug-like targets are categorized using the curated categorization strategy by Guide to Pharmacology. Any remaining targets are sorted into and ‘Uncategorized’ category. Targets with multiple UniProt features are further subdivided into their feature traits and associated PDB structure are sorted based on sequence similarity.
Each target will attempt to find a reference structure using Spruce. Failure to find viable structures, or a viable reference structure, will result in family of targets that cannot superpose. Successful targets will contain a list of related PDB structures and a reference structure, and will be saved in a Target dataset. If possible, a common reference structure will attempt to be found from the list of related targets. These are saved in the Family dataset along with all the categorization information.
Potential Sources for Inputs: MMDS 01. Make/Update RCSB PDB Collection
Related Floes: MMDS 03. Structure Prep, MMDS 04. Add family data to MMDS
Parameter title in user interface (promoted name)
Target dataset no ref structure (data_out) type: dataset_out: Output dataset of targets that are not able to generate reference structures.Default: Target Dataset No Ref Structure
Parameter title in user interface (promoted name)
Target dataset (data_out) type: dataset_out: Output Target datasetDefault: Target Dataset
Parameter title in user interface (promoted name)
Output Dataset (data_out) type: dataset_out: Output dataset to write toDefault: retrieve_failures
Parameter title in user interface (promoted name)
Failed Family dataset (data_out) type: dataset_out: Output failed family datasetDefault: Failed Family Dataset
Parameter title in user interface (promoted name)
UniProt dataset map (data_out) type: dataset_out: Output UniProt dataset mapDefault: UniProt Dataset Map
Parameter title in user interface (promoted name)
Failed target dataset (data_out) type: dataset_out: Output failed target datasetDefault: Failed Target Dataset
Parameter title in user interface (promoted name)
Target dataset no structure (data_out) type: dataset_out: Output dataset of targets without any structures able to generate reference structures.Default: Target Dataset No Structures
Parameter title in user interface (promoted name)
Existing dataset failures (data_out) type: dataset_out: Output existing dataset failuresDefault: Existing Dataset Failures
Parameter title in user interface (promoted name)
Family dataset (data_out) type: dataset_out: Output Family datasetDefault: Family Dataset
Parameter title in user interface (promoted name)
PDB Structure Collection (collection) type: collection_source: Collection containing input PDB structures from source