MMDS 02. Generate Target and Family Dataset

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Product-based/SPRUCE

  • Product-based/MMDS

  • Role-based/MMDS Staff User/MMDS Data Prep

  • Solution-based/Virtual-screening/Target Preparation

  • Solution-based/Hit to Lead/Target Preparation

  • Solution-based/Hit to Lead/Target Preparation/Structural Data Preparation

  • Task-based/Data Science/Clustering

  • Task-based/Target Prep & Analysis/Protein Preparation

  • Task-based/Target Prep & Analysis/Protein Similarity Search

Description

MMDS requires protein target categorization in order to determine related target families. Targets are identified by their UniProt sequences and categorization for drug-like targets are categorized using the curated categorization strategy by Guide to Pharmacology. Any remaining targets are sorted into and ‘Uncategorized’ category. Targets with multiple UniProt features are further subdivided into their feature traits and associated PDB structure are sorted based on sequence similarity.

Each target will attempt to find a reference structure using Spruce. Failure to find viable structures, or a viable reference structure, will result in family of targets that cannot superpose. Successful targets will contain a list of related PDB structures and a reference structure, and will be saved in a Target dataset. If possible, a common reference structure will attempt to be found from the list of related targets. These are saved in the Family dataset along with all the categorization information.

Potential Sources for Inputs: MMDS 01. Make/Update RCSB PDB Collection

Related Floes: MMDS 03. Structure Prep, MMDS 04. Add family data to MMDS

Parameter title in user interface (promoted name)

  • Target dataset no ref structure (data_out) type: dataset_out: Output dataset of targets that are not able to generate reference structures.
    Default: Target Dataset No Ref Structure

Parameter title in user interface (promoted name)

  • Target dataset (data_out) type: dataset_out: Output Target dataset
    Default: Target Dataset

Parameter title in user interface (promoted name)

  • Output Dataset (data_out) type: dataset_out: Output dataset to write to
    Default: retrieve_failures

Parameter title in user interface (promoted name)

  • Failed Family dataset (data_out) type: dataset_out: Output failed family dataset
    Default: Failed Family Dataset

Parameter title in user interface (promoted name)

  • UniProt dataset map (data_out) type: dataset_out: Output UniProt dataset map
    Default: UniProt Dataset Map

Parameter title in user interface (promoted name)

  • Failed target dataset (data_out) type: dataset_out: Output failed target dataset
    Default: Failed Target Dataset

Parameter title in user interface (promoted name)

  • Target dataset no structure (data_out) type: dataset_out: Output dataset of targets without any structures able to generate reference structures.
    Default: Target Dataset No Structures

Parameter title in user interface (promoted name)

  • Existing dataset failures (data_out) type: dataset_out: Output existing dataset failures
    Default: Existing Dataset Failures

Parameter title in user interface (promoted name)

  • Family dataset (data_out) type: dataset_out: Output Family dataset
    Default: Family Dataset

Parameter title in user interface (promoted name)

  • PDB Structure Collection (collection) type: collection_source: Collection containing input PDB structures from source