MMDS 01. Make/Update RCSB PDB Collection¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/SPRUCE
Product-based/MMDS
Role-based/MMDS Staff User/MMDS Data Prep
Solution-based/Virtual-screening/Target Preparation
Solution-based/Hit to Lead/Target Preparation
Solution-based/Hit to Lead/Target Preparation/Structural Data Preparation
Task-based/Target Prep & Analysis/Protein Preparation
Description
MMDS requires an up-to-date collection of protein structure files. The .pdb format is preferred, but will read .mmcif file formats if available. This floe will generate or append (if exists) a collection that contains protein structure files.
Missing structures from the RCSB are downloaded from the RCSB database and added to the collection. If the provided structures are newer a version or new entrant, they are added to the collection.
AlphaFold structures are also saved in PDB format and updated as new versions are released, but these structures can be optionally excluded from the collection.
Limitations: Due to the number of internet calls to the RCSB, the floe has been throttled so that we do not overwhelm the servers at the protein data bank.
Related Floes: MMDS 02. Generate Target and Family Dataset, MMDS 03. Structure Prep, MMDS 06. Add structures to MMDS
Computational Cost Scaling Creating a new PDB collection requires significantly more compute resource then if this floe were used to update a preexisting collection.
Parameter title in user interface (promoted name)
Output Dataset (data_out) type: dataset_out: Output dataset to write toDefault: retrieve_failures
Parameter title in user interface (promoted name)
Collection Name (coll_name) type: string: Name of a new or existing collection for biomolecular source data.For existing collections, an ID can also be used. When supplying a name of an existing collection the latest with that name is updated (if multiple exist).Default: RCSB PDB Collection