Protein Sequence to AI Folded Structure Prediction

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

Description

Protein sequence(s) are used as an input for to predict protein structure using AI folding models. This floe supports the OmegaFold model for structure prediction.

OmegaFold is a third-party sequence to structure protein folding method that utilizes a Large Language Models (LLM) to predict protein structure without the use Multiple Sequence Alignments (MSA). OmegaFold is a third-party sequence-to-structure protein folding method that uses a Large Language Model (LLM) to predict protein structure without the use of Multiple Sequence Alignments (MSA). This floe and its defaults are based on the standard folding practices that are outlined by OmegaFold.

Limitations: OmegaFold does not currently support predictions of multiple sequences, also known as multimers. If a multimeric sequence is identified in the input, the sequence will be skipped.

Longer sequences can be computationally demanding. If you want to run longer sequences, it is common practice to split the sequence with around 200 residues of overlap and do multiple sequence runs.

You can read more background information about OmegaFold.

Related Floes: SPRUCE - Protein Preparation, DU to PDB

Computational Cost Scaling For most optimized performance it is best to batch all sequences at once then it would be to run many small jobs.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Sequence to run AI Folding (sequence): Sequence(s) to run structure prediction. If providing a sequence that contains a multimer, delineate chains with ‘:’ punctuation. To perform independent runs of multiple sequences, delineate different sequences with a ‘^’ and ensure sequences respectively match their counterpart in the ‘Sequence Title’ parameter. OmegaFold does not support multimer. If a multimer sequence is detected the sequence will be skipped.

  • Required

  • Type: string

Sequence Title (titles): Title(s) to the independent runs from the ‘Sequence to run AI Folding’ parameter. Delineate each run with a ‘^’ and ensure titles match their counterpart in the ‘Sequence to run AI Folding’ parameter. If a mismatch is observed, a distinct title will be generated in the format: ‘Sequence_1^Sequence_2^…’

  • Type: string

Parameter File(s) on Orion (params): Select the parameter weights to be used for the AIFold. Selecting a large number of parameter weights files will increase the required disk space for the cube. Selection are from the default set of parameters. Note OmegaFold_model2 is much more heavy-weight and you will likely need to increase GPU memory requirements

  • Type: string

  • Default: [‘OmegaFold_model1.pt’]

  • Choices: [‘OmegaFold_model1.pt’, ‘OmegaFold_model2.pt’]

Outputs

OmegaFold Results (out): Output dataset to write to

  • Required

  • Type: dataset_out

  • Default: OmegaFold_Predictions

OmegaFold Failures (fout): Output dataset to write to

  • Required

  • Type: dataset_out

  • Default: OmegaFold_Failures