3D QSAR Model: Builder

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Models

Description

The 3D QSAR Model: Builder Floe is a tool for building models with 3D descriptors. The floe incorporates: (1) 3D conformer generation and charge assignment; (2) hyperparameter optimization for ROCS- and EON-based kernel-PLS model building; (3) cross-validation; (4) model building; and (5) optional external validation.

Building models using this floe requires one or two datasets based on a specific usage scenario: (1) a dataset of molecules along with their potency without a tagged external validation set and (2) an optional dataset of pre-aligned receptors for Posit pose generation or reference molecules for FlexiROCS conformer generation.

Outputs from this floe contain: (1) a model dataset; (2) a training set conformer dataset used for model building (optional); and (3) an external validation dataset (optional).

The output model dataset stores the receptors or reference molecules provided, which will be read in the 3D QSAR Model: Validation and 3D QSAR Model: Predictor Floes.

The floe also produces hyperparameters optimization, cross-validation, and (optionally) the external validation reports.

Note: If the training set is relatively large (e.g., greater than 100), the leave-one-out cross-validation is not necessary and can slow down the floe. Consider switching to random for split method under the “Cross Validation Parameters” section, in such cases.

Promoted Parameters

Title in user interface (promoted name)

Cross Validation Parameters

Split Method (split_method): Way to split the dataset into training and validation set

  • Type: string

  • Default: leave one out

  • Choices: [‘random’, ‘leave one out’]

Percentage (Random Split) (percentage): The percentage of records used for training in random split

  • Type: decimal

  • Default: 90.0

Number of Split Sets (Random Split) (num_random_set): Number of times the random split to perform

  • Type: integer

  • Default: 50

External Validation Parameters

Do External Validation (do_ext_valid): Whether to do external validation. If true, floe will look for specified tag field with specified tag value to identify external validation set.

  • Type: boolean

  • Default: False

  • Choices: [True, False]

External Validation Tag Field (in_test_tag_field): Field containing tag for external validation set

  • Type: field_parameter::int

  • Default: External validation tag

External Validation Set Tag Value (test_tag_value): Value of tag field for external validation set

  • Type: integer

  • Default: 1

Inputs

Ligand Database (in): Dataset containing the ligand molecules to process.

  • Required

  • Type: data_source

Receptors/Reference Molecules (receptors): Dataset containing pre-aligned receptors/reference molecules.

  • Type: data_source

Outputs

Output Model Dataset (out): Output dataset containing built models and receptors/reference molecules.

  • Required

  • Type: dataset_out

  • Default: Output for 3D QSAR Model: Builder

Failed Dataset (failed): Output dataset of failed calculations.

  • Required

  • Type: dataset_out

  • Default: Failed Output for 3D QSAR Model: Builder

Training Conformer Output Dataset (train_pose_out): Optional output dataset containing training set conformers if Output Training Conformers is On.

  • Required

  • Type: dataset_out

  • Default: Training Conformer Output

External Validation Output Dataset (ext_valid_out): Optional output dataset containing external validation results if Do External Validation is On.

  • Required

  • Type: dataset_out

  • Default: External Validation Output

3D Conformer Parameters

Use Input 3D (use_input_3d): Whether to use 3D input structures. Flag will be ignored for molecules without 3D input structures.

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Output Training Conformers (output_train_pose): Whether to output training set conformers used for model building.

  • Required

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Charge Method Parameters

Charge Type (method_type): Charge assignment method.

  • Type: string

  • Default: am1bcc

  • Choices: [‘am1bcc’, ‘mmff’, ‘current_charges’]

Potency Parameters

Input Potency field (in_potency_field): Field containing input potency data

  • Required

  • Type: field_parameter::float

  • Default: potency

Unit for Potency (potency_unit): Unit for input potency field (e.g. nanomolar or micromolar for IC50 and log for pIC50)

  • Type: string

  • Default: log

  • Choices: [‘micromolar’, ‘nanomolar’, ‘log’]

Minimum Potency (potency_min): Molecules with potency (log unit) at or below this value are not trustworthy and discarded.

  • Type: decimal

  • Default: 0.0

Maximum Potency (potency_max): Molecules with potency (log unit) at or above this value are not trustworthy and discarded.

  • Type: decimal

  • Default: 15.0