Pareto Frontier Consensus

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Product-based/Gigadock

  • Product-based/FastROCS

  • Role-based/Computational Chemist

  • Solution-based/Virtual-screening/Analysis

  • Solution-based/Virtual-screening/Analysis/Consensus

  • Task-based/Data Science/Clustering

Description

This floe creates a consensus list using a Pareto Frontier method (a.k.a. Pareto Dominance). The consensus assigns a Pareto Dominance Rank to each record and outputs only those records that have a rankless than a specified minimum value (see the ‘Pareto Dominance Max Rank’ parameter). The Pareto Dominance Rank is based on the values in specified fields on the input dataset(s) (see ‘Consensus Field(s) with High Values Preferred’ and ‘Consensus Field(s) with Low Values Preferred’ parameters) and is equal to the number of other records which have a better value in every field.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Input Dataset (input_dataset): The dataset(s) from which to read records.

  • Required

  • Type: data_source

Consensus Field(s) with High Values Preferred (consensus_fields_with_high_values_preferred): Integer and/or Float field(s) on the input dataset(s) to use in the consensus and for which high values are preferred (e.g., a value of 5 should be considered ‘better’ than a values of -1). Tanimoto similarities are an example of real world data in which high values (i.e., higher similarity) are preferred. Multiple fields can be specified for this parameter.

  • Type: string

Consensus Field(s) with Low Values Preferred (consensus_fields_with_low_values_preferred): Integer and/or Float field(s) on the input dataset(s) to use in the consensus and for which lower values are preferred (e.g., a value of -1 should be considered ‘better’ than a values of 5). Binding energies are an example of real world data in which lower values are typically preferred (i.e. the lower value the binding energy the better the binding). Multiple fields can be specified for this parameter.

  • Type: string

Outputs

Output Dataset (output_dataset): Name of the consensus output dataset

  • Required

  • Type: dataset_out

  • Default: Pareto Frontier Consensus

Options

Pareto Dominance Max Rank (pareto_dominance_max_rank): This is the maximum allowed Pareto Dominance Rank a record may have an still make it onto the consensus output. The Pareto Dominance Rank of a given record is the number of other record which have a better value for every one of the values used in the consensus (see ‘Consensus Field(s) with High Values Preferred’ and ‘Consensus Field(s) with Low Values Preferred’ inputparameters).

  • Type: integer

  • Default: 4

Output Fields

Pareto Dominance Rank Field (pareto_dominance_rank_field): Integer Field on the output dataset holding the Pareto Dominance Rank of the record. A rank of 0 is the lowest and the best rank. The highest rank that will appear in the output is determined by the setting of the ‘Pareto Dominance Max Rank’ parameter

  • Required

  • Type: field_parameter::int

  • Default: Pareto Dominance Rank

Development

Enable cube timing report (time_all_cubes): If true this cube will emit timing information to the timing_data port.

  • Type: boolean

  • Default: True

  • Choices: [True, False]