Pareto Frontier Consensus
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/Gigadock
Product-based/FastROCS
Role-based/Computational Chemist
Solution-based/Virtual-screening/Analysis
Solution-based/Virtual-screening/Analysis/Consensus
Task-based/Data Science/Clustering
Description
This floe creates a consensus list using a Pareto Frontier method (a.k.a. Pareto Dominance). The consensus assigns a Pareto Dominance Rank to each record and outputs only those records that have a rank less than a specified minimum value (see the ‘Pareto Dominance Max Rank’ parameter). The Pareto Dominance Rank is based on the values in specified fields on the input dataset(s) (see ‘Consensus Field(s) with High Values Preferred’ and ‘Consensus Field(s) with Low Values Preferred’ parameters) and is equal to the number of other records which have a better value in every field.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Input Dataset (input_dataset): The dataset(s) to read records from
Required
Type: data_source
Consensus Field(s) with High Values Preferred (consensus_fields_with_high_values_preferred): Integer and/or float fields on the input dataset to use in the consensus and for which high values are preferred (e.g., a value of 5 should be considered ‘better’ than a values of -1). Tanimoto similarities are an example of real world data in which high values (i.e., higher similarity) are preferred. Multiple fields can be specified for this parameter.
Type: string
Consensus Field(s) with Low Values Preferred (consensus_fields_with_low_values_preferred): Integer and/or float fields on the input dataset to use in the consensus and for which lower values are preferred (e.g., a value of -1 should be considered ‘better’ than a values of 5). Binding energies are an example of real world data in which lower values are typically preferred (i.e., the lower value the binding energy the better the binding). Multiple fields can be specified for this parameter.
Type: string
Outputs
Output Dataset (output_dataset): Name of the consensus output dataset
Required
Type: dataset_out
Default: Pareto Frontier Consensus
Options
Pareto Dominance Max Rank (pareto_dominance_max_rank): This is the maximum allowed Pareto Dominance Rank a record may have and still make it onto the consensus output. The Pareto Dominance Rank of a given record is the number of other records which have a better value for every one of the values used in the consensus (see ‘Consensus Field(s) with High Values Preferred’ and ‘Consensus Field(s) with Low Values Preferred’ input parameters).
Type: integer
Default: 4
Output Fields
Pareto Dominance Rank Field (pareto_dominance_rank_field): Integer Field on the output dataset holding the Pareto Dominance Rank of the record. A rank of 0 is the lowest and the best rank. The highest rank that will appear in the output is determined by the setting of the ‘Pareto Dominance Max Rank’ parameter.
Required
Type: field_parameter::int
Default: Pareto Dominance Rank