Conformer String Hit List¶
This cube builds a hit list of records based on conformer data.
It accumulates a hit list of a given size by sorting input records based on the ‘string’ value of the field on the molecules conformer record specified by the Sort Field parameter. If a molecule has multiple conformers with records containing the Sort Field, the sort value will be the best value of any conformer (based on the setting of the Descending parameter). The final hit list is sent to the hit_list port. The sort ordering of the records on the hit list is stable.
Records which have a Sort Field but are not included in the hit list are written to the discard output port. The order of records emitted to the discard port is not stable. Any records which do not contain a Sort Field are written to the missing output port. These records are written as encountered so they are in relative input order.
If Keep Ties parameter is true, then it indicates that ties in the last position of the hit list will be kept and the output hit list may have more than Hit List Size. If Keep Ties parameter is false, then the hit list is truncated to exactly Hit List Size irrespective of ties in the last position.
If ties are allowed, the Hit List Truncate parameter is used to guard against pathological behavior for lists of low cardinality. If there are a large number of ties, a hit list can grow to much larger than the desired hit list size. The Hit List Truncate parameter sets the absolute maximum hit list size, expressed as a multiple of Hit List Size. If the hit list grows to that size it is truncated without regard to ties.
Parameter Details¶
Calculation Parameters¶
CPUs (integer) : The number of CPUs to run this cube withDefault: 1 Min: 1 Max: 128
Cube Metrics (string) : Set of metrics to be collectedChoices: cpu, disk, memory, network
Descending (boolean) : This parameter determines whether the list will be sorted in descending or ascending order.Default: False
Temporary Disk Space (MiB) (decimal) : The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 Min: 128.0 Max: 8589934592
GPUs (integer) : The number of GPUs to run this cube withDefault: 0 Max: 16
Hit List Size (integer) : The desired size of the hit list.Default: 1 Min: 1
Hit List Truncate (integer) : The maximum size of hit list, as a multiple of desired size, if ties are allowed.Default: 4 Min: 1
Instance Tags (string) : Only run on machines with matching tags (comma separated)Default: “”
Instance Type (string) : The type of instance that this cube needs to be run on
Keep Ties (boolean) : This parameter indicates whether to keep identical values even when exceeding the desired hit list size.Default: False
Memory (MiB) (decimal) : The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 Min: 256.0 Max: 8589934592
Metric Period (decimal) : How often to sample metrics, in secondsDefault: 60 Min: 1 Max: 300
Spot policy (string) : Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required
Field parameters¶
None (Field Type: Chem.Mol) :
String Sort Field (Field Type: String) : Record field containing the key value to sort by
None (Field Type: Chem.Mol) :
String Sort Field (Field Type: String) : Record field containing the key value to sort by
Hit List Parameters¶
Hit List Size (integer) : The desired size of the hit list.Default: 1 Min: 1
Descending (boolean) : This parameter determines whether the list will be sorted in descending or ascending order.Default: False
Keep Ties (boolean) : This parameter indicates whether to keep identical values even when exceeding the desired hit list size.Default: False
Hit List Truncate (integer) : The maximum size of hit list, as a multiple of desired size, if ties are allowed.Default: 4 Min: 1
Hardware Parameters¶
Machine hardware requirements
Memory (MiB) (decimal) : The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 Min: 256.0 Max: 8589934592
Temporary Disk Space (MiB) (decimal) : The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 Min: 128.0 Max: 8589934592
GPUs (integer) : The number of GPUs to run this cube withDefault: 0 Max: 16
CPUs (integer) : The number of CPUs to run this cube withDefault: 1 Min: 1 Max: 128
Instance Type (string) : The type of instance that this cube needs to be run on
Spot policy (string) : Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required
Instance Tags (string) : Only run on machines with matching tags (comma separated)Default: “”
Metrics Parameters¶
Cube Metric Parameters
Metric Period (decimal) : How often to sample metrics, in secondsDefault: 60 Min: 1 Max: 300
Cube Metrics (string) : Set of metrics to be collectedChoices: cpu, disk, memory, network
Tip
filename: snowball/logic/hitlist.py