Generate Heads for 2D Directed Sphere Exclusion

This cube generates the heads used for directed sphere exclusion so that the sphere exclusion clustering itself can happen in parallel.

Main Parameters

Parameter Name

Extended Log Field

Log Field


Calculation Parameters

  • None (batch_size) type: integer:
    Default: 10000
  • None (batch_size_floor) type: integer:
    Default: 400
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Cube Metrics (cube_metrics) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • None (head_percentage) type: decimal: Fraction of batch size to find as cluster heads.
    Default: 0.03
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Metric Period (metric_period) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • None (num_clusters_per_cycle_floor) type: integer:
    Default: 20
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • None (sim_cutoff) type: decimal:
    Default: 0.2
  • Similarity Measure (sim_type) type: string: The similarity measure used to 2D similarity calculation.
    Default: OETanimoto
    Choices: OECosine, OEDice, OEEuclid, OEManhattan, OETanimoto
  • Sort Order (sort_order) type: string: The sort order for the hitlist scores.
    Default: Descending
    Choices: Descending, Ascending
  • None (sphere_exclusion_radius) type: decimal:
    Default: 0.05
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • None (starting_batch_percentage) type: decimal:
    Default: 0.1
  • None (target_num_clusters) type: integer:
  • Use Score (use_rank) type: boolean:
    Default: False

Field parameters

  • Fingerprint X Vector Field (cluster_fingerprint_field) type: Field Type: Chem.FingerPrintVec:
    Default: fp_x
  • None (cluster_head_index_field) type: Field Type: Int:
    Default: cluster_head_index
  • Cluster ID Field (cluster_id_field) type: Field Type: String: The name for the field that will contain the unique cluster ID.
    Default: Cluster ID
  • None (count_field) type: Field Type: Int:
    Default: count
  • Extended Log Field (ext_log_field) type: Field Type: StringVec: Message extended log field
    Default: Extended Log Field
  • Fingerprint Field (fingerprint_field) type: Field Type: Chem.FingerPrint: Tag name for the field that stores fingerprints.
    Default: Fingerprint
  • None (finished_field) type: Field Type: String:
    Default: finished
  • UUID (id_field) type: Field Type: String: The field to store unique identifiers for mols
    Default: UUID
  • None (is_core) type: Field Type: Bool:
    Default: is_core
  • Log Field (log_field) type: Field Type: String: The field to store messages to floe report
    Default: Log Field
  • Hit Score Field (rank_field) type: Field Type: Float: The name for the field that will contain the hitlist score.
    Default: Score

Hardware Parameters

Machine hardware requirements
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”

Metrics Parameters

Cube Metric Parameters
  • Metric Period (None) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Cube Metrics (None) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network