String Fields Comparison

../../../../../_images/CompareStringFieldCubeIcon.svg

This cube sends records to one of the following output ports, based on the comparison of two string values on the record.

The ‘A’ value is read from the field specified by the Field A parameter, and the ‘B’ value is read from the field specified by the Field B parameter. The output ports are:

  • equal : if the constant value and field value are the same

  • unequal : if the constant value and field value are not the same

  • missing : if the record is missing either ‘Field A’ or ‘Field B’ fields

The string comparison can be customized using the parameters listed in the String Comparison Parameters section.

Calculation Parameters

  • Case Sensitive (case_sensitive) type: boolean: This parameter determines whether the string comparison is case sensitive.
    Default: True
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Cube Metrics (cube_metrics) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Max Backlog Wait (max_backlog_wait) type: integer: The max time (in seconds) that a cube will be backlogged on a group before being re-evaluated
    Default: 600 , Min: 300
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Metric Period (metric_period) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Thread limit per CPU (pids_per_cpu_limit) type: integer: The number of threads per CPU
    Default: 32
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Strip Whitespace (strip_whitespace) type: boolean: This parameter determines whether leading and trailing white spaces are stripped prior to string comparison.
    Default: False

Field parameters

  • Field A (field_a) type: Field Type: String: The name of the field with the ‘Field A’ values.
  • Field B (field_b) type: Field Type: String: The name of the field with the ‘Field B’ values.

String Comparison Parameters

  • Case Sensitive (None) type: boolean: This parameter determines whether the string comparison is case sensitive.
    Default: True
  • Strip Whitespace (None) type: boolean: This parameter determines whether leading and trailing white spaces are stripped prior to string comparison.
    Default: False

Hardware Parameters

Machine hardware requirements
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Thread limit per CPU (pids_per_cpu_limit) type: integer: The number of threads per CPU
    Default: 32
  • Max Backlog Wait (max_backlog_wait) type: integer: The max time (in seconds) that a cube will be backlogged on a group before being re-evaluated
    Default: 600 , Min: 300
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”

Metrics Parameters

Cube Metric Parameters
  • Metric Period (None) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Cube Metrics (None) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network

Parallel String Fields Comparison

The parallel version adds these extra parameters.

  • Number of messages to distribute at a time (item_count) type: integer: The maximum number of messages to bundle together for a parallel cube.
    Default: 1 , Min: 1, Max: 65535
  • Maximum Failures (max_failures) type: integer: The maximum number of times to attempt processing a work item
    Default: 10 , Min: 1, Max: 100
  • Autoscale this Cube (autoscale) type: boolean: If True, let Orion manage the parallelism of this Cube
    Default: True
  • Maximum number of Cubes (max_parallel) type: integer: The maximum number of concurrently running copies of this Cube
    Default: 1000 , Min: 1
  • Minimum number of Cubes (min_parallel) type: integer: The minimum number of concurrently running copies of this Cube
    Default: 0