Export AIRR fields for dataset

This will export the AIRR fields associated with each row of an AbXtract dataset and write to file. If the input files that produced the dataset were in AIRR format, will return the rows from the input files that correspond with the AbXtract output records, which can be linked using the sequence_id column. Otherwise, will convert AbXtract fields to closest AIRR fields. If you prefer to get the AbXtract values anyway, set option Dataset was produced from an AIRR-formatted file to FALSE. Duplicate_count field gives total number identical observations of this sequence (UMIs are ignored). If paired chains were processed (e.g., PacBio data), light and heavy chains will be on different rows that can be linked by the sequence_id. If data not clustered and input was Sanger, clone_id column will correspond to clone_id column in Sanger floe outputs.

Main Parameters

Parameter Name

Provide AbXtract cluster call in clone_id column

Count field for AIRR-compatible file

Dataset was produced from an AIRR-compatible file


Parameter Details

Calculation Parameters

  • Provide AbXtract cluster call in clone_id column (add_cluster_call) type: boolean: If AbXtract clustering was done and was_airr_input is TRUE (whereby original rows are returned),

inserts cluster call into the clone_id column. If was_airr_input is FALSE and clustering was performed, cluster call is automatically inserted in clone_id column, ignoring this parameter. If clustering was not performed, this parameter is ignored. If dataset is a result of Sanger processing and both cluster and clone_id column exist, the cluster column will overwrite the clone_id column.

Default: False
  • Count field for AIRR-compatible file (count_method) type: string: consensus_count field gives number of reads contributing to consensus sequence for a particular UMI. duplicate_count field gives number of UMIs sharing identical sequence or total number identical observationsof this sequence (absent UMIs). If dataset was generated from AIRR-comptabile file, choose the same field as was converted to ‘count’ in the AIRR-to-AbXtract file conversion.
    Default: duplicate_count
    Choices: consensus_count, duplicate_count
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Cube Metrics (cube_metrics) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Metric Period (metric_period) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Dataset was produced from an AIRR-compatible file (was_airr_input) type: boolean: Dataset produced from file that had AIRR-compatible headers.
Set to FALSE if you would rather return AbXtract values with AIRR-compatible headers.
Default: False

Hardware Parameters

Machine hardware requirements
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”

Metrics Parameters

Cube Metric Parameters
  • Metric Period (None) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Cube Metrics (None) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network