Subset AbXtract Fields

Writes a dataset with only a preset number of AbXtract fields.

Main Parameters

Parameter Name

Identifier Fields to Keep

Sequence Fields to Keep

Write the Subsetted AbXtract Fields Output to CSV File


Calculation Parameters

  • Experimental Fields to Keep, If Present (assay_stats) type: string:
    Default: []
    Choices: KD, on_rate, off_rate
  • Biophysical Fields to Keep (biophysical_stats) type: string:
    Default: []
    Choices: cdr3_aa_1_charge, cdr3_aa_1_hydropathy, cdr3_aa_1_length, merged_cdrs_1_hydropathy, merged_cdrs_2_hydropathy, merged_cdrs_1_2_hydropathy, merged_cdrs_1_charge, merged_cdrs_2_charge, merged_cdrs_1_2_charge, merged_cdrs_1_length, merged_cdrs_2_length, merged_cdrs_1_2_length, cdr3_aa_2_charge, cdr3_aa_2_hydropathy, cdr3_aa_2_length, N_philic, N_phobic, isoelectric_point, charge_symmetric_parameter, high_viscosity_index
  • Cluster Fields to Keep (cluster_fields) type: string:
    Default: [‘cluster’]
    Choices: cluster_cdr3_1, cluster_cdr3_2, cluster, cluster_numeric
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Cube Metrics (cube_metrics) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • Identifier Fields to Keep (id_fields) type: string: NOTE: Sanger Well ID (if used) is specified by the ‘id’ field
    Default: [‘seq_id’, ‘barcode_group’]
    Choices: id, sample_name, barcode_group, barcode_round, processed_roi, overlay_roi, seq_id
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Liability Fields to Keep (liability_stats) type: string:
    Default: []
    Choices: liability_string_cdr1_aa_1, liability_string_cdr2_aa_1, liability_string_cdr3_aa_1, liability_string_cdr1_aa_2, liability_string_cdr2_aa_2, liability_string_cdr3_aa_2, liability_quant_cdr1_aa_1, liability_quant_cdr2_aa_1, liability_quant_cdr3_aa_1, liability_quant_cdr1_aa_2, liability_quant_cdr2_aa_2, liability_quant_cdr3_aa_2, liability_quant_chain_1, liability_quant_chain_2, liability_quant_lcdr1_3_hcdr1_3
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Metric Period (metric_period) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Population Fields to Keep (population_stats) type: string:
    Default: [‘percent_roi_final’]
    Choices: count, count_roi_final, count_roi_early, percent_roi_early, percent_roi_final, fold_enrichment_roi, log2_enrichment_roi, overlap_population, count_fl_final, count_fl_early, percent_fl_final, percent_fl_early, percent_fl, ratio_to_top_early, ratio_to_top_final, ratio_to_top_early_final
  • Sanger Overlap Fields to Keep (sanger_stats) type: string: These items indicate overlap of NGS to Sanger based on the specified region of interest (ROI)
    Default: []
    Choices: well_id, overlap_to_sanger, overlap_to_ngs
  • Quality Fields to Keep (sequence_functional_status) type: string:
    Default: []
    Choices: functional_1, sequence_issue, votes_1, functional_2, votes_2
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Sequence Fields to Keep (string_fields) type: string:
    Default: [‘match_name_1’, ‘match_name_2’, ‘sequence_aa_1’, ‘sequence_aa_2’, ‘cdr3_aa_1’, ‘cdr3_aa_2’, ‘read’]
    Choices: read, sequence_1, sequence_aa_1, sequence_aa_1_2, match_name_1, match_name_1_2, fr1_1, fr2_1, fr3_1, fr4_1, cdr1_1, cdr2_1, cdr3_1, fr1_aa_1, fr2_aa_1, fr3_aa_1, fr4_aa_1, cdr1_aa_1, cdr2_aa_1, cdr3_aa_1, merged_cdrs_1, merged_cdrs_2, merged_cdrs_1_2, sequence_2, sequence_aa_2, match_name_2, fr1_2, fr2_2, fr3_2, fr4_2, cdr1_2, cdr2_2, cdr3_2, fr1_aa_2, fr2_aa_2, fr3_aa_2, fr4_aa_2, cdr1_aa_2, cdr2_aa_2, cdr3_aa_2
  • Write the Subsetted AbXtract Fields Output to CSV File (write_to_csv_file) type: boolean: Allows the option to write to CSV after the subsetting fields at the cost of additional time. If not, can do this in separate step. Writes to empty file if turned off.
    Default: True

Hardware Parameters

Machine hardware requirements
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”

Metrics Parameters

Cube Metric Parameters
  • Metric Period (None) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Cube Metrics (None) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network