Enrichment and Depletion Tracing of Clones

Calculates the relative fold enrichment over multiple rounds of selection in the context of a given region of interest. The number of records in the output is geared to identify the final round in the context of earlier rounds. This is to avoid redundancy in the final output. So, if values are found in multiple rounds, the latest round copy of the full-length sequence is retained and the sequence region-of-interest is given a minimal frequency in any subsequent rounds if they exist. Note that sequences not belonging to samples in the Enrichment Table are excluded from the output.

Main Parameters

Parameter Name

Enrichment Table

Correction Factor Group 1

Correction Factor Group 2

Keep Only Functional Sequences

Metrics for Ranking

Region to Consider for Enrichment

Minimum Count for the Region of Interest (ROI)

Minimum Percent for the Region of Interest (ROI)

Write the Enrich and Relative Abundance Output to CSV File


Calculation Parameters

  • Enrichment Table (barcode_table) type: file_in: XLS/CSV/TSV file containing sample names in the format Name,round_enrich(e.g., 1, 2, 3, etc.),enrich_group. Do not include header. Rounds do not need to be sequential but must be numeric. They will be considered in numerical order with largest value considered the latest round. Only one sample per round per enrich_group. Name should match sample names from original barcode table. Each sample name can only be used once.
  • Correction Factor Group 1 (correction_factor1) type: integer: Final round clones not found in earlier rounds are given pseudocount equal to minimum of early round count divided by this value; larger value means greater weight to clones found only in final round.
    Default: 2
  • Correction Factor Group 2 (correction_factor2) type: integer: Early round clones not found in final round are given pseudocount equal to minimum of final round count divided by this value; larger value means bigger penalty for de-enrichment
    Default: 10
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Cube Metrics (cube_metrics) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • Keep Only Functional Sequences (filter_functional) type: boolean: Eliminates non-functional sequences, truncations, stop-codons, frame-shifts.
    Default: True
  • None (floe_report_name) type: string:
    Default: Trace Enriched Populations
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • Input dataset CSV files (input_csv) type: file_in: Input annotated data in CSV format from upstream AbXtract floes. Accepts multiple inputs. Required for high-diversity datasets >50,000 total records.
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Metric Period (metric_period) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Metrics for Ranking (ranking_criteria) type: string: Place metrics in order of ranking
    Default: [‘Latest Selection Round Detected’, ‘Net Fold Enrichment’, ‘Last Round Relative Abundance’, ‘Last Round Count’]
    Choices: Latest Selection Round Detected, Last Round Count, Last Round Relative Abundance, Net Fold Enrichment
  • Region to Consider for Enrichment (roi) type: string: Indicate the region of interest that should be considered for the enrichment
    Default: CDR3 Chain 2
    Choices: Merged CDRs, CDR3 Chain 1, CDR3 Chain 2, HCDR3 and LCDR3, Full-Length
  • Minimum Count for the Region of Interest (ROI) (roi_count) type: integer: This will set the minimum count for a given region of interest, all below will be removed. Only applied to the final round. Be aware that clones not present in final round will have pseudocount values often <1. These will be retained if minimum count is set to 1 or 0.
    Default: 1 , Max: 10000000000
  • Minimum Percent for the Region of Interest (ROI) (roi_percent) type: decimal: This will set the minimum percent for a given region of interest, all below will be removed. Only applied to the final round. Be aware that clones not present in final round will have pseudocount values below the minimum percent for true clones.
    Default: 1e-12 , Min: 1e-12, Max: 100
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Write the Enrich and Relative Abundance Output to CSV File (write_to_csv_file) type: boolean: Allows the option to write to CSV after the AbXtract Processing file at the cost of additional time. If not, can do this in separate step. Writes to empty file if turned off.
    Default: True

Hardware Parameters

Machine hardware requirements
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”

Metrics Parameters

Cube Metric Parameters
  • Metric Period (None) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Cube Metrics (None) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network