Custom Select NGS Leads by Sequence ID with File

A cube that utilizes Sanger clones to select the top sequences of interest

Main Parameters

Parameter Name

Custom Input File with SEQ ID, REQUIRED

Edit Distance, only 100% homology is available at this time

Region of Interest to Remove from Output

Remove Non-Functional or Aberrant Sequences

Max number of unique NGS desired?

Edit Distance Method for Overlap (Only 100% Homology Available)

Metrics for Ranking

Region of Interest (ROI) to Select Top Representatives

Write the Custom Select Output to CSV File


Parameter Details

Calculation Parameters

  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Cube Metrics (cube_metrics) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network
  • Custom Input File with SEQ ID, REQUIRED (custom_number_select) type: file_in: Input a file (column A = seq_id, column B = number of sequences desired) to indicate number of reps to select by given cluster or unique region of interest
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • Edit Distance, only 100% homology is available at this time (edit_dist) type: integer: Only 100% homology is available.
    Default: 0
  • Region of Interest to Remove from Output (eliminate_roi) type: string: This option is only available for the Sanger Select CUBE, not this custom NGS select CUBE
    Default: KEEP ALL MATCHING ROIs
    Choices: KEEP ALL MATCHING ROIs
  • Remove Non-Functional or Aberrant Sequences (filter_functional) type: boolean:
    Default: True
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Metric Period (metric_period) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Max number of unique NGS desired? (number_of_sequences_total) type: integer: Please choose the maximum number full-length unique sequences desired per clone.
If custom file input, will override this parameter.
Default: 10 , Min: 1, Max: 1000000
  • Edit Distance Method for Overlap (Only 100% Homology Available) (overlap_criterion) type: string: Only 100% identity is available for this option
    Default: 100% Identity
    Choices: 100% Identity
  • Metrics for Ranking (predict_choices) type: string: Place metrics in order of ranking (if nothing, ranks by full-length count)
    Default: [‘ROI Percent, Final Round Only’, ‘Full Length (Corrects for Illumina or PacBio), Percent’, ‘Liabilities Both Chains’, ‘Liabilities CDR3_2’]
    Choices: Full Length (Corrects for Illumina or PacBio), Count, Full Length (Corrects for Illumina or PacBio), Percent, ROI Count, Final Round Only, ROI Percent, Final Round Only, ROI Fold Enrichment, Final Round Only, ROI Log2 Enrichment, Final Round Only, Liabilities Both Chains, Liabilities Chain_2, Liabilities Chain_1, Liabilities CDR1_1, Liabilities CDR2_1, Liabilities CDR3_1, Liabilities CDR1_2, Liabilities CDR2_2, Liabilities CDR3_2, ROI Count, Early Round Only, ROI Percent, Early Round Only, Cluster Count (e.g. unique sequences per cluster), Cluster Percent (e.g. unique rep per cluster)
  • Region of Interest (ROI) to Select Top Representatives (roi) type: string: Select the cluster or region of interest (ROI) that match desired sequence ID for SANGER. IMPORTANT, if cluster is selected then all sequences should come from the dataset that was clustered at the same time
    Default: Cluster
    Choices: Cluster, Cluster_CDR3_1, Cluster_CDR3_2, Merged CDRs, CDR3 Chain_1, CDR3 Chain_2, HCDR3 and LCDR3, Full-Length
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Write the Custom Select Output to CSV File (write_to_csv_file) type: boolean: Allows the option to write to CSV after the AbXtract Processing at the cost of additional time. If not, can do this in separate step. Writes to empty file if turned off
    Default: True

Hardware Parameters

Machine hardware requirements
  • Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 1800 , Min: 256.0, Max: 8589934592
  • Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to address
    Default: 64
  • Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
    Default: 5120.0 , Min: 128.0, Max: 8589934592
  • GPUs (gpu_count) type: integer: The number of GPUs to run this cube with
    Default: 0 , Max: 16
  • CPUs (cpu_count) type: integer: The number of CPUs to run this cube with
    Default: 1 , Min: 1, Max: 128
  • Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
  • Spot policy (spot_policy) type: string: Control cube placement on spot market instances
    Default: Prohibited
    Choices: Allowed, Preferred, NotPreferred, Prohibited, Required
  • Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)
    Default: “”

Metrics Parameters

Cube Metric Parameters
  • Metric Period (None) type: decimal: How often to sample metrics, in seconds
    Default: 60
    Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
  • Cube Metrics (None) type: string: Set of metrics to be collected

    Choices: cpu, disk, memory, network