Consolidating and Writing Datasets, Illumina¶
Cube to consolidate and write results. Received records from any upstream or downstream cube. Consolidate all input into different sample_names (upstream) or barcode_group (downstream). Consolidate all input into chain amino acid sequence. Generates FLOE report at the end.
Main Parameters¶
Parameter Name |
---|
Output Dataset Name |
Provides Report of the Selected Antibody Leads |
Metrics to Assess Sanger in Presence of NGS |
Are these already processed records? |
Is This A Downstream Processed File? |
Is This A Sanger Processed File? |
Split by cluster? Only applies to downstream records. |
ROI for sequence logo (choose Chain1 CDR3 if short-read/single-chain data) |
Write IgMatcher to File after Processing |
Write Records to Dataset |
Write Barcode Group to Their Own Dataset After Processing |
Write Report |
Parameter Details¶
Calculation Parameters¶
CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128 Cube Metrics (cube_metrics) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network Output Dataset Name (data_out) type: dataset_out: Output dataset to write to Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592 Provides Report of the Selected Antibody Leads (downstream_ngs_selection) type: boolean: Provides detailed information on the biophysical characteristics of the selected antibodies.Default: False Metrics to Assess Sanger in Presence of NGS (downstream_sanger) type: boolean: Indicates whether additional metrics are to be included to identify Sanger sequences in NGS and vice-versaDefault: False GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16 Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “” Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on interfix (interfix) type: string: name to add in the middle of the file for identification (e.g. ‘cdr3’)Default: “” Are these already processed records? (is_analyzed) type: boolean: Indicates whether input are to be analyzed post-processing for generating specific plots.Default: False Is This A Downstream Processed File? (is_downstream) type: boolean: Indicates whether the input contains data for downstream processing.Default: False Is This A Sanger Processed File? (is_sanger) type: boolean: Indicates whether the input contains Sanger (low-throughput) Sequencing DataDefault: False Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592 Metric Period (metric_period) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300 Split by cluster? Only applies to downstream records. (sequence_logo_by_cluster) type: boolean: Indicates whether to split sequences by cluster before creating sequence logos. Cluster logos are output only if not more than 500 records.Default: False ROI for sequence logo (choose Chain1 CDR3 if short-read/single-chain data) (sequence_logo_roi) type: string: Name of regions to be aligned for sequence logo. Logo is output only if not more than 500 records.Default: CDR3 Chain_2 (Downstream Chain)Choices: CDR3 Chain_1 (Upstream Chain), CDR3 Chain_2 (Downstream Chain), HCDR3 and LCDR3 Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required Write IgMatcher to File after Processing (write_csv) type: boolean: Write barcode group (if provided) to their own dataset after processing, Note: if only a single barcode group then no separate dataset will be written.Default: True Write Records to Dataset (write_dataset) type: boolean: Write out a records to datasetDefault: True Write Barcode Group to Their Own Dataset After Processing (write_group) type: boolean: Write barcode group (if provided) to their own dataset after processing, Note: if only a single barcode group then no separate dataset will be written.Default: False Write Report (write_report) type: boolean: Write out a floe report after consolidationDefault: False
Hardware Parameters¶
- Machine hardware requirements
- Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592
- Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592
- GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16
- CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128
- Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
- Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required
- Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “”
Metrics Parameters¶
- Cube Metric Parameters
- Metric Period (None) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
- Cube Metrics (None) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network