FASTQ Parser for Reads With UMIs¶
A Cube that takes in FASTQ reads with UMIs, corrects sequence errors, and stores the sequence and count for downstream use.
Main Parameters¶
Parameter Name |
---|
Directional reads |
Hamming distance threshold for clustering UMIs |
Read group size threshold |
Minimum number of unique UMIs per consensus sequence |
UMI extraction method |
Unique molecular identifier extraction pattern |
Parameter Details¶
Calculation Parameters¶
CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128 Cube Metrics (cube_metrics) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network Directional reads (directional) type: boolean: If True, reads are oriented 5’ to 3’ with respect to the UMI extraction pattern. If False, they non-directional (UMI could be at either end).Default: False Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592 Hamming distance threshold for clustering UMIs (ed) type: integer:Default: 2 , Max: 100 GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16 Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “” Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592 Metric Period (metric_period) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300 Read group size threshold (min_seq_group_size) type: integer: Minimum number of sequencing reads per UMIDefault: 5 , Min: 1 Minimum number of unique UMIs per consensus sequence (min_umi_count) type: integer: Sequences are retained that are represented by at least this many UMIs.Default: 2 , Min: 1 Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to addressDefault: 64 Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required UMI extraction method (umi_extract_method) type: string: Method to use with regular expression string to extract UMIDefault: stringChoices: regex, string Unique molecular identifier extraction pattern (umi_regex) type: string: An extraction pattern for the unique molecular identifier (UMI), which may be a regular expression or a string using {N, C, X}. Be sure to include both 5’ and 3’ unique molecular identifiers. If you would like to demultiplex samples using a barcode table, DO NOT mark the sample barcode in the UMI extraction pattern as a region to be extractedDefault: “” Reverse unique molecular identifier extraction pattern (umi_regex_rev) type: string: For use with non-directional reads only. Ignored if directional is set to True.Default: “”
Hardware Parameters¶
- Machine hardware requirements
- Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592
- Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to addressDefault: 64
- Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592
- GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16
- CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128
- Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
- Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required
- Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “”
Metrics Parameters¶
- Cube Metric Parameters
- Metric Period (None) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
- Cube Metrics (None) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network