Overlap Among Different Datasets - AbXtract
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Solution-based/Biologics/Antibody Design
Role-based/Bioinformatician
Role-based/Biologist
Product-based/AbXtract
Description
Insert all the datasets from different source populations (e.g., barcode group) and the region of interest (ROI) and the floe will create an overlap_population field that indicates all of the populations to which a given ROI is found. One can use the Modify the Sample Name/Barcode Group Floe. May also specify a relaxed stringency for the overlap among populations by increasing the edit distance for given Levenshtein distance or Hamming distance method.
Parameter title in user interface (promoted name)
Output CSV Filename (file_name) type: file_out: All records are written to downstream csv file, must contain the *.csv extensionDefault: ngs_overlap.csv
Parameter title in user interface (promoted name)
Edit Distance Method For Overlap Among Different Barcode Groups (edit_distance_method_overlap) type: string: Indicate the type of edit distance method to apply for the overlap to complete population. NOTE: Only in effect if edit distance does not equal 0Default: Levenshstein DistanceChoices: Hamming Distance, Levenshstein Distance
Parameter title in user interface (promoted name)
Edit Distance for Overlap by ROI of Different Barcode Groups (edit_distance_overlap) type: integer: If there are multiple downstream barcode groups, these will be compared to one another.Default: 0 , Max: 100
Parameter title in user interface (promoted name)
Region of Interest For the Overlap (roi) type: string: Indicate the region of interest (ROI) for identifying regions of overlap among different barcode groups.Default: CDR3 Chain_2 (Downstream Chain)Choices: Merged CDRs, CDR3 Chain_1 (Upstream Chain), CDR3 Chain_2 (Downstream Chain), HCDR3 and LCDR3, Full-Length
Parameter title in user interface (promoted name)
Failed Dataset Output Name (data_out) type: dataset_out: Contains failed records from both upstream and downstream ProcessesDefault: problematic.ngs_overlap
Parameter title in user interface (promoted name)
Output Name of the Overlapped Dataset (data_out) type: dataset_out: This dataset will contain a consolidated dataset where all overlap populations
by their id are overlapped to different dataset to field called overlap_population NOTE: populations are also compared to themselves, so overlap contains values N>=2
Default: ngs_overlap