Generate and Deduplicate SMILES for One or More Datasets
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Task-based/Cheminformatics/SMILES Gen & Deduplication
Role-based/Medicinal Chemist
Description
Add a string data field that stores the SMILES representation of the primary molecule of each record to the combined dataset, then deduplicate the dataset based on that SMILES.
Promoted Parameters
Title in user interface (promoted name)
Outputs
Output Dataset for Unique Records (unique): Output dataset to write to
Required
Type: dataset_out
Default: unique_SMILES
Output Dataset for Duplicate Records (duplicate): Output dataset to write to
Required
Type: dataset_out
Default: duplicate_SMILES
Output Dataset for Records Missing SMILES (missing): Output dataset to write to
Required
Type: dataset_out
Default: missing_SMILES
Inputs
Input Dataset (in): Dataset to deduplicate
Required
Type: data_source