Generate and Deduplicate SMILES for One or More Datasets¶

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

Task-based/Cheminformatics/SMILES Gen & Deduplication

Role-based/Medicinal Chemist

Description

Add a string data field that stores the SMILES representation of the primary molecule of each record to the combined dataset, then deduplicate the dataset based on that SMILES.

Promoted Parameters

Title in user interface (promoted name)

Outputs

Output Dataset for Unique Records (unique): Output dataset to write to

Required

Type: dataset_out

Default: unique_SMILES

Output Dataset for Duplicate Records (duplicate): Output dataset to write to

Required

Type: dataset_out

Default: duplicate_SMILES

Output Dataset for Records Missing SMILES (missing): Output dataset to write to

Required

Type: dataset_out

Default: missing_SMILES

Inputs

Input Dataset (in): Dataset to deduplicate

Required

Type: data_source