Dataset Similarity – Fingerprint Generation (User Defined)¶
This Floe generates the following user customizable fingerprint types: The generated fingerprints can be customized with the following parameters:
Fingerprint Type parameter determines the type of fingerprint (circular - ECFP-like, path, tree)
Fingerprint Size parameter determines the size of the generated fingerprint (in bits)
Minimum Fragment Size and Maximum Fragment Size parameters determine the minimum and maximum size of the fragments that are exhaustively enumerated during the fingerprint generation
Fingerprint Atom Typing and Fingerprint Bond Typing parameters determine which atom and bond properties are encoded into the fingerprints
Extra Required Parameters
Fingerprint Atom Typing (string) : The atom properties encoded into the fingerprints.Default: [‘Atomic number’]Choices: Atomic number, Aromaticity, Chiral, Formal charge, Heavy degree, Hybridization, In ring, Hydrogen count, Halogen equivalent, Aromatic equivalent, HBond acceptor equivalent, HBond donor equivalent Fingerprint Bond Typing (string) : The bond properties encoded into the fingerprints.Default: [‘Bond order’]Choices: Bond order, Chiral, In ring Fingerprint Field (Field Type: Chem.FingerPrint) : Tag name for the field that stores fingerprints.Default: Fingerprint Maximum Fragment Size (integer) : The largest fragments that are enumerated during the fingerprint generation. In case of path and tree fingerprint types, this means maximum number of bonds in a fragment. In case of circular fingerprint type, this numbers means bond distance from central atoms.Default: 4 Min: 1 Max: 8 Minimum Fragment Size (integer) : The smallest fragments that are enumerated during the fingerprint generation. In case of path and tree fingerprint type, this means minimum number of bonds in a fragment. In case of circular fingerprint type, this numbers means bond distance from central atoms.Default: 0 Max: 5 Fingerprint Size (integer) : The size of the fingerprint (in bits) generated for similarity calculation. It is recommended to generate fingerprints with the size of multiple of 256.Default: 4096 Min: 256 Max: 16384 Fingerprint Type (string) : The fingerprint type generated for similarity calculation.Default: TreeChoices: Circular, Path, Tree Input Dataset (data_source) : Dataset to generate fingerprints Output Dataset (dataset_out) : Output dataset of successful calculationsDefault: fingerprints Failed Dataset (dataset_out) : Output dataset of failed calculationsDefault: Failed Output for Dataset Similarity – Fingerprint Generation (User Defined)