Fingerprint Generation (User-Defined)
This cube generates custom 2D fingerprints for input molecules.
The molecules are read from the field specified by the Input Molecule Field parameter. The generated fingerprints can be customized with the following parameters:
Fingerprint Type parameter determines the type of fingerprint
Fingerprint Size parameter determines the size of the generated fingerprint (in bits)
Minimum Fragment Size and Maximum Fragment Size parameters determine the minimum and maximum size of the fragments that are exhaustively enumerated during the fingerprint generation
Fingerprint Atom Typing and Fingerprint Bond Typing parameters determine which atom and bond properties are encoded into the fingerprints
The generated fingerprint is stored in the field specified by the Fingerprint Field parameter, and the record is sent to the success port.
Downstream Cubes
See also
Fingerprint Generation section in GraphSim TK manual.
User-defined Fingerprint section in GraphSim TK manual.
Calculation Parameters
CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128 Cube Metrics (cube_metrics) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592 Fingerprint Atom Typing (fingerprint_atom_type) type: string: The atom properties encoded into the fingerprints.Default: [‘Atomic number’]Choices: Atomic number, Aromaticity, Chiral, Formal charge, Heavy degree, Hybridization, In ring, Hydrogen count, Halogen equivalent, Aromatic equivalent, HBond acceptor equivalent, HBond donor equivalent Fingerprint Bond Typing (fingerprint_bond_type) type: string: The bond properties encoded into the fingerprints.Default: [‘Bond order’]Choices: Bond order, Chiral, In ring Maximum Fragment Size (fingerprint_max_frag_size) type: integer: The largest fragments that are enumerated during the fingerprint generation. In case of path and tree fingerprint types, this means maximum number of bonds in a fragment. In case of circular fingerprint type, this numbers means bond distance from central atoms.Default: 4 , Min: 1, Max: 8 Minimum Fragment Size (fingerprint_min_frag_size) type: integer: The smallest fragments that are enumerated during the fingerprint generation. In case of path and tree fingerprint type, this means minimum number of bonds in a fragment. In case of circular fingerprint type, this numbers means bond distance from central atoms.Default: 0 , Max: 5 Fingerprint Size (fingerprint_size) type: integer: The size of the fingerprint (in bits) generated for similarity calculation. It is recommended to generate fingerprints with the size of multiple of 256.Default: 4096 , Min: 256, Max: 16384 Fingerprint Type (fingerprint_type) type: string: The fingerprint type generated for similarity calculation.Default: TreeChoices: Circular, Path, Tree GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16 Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “” Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on Max Backlog Wait (max_backlog_wait) type: integer: The max time (in seconds) that a cube will be backlogged on a group before being re-evaluatedDefault: 600 , Min: 300 Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592 Metric Period (metric_period) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300 Thread limit per CPU (pids_per_cpu_limit) type: integer: The number of threads per CPUDefault: 32 Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to addressDefault: 64 Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required
Field parameters
Fingerprint Field (fingerprint_field) type: Field Type: Chem.FingerPrint: Tag name for the field that stores fingerprints. None (in_mol_field) type: Field Type: Chem.Mol:
2D Similarity Parameters
- The parameters of the 2D fingerprint similarity calculation.
- Fingerprint Type (fingerprint_type) type: string: The fingerprint type generated for similarity calculation.Default: TreeChoices: Circular, Path, Tree
- Fingerprint Size (fingerprint_size) type: integer: The size of the fingerprint (in bits) generated for similarity calculation. It is recommended to generate fingerprints with the size of multiple of 256.Default: 4096 , Min: 256, Max: 16384
- Fingerprint Atom Typing (fingerprint_atom_type) type: string: The atom properties encoded into the fingerprints.Default: [‘Atomic number’]Choices: Atomic number, Aromaticity, Chiral, Formal charge, Heavy degree, Hybridization, In ring, Hydrogen count, Halogen equivalent, Aromatic equivalent, HBond acceptor equivalent, HBond donor equivalent
- Fingerprint Bond Typing (fingerprint_bond_type) type: string: The bond properties encoded into the fingerprints.Default: [‘Bond order’]Choices: Bond order, Chiral, In ring
- Minimum Fragment Size (fingerprint_min_frag_size) type: integer: The smallest fragments that are enumerated during the fingerprint generation. In case of path and tree fingerprint type, this means minimum number of bonds in a fragment. In case of circular fingerprint type, this numbers means bond distance from central atoms.Default: 0 , Max: 5
- Maximum Fragment Size (fingerprint_max_frag_size) type: integer: The largest fragments that are enumerated during the fingerprint generation. In case of path and tree fingerprint types, this means maximum number of bonds in a fragment. In case of circular fingerprint type, this numbers means bond distance from central atoms.Default: 4 , Min: 1, Max: 8
Hardware Parameters
- Machine hardware requirements
- Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592
- Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to addressDefault: 64
- Thread limit per CPU (pids_per_cpu_limit) type: integer: The number of threads per CPUDefault: 32
- Max Backlog Wait (max_backlog_wait) type: integer: The max time (in seconds) that a cube will be backlogged on a group before being re-evaluatedDefault: 600 , Min: 300
- Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592
- GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16
- CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128
- Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
- Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required
- Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “”
Metrics Parameters
- Cube Metric Parameters
- Metric Period (None) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
- Cube Metrics (None) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network
Parallel Fingerprint Generation (User-Defined)
The parallel version adds these extra parameters.
Number of messages to distribute at a time (item_count) type: integer: The maximum number of messages to bundle together for a parallel cube.Default: 1 , Min: 1, Max: 65535 Maximum Failures (max_failures) type: integer: The maximum number of times to attempt processing a work itemDefault: 10 , Min: 1, Max: 100 Autoscale this Cube (autoscale) type: boolean: If True, let Orion manage the parallelism of this CubeDefault: True Maximum number of Cubes (max_parallel) type: integer: The maximum number of concurrently running copies of this CubeDefault: 1000 , Min: 1 Minimum number of Cubes (min_parallel) type: integer: The minimum number of concurrently running copies of this CubeDefault: 0