Polymorph Search with IEFF Crystal Force Field (Part II of CSP Protocol: Generation and Filtering)¶
This Floe is the second part of the Crystal Structure Prediction (CSP) protocol developed by OpenEye. The goal of it is to predict, starting from a conformational ensemble, most stable crystal geometries. For energy function we use Intermolecular Energy Force Field (IEFF).
For each conformer in the ensemble, the electrostatic multipoles, needed for IEFF, are evaluated. Sampling over the list of space groups is done and crystal geometries are generated and optimized with IEFF force field. The lowest lying structures are deduplicated and are the main result of this workflow.
We below list top 20 most frequent space groups with their respective frequencies (data taken from spacegroup frequencies Table). Data is provided for general and for chiral space groups.
Space group |
Frequency |
Chiral space group |
Frequency |
---|---|---|---|
14 |
35.1 |
19 |
47.9 |
2 |
19.3 |
4 |
30.1 |
19 |
9.01 |
1 |
5.16 |
15 |
7.16 |
5 |
4.32 |
4 |
5.66 |
18 |
2.50 |
61 |
3.78 |
92 |
1.40 |
62 |
1.54 |
20 |
1.06 |
33 |
1.53 |
146 |
0.787 |
9 |
0.999 |
96 |
0.675 |
1 |
0.970 |
76 |
0.627 |
60 |
0.897 |
152 |
0.590 |
5 |
0.813 |
144 |
0.431 |
29 |
0.725 |
173 |
0.399 |
11 |
0.675 |
198 |
0.367 |
12 |
0.515 |
169 |
0.324 |
13 |
0.507 |
145 |
0.324 |
148 |
0.485 |
154 |
0.282 |
18 |
0.470 |
78 |
0.271 |
7 |
0.367 |
170 |
0.229 |
56 |
0.354 |
155 |
0.229 |
Promoted Parameters
unique_confs (dataset_out) : Resulting dataset with all unique conformers in top IEFF crystal structures after rigid packing.Default: unique_confs in (data_source) : Dataset with input molecules on which crystal polymorph predictions need to be performed. failure (dataset_out) : Dataset containing records with failed jobs from three stages of computation: qm multipoles, IEFF, or crystal visualization.Default: failure out (dataset_out) : Resulting dataset with lowest in energy, deduplicated crystal structures (in the CIF format) predicted with IEFF Crystal Force Field.Default: top_structures qm_mults (dataset_out) : Resulting dataset with computed QM Multipoles, useful to store in case random packing stage needs to be re-done without recomputing QM Multipoles.Default: qm_mults
Extra Required Parameters
Temp Collection Name (collection_sink) : Name for the created collections.Default: IEFF Temp Crystal Packings Collection Switch (boolean) : This parameter controls whether records are sent to the ‘true’ or ‘false’ portDefault: True Collection Name (collection_sink) : Name for the created collections.Default: IEFF Crystal Packings Collection Output Shard Format (string) : The format of the data that shards will containDefault: oedbChoices: oedb, ism.gz, oez, oeb.gz, oeb Records per shard parameter (integer) : Number of records in each shard. For optimal performance the combination of parameters: ‘Size of the batch’ (batch_size), ‘Parallel Group Item Count’ (item_count), and ‘Records per shard parameter’ (records_per_shard) need to satisfy : records_per_shard = batch_size * item_countDefault: 50 Random Packing Switch (boolean) : Controls if Random Packing of monomer in crystal is performed or skipped.Default: True Output Shard Format (string) : The format of the data that shards will containDefault: oedbChoices: oedb, ism.gz, oez, oeb.gz, oeb records_per_shard (integer) : The target number of records in a shard. 0 indicates to run up to the max_shard_bytes limit per shardDefault: 10 Hit List Size (integer) : The desired size of the hit list.Default: 1 Min: 1 Energy tag for global minimum (Field Type: Float) : Energy tag for lattice energy in order to find the global minimum.Default: IEFF Lattice Full Energy (kcal/mol) Switch (boolean) : This parameter controls whether records are sent to the ‘true’ or ‘false’ portDefault: True QM Multipoles Switch (boolean) : Controls if QM Multipoles are computed or this step is skipped.Default: True