BROOD Fragment Database

An essential feature of these methods is the generation of a database of potential fragments. While it may be tempting to generate fragments de novo, these approaches often generate unrealistic chemical fragments. Particularly in regards to a method that is related to a common medicinal chemistry technique, we feel it is important to propose known fragments.

The following databases are available with BROOD.

ChEMBL31 Fragment Database

The ChEMBL31 fragment database is derived from the ChEMBL 31 molecule database. The compounds are fragmented and passed through the OpenEye fragment filter, resulting in 20.7 million fragments.

ChEMBL31_lite Fragment Database

The ChEMBL31_lite fragment database is also derived from the ChEMBL 31 molecule database. It is more size tractable and contains 8.8 million fragments. In addition to applying the OpenEye fragment filter, further filters based on the number of heavy atoms and the minimum frequency of fragments were applied in creating this database.

brood-database-chembl-3.0.0

brood-database-chembl-3.0.0 is derived from the ChEMBL20 molecule database. The compounds are fragmented resulting in approximately 11 million unique molecular fragments after standard property filtering. The fragments are prioritized according to their medicinal and geometric relevance to fragment replacement and a final collection of approximately six million fragments are retained for the database.

Users may also provide their own fragment database for searching. These fragments databases can be prepared from molecule collections using the CHOMP program. CHOMP breaks the molecules into fragments, filters the fragments, enumerates undefined stereochemistry, tracks molecules from which the fragments came, and identifies the unique collection of fragments. Once this set is generated, CHOMP generates or reads multiconformer representations for each fragment. The conformers can either be generated by OMEGA technology within CHOMP or extracted from small-molecule crystal structure databases passed into CHOMP. As a final step in database generation, CHOMP precalculates physical and geometric properties, organizes the fragments for efficient retrieval, and writes a database format that is optimized for efficient BROOD searching.