Overview¶
The application of chemical similarity analysis in drug design is a commonly used and useful technique. Numerous topological (2D) and superposition (3D) methods exist for the measurement of chemical similarity. Methods that work in three dimensions have traditionally been much slower than 2D methods. This is due in large part to the fact that 3D methods must have some notion of the energetically accessible conformational ensembles available to a molecule, while 2D methods only work with a single structure. 3D methods, however, have the advantage of being able to find chemically less intuitive structures that have approximately the same shape and chemical properties. ROCS is designed to perform large scale 3D database searches by using a superposition method that finds the similar but non-intuitive compounds that are so valuable in the drug discovery process.
ROCS is a shape-based superposition method. Molecules are aligned by a solid-body optimization process that maximizes the overlap volume between them (See Shape based alignment method in ROCS). Volume overlap in this context is not the hard-sphere overlap volume, but rather a Gaussian-based overlap parameterized to reproduce hard-sphere volumes (see Shape Characteristics and the use of Gaussians). ROCS uses only the heavy atoms of a ligand, hydrogens are ignored. Since shape and volume in this context are so closely related, a volume overlap maximization procedure is an excellent method for gaining insights into similar shapes. Although ROCS is primarily a shape-based method, user specified definitions of chemistry can be included into the superposition and similarity analysis process which facilitates the identification of those compounds which are similar both in shape and chemistry.
Molecular superposition has had limited impact in 3D database searching because of the slow speed (1-2 molecules/second) of previously reported superposition methods. ROCS can routinely perform global shape and color alignments at the rate of 600-800 conformers per second. Medium-sized database searching (10’s of millions of conformers) becomes tractable but slow at this rate of superposition. Distributed computing makes the entire process much more facile for screening larger numbers of compounds and conformers. ROCS can automatically split up similarity searches over entire networks of computers in an efficient and manner taking full advantage of parallel virtual machines. The coupling of shape and chemistry screening with a distributed architecture makes ROCS an incredibly powerful tool for searching large 3D databases.