InChI validationΒΆ

The InChI strings returned by OECreateInChI were rigorously tested to ensure that they were identical to the InChI strings generated by the standalone InChI utility program.

Performance of OECreateInChI to generate standard InChI
Database Size Number of failures Success Rate
eMolecules 9.1M 1,350 99.97 %
ChEMBL_23 1.7M 8,255 99.52 %

Many failures are due to the problem that OEChem TK and InChI handle some incorrect bond stereo configurations differently. See example in Difference due atom stereo perception

Additionally there are approximately 7,300 structures in ChEMBL_23 where the InChI, SMILES and SDF forms for the record are in disagreement resulting in ambiguity for the expected or reference InChI.

Differences due atom stereo perception
  ../_images/InChI-problem-atom-stereo-A.png
InChI standalone InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6)
OECreateInChI InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6)/t2-/m1/s1

See example in Difference due atom stereo perception