InChI validation

The InChI strings returned by OECreateInChI were rigorously tested to ensure that they were identical to the InChI strings generated by the standalone InChI utility program.

Performance of OECreateInChI to generate standard InChI

Database

Size

Number of failures

Success Rate

eMolecules

9.1M

1,350

99.97 %

ChEMBL_23

1.7M

8,255

99.52 %

Many failures are due to the problem that OEChem TK and InChI handle some incorrect bond stereo configurations differently. See example in Difference due atom stereo perception

Additionally there are approximately 7,300 structures in ChEMBL_23 where the InChI, SMILES and SDF forms for the record are in disagreement resulting in ambiguity for the expected or reference InChI.

Differences due atom stereo perception
../_images/InChI-problem-atom-stereo-A.png

InChI standalone

InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6)

OECreateInChI

InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6)/t2-/m1/s1

See example in Difference due atom stereo perception