Lexichem TK 1.6¶
On a benchmark of 250251 compounds in the NCI00 database,
mol2nam
is able to convert 233010 structures (93.11%) to names without BLAH. Of these 233010 names,nam2mol
is able to convert 221331 (94.99%) back into structures.This release includes a significant number of improvements to both name generation and name parsing. For example, both name generation and parsing now do a much better job on ring fusion nomenclature, for names like ‘
5,6,7,8-tetrahydro[1,2,4]triazolo[4,3-a]pyridine
’. There’s also much improved handling of charged ring systems. The name parsing conversion rate for the 71367 compound names in the 2003 Maybridge catalog is now 93.25% in v1.6, up from 80.80% in v1.5.In name generation, new naming styles have been added for MDL/Beilstein AutoNom style names, for CAS permuted index style names (and there are new placeholder styles for IUPAC79 and IUPAC93 naming). A large number of improvements have been made to names generated using the ‘traditional’ naming style. A new
OECapitalizeName
API function is available to capitalizing the appropriate first letter of a generated name, such as ‘p-tert-Butylbenzoic acid
’.Several bug fixes have been made to the Cahn-Ingold-Prelog (CIP) chirality perception implementation.
The
OEParseIUPACName
function is now able return supplementary locant annotations for each atom. This function now stores an integer locant code/identifier in the integer atom type field of each atom, which may be retrieved using theOEAtomBase::GetIntType
method and converted into a readable/displayable string using the recently exposedOENameLocant
function. This functionality is a recent addition (obviously), and most but not all supported ring systems and parents have locant annotations in this initial release.Finally, for the adventurous, new APIs for translating compound names from foreign languages into English are available as the experimental
OEFromJapanese
,OEFromSwedish
andOEFromSpanish
functions. Additionally, aOEFromUTF8
function is available for converting UTF-8 encoded strings into the escaped sequences expected by these functions (effectively the inverse ofOEToUTF8
).