Glossary¶
- canonical SMILES¶
In OEChem TK, the name canonical SMILES is used for a unique SMILES string that encodes the connection table of a molecule, but no chiral or isotopic information. Consequently, two stereoisomers always share the same canonical SMILES, since their stereo information are ignored during the canonicalization process. For generating a canonical SMILES, use the
OECreateCanSmiString
function.Note
OEChem TK’s canonical SMILES terminology corresponds to Daylight’s ‘unique’ SMILES definition.
- canonical isomeric SMILES¶
In OEChem TK, the name canonical isomeric SMILES is used for a unique SMILES string that also encodes isotopic and stereo information. Due to the unambiguity of canonical isomeric SMILES, they can be used as a universal identifier for a specific chemical structure. For generating a canonical isomeric SMILES, use the
OECreateIsoSmiString
or the preferred high-levelOEMolToSmiles
function.Note
OEChem TK’s canonical isomeric SMILES terminology corresponds to Daylight’s ‘absolute’ SMILES definition.
- chiral atom¶
In OEChem TK, an atom is considered chiral, if it is connected to four different substituent groups i.e. its mirror image is non-superimposable.
Note
In OEChem TK, an easily invertible nitrogen, i.e. a non-planar nitrogen with one attached hydrogen, is not considered to be chiral. This is due to the fact that trivalent nitrogen compound undergo rapid inversion that interconvert enantiomers.
See also
stereo atom definition
OEAtomBase.SetChiral
methodOEAtomBase.IsChiral
methodAtom Chirality section
- chiral bond¶
In OEChem TK, a double bond is considered chiral, if the cis and trans forms of this bond represent two distinct isomers. A chiral bond can be either a chain bond or a ring bond that does not belong to any ring smaller than 8-membered.
See also
stereo bond definition
OEBondBase.SetChiral
methodOEBondBase.IsChiral
methodBond Stereochemistry section
- CSV¶
Comma-separated-values file format.
See also
CSV standard at RFC 4180
CSV File Format section
- CXSMILES¶
The Chemaxon Extended SMILES format which adds an additional (and optional) appendix to the SMILES string to encode a wide variety of additional features that are not part of the SMILES representation proper.
See also
- InChI¶
From http://www.inchi-trust.org/about-the-inchi-standard/,
Originally developed by the International Union of Pure and Applied Chemistry (IUPAC), the IUPAC International Chemical Identifier (InChI) is a character string generated by computer algorithm. It is a tool to be used in software applications designed and developed by those who choose to use it.
The InChI algorithm turns chemical structures into machine-readable strings of information. InChIs are unique to the compound they describe and can encode absolute stereochemistry making chemicals and chemistry machine-readable and discoverable.
The InChI format and algorithm are non-proprietary and the software is open source, with ongoing development done by the community. A number of IUPAC working groups is currently creating standard for those areas of chemistry that are not yet handled by the InChI algorithm.
- InChIKey¶
From http://www.inchi-trust.org/about-the-inchi-standard/,
The InChIKey has been designed so that Internet search engines can search and find the links to a given InChI.
To make the
InChIKey
the InChI string is subjected to a compression algorithm to create a fixed-length string of upper-case characters. While theInChI
toInChIKey
hash compression is irreversible, there are a number ofInChI
resolvers available to look up an InChI given an InChIKey.- LINGO¶
LINGO is a very fast text-based molecular similarity search method. It is based on fragmentation of canonical isomeric SMILES strings into overlapping substrings.
See also
- non-terminal atom¶
An atom is considered non-terminal if it is connected to two or more non-hydrogen atoms, i.e.
OEAtomBase.GetHvyDegree()
>= 2.- rotatable bond¶
In OEChem TK, a bond is considered rotatable only if it is a single non-ring bond between two non-terminal, non-triple-bonded atoms. For example the following structures have no rotatable bonds:
CCC
CCC#CCC
C1CCCCC1
Note
Since the ‘rotatable’ property is depends on the ‘in ring’ property. The
OEFindRingAtomsAndBonds
function must be called before accessing the rotatable bond property viaOEBondBase.IsRotor
.- SMARTS¶
SMARTS is a language that allows specifying substructures by providing a number of primitive symbols describing atomic and bond properties. Atom and bond primitive specifications may be combined to form expressions by using logical operators. An introduction to SMARTS syntax is provided in SMARTS Pattern Matching. For more information go to http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html
- SMILES¶
A SMILES string represents a molecule by describing only its molecular graph (i.e. atoms and bonds in the connection table, but no chiral or isotopic information). There are usually a large number of valid SMILES which represent a given structure. For example, CCO, OCC and C(O)C all specify the structure of ethanol. For generating an arbitrary SMILES string, use the
OECreateAbsSmiString
function. An introduction to SMILES syntax is provided in chapter SMILES Line Notation. For more information go to http://www.daylight.com/smiles/.- SMIRKS¶
SMIRKS is a reaction transform language. A reaction considered valid according to the strict SMIRKS semantics if:
all mapped product atoms have corresponding mapped reactant atoms
all atom maps must be pairwise (i.e. every map class has exactly one reactant and one product atom)
The strict semantics also requires that unmapped reactant atoms are destroyed in the reaction.
The strict semantics means that in full compliance with the SMIRKS defined by its originator, Daylight Inc. For more information about the semantics of SMIRKS language visit http://www.daylight.com/dayhtml_tutorials/languages/smirks/index.html.
- stereo atom¶
In OEChem TK the atom stereo information is stored as a relative positions of neighboring atoms around a tetrahedral center. If an atom has specified stereochemistry, then the
OEAtomBase.HasStereoSpecified
method returnstrue
.Warning
In OEChem TK atom stereochemistry is internally represented by two properties stereo atom and chiral atom. These properties are completely independent and allows OEChem TK to retain configuration information around atoms that are not chiral atom, or to identify chiral atoms whose configuration is not specified.
Note
In the current version of OEChem TK, the only class of stereochemistry supported for atoms is
OEAtomStereo_Tetrahedral
which corresponds to \(sp3\) tetrahedral chirality. Valid return values for theOEAtomStereo_Tetrahedral
stereochemistry class areOEAtomStereo_Left
andOEAtomStereo_Right
.See also
chiral atom definition
OEAtomBase.SetStereo
methodOEAtomBase.GetStereo
methodAtom Chirality section
- stereo bond¶
In OEChem TK the bond stereo information is stored as a relative positions of neighboring atoms around a bond. If a bond has specified stereochemistry, then the
OEBondBase.HasStereoSpecified
method returnstrue
.Warning
In OEChem TK bond stereochemistry is internally represented by two properties stereo bond and chiral bond. These properties are completely independent and allows OEChem TK to retain configuration information around bonds that are not chiral bonds, or to identify chiral bonds whose configuration is not specified.
Note
In the current version of OEChem TK, the only class of stereochemistry supported for bonds is
OEBondStereo_CisTrans
which corresponds to conjugated E/Z chirality. Valid return values for theOEBondStereo_CisTrans
stereochemistry class areOEBondStereo_Cis
andOEBondStereo_Trans
.See also
chiral bond definition
OEBondBase.SetStereo
methodOEBondBase.GetStereo
methodBond Stereochemistry section