Chemical Identifier Resolver documentation

Overview

This service works as a resolver for different chemical structure identifiers and allows one to convert a given structure identifier into another representation or structure identifier. You can either use the web form of the resolver or the following simple URL API scheme:

http:///chemical/structure/"structure identifier"/"representation"

The service returns the requested new structure representation with a corresponding MIME-Type specification (in most cases MIME-Type: "text/plain"). If a requested URL is not resolvable for the service an HTML 404 status message is returned. In the (unlikely) case of an error, an HTML 500 status message is generated.

Structure Identifiers

For new developments and recent additions please read also our blog

The service is able to resolve any of the chemical structure identifier types listed in the following table

Structure Identifier TypeDescription
Structure identifier string resolvable by CACTVS

The part of the URL encoding the chemical structure identifier is passed to the structure decoder available in CACTVS. This works for molecular structure encoded as:

  • SMILES
  • InChI/Standard InChI
  • different CACTVS formats like CACTVS Minimol, CACTVS Serialized Object String

Note: Triple bonds in SMILES strings represented by '#' have to be URL-escaped as '%23' (e.g. the SMILES string of ethyne has to be specified as 'C%23C' instead of 'C#C' if encoded as part of a URL).

Standard InChIKeys

Standard InChIKeys are a hashed structure representation. The service can be used as a Standard InChIKey resolver which converts Standard InChIKeys into a full structure representation by a database lookup. The service currently has approx. 100 million unique Standard InChIKeys and their corresponding chemical structures on file.

Note: A Standard InChIKey is accepted by the service both prefixed by 'InChIKey=' or prefix-less. The current version of the service only accepts full Standard InChIKeys with all three InChIKey layers specified. This will change in future versions.

NCI/CADD Identifiers (FICTS, FICuS, uuuuu) and CACTVS HASHISY hashcodes

Like Standard InChIKeys, NCI/CADD Structure Identifiers are also a hashed structure representation and are resolved into a full structure representation by a database lookup. The database currently holds 96 million unique chemical structures and their NCI/CADD Identifiers.

Chemical names

Chemical names are resolved by a database lookup into a full structure representation. The service has currently approx. 68 million chemical names available linked to approx. 16 million unique structure records. The set of available names includes trivial names, synonyms, systematic names, registry numbers, etc.

Note: Chemical names are currently resolved only by a full string search. To specify a chemical name as part of a URL you need to embed spaces in the usual way in the URL field of your web browser (e.g. 'sodium chloride').

Disclaimer: Although we already took great care to make the chemical name/registry numbers to structure conversion working well, we are still working on improvements of this functionality. A known issue for us is that names which include information about the stereo configuration of a compound, might return a structure lacking stereochemistry completely.

If no specific resolver module is named the resolver tries to recognize the type of a given identifier string in the order of identifier types as listed in the table. In case a structure identifier is ambiguous, i.e. the specified structure identifier string can be regarded as more than one of the aforementioned identifier types, only the result of the highest precedence identifier type is returned. However, the results for any lower precedence identifier strings can be obtained by using the xml format of the service described in the following.

XML Format

For an XML-formatted response add "/xml " to the URL scheme:

http:///chemical/structure/"structure identifier"/"representation"/xml

Example: The structure identifier "CCO" can be resolved as SMILES string, however, can also be found as chemical name string (scroll down to the end of the received XML document):

http:///chemical/structure/CCO/names/xml

Representation Methods

For recent additions to the list of available representation methods please read also our blog

The following describes which methods are available as "representation" part of the URL scheme above.

Method: stdinchi

Returns the Standard InChI of the structure.

Example: Standard InChI for chemical name string "aspirin":

http:///chemical/structure/aspirin/stdinchi

(MIME-Type: "text/plain")

Method: stdinchikey

Returns the Standard InChIKey of the structure.

Example: Standard InChIKey of "ethanol" specified as SMILES string "CCO":

http:///chemical/structure/CCO/stdinchikey

(MIME-Type: "text/plain")

Method: smiles

Returns the Unique SMILES of the structure as calculated by the chemoinformatics toolkit CACTVS. A Unique SMILES calculated by CACTVS might be different from a SMILES string calculated by Daylight's official implementation which has not been fully published. CACTVS can only resemble the calculation of Unique SMILES. However, as long as a SMILES string has been calculated by the Chemical Structure Identifier Resolver it is unique for a specific structure.

Example: Unique SMILES string of chemical name string "benzene":

http:///chemical/structure/benzene/smiles

(MIME-Type: "text/plain")

Example: Unique SMILES string of the non-unique SMILES string "C(O)C" (ethanol):

http:///chemical/structure/C(O)C/smiles

(MIME-Type: "text/plain")

Method: ficts

Returns the NCI/CADD FICTS identifier. For this method a timeout can occur, in which case the CACTVS HASHISY hashcode instead of the full NCI/CADD Identifier is returned (see method hashisy).

Example: FICTS for SMILES string"c1ccccc1":

http:///chemical/structure/c1ccccc1/ficts

(MIME-Type: "text/plain")

Method: ficus

Returns the NCI/CADD FICuS identifier. For this method a timeout can occur, in which case the CACTVS HASHISY hashcode instead of the full NCI/CADD Identifier is returned hashisy).

Example: FICuS for chemical name string "aspirin":

http:///chemical/structure/aspirin/ficus

(MIME-Type: "text/plain")

Method: uuuuu

Returns the NCI/CADD uuuuu identifier. For this method a timeout can occur, in which case the CACTVS HASHISY hashcode instead of the full NCI/CADD Identifier is returned (see method hashisy).

Example: uuuuu for Standard InChIKey "InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N":

http:///chemical/structure/InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N/uuuuu

(MIME-Type: "text/plain")

Method: sdf

Returns the SD file of the structure. For other file formats than SD file, please read this blog article: http:///blog/?p=68

Example: SD File for chemical name string "morphine":

http:///chemical/structure/morphine/file?format=sdf

(MIME-Type: "text/plain")

Method: names

Returns a list of chemical names for the structure. The names currently available via this method comprises trivial names, systematic names, registry numbers and original structure provider IDs. Note that not all entries in our database have a name associated with them.

Example: Chemical names for Standard InChIKey "InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N" (Standard InChIKey of "ethanol"):

http:///chemical/structure/InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N/names

(MIME-Type: "text/plain")

Example: Synonyms for chemical name string "aspirin":

http:///chemical/structure/aspirin/names

(MIME-Type: "text/plain")

Method: hashisy

Returns the CACTVS HASHISY hashcode of the given chemical structure identifier. The HASHISY hashcode is a 16-digit hexadecimal (64-bit) hashcode representation of a chemical structure and also represents the hashcode part of the NCI/CADD Structure Identifiers (FICTS, FICuS, uuuuu), however, this methods provides only some raw structure normalization steps compared to methods ficts, ficus, and uuuuu.

Example: FICTS for SMILES string"c1ccccc1":

http:///chemical/structure/c1ccccc1/hashisy

(MIME-Type: "text/plain")

Method: image

Returns a GIF or PNG image of the structure identifier. There are several options available to control in which way the image is generated. An overview is available in this blog article: http:///blog/?p=136

Example: GIF for "aspirin":

http:///chemical/structure/aspirin/image

(MIME-Type: "image/gif")