Chemical Identifier Resolver

Twirl Your Identifier

Posted in Chemical Identifier Resolver, Uncategorized on November 20th, 2009 by Markus – 3 Comments

twirl_very_smallA few month ago I stumbled over the fantastic TwirlyMol javascript posted by Noel O’Boyle on his blog. TwirlyMol creates a live 3D model of a chemical structure just in your webpage without the need of any plugins (the only thing that is required is a modern browser). Well, I thought it would be really cool to link TwirlyMol with the Resolver because this would offer a simple way to use it as a 3D molecule viewer, or, it would allow to embed 3D structure models into web pages, all by just crafting an URL – o.k., here it is.

As structure representation you can use any chemical structure identfier that is accepted by the Resolver including arbitrary SMILES and InChI strings.

Use it as a 3D structure viewer:

http://cactus.nci.nih.gov/chemical/structure/aspirin/twirl
http://cactus.nci.nih.gov/chemical/structure/CN(Cc1cnc2nc(N)nc(N)c2n1)c3ccc(cc3)C(=O)NC(CCC(O)=O)C(O)=O/twirl
http://cactus.nci.nih.gov/chemical/structure/InChI=1S/C6H12/c1-2-4-6-5-3-1/h1-6H2/twirl

You can resize it:

http://cactus.nci.nih.gov/chemical/structure/aspirin/twirl?height=200&width=200
http://cactus.nci.nih.gov/chemical/structure/aspirin/twirl?height=400&width=400
http://cactus.nci.nih.gov/chemical/structure/aspirin/twirl?height=800&width=800

Embed it into your own web page:

a. create div element in your page, tag it with an id (could be an arbitrary name), and set the height and width attribute of the div element:

<div id="twirler" height="300" width="300"></div>

b. load the twirl script from cactus.nci.nih.gov and add the URL option “div_id” which names the id of the div element:

<script src="http://cactus.nci.nih.gov/chemical/structure/aspirin/twirl?div_id=twirler" />

Thats it.  The rest is done by the javascript script coming from our webserver. Since TwirlyMol depends on the dojo and dojo.gfx javascript libraries these are loaded first from Google AJAX Libraries API into your web page. The 3D coordinates needed for the rendering of the structure are calculated by CORINA. You can use it also for more than one 3D model in the same web page, just use unique id names for the div element.

I tested on the following browsers how it works:

  • Firefox. It works well on Firefox<3.5, however is much slower compared to the Firefox>3.5. I used Firefox on Linux and Windows XP
  • Google Chrome. It works fine there and it is fast. OS was Windows XP
  • Safari. I tested it only quickly there and I don’t know any particular version numbers (neither of the OS nor the browser) but it seems to work fine there, too.
  • Internet Explorer. Well, well – it works, I tested it on IE7 and IE8 but it reminds more of a slide show than a molecule viewer.

If you experience any problems with TwirlyMol embedded by the Chemical Identifier Resolver, please report them to us. A thing I haven’t tested much is how it works with websites that uses the dojo and dojo.gfx libraries themselves. I also would be happy to hear about (present and absent) interferences with other javascript libraries like mootools, jquery, prototype.js etc. I used TwirlyMol with prototype.js myself and so far I didn’t run into any issues.

Read also Noel’s post about this on his blog, he has some nice live examples using this service (embarrassingly, I still have to figure out how I get TwirlyMol working in my Wordpress blog here :-) )

Molecular Weight, Formula and IUPAC Name

Posted in Chemical Identifier Resolver on November 19th, 2009 by Markus – Be the first to comment

We added a few more structure representations that can be calculated from a structure identifier by the Chemical Identifier Resolver:

molecular weight and monoisotopic mass

http://cactus.nci.nih.gov/chemical/structure/aspirin/mw
http://cactus.nci.nih.gov/chemical/structure/InChIKey=RCINICONZNJXQF-MZXODVADSA-N/monoisotopic_mass

chemical formula

http://cactus.nci.nih.gov/chemical/structure/50-00-0/formula
http://cactus.nci.nih.gov/chemical/structure/nsc740/formula

IUPAC name

http://cactus.nci.nih.gov/chemical/structure/aspirin/iupac_name
http://cactus.nci.nih.gov/chemical/structure/acetone/iupac_name

If you use the IUPAC name representation for a structure identifier please read the following notice: we took great care during the implementation of the name index for this web service, however, we are aware of that it is far from perfect and has quite  few errors in it. Unfortunately, these errors are not easy to to find if you have to deal with millions of names and their proper assignment to the correct chemical structure. If you find any mistakes, please tell us. Our plan is to improve the name index over the time but we are of course happy about any contributions helpful for this process. Thanks!

Beta 2

Posted in Chemical Identifier Resolver on November 19th, 2009 by Markus – Be the first to comment

beta2_smallWe have moved to the second beta version of the Chemical Identifier Resolver tonight. The new beta version includes several new features – I will post about them the next few days. If you encounter any problems that have not been there before, please report them to us.

Slow Resolver

Posted in Chemical Identifier Resolver on October 19th, 2009 by Markus – Be the first to comment

Unfortunately, there was a problem with the Resolver during the last few days. It reacted very slowly to any kind of request. As a matter of fact, those things always seems to pop up when you are out of town and it got proven this time again. It is fixed by now, so we apologize for any inconveniences this might have created.

On an additional note: we have some scheduled database updates the next few days, so the service might react a little bit slower again – however, not even close to the degree it has been the previous days (which was a software bug combined with a network problem – and both worked “perfectly” together)

Database Update (Standard InChIKeys)

Posted in Chemical Identifier Resolver on October 6th, 2009 by Markus – 2 Comments

We ran an update of the Chemical Identifier Resolver database tonight which grew the number of chemical structures known by the service quite a bit. The number of Standard InChIKeys indexed with their respective full structure representations is now approx. 93 million (92,939,226 to be exact). The service uses this index to work as an Standard InChIKey Resolver, e.g.

http://cactus.nci.nih.gov/chemical/structure/InChIKey=RZVAJINKPMORJF-UHFFFAOYSA-N/file?format=sdf
http://cactus.nci.nih.gov/chemical/structure/InChIKey=RZVAJINKPMORJF-UHFFFAOYSA-N/file?format=cdxml
http://cactus.nci.nih.gov/chemical/structure/InChIKey=RZVAJINKPMORJF-UHFFFAOYSA-N/smiles

Hiccups

Posted in Chemical Identifier Resolver on September 22nd, 2009 by Markus – 4 Comments

Sorry, the Chemical Identifier Resolver currently might suffer hiccups from time to time, i.e. might react slowly or with an error message – we are working on a larger update of the service.  I hope it will be finished tonight.

Markus

Create Structure Images from Standard InChIKeys

Posted in Chemical Identifier Resolver on August 3rd, 2009 by Markus – 2 Comments

As you might already have found out, the Chemical Identifier Resolver allows to create a GIF image from a Standard InChIKey very easily:

http://cactus.nci.nih.gov/chemical/structure/InChIKey=BSYNRYMUTXBXSQ-UHFFFAOYSA-N/image

The same  can be done for any chemical structure identifier accepted by the Resolver:

http://cactus.nci.nih.gov/chemical/structure/morphine/image
http://cactus.nci.nih.gov/chemical/structure/InChI=InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H/image
http://cactus.nci.nih.gov/chemical/structure/CC(=O)Oc1ccccc1C(O)=O/image

The images are all created by CACTVS. So far, the service returned always a 250×250 GIF image but for the generation of structure images you might of course ask for more control about how the structure image is to be created.  So we added a few (URL) options to the image method of the Resolver. For instance, the following image has just been created from the URL shown in the caption:

http://cactus.nci.nih.gov/chemical/structure/InChIKey=BSYNRYMUTXBXSQ-UHFFFAOYSA-N/image?footer=BSYNRYMUTXBXSQ-UHFFFAOYSA-N&width=500

http://cactus.nci.nih.gov/chemical/structure/InChIKey=BSYNRYMUTXBXSQ-UHFFFAOYSA-N/image?footer=BSYNRYMUTXBXSQ-UHFFFAOYSA-N&width=500

More options are:

Create a PNG image instead of GIF:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?format=png

Change width, height, linewidth and fontsize:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?width=500&height=500&linewidth=2&symbolfontsize=16

Add some background color:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?bgcolor=yellow

You can also use the html hex code color codes (the ‘#’ character has to be URL-escaped as ‘%23′ in this case):

ttp://cactus.nci.nih.gov/chemical/structure/aspirin/image?bgcolor=%23AADDEE

For an image with transparent background use ‘transparent’ as color name and switch off antialiasing:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?bgcolor=transparent&antialiasing=0

Show black atom labels instead of the default color scheme for the different atom element types:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?atomcolor=black

Control which hydrogen atoms are shown:

The default values is special, i.e. only hydrogen atoms in functional groups or defining stereochemistry are shown.

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?hsymbol=special
http://cactus.nci.nih.gov/chemical/structure/aspirin/image?hsymbol=all

Control how carbon atoms are shown:
The default values is special, if all is used all carbon atoms are shown as atom symbol:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?csymbol=special
http://cactus.nci.nih.gov/chemical/structure/aspirin/image?csymbol=all
Change the colors for hydrogen atoms:
http://cactus.nci.nih.gov/chemical/structure/aspirin/image?hcolor=gray

Use another color for bonds:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?bondcolor=red

Show R/S stereo labels:

http://cactus.nci.nih.gov/chemical/structure/taxol/image?showstereo=0
http://cactus.nci.nih.gov/chemical/structure/taxol/image?showstereo=1

Add some text to the image:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?header="Aspirin on the top"
http://cactus.nci.nih.gov/chemical/structure/aspirin/image?footer="Aspirin on the bottom"

Add a frame:

http://cactus.nci.nih.gov/chemical/structure/aspirin/image?frame=1

There are more options and we will document them more exhaustively later. If you are familiar with all options CACTVS has available for controlling the GIF/PNG generation, try them – chances are good that they might work. Please also visit our GIF Generator at http://cactus.nci.nih.gov.

Resolve a structure identifier as SDF, CML, MRV, PDB …

Posted in Chemical Identifier Resolver on July 21st, 2009 by Markus – 2 Comments

We’d like to present a new feature of the Chemical Identifier Resolver: in addition to the already available SD file format representation

http://cactus.nci.nih.gov/chemical/structure/aspirin/sdf

the service can now represent a structure (identifier) also in many different text-based structure (file) formats. The general URL format is:

http://cactus.nci.nih.gov/chemical/structure/"identifier"/file?format="format"

The different chemical structure representations are generated by the chemoinformatic toolkit CACTVS. Although CACTVS can offer a whole lot more formats (including binary ones) we make the following (few) available here:

alc (Alchemy format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=alc

cdxml (CambridgeSoft ChemDraw XML format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=cdxml

cerius (MSI Cerius II format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=cerius

charmm (Chemistry at HARvard Macromolecular Mechanics file format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=charmm

cif (Crystallographic Information File)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=cif

cml (Chemical Markup Language)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=cml

ctx (Gasteiger Clear Text format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=ctx

gjf (Gaussian input data file)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=gjf

gromacs (GROMACS file format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=gromacs

hyperchem (HyperChem file format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=hyperchem

jme (Java Molecule Editor format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=jme

maestro (Schroedinger MacroModel structure file format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=maestro

mol (Symyx molecule file)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=mol

mol2 (Tripos Sybyl MOL2 format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=sybyl2
http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=mol2

mrv (ChemAxon MRV format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=mrv

pdb (Protein Data Bank)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=pdb

sdf (Symyx Structure Data Format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=sdf

sdf3000 (Symyx Structure Data Format 3000)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=sdf3000

sln (SYBYL Line Notation)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=sln

smiles (SMILES)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=smile

xyz (xyz file format)

http://cactus.nci.nih.gov/chemical/structure/aspirin/file?format=xyz

All these workof course also with Standard InChIKey, SMILES or NCI/CADD Identifier as structure identifier:

http://cactus.nci.nih.gov/chemical/structure/InChIKey=BSYNRYMUTXBXSQ-UHFFFAOYSA-N/file?format=pdb
http://cactus.nci.nih.gov/chemical/structure/CC(=O)Oc1ccccc1C(O)=O/file?format=mrv
http://cactus.nci.nih.gov/chemical/structure/045DA3288E1A0233-FICuS-01-39/file?format=cml

The Chemical Identifier Resolver

Posted in Chemical Identifier Resolver on June 25th, 2009 by Markus – 2 Comments
The Chemical Identifier Resolver

The Chemical Identifier Resolver at http://cactus.nci.nih.gov/chemical/structure

This new service on http://cactus.nci.nih.gov is a resolver for different chemical structure representations and identifiers, including those that do not carry any information about the structure itself. For instance, it can work as a Standard InChIKey Resolver, an NCI/CADD Identifier Resolver or a Chemical Name Resolver. The service also allows one to convert a given structure identifier into another representation or structure identifier.

Representations/identifiers supported are: Standard InChI/InChIKey, NCI/CADD Identifiers (FICuS, FICTS, uuuuu), SMILES, SDF, names, and a few other types of IDs.  See the web page for more information.

For those identifiers that require lookup, the underlying database currently contains about 67 million unique structure records, from which the respective Standard InChIKeys and NCI/CADD Identifiers have been calculated. For lookup by chemical names, 68 million names associated with 16 million unique structure records are currently available in the database. The database continues to grow.

Closely related are the new capabilities of resolving/converting chemical structure identifiers by simply using a URL adhering to the following scheme:

http://cactus.nci.nih.gov/chemical/structure/"structure identifier"/"representation"[/xml]

We just list a few examples here that should give you an idea of what’s possible with this service.  For more detailed explanations, see the above web page.

Example: Standard InChI for chemical name string “aspirin“:

http://cactus.nci.nih.gov/chemical/structure/aspirin/stdinchi
http://cactus.nci.nih.gov/chemical/structure/aspirin/stdinchi/xml

Example: Standard InChIKey of “ethanol” specified as SMILES string “CCO“:

http://cactus.nci.nih.gov/chemical/structure/CCO/stdinchikey

Example: Unique SMILES string of chemical name string “benzene“:

http://cactus.nci.nih.gov/chemical/structure/benzene/smiles

Example: SD File for chemical name string “morphine“:

http://cactus.nci.nih.gov/chemical/structure/morphine/sdf

Example: Chemical names for Standard InChIKey ”InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N” (Standard InChIKey of “ethanol“):

http://cactus.nci.nih.gov/chemical/structure/InChIKey=LFQSCWFLJHTTHZ-UHFFFAOYSA-N/names

Example: Synonyms for chemical name string “aspirin“:

http://cactus.nci.nih.gov/chemical/structure/aspirin/names