Using SDF as “Identifier” Format

The Resolver now accepts SD File as input (or identifier) format. Here is an example how it works with Python (well, in words: replace all line feeds with “\n” and URL-encode the resulting string):

import urllib2
sdf = """C2H3O
SItclcactv02251111172D 0   0.00000     0.00000

  6  5  0  0  0  0  0  0  0  0999 V2000
    2.0000    0.2500    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.8660   -0.2500    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7321    0.2500    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.4675   -0.7249    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    3.2646   -0.7249    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    4.2690   -0.0600    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0  0  0  0
  2  3  1  0  0  0  0
  2  4  1  0  0  0  0
  2  5  1  0  0  0  0
  3  6  1  0  0  0  0
M  END
$$$$
"""
sdf = sdf.replace('\n', "\\n")
url = 'http://cactus.nci.nih.gov/chemical/structure/%s/smiles' % urllib2.quote(sdf)
resolver = urllib2.urlopen(url)
smiles = resolver.read()
print smiles

Here is how it works in JavaScript if you have a molfile hold in a variable and you want to prepare the URL for an AJAX call, for instance:

molfile = molfile.replace(/\n/g, '\\n');
var url =  'http://cactus.nci.nih.gov/chemical/structure' +  escape(molfile) + 'smiles';

The maximum length of the (encoded!) string accepted by the Resolver is 32kB and only the first structure record is currently regarded in case you sent multiple-record files.

Markus

One thought on “Using SDF as “Identifier” Format

  1. Markus,

    three questions
    1. Is there an offline version available, and how complicated would it be to get it, plus cost or collaboration scenario.

    2. Do you provide something like
    http://cactus.nci.nih.gov/chemical/structure/“structure identifier”/dbid/”database name”
    where “database name” could be chembl, pubchem, etc.
    http://cactus.nci.nih.gov/chemical/structure/“structure identifier”/dbid_all_vendors

    3. Could you take care of the maintenance of all those databases and provide an offline version of the identifier matchings, including dbid to original vendors, for download, plus cost and collaboration scenario.?

Comments are closed.