General Information

The NCI/CADD group is a research unit within the Chemical Biology Laboratory at the National Cancer Institute. Read more about the CADD Group's Chemoinformatics Tools and User Services. Or, visit our blog.

Happy 25th anniversary, cactus!

To search and display chemical structures here, you will need Java/JavaScript to be enabled on your browser.

Accessibility (Section 508): We want to hear from users with disabilities, especially visually impaired users, where they have experienced particular difficulties in using our site. If you are a visually impaired user, please e-mail M.C. Nicklaus and team with your experiences and suggestions.

Chemical Activity Predictor

This service is the first one of our new Apps. It provides the prediction of a (growing) number of small molecule properties calculated by QSAR models created with the GUSAR software. Beta version. --- Due to ever stricter security mandates, this service unfortunately had to be turned off. As an alternative, you can try out the NIH/NCATS Predictor service, which contains many more models than the Chemical Activity Predictor.

Chemical Identifier Resolver

This service works as a resolver for different chemical structure identifiers and allows the conversion of a given structure identifier into another representation or structure identifier. It can be used via a web form or a simple URL API.

Enhanced NCI Database Browser Release 2.2

A web service to the open NCI database compounds (>250,000 structures) with different kinds of output features and links to other services for continued processing (Version 2.2 is a 2013 technical overhaul).


Allows you to test a set of 86 different tautomeric transforms with your own molecules. In addition to the standard rules of the chemoinformatics toolkit CACTVS, there are 60+ additional rules compiled in the context of the IUPAC project of Redesign of the Handling of Tautomerism in InChI V2.

Chemical Structure Lookup Service (CSLS)

Look up whether a structure occurs in many different databases, both public and commercial. Currently loaded pointers to: over 74 million entries from more than 100 databases, representing more than 46 million unique chemical structures.

PROSIT: Online Pseudorotation Tool Version 2

Pseudo-Rotational Online Service and Interactive tool (PROSIT). Calculates, and displays in tabular format, the pseudorotation parameters (P, chi, nu_max) and other useful information for 3D structures of nucleosides, nucleotides and analogs thereof, as well as for DNA and RNA single and double strands, whether by themselves or complexed with a protein. You can request a visual output of the found nucleoside (analog) or (oligo-)nucleotide in 2D or 3D.

Pseudorotation Visualization: Ancillary tool to modify the pseudorotation parameters P and nu_max interactively, and observe the resulting conformational changes in a 3D model of a five-membered ring.

Online SMILES Translator

Web-based SMILES Translation Service. Converts SMILES strings, SDF, PDB, MOL and other formats into USMILES, and SDF, PDB and MOL file formats.

GIF Creator for Chemical Structures

Computer-generated GIF and PNG images of chemical structures for WWW pages etc. from your 2D or 3D input files. Forms-based interface, automatic generation of 2D display coordinates for structures without them.

VRML Creator for Chemical Structures

Generation of VRML scenes from your 2D or 3D data files. Automatic generation of 3D coordinates if not contained in the input structure. Many display options to control the visual appearance.

Optical Structure Recognition (OSRA)

Converts graphical representations of chemical structures in journal articles, patent documents, textbooks, trade magazines etc., into SMILES. OSRA can read over 90 graphical formats including GIF, JPEG, PNG, TIFF, PDF, PS etc.

OSRA Web Interface: An online demonstration of the capabilities of OSRA.

SDF Toolkit

Toolkit (programmed in Perl 5) providing functions to read and parse structure files in MDL's SDF format, such as filter and add/remove properties, select individual records out of large SD files etc. A Windows version is also available.


A fast clustering program for computation of a representative subset of a large dataset. Applicable, but not limited, to chemical databases.


ReactionCode is a versatile format for searching, analysis, classification, transform, and encoding/decoding of reactions" as a description.

PubChem Structure + Assay Download Page

Download SD files of structures from PubChem with assay data included as properties, suitable for building QSAR or other types of models.

NCI Database Download Page

Download the "raw" data in bulk format that were used in building the Enhanced NCI Database Browser.

FDA SPL Download Page

Download a Mapping File to, or (older) SD file versions of, Structured Product Labeling (SPL) index files of substances indexed by FDA.

HIV-1 Integrase Inhibitor Download Page

Download structures and annotations of HIV-1 integrase inhibitors collected from literature.

SAVI Products Download Page

Download products and other associated data of our Synthetically Accessible Virtual Inventory (SAVI) project: Computational generation of a very large database of reliably and inexpensively synthesizable novel compounds with desirable properties for drug development.

Tautomerism Database Download Page

Download a spreadsheet with 5,977 structures extracted from experimental literature containing 2,819 cases of tautomeric tuples, annotated with experimental conditions, structure identifiers, bibliographic references, and preliminary analysis of the tautomerism involved in each case. (Release 3)

Multi-Species Acute Toxicity Database Download Page

Download a spreadsheet with toxicity measurements for 80,081 unique compounds (compounds that also went into the RTECS® database)

iRL-Based Screening Sample Download Page

Download an NCI/CADD Group curated set of over 140 million compounds based on iResearch™ Library (iRL) commercial screening samples (

Claimed Small-Molecule Structures in NCI/NIH Patents

Download a database of about 12,700 structures extracted from patents filed by NCI/NIH, granted through September 2019.

Chemistry Search Services on the Web

A table of searchable small molecule databases available on the web, listing URLs and an (incomplete) survey of the services' features and capabilities. Contains U.S. government, academic, and commercial web sites.

Meeting Presentations & Other Documents

Presentations from various meetings, conferences and other associated documents.



Historical Page: Other Public Chemical Data

An early list of URLs that point to other public chemical (or chemistry-related) information. Currently limited to U.S. government web sites. These sites may contain search capabilities and/or other public datasets. Note: many of these sites and links do not work any more as of 2023.

M.C. Nicklaus and team

Last Update: 2024-05-13