Description of the Cambridge Structural Database


Contents

Most of the information given here is take from the original CCDC pages


Overview

The Cambridge Structural Database (CSD) is the only comprehensive collection of small-molecule organic and organometallic crystal structures determined by X-ray and neutron diffraction techniques. It is maintained at the Cambridge Crystallographic Data Centre (CCDC).

As such, the CSD is one of the most important sources of experimental data on the shape and dimensions of molecules and a unique source of data on intermolecular interactions.

A set of specialised programs developed at the CCDC enable searching of the database and the statistical analysis and visual display of results and structures.

The CSD and the software together constitute the CSD System.

The CSD System is widely used by over 100 industrial companies in many areas of research including rational drug and agrochemical design, formulation of novel non-linear optical materials and design of catalysts for the petrochemical industry. It is also used for fundamental research in some 1,000 universities and research institutes, worldwide.

The following pages provide a brief account of the different parts of the CSD System.


Information Content

The Cambridge Structural Database (CSD) contains crystal structure information for over 150,000 organic and organometallic compounds. All of these crystal structures have been analysed using X-ray or neutron diffraction techniques.

For each crystallographic entry in the CSD there are three distinct types of information stored. These are conveniently categorized in terms of their "dimensionality".

1D Information

This data field incorporates all of the bibliographic material for the particular entry and summarises the structural and experimental information for the crystal structure. The text and numerical information includes the authors names and the full journal reference, as well as the crystallographic cell dimensions and space group. (Example.)

2D Information

A conventional chemical diagram of the molecule is stored in this information field. This is encoded as a chemical connection table comprising atom and bond properties. Atom properties include element symbol, number of connected non-hydrogen atoms, number of connected hydrogen atoms, and the net charge. Seven different bond types can be specified in bond properties. (Example.)

3D Information

A 3D representation of the molecule can be generated from the information stored in this field. This data includes the atomic coordinates, the space group symmetry, the covalent radii and the crystallographic connectivity established by using those radii. The 3D representation is matched with the 2D chemical structure. (Example.)


QUEST

The QUEST program is the basic search engine for the CSD. It allows the user to design and specify a query which is then used to interrogate the database. The database is searched sequentially until the program locates a CSD entry which matches the search criteria.

QUEST can be used to search the database for textual and numerical information and 2D structures and substructures. It can also be used for 3D (geometric) searches and to search the extended crystal structure for inter- and intra-molecular non-bonded interactions. Various geometric parameters (bond lengths, bond angles etc.) can be defined for a search and values will be tabulated for subsequent analysis with VISTA.

QUEST allows the user to view all of the information available for a hit entry. The 3D representation can be manipulated and analysed using the tools available in the 3D display window. These include viewing the packing diagram, performing a variety of geometry calculations and having the ability to rotate, translate and enlarge the molecule.

The user can elect to keep or reject a hit and the search then continues until the next hit is located. This process is repeated until the end of the database is encountered or until the user stops the search. At any stage the user can elect to allow the search to run non-interactively, in which case all hits located by QUEST will be retained.

Default display of hit entries in QUEST.
The entry is a Vanadium complex (ACACVO02) and was retrieved during a search for entries containing transition metal complexes. The required fragment is highlighted in red in the 2D diagram (right). The 3D structure is shown on the left.


VISTA

VISTA is an interactive analytical and statistical program. VISTA reads the tables files which are automatically generated by QUEST when 3D parameters are defined for a CSD search. Data is read directly into the VISTA spreadsheet from where all analysis of the search results is carried out. VISTA can generate histograms, scattergrams and polar plots and can perform principal component analysis and correlation/covariance analysis.

Information can be selected from the VISTA spreadsheet and written to user-defined tables for input into other analytical packages. Plots can be edited by the user and saved as Postscript files.

CSD entries that were saved during the QUEST search can be reviewed within VISTA. The 1D information, the 2D chemical diagram, and the 3D structural diagram can all be displayed. The options available in QUEST for manipulating the 3D structure are also available in VISTA.

Polar scatterplot generated by VISTA for the conformational study of 8-membered rings. (Allen, Howard & Pitchford, Acta Cryst. B, 1996, in press.)


PLUTO

PLUTO is a visualization tool for displaying the 3D structures of database entries. It incorporates many features not available within the QUEST program including the ability to explore the non-bonded networks present in crystal structures down user-defined atomic pathways. PLUTO also gives the user the opportunity to create alternate views of the molecule, and print them to a Postscript file. This can include packing diagrams and crystal network information.

Expanding the crystal network with PLUTO


Return to the CSD tutorial page.
Return to the other SAMSAM tutorials.