Skip to content
Snippets Groups Projects
compoundlib.rst 7.30 KiB

The compound library

Compound libraries contain information on chemical compounds, such as their connectivity, chemical class and one-letter-code. The compound library has several uses, but the most important one is to provide the connectivy information for the :class:`rule-based processor <RuleBasedBuilder>`.

The compound definitions for standard PDB files are taken from the components.cif dictionary provided by the PDB. The dictionary is updated with every PDB release and augmented with the compound definitions of newly crystallized compounds.

If you downloaded the bundle, a recent version of the compound library is already included. If you are compiling from source or want to incorporate the latest compound definitions, follow :ref:`these instructions <mmcif-convert>` to build the compound library manually.

Holds the description of a chemical compound, such as three-letter-code, and chemical class.

Definition of an atom

Definition of a bond

Example: Translating SEQRES entries

In this example we will translate the three-letter-codes given in the SEQRES record to one-letter-codes. Note that this automatically takes care of modified amino acids such as selenium-methionine.

compound_lib=conop.CompoundLib.Load('compounds.chemlib')
seqres='ALA GLY MSE VAL PHE'
sequence=''
for tlc in seqres.split():
  compound=compound_lib.FindCompound(tlc)
  if compound:
     sequence+=compound.one_letter_code
print sequence # prints 'AGMVF'

Creating a compound library

The simplest way to create compound library is to use the :program:`chemdict_tool`. The programs allows you to import the chemical description of the compounds from a MMCIF dictionary, e.g. the components.cif dictionary provided by the PDB. The latest dictionary for can be downloaded from the wwPDB site. The files are rather large, it is therefore recommended to download the gzipped version.

After downloading the file use :program:`chemdict_tool` to convert the MMCIF dictionary into our internal format.

chemdict_tool create <components.cif> <compounds.chemlib>

Note that the :program:`chemdict_tool` only understands .cif and .cif.gz files. If you have would like to use other sources for the compound definitions, consider writing a script by using the :doc:`compound library <compoundlib>` API.

If you are working with CHARMM trajectory files, you will also have to add the definitions for CHARMM. Assuming your are in the top-level source directory of OpenStructure, this can be achieved by:

chemdict_tool update modules/conop/data/charmm.cif <compounds.chemlib> charmm

Once your library has been created, you need to tell cmake where to find it and make sure it gets staged.

cmake -DCOMPOUND_LIB=compounds.chemlib
make