Skip to content
Snippets Groups Projects
Commit 404baa79 authored by Studer Gabriel's avatar Studer Gabriel
Browse files

mmcif writer: docu updates

parent faa3024a
No related branches found
No related tags found
No related merge requests found
......@@ -1678,6 +1678,9 @@ significant impact on how chains are assigned to mmCIF entities, chain names and
residue numbers. Ideally, the input is *mmcif_conform* which is the case
when loading a structure from a valid mmCIF file with :func:`ost.io.LoadMMCIF`.
Behaviour when *mmcif_conform* is True
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Expected properties when *mmcif_conform* is enabled:
* The residues in a chain all represent the same mmCIF entity. That is for
......@@ -1696,7 +1699,7 @@ Expected properties when *mmcif_conform* is enabled:
type "branched". There, a subtype such as CHAINTYPE_OLIGOSACCHARIDE is
expected.
* The residue numbers in "polymer" chains must match the SEQRES of the
underlying entity with 1-based indexing.Insertion codes are not allowed
underlying entity with 1-based indexing. Insertion codes are not allowed
and raise an error.
* Each residue must have a valid chem class assigned (available as
:func:`ost.mol.ResidueHandle.GetChemClass`). Even though this information
......@@ -1743,6 +1746,54 @@ a few special cases:
_atom_site.pdbx_PDB_ins_code
Behaviour when *mmcif_conform* is False
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
If *mmcif_conform* is not enabled, the only expectation is that chem classes
(available as :func:`ost.mol.ResidueHandle.GetChemClass`) are set. OpenStructure
delegates this to the :class:`ost.conop.Processor` and thus requires a valid
:class:`ost.conop.CompoundLib` when reading a structure. There will be
significant preprocessing involving the split of chains which is purely based
on the set chem classes. Each chain gets split with the following rules:
* separate chain of _entity.type "non-polymer" for each residue with chem class
:class:`NON_POLYMER`/:class:`UNKNOWN`
* if any residue has chem class :class:`WATER`, all of them are collected
into one separate chain with _entity.type "water"
* if any residue is a saccharide, i.e. has chem class
:class:`SACCHARIDE`/:class:`L_SACCHARIDE`/:class:`D_SACCHARIDE`, all of them
are collected into one separate chain of _entity.type "branched" and
_pdbx_entity_branch.type "oligosaccharide".
* if any residue has chem class :class:`RNA_LINKING`, all of them are collected
into one separate chain of _entity.type "polymer" and
_entity_poly.type "polyribonucleotide".
* if any residue has chem class :class:`DNA_LINKING`, all of them are collected
into one separate chainof _entity.type "polymer" and
_entity_poly.type "polydeoxyribonucleotide".
* if any residue is peptide linking, all of them are collected into one separate
chain of _entity.type "polymer" and _entity_poly.type
"polypeptide(L)"/"polypeptide(D)". We only allow the following
combinations of chem classes. Either
:class:`L_PEPTIDE_LINKING`/:class:`PEPTIDE_LINKING` or
:class:`D_PEPTIDE_LINKING`/:class:`PEPTIDE_LINKING`. Mixing
:class:`L_PEPTIDE_LINKING` and :class:`D_PEPTIDE_LINKING` raises an error.
Chain names are generated by iterating over
"ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz", starting with
AA, AB, AC etc. once the first cycle is through. There can therefore be as many
chains as needed. The mmCIF entities are built the same way as for
*mmcif_conform* with two differences: 1) the extracted SEQRES of a chain is the
ATOMSEQ, i.e. the exact sequence of its residues 2) Entity matching happens
through exact matches of SEQRES and is independent from residue numbers. As a
consequence, the residue numbers written as _atom_site.label_seq_id do not
correspond anymore to the actual residue numbers but refer to the location in
ATOMSEQ.
Once split and new chain names assigned, the rest is straightforward.
The special cases listed above (_atom_site.auth_asym_id,
_pdbx_poly_seq_scheme.pdb_strand_id, _atom_site.auth_seq_id etc.) are
treated the same as if *mmcif_conform* was true.
.. class:: MMCifWriterEntity
Defines mmCIF entity which will be written in :class:`MMCifWriter`
......@@ -1752,7 +1803,7 @@ a few special cases:
Static constructor function for entities of type "polymer"
:param entity_poly_type: Entity poly type from restricted alphabet for
:param entity_poly_type: Entity poly type from restricted vocabulary for
`_entity_poly.type <https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_entity_poly.type.html>`_
:type entity_poly_type: :class:`str`
:param mon_ids: Full names of all compounds defining the SEQRES of that
......@@ -1806,7 +1857,7 @@ a few special cases:
.. method:: SetStructure(ent, mmcif_conform=True, entity_info=list())
Extracts mmCIF categories/attributes based on the description above.
An object of type :class:`MMCifWriter` can only be associated to one
An object of type :class:`MMCifWriter` can only be associated with one
Structure. Calling this function more than once raises an error.
:param ent: The stucture to write
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment