-
Studer Gabriel authored
That's the TMscore when using the other structure as reference
Studer Gabriel authoredThat's the TMscore when using the other structure as reference
:mod:`~ost.bindings.tmtools` - Structural superposition
The :mod:`~ost.bindings.tmtools` module provides access to the structural superposition programs TMscore, Tmalign and MMalign developed by Y. Zhang and J. Skolnick. These programs superpose a model onto a reference structure, using the positions of the Calpha atoms only. While at their core, these programs essentially use the same algorithm, they differ on how the Calphas are paired. TMscore pairs the Calpha atom based on the residue number, TMalign calculates an optimal pairing of Calpha atom based on heuristics.
Citation:
Yang Zhang and Jeffrey Skolnick, Proteins 2004 57: 702-710 Y. Zhang and J. Skolnick, Nucl. Acids Res. 2005 33, 2302-9
Besides using the standalone TM-align program, ost also provides a wrapper around USalign as published in:
Chengxin Zhang, Morgan Shine, Anna Marie Pyle, Yang Zhang (2022) Nat Methods
The advantage is that no intermediate files must be generated, a wrapper on the c++ layer is used instead.
Distance measures used by TMscore
There are many different ways to describe the structural similarity of two protein structures at the Calpha level. TMscore calculate several of these measures. The most common is to describe the difference in terms of the root mean square deviation of the Calpha positions, the RMSD. Despite its common use, RMSD has several drawbacks when working with incomplete models. Since the RMSD highly depends on the set of included atoms, it is relatively easy to obtain a smaller RMSD by omitting flexible parts of a protein structure. This has lead to the introduction of the global distance test (GDT). A model is compared to a reference by calculating the fraction of Calpha atoms that can be superposed below a certain cutoff, e.g. 1Å. The fractions of several such cutoffs are combined into the GDT_TS (1, 2, 4 and 8Å) and GDT_HA (0.5, 1, 2, 4Å) and divided by four to obtain the final measure. In contrast to RSMD, GDT is an agreement measure. The higher the value, the more similar the two structures are. TM-score (not to be confused by TMscore, the program), additionally adds a size dependences to the GDT measure by taking the protein length into account. As with GDT, the bigger the value, the more similar the two structures are.
Common Usage
The following example shows how to use TMscore to superpose two protein structures and print the RMSD as well as the GDT_TS and GDT_HA similarity measures.
from ost.bindings import tmtools
pdb1=io.LoadPDB('1ake.pdb', restrict_chains='A')
pdb2=io.LoadPDB('4ake.pdb', restrict_chains='A')
result=tmtools.TMScore(pdb1, pdb2)
print(result.rmsd_below_five) # 1.9
print(result.gdt_ha) # 0.41
print(result.gdt_ts) # 0.56
Usage of TMalign
Usage of TMscore
TMalign C++ wrapper
Instead of calling the TMalign executable, ost also provides a wrapper around its C++ implementation. The advantage is that no intermediate files need to be generated in order to call the executable.
from ost import bindings
pdb1=io.LoadPDB('1ake.pdb').Select("peptide=true")
pdb2=io.LoadPDB('4ake.pdb').Select("peptide=true")
result = bindings.WrappedTMAlign(pdb1.chains[0], pdb2.chains[0],
fast=True)
print(result.tm_score)
print(result.alignment.ToString(80))
All parameters of the constructor are available as attributes of the class
param rmsd: | RMSD of the superposed residues |
---|---|
param tm_score: | TMScore of the superposed residues |
param tm_score_swapped: | TMScore when reference is swapped |
param aligned_length: | Number of superposed residues |
param transform: | Transformation matrix to superpose first chain onto reference |
param alignment: | The sequence alignment given the structural superposition |
type rmsd: | :class:`float` |
type tm_score: | :class:`float` |
type aligned_length: | :class:`int` |
type transform: | :class:`geom.Mat4` |
type alignment: | :class:`ost.seq.AlignmentHandle` |
For higher order complexes, ost provides access to the MMalign functionality from USalign. This corresponds to calling USalign with the preferred way of comparing full biounits:
USalign mdl.pdb ref.pdb -mm 1 -ter 0
All parameters of the constructor are available as attributes of the class
param rmsd: | RMSD of the superposed residues |
---|---|
param tm_score: | TMScore of the superposed residues |
param tm_score_swapped: | TMScore when reference is swapped |
param aligned_length: | Number of superposed residues |
param transform: | Transformation matrix to superpose mdl onto reference |
param alignments: | Alignments of all mapped chains, with first sequence being from ent1 and second sequence from ent2 |
param ent1_mapped_chains: | All mapped chains from ent1 |
param ent2_mapped_chains: | The respective mapped chains from ent2 |
type rmsd: | :class:`float` |
type tm_score: | :class:`float` |
type aligned_length: | :class:`int` |
type transform: | :class:`geom.Mat4` |
type alignments: | :class:`ost.seq.AlignmentList` |
type ent1_mapped_chains: | :class:`ost.StringList` |
type ent2_mapped_chains: | :class:`ost.StringList` |