Skip to content
Snippets Groups Projects
Commit 732bbf2f authored by Valerio Mariani's avatar Valerio Mariani
Browse files

DRMSD Test first commit

Needs improvements. For example, distance differences should be always capped to cap_difference
parent e2cb0d86
Branches
Tags
No related merge requests found
......@@ -116,6 +116,73 @@
:returns: the Local Distance Difference Test score
.. function:: DistanceRMSDTest(model, distance_list, cap_difference, sequence_separation=0, local_drmsd_property_string="")
This function performs a Distance RMSD Test on a provided model, and calculates the two values that are necessary to determine
the Distance RMSD Score, namely the sum of squared distance deviations and the number of distances on which the sum was computed.
The Distance RMSD Test (or DRMSD Test) computes the deviation in the length of local contacts between a model and
a reference structure and expresses it in the form of a score value. The score has an an RMSD-like form, with the deviations
in the RMSD formula computed as contact distance differences. The score is open-ended, with a value of zero meaning complete
agreement of local contact distances, and a positive value revealing a disagreement of magnitude proportional to the score value
itself. This score does not require any superposition between the model and the reference.
This function processes a list of distances provided by the user, together with their length in the reference structure.
For each distance that is found in the model, its difference with the reference length is computed and used as deviation term in the RMSD-like
formula. When a distance is not present in the model because one or both the atoms are missing, a default deviation value provided by the
user is used.
The function only processes distances between atoms that do not belong to the same residue, and considers only standard residues
in the first chain of the model. For residues with symmetric sidechains (GLU, ASP, ARG, VAL, PHE, TYR), the
naming of the atoms is ambiguous. For these residues, the function computes the Distance RMSD Test score that each
naming convention would generate when considering all non-ambiguous surrounding atoms.
The solution that gives the lower score is then picked to compute the final Distance RMSD Score for the whole model.
A sequence separation parameter can be passed to the function. If this happens, only distances between residues
whose separation is higher than the provided parameter are considered when computing the score
If a string is passed as last parameter to the function, the function computes the Distance RMSD Score
for each residue and saves it as a float property in the ResidueHandle, with the passed string
as property name. Additionally, the actual sum of squared deviations and the number of distances on which it was computed
are stored as properties in the ResidueHandle. The property names are respectively <passed string>_sum (a float property) and
<passed string>_count (an integer property)
:param model: the model structure
:type model: :class:`~ost.mol.EntityView`
:param distance_list: the list of distances to check.
:type distance_list: :class:`~ost.mol.alg.GlobalRDMap`. This class stores two length values for each distance. Only the first is used by this function as
reference length. The second is ignored.
:param cap_difference: a default deviation value to be used when a distance is not found in the model
:param sequence_separation: sequence separation parameter used when computing the score
:param local_ldt_property_string: the base name for the ResidueHandle properties that store the local scores
:returns: a tuple containing the sum of squared distance deviations, and the number of distances on which it was computed.
.. function:: DRMSD(model, distance_list, cap_difference, sequence_separation=0);
This function calculates the Distance RMSD Test score (see previous function).
The function only considder distances between atoms not belonging to the same residue, and only compares the input distance list to the first
chain of the model structure. It requires, in addition to the model and the list themselves, a default deviation value to be used in the DRMSD
Test when a distance is not found in the model.
The local Local Distance Difference Test score values are stored in the ResidueHandles of the model passed to the
function in a float property called "localdrmsd"
A sequence separation parameter can be passed to the function. If this happens, only distances between residues
whose separation is higher than the provided parameter are considered when computing the score
:param model: the model structure
:type model: :class:`~ost.mol.EntityView`
:param distance_list: the list of distances to check for conservation
:type distance_list: :class:`~ost.mol.alg.GlobalRDMap`. This class stores two length values for each distance. Only the first is used by this function as
reference length. The second is ignored.
:param sequence_separation: sequence separation parameter used when computing the score
:param cap_difference: a default deviation value to be used when a distance is not found in the model
:returns: the Distance RMSD Test score
.. function:: CreateDistanceList(reference, radius);
.. function:: CreateDistanceListFromMultipleReferences(reference_list, tolerance_list, sequence_separation, radius);
......@@ -153,7 +220,7 @@
:param sequence_separation: sequence separation parameter used when computing the LDDT score
:param radius: inclusion radius (in Angstroms) used to determine the distances included in the list
:returns: class `~ost.mol.alg.GlobalRDMap`
:returns: :class:`~ost.mol.alg.GlobalRDMap`
.. class:: UniqueAtomIdentifier
......@@ -203,15 +270,38 @@
of a single structure
.. function: PrintResidueRDMap(residue_distance_list)
.. function:: PrintResidueRDMap(residue_distance_list)
Prints to standard output all the distances contained in a :class:`~ost.mol.alg.ResidueRDMap` object
Prints to standard output all the distances contained in a :class:`~ost.mol.ResidueRDMap` object
.. function:: PrintGlobalRDMap(global_distance_list)
Prints to standard output all the distances contained in each of the :class:`~ost.mol.alg.ResidueRDMap` objects that
make up a :class:`~ost.mol.alg.GlobalRDMap` object
.. function: PrintGlobalRDMap(global_distance_list)
Prints to standard output all the distances contained in each of the :class:`~ost.mol.ResidueRDMap` objects that
make up a :class:`~ost.mol.GlobalRDMap` object
.. function:: Swappable(residue_name,atom_name)
This function checks if an atom in a residue has a symmetry equivalent. It returns true if the atom belongs to a residue with a symmetric side-chain
and a symmetry equivalent atom exists, otherwise it returns false.
:param residue_name: a string containing the name of the residue to which the atom belongs
:param atom_name: a string containing the name of the atom
:returns: True if the atom has a symmetry equivalent, false otherwise
.. function:: SwappedName(atom_name)
If the atom does belongs to a residue with a symmetric side-chain and if the atom has a symmetry
equivalent, the function returns the name of the symmetry equivalent atom, otherwise it returns the name of the original
atom
:param atom_name: a string containing the name of the atom
:returns: A string containing the same of the symmetry equivalent atom if the input atom has one, otherwise the name of the input atom itself.
.. _steric-clashes:
......
......@@ -22,6 +22,7 @@
#include <ost/config.hh>
#include <ost/mol/alg/local_dist_diff_test.hh>
#include <ost/mol/alg/distance_test_common.hh>
#include <ost/mol/alg/distance_rmsd_test.hh>
#include <ost/mol/alg/superpose_frames.hh>
#include <ost/mol/alg/filter_clashes.hh>
#include <ost/mol/alg/consistency_checks.hh>
......@@ -108,8 +109,7 @@ BOOST_PYTHON_MODULE(_ost_mol_alg)
export_entity_to_density();
#endif
to_python_converter<std::pair<Real,Real>,
PairToTupleConverter<Real, Real> >();
def("LocalDistDiffTest", lddt_a, (arg("sequence_separation")=0,arg("local_lddt_property_string")=""));
def("LocalDistDiffTest", lddt_c, (arg("local_lddt_property_string")=""));
......@@ -122,7 +122,7 @@ BOOST_PYTHON_MODULE(_ost_mol_alg)
def("CreateDistanceList",&mol::alg::CreateDistanceList);
def("CreateDistanceListFromMultipleReferences",&create_distance_list_from_multiple_references);
def("DistanceRMSDTest", &mol::alg::DistanceRMSDTest, (arg("sequence_separation")=0,arg("local_lddt_property_string")=""));
def("SuperposeFrames", superpose_frames1,
(arg("source"), arg("sel")=ost::mol::EntityView(), arg("begin")=0,
......@@ -159,14 +159,6 @@ BOOST_PYTHON_MODULE(_ost_mol_alg)
.def("GetQualifiedAtomName",&mol::alg::UniqueAtomIdentifier::GetQualifiedAtomName)
;
class_<mol::alg::ResidueRDMap>("ResidueRDMap")
.def(map_indexing_suite<mol::alg::ResidueRDMap>())
;
class_<mol::alg::GlobalRDMap>("GlobalRDMap")
.def(map_indexing_suite<mol::alg::GlobalRDMap>())
;
def("FillClashingDistances",&fill_clashing_distances_wrapper);
def("FillStereoChemicalParams",&fill_stereochemical_params_wrapper);
def("IsStandardResidue",&mol::alg::IsStandardResidue);
......@@ -217,9 +209,6 @@ BOOST_PYTHON_MODULE(_ost_mol_alg)
.def("GetClashList",&mol::alg::ClashingInfo::GetClashList)
;
to_python_converter<std::pair<mol::EntityView,mol::alg::ClashingInfo>,
PairToTupleConverter<mol::EntityView, mol::alg::ClashingInfo> >();
class_<mol::alg::StereoChemistryInfo> ("StereoChemistryInfo" ,init <>())
.def(init<Real,int,int,Real,int,int, const std::map<String,mol::alg::BondLengthInfo>&,
const std::vector<mol::alg::StereoChemicalBondViolation>&,
......@@ -235,9 +224,20 @@ BOOST_PYTHON_MODULE(_ost_mol_alg)
.def("GetAngleViolationList",&mol::alg::StereoChemistryInfo::GetAngleViolationList)
;
to_python_converter<std::pair<ost::mol::alg::UniqueAtomIdentifier,ost::mol::alg::UniqueAtomIdentifier>,
PairToTupleConverter<ost::mol::alg::UniqueAtomIdentifier, ost::mol::alg::UniqueAtomIdentifier> >();
to_python_converter<std::pair<Real,Real>,
PairToTupleConverter<Real, Real> >();
to_python_converter<std::pair<Real,long int>,
PairToTupleConverter<Real, long int> >();
to_python_converter<std::pair<mol::EntityView,mol::alg::StereoChemistryInfo>,
PairToTupleConverter<mol::EntityView, mol::alg::StereoChemistryInfo> >();
PairToTupleConverter<mol::EntityView, mol::alg::StereoChemistryInfo> >();
to_python_converter<std::pair<mol::EntityView,mol::alg::ClashingInfo>,
PairToTupleConverter<mol::EntityView, mol::alg::ClashingInfo> >();
to_python_converter<std::vector<mol::alg::ClashEvent>,
VectorToListConverter<mol::alg::ClashEvent> >();
......@@ -248,4 +248,14 @@ BOOST_PYTHON_MODULE(_ost_mol_alg)
to_python_converter<std::vector<mol::alg::StereoChemicalAngleViolation>,
VectorToListConverter<mol::alg::StereoChemicalAngleViolation> >();
class_<mol::alg::ResidueRDMap>("ResidueRDMap")
.def(map_indexing_suite<mol::alg::ResidueRDMap,true>())
;
class_<mol::alg::GlobalRDMap>("GlobalRDMap")
.def(map_indexing_suite<mol::alg::GlobalRDMap,true>())
;
def("DRMSD",&mol::alg::DRMSD);
}
......@@ -4,6 +4,7 @@ set(OST_MOL_ALG_HEADERS
sec_structure_segments.hh
local_dist_diff_test.hh
distance_test_common.hh
distance_rmsd_test.hh
superpose_frames.hh
filter_clashes.hh
construct_cbeta.hh
......@@ -19,6 +20,7 @@ set(OST_MOL_ALG_SOURCES
sec_structure_segments.cc
local_dist_diff_test.cc
distance_test_common.cc
distance_rmsd_test.cc
superpose_frames.cc
filter_clashes.cc
construct_cbeta.cc
......
#include <ost/log.hh>
#include <ost/mol/mol.hh>
#include <ost/mol/alg/distance_rmsd_test.hh>
#include <ost/mol/alg/distance_test_common.hh>
#include <boost/concept_check.hpp>
namespace ost { namespace mol { namespace alg {
namespace {
std::pair<Real, long int> calc_rmsd(const ResidueRDMap& res_distance_list, const ResNum& rnum,
ChainView mdl_chain, int sequence_separation, Real cap_distance,
bool only_fixed, bool swap, std::vector<std::pair<Real, long int> >& local_rmsd_data_list, bool log )
{
std::pair<Real, long int> rmsd_data(0, 0);
ResidueView mdl_res=mdl_chain.FindResidue(rnum);
for (ResidueRDMap::const_iterator ai=res_distance_list.begin(); ai!=res_distance_list.end(); ++ai) {
const UAtomIdentifiers& uais = ai->first;
const std::pair <Real,Real>& values = ai->second;
const UniqueAtomIdentifier& first_atom=uais.first;
const UniqueAtomIdentifier& second_atom=uais.second;
String name=swap ? SwappedName(first_atom.GetAtomName()) : first_atom.GetAtomName();
AtomView av1=mdl_res ? mdl_res.FindAtom(name) : AtomView();
if (only_fixed) {
if (std::abs(first_atom.GetResNum().GetNum()-second_atom.GetResNum().GetNum())<=sequence_separation) {
continue;
}
if (Swappable(second_atom.GetResidueName(), second_atom.GetAtomName())) {
continue;
}
}
if (!only_fixed) {
if (first_atom.GetResNum().GetNum()<=(second_atom.GetResNum().GetNum()+sequence_separation)) {
continue;
}
}
ResidueView rv2=mdl_chain.FindResidue(second_atom.GetResNum());
rmsd_data.second+=1;
int rindex2=0;
int rindex1=mdl_res ? mdl_res.GetIndex() : -1;
if (!only_fixed && rindex1!=-1)
local_rmsd_data_list[rindex1].second+=1;
if (!rv2) {
rmsd_data.first+=(cap_distance*cap_distance);
local_rmsd_data_list[rindex1].first+=(cap_distance*cap_distance);
continue;
}
rindex2=rv2.GetIndex();
if (!only_fixed)
local_rmsd_data_list[rindex2].second+=1;
AtomView av2=rv2.FindAtom(second_atom.GetAtomName());
if (!(av1 && av2)) {
rmsd_data.first+=(cap_distance*cap_distance);
local_rmsd_data_list[rindex1].first+=(cap_distance*cap_distance);
local_rmsd_data_list[rindex2].first+=(cap_distance*cap_distance);
continue;
}
Real mdl_dist=geom::Length(av1.GetPos()-av2.GetPos());
Real diff = mdl_dist-values.first;
if (log) {
// LOG_VERBOSE("drmsd:" << " " << av1.GetResidue().GetChain() << " " << av1.GetResidue().GetName() << " " << av1.GetResidue().GetNumber() << " " << av1.GetName()
// << " " << av2.GetResidue().GetChain() << " " << av2.GetResidue().GetName() << " " << av2.GetResidue().GetNumber() << " " << av2.GetName() << " "
// << mdl_dist << " " << values.first << " " << diff)
LOG_VERBOSE("drmsd:" << " " << av1.GetResidue().GetChain() << " " << av1.GetResidue().GetName() << " " << av1.GetResidue().GetNumber() << " " << av1.GetName()
<< " " << av1.GetPos() << " " << av2.GetResidue().GetChain() << " " << av2.GetResidue().GetName() << " " << av2.GetResidue().GetNumber() << " " << av2.GetName() << " "
<< av2.GetPos() << " " << mdl_dist << " " << values.first << " " << diff)
}
rmsd_data.first+=(diff*diff);
if (!only_fixed) {
local_rmsd_data_list[rindex1].first+=(diff*diff);
local_rmsd_data_list[rindex2].first+=(diff*diff);
}
}
return rmsd_data;
}
void drmsdt_check_and_swap(const GlobalRDMap& glob_dist_list, const EntityView& mdl, int sequence_separation, Real cap_distance, std::vector<std::pair<Real, long int> > local_rmsd_data_list)
{
ChainView mdl_chain=mdl.GetChainList()[0];
XCSEditor edi=mdl.GetHandle().EditXCS(BUFFERED_EDIT);
for (GlobalRDMap::const_iterator i=glob_dist_list.begin(); i!=glob_dist_list.end(); ++i) {
ResNum rnum = i->first;
if (i->second.size()==0) {
continue;
}
ResidueView mdl_res=mdl_chain.FindResidue(rnum);
if (!mdl_res) {
continue;
}
String rname = mdl_res.GetName();
if (!(rname=="GLU" || rname=="ASP" || rname=="VAL" || rname=="TYR" ||
rname=="PHE" || rname=="LEU" || rname=="ARG")) {
continue;
}
// std::pair<long int, long int> ov1=calc_rmsd(i->second, rnum,mdl_chain, sequence_separation,
// cap_distance, true, false, local_rmsd_data_list,false);
// std::pair<long int, long int> ov2=calc_rmsd(i->second, rnum, mdl_chain, sequence_separation,
// cap_distance, true, true, local_rmsd_data_list,false);
std::pair<long int, long int> ov1=calc_rmsd(i->second, rnum,mdl_chain, sequence_separation,
cap_distance, true, false, local_rmsd_data_list,true);
std::pair<long int, long int> ov2=calc_rmsd(i->second, rnum, mdl_chain, sequence_separation,
cap_distance, true, true, local_rmsd_data_list,true);
if (std::sqrt(ov1.first/static_cast<Real>(ov1.second))>(std::sqrt(ov2.first/static_cast<Real>(ov2.second)))) {
AtomViewList atoms=mdl_res.GetAtomList();
for (AtomViewList::iterator j=atoms.begin(), e2=atoms.end(); j!=e2; ++j) {
if (Swappable(rname, j->GetName())) {
edi.RenameAtom(j->GetHandle(), SwappedName(j->GetName()));
}
}
}
}
}
}
std::pair<Real,long int> DistanceRMSDTest(const EntityView& mdl, const GlobalRDMap& glob_dist_list,
Real cap_distance, int sequence_separation, const String& local_drmsdt_property_string)
{
if (!mdl.GetResidueCount()) {
LOG_WARNING("model structures doesn't contain any residues");
return std::make_pair<long int,long int>(0,0);
}
if (glob_dist_list.size()==0) {
LOG_WARNING("global reference list is empty");
return std::make_pair<long int,long int>(0,0);
}
std::vector<std::pair<Real, long int> > local_rmsd_data_list(mdl.GetResidueCount(), std::pair<Real, long int>(0, 0));
drmsdt_check_and_swap(glob_dist_list,mdl,sequence_separation, cap_distance, local_rmsd_data_list);
ChainView mdl_chain=mdl.GetChainList()[0];
local_rmsd_data_list.clear();
std::pair<Real, long int> total_ov(0, 0);
for (GlobalRDMap::const_iterator i=glob_dist_list.begin(), e=glob_dist_list.end(); i!=e; ++i) {
ResNum rn = i->first;
if (i->second.size()!=0) {
std::pair<Real, long int> ov1=calc_rmsd(i->second, rn, mdl_chain, sequence_separation, cap_distance, false, false, local_rmsd_data_list,true);
total_ov.first+=ov1.first;
total_ov.second+=ov1.second;
}
}
for (GlobalRDMap::const_iterator i=glob_dist_list.begin(),
e=glob_dist_list.end();i!=e; ++i) {
ResNum rn = i->first;
if(local_drmsdt_property_string!="") {
ResidueView mdlr=mdl_chain.FindResidue(rn);
if (mdlr.IsValid()) {
int mdl_res_index =mdlr.GetIndex();
Real local_rmsd=sqrt(local_rmsd_data_list[mdl_res_index].first/(static_cast<Real>(local_rmsd_data_list[mdl_res_index].second) ? static_cast<Real>(local_rmsd_data_list[mdl_res_index].second) : 1));
mdlr.SetFloatProp(local_drmsdt_property_string, local_rmsd);
mdlr.SetIntProp(local_drmsdt_property_string+"_sum", local_rmsd_data_list[mdl_res_index].first);
mdlr.SetIntProp(local_drmsdt_property_string+"_count", local_rmsd_data_list[mdl_res_index].second);
}
}
}
local_rmsd_data_list.clear();
return std::make_pair<Real,long int>(total_ov.first,total_ov.second);
}
Real DRMSD(const EntityView& v, const GlobalRDMap& global_dist_list, Real cap_distance, int sequence_separation)
{
String label="localdrmsd";
std::pair<Real,long int> total_ov=alg::DistanceRMSDTest(v, global_dist_list, cap_distance, sequence_separation, label);
Real calcdrmsd = static_cast<Real>(total_ov.first)/(static_cast<Real>(total_ov.second) ? static_cast<Real>(total_ov.second) : 1);
return std::sqrt(calcdrmsd);
}
}}}
//------------------------------------------------------------------------------
// This file is part of the OpenStructure project <www.openstructure.org>
//
// Copyright (C) 2008-2011 by the OpenStructure authors
//
// This library is free software; you can redistribute it and/or modify it under
// the terms of the GNU Lesser General Public License as published by the Free
// Software Foundation; either version 3.0 of the License, or (at your option)
// any later version.
// This library is distributed in the hope that it will be useful, but WITHOUT
// ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
// details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with this library; if not, write to the Free Software Foundation, Inc.,
// 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
//------------------------------------------------------------------------------
#ifndef OST_MOL_ALG_DISTANCE_RMSD_TEST_HH
#define OST_MOL_ALG_DISTANCE_RMSD_TEST_HH
#include <ost/mol/entity_view.hh>
#include <ost/mol/alg/module_config.hh>
#include <ost/mol/alg/local_dist_diff_test.hh>
namespace ost { namespace mol { namespace alg {
/// \brief Calculates the Distance RMSD in a model, given a list of distances with their refence length
///
/// Calculates the two values needed to determine the Distance RMSD Test score for a given model, i.e.
/// the sum of the squared distance deviations and the total number of distances on which the sum was
/// calculated.
/// The function requires a list of distances for which the distance deviation has to be computed, together
/// with a reference length for each of them. The function also requires a model on which the returned
/// values are computed. Furthermore, the function requires a default deviation value to be used in the
/// calculations when a distance is not present in the model.
///
/// The distance information needs to be stored in ian instance of the GlobalRDMap object. This object
/// stores two distance lengths for each distance. This function uses the first of them as reference length
/// and ignores the second.
///
/// The function only processes standard residues in the first chains of the model and of the reference
/// For residues with symmetric sidechains (GLU, ASP, ARG, VAL, PHE, TYR), the
/// naming of the atoms is ambigous. For these residues, the function computes the Distance RMSD
/// score that each naming convention would generate when considering all non-ambigous surrounding atoms.
/// The solution that gives lower score is then picked to compute the final Distance RMSD score for the whole
/// model.
///
/// A sequence separation parameter can be passed to the function. If this happens, only distances between residues
/// whose separation is higher than the provided parameter are considered when computing the score.
///
/// If a string is provided as an argument to the function, residue-per-residue statistics are stored as
/// residue properties. Specifically, the local residue-based Distance RMSD score is stored in a float property
/// as the provided string, while the residue-based sum of squared distances and the number of distances checked
/// are saved in two properties named <string>_sum (a float property) and <string>_count (an int property).
std::pair<Real,long int> DistanceRMSDTest(const EntityView& mdl,
const GlobalRDMap& glob_dist_list,
Real cap_distance, int sequence_separation = 0,
const String& local_drmsdt_property_string="");
/// \brief Computes the Distance RMSD Test given a list of distances to check and a model
///
/// Computes the Local Distance RMSD Test on the provided model. Requires a list of distances to check and a
/// model for which the score is computed, plus a default deviation value to be used when a distance is not
/// present in the model.
///
/// A sequence separation parameter can be passed to the function. If this happens, only distances between residues
/// whose separation is higher than the provided parameter are considered when computing the score.
Real DLLEXPORT_OST_MOL_ALG DRMSD(const EntityView& v, const GlobalRDMap& global_dist_list, Real cap_distance, int sequence_separation=0);
}}}
#endif
......@@ -24,7 +24,17 @@
namespace ost { namespace mol { namespace alg {
/// \brief Returns the name of the symmetry equivalent atom in residues with symmetric side-chains
///
/// If the atom does belongs to a residue with a symmetric side-chain and if the atom has a symmetry
/// equivalent, the function returns the name of the symmetry equivalent atom, otherwise it returns
/// the name of the original atom
String DLLEXPORT_OST_MOL_ALG SwappedName(const String& name);
/// \brief Checks if an atom in a residue has a symmetry equivalent
///
/// Returns true if the atom belongs to a residue with a symmetric side-chain and a symmetry equivalent
/// atom exists. Returns false otherwise
bool DLLEXPORT_OST_MOL_ALG Swappable(const String& rname, const String& aname);
/// \brief Contains the infomation needed to uniquely identify an atom in a structure
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment