Commit aaec9ede authored 1 year ago by Studer Gabriel

lddt: speedup for big complexes

Pairwise distance computation for the reference distances was
performed with N squared complexity and some funny guy had the idea
to throw a 180mer at it...

One possibibility would be the use of some KD tree data structure.
However, the construction itself comes with computational cost.
The implemented solution makes use of the expected spatial proximity
of atoms in the same chain and distances are computed as follows:

- process each chain individually
- perform crude collision detection
- process potentially interacting chain pairs
- concatenate distances from all processing steps

The new algorithm has been tested and compared to the previous
implementation by randomly selecting 3 models of each CASP15 oligo
target. Global lDDT has been tested for a match within 0.0001 and
per-residue lDDT for a match within 0.001. Reason for lower threshold
in per-residue lDDT is floating point accuracy. Also for the changed
unit test, one distance difference was within floating point accuracy
of one of the thresholds (see comments there). Accuracy of 0.001 still
means that we only allow a discrepancy of one for 1000 checked distances...

Observed speedups are size dependent and range from lower 2 digit
percentages up to several fold speedup for larger CASP15 targets.
The mentioned 180mer now concludes in a few minutes as oposed to
almost a day.

parent 0356d1b9

No related branches found

No related tags found

No related merge requests found

Hide whitespace changes

Inline Side-by-side

Showing with 130 additions and 55 deletions

Please register or to comment