Skip to content
Snippets Groups Projects
  • Studer Gabriel's avatar
    aaec9ede
    lddt: speedup for big complexes · aaec9ede
    Studer Gabriel authored
    Pairwise distance computation for the reference distances was
    performed with N squared complexity and some funny guy had the idea
    to throw a 180mer at it...
    
    One possibibility would be the use of some KD tree data structure.
    However, the construction itself comes with computational cost.
    The implemented solution makes use of the expected spatial proximity
    of atoms in the same chain and distances are computed as follows:
    
    - process each chain individually
    - perform crude collision detection
    - process potentially interacting chain pairs
    - concatenate distances from all processing steps
    
    The new algorithm has been tested and compared to the previous
    implementation by randomly selecting 3 models of each CASP15 oligo
    target. Global lDDT has been tested for a match within 0.0001 and
    per-residue lDDT for a match within 0.001. Reason for lower threshold
    in per-residue lDDT is floating point accuracy. Also for the changed
    unit test, one distance difference was within floating point accuracy
    of one of the thresholds (see comments there). Accuracy of 0.001 still
    means that we only allow a discrepancy of one for 1000 checked distances...
    
    Observed speedups are size dependent and range from lower 2 digit
    percentages up to several fold speedup for larger CASP15 targets.
    The mentioned 180mer now concludes in a few minutes as oposed to
    almost a day.
    aaec9ede
    History
    lddt: speedup for big complexes
    Studer Gabriel authored
    Pairwise distance computation for the reference distances was
    performed with N squared complexity and some funny guy had the idea
    to throw a 180mer at it...
    
    One possibibility would be the use of some KD tree data structure.
    However, the construction itself comes with computational cost.
    The implemented solution makes use of the expected spatial proximity
    of atoms in the same chain and distances are computed as follows:
    
    - process each chain individually
    - perform crude collision detection
    - process potentially interacting chain pairs
    - concatenate distances from all processing steps
    
    The new algorithm has been tested and compared to the previous
    implementation by randomly selecting 3 models of each CASP15 oligo
    target. Global lDDT has been tested for a match within 0.0001 and
    per-residue lDDT for a match within 0.001. Reason for lower threshold
    in per-residue lDDT is floating point accuracy. Also for the changed
    unit test, one distance difference was within floating point accuracy
    of one of the thresholds (see comments there). Accuracy of 0.001 still
    means that we only allow a discrepancy of one for 1000 checked distances...
    
    Observed speedups are size dependent and range from lower 2 digit
    percentages up to several fold speedup for larger CASP15 targets.
    The mentioned 180mer now concludes in a few minutes as oposed to
    almost a day.