Skip to content
Snippets Groups Projects
  • Studer Gabriel's avatar
    fe259ddf
    chain mapping: reduce likelihood of grouping dissimilar sequences together · fe259ddf
    Studer Gabriel authored
    In principle one can have the following alignment:
    
    XXXXXXXXXXA--------
    ----------AYYYYYYYY
    
    It has a 100% sequence identity! The previously implemented logic of gap
    thresholds was also not very helpfil to filter out these cases as it
    operated on fraction of gaps between first and last aligned column in
    the alignment. That's 0.0 and thus perfect.
    
    This commit simplifies this logic and simply checks for a sequence identity
    threshold and a minimum number of aligned columns when grouping sequences
    together. This should make grouping these cases together very unlikely.
    fe259ddf
    History
    chain mapping: reduce likelihood of grouping dissimilar sequences together
    Studer Gabriel authored
    In principle one can have the following alignment:
    
    XXXXXXXXXXA--------
    ----------AYYYYYYYY
    
    It has a 100% sequence identity! The previously implemented logic of gap
    thresholds was also not very helpfil to filter out these cases as it
    operated on fraction of gaps between first and last aligned column in
    the alignment. That's 0.0 and thus perfect.
    
    This commit simplifies this logic and simply checks for a sequence identity
    threshold and a minimum number of aligned columns when grouping sequences
    together. This should make grouping these cases together very unlikely.