Add 2-pass mapping
Description
For increased sensitivity in novel splice junction detection, STAR offers a 2-pass mapping mode.
From the manual (section 8):
Per-sample 2-pass mapping. Annotated junctions will be included in both the 1st and 2nd passes. To run STAR 2-pass mapping for each sample separately, use --twopassMode Basic option. STAR will perform the 1st pass mapping, then it will automatically extract junctions, insert them into the genome index, and, finally, re-map all reads in the 2nd mapping pass. This option can be used with annotations, which can be included either at the run-time (see #1 (closed)), or at the genome generation step. --twopass1readsN defines the number of reads to be mapped in the 1st pass. The default and most sensitive approach is to set it to -1 (or make it bigger than the number of reads in the sample) - in which case all reads in the input read file(s) are used in the 1st pass. While it can reduce mapping time by ∼ 40%, it is not recommended to use a small portion of the reads in the 1st step, since it will significantly reduce sensitivity for the low expressed novel junctions. The idea to use a portion of the reads in the 1st pass was inspired by Kim, Langmead and Salzberg in Nature Methods 12, 357360 (2015).
Suggested solution
Enable option --twopassMode
in the relevant Snakemake rules and pass as an argument the value of field pass_mode
in the sample table.