Skip to content
Snippets Groups Projects

Documentation

Closed CJHerrmann requested to merge documentation into master
1 unresolved thread
1 file
+ 24
2
Compare changes
  • Side-by-side
  • Inline
+ 24
2
@@ -5,6 +5,7 @@
@@ -5,6 +5,7 @@
* read samples table
* read samples table
* create log directories
* create log directories
* **create_index_star**
* **create_index_star**
 
* **extract_transcriptome**
* **create_index_salmon**
* **create_index_salmon**
* **create_index_kallisto**
* **create_index_kallisto**
* **extract_transcripts_as_bed12**
* **extract_transcripts_as_bed12**
@@ -20,6 +21,7 @@
@@ -20,6 +21,7 @@
* **pe_index_genomic_alignment_samtools**
* **pe_index_genomic_alignment_samtools**
* **pe_quantification_salmon**
* **pe_quantification_salmon**
* **pe_genome_quantification_kallisto**
* **pe_genome_quantification_kallisto**
 
* **star_rpm_paired_end**
@@ -76,6 +78,10 @@ Create index for STAR alignments. Supply the reference genome sequences (FASTA f
@@ -76,6 +78,10 @@ Create index for STAR alignments. Supply the reference genome sequences (FASTA f
**Parameters:** sjdbOverhang (This is the `index_size` specified in the samples table).
**Parameters:** sjdbOverhang (This is the `index_size` specified in the samples table).
**Output:** chrNameLength.txt will be used for STAR mapping; chrName.txt
**Output:** chrNameLength.txt will be used for STAR mapping; chrName.txt
 
#### extract_transcriptome
 
> TODO
 
 
#### create_index_salmon
#### create_index_salmon
Create index for Salmon quantification. If you want to use Salmon in mapping-based mode, then you first have to build a salmon index for your transcriptome. This will build the mapping-based index, using an auxiliary k-mer hash over k-mers of length 31. While the mapping algorithms will make use of arbitrarily long matches between the query and reference, the k size selected here will act as the minimum acceptable length for a valid match. Thus, a smaller value of k may slightly improve sensitivty. We find that a k of 31 seems to work well for reads of 75bp or longer, but you might consider a smaller k if you plan to deal with shorter reads. [Salmon manual](https://salmon.readthedocs.io/en/latest/salmon.html)
Create index for Salmon quantification. If you want to use Salmon in mapping-based mode, then you first have to build a salmon index for your transcriptome. This will build the mapping-based index, using an auxiliary k-mer hash over k-mers of length 31. While the mapping algorithms will make use of arbitrarily long matches between the query and reference, the k size selected here will act as the minimum acceptable length for a valid match. Thus, a smaller value of k may slightly improve sensitivty. We find that a k of 31 seems to work well for reads of 75bp or longer, but you might consider a smaller k if you plan to deal with shorter reads. [Salmon manual](https://salmon.readthedocs.io/en/latest/salmon.html)
@@ -203,7 +209,7 @@ Spliced Transcripts Alignment to a Reference
@@ -203,7 +209,7 @@ Spliced Transcripts Alignment to a Reference
#### (pe_)index_genomic_alignment_samtools
#### (pe_)index_genomic_alignment_samtools
Index the genomic alignment with [samtools index](http://quinlanlab.org/tutorials/samtools/samtools.html#samtools-index). Indexing a genome sorted BAM file allows one to quickly extract alignments overlapping particular genomic regions. Moreover, indexing is required by genome viewers such as IGV so that the viewers can quickly display alignments in each genomic region to which you navigate.
Index the genomic alignment with [samtools index](http://quinlanlab.org/tutorials/samtools/samtools.html#samtools-index). Indexing a genome sorted BAM file allows one to quickly extract alignments overlapping particular genomic regions. Moreover, indexing is required by genome viewers such as IGV so that the viewers can quickly display alignments in each genomic region to which you navigate.
Needed for TIN score calculation.
Needed for TIN score calculation and bedgraph coverage calculation.
**Input:** bam file
**Input:** bam file
**Output:** bam.bai index file
**Output:** bam.bai index file
@@ -246,4 +252,20 @@ Needed for TIN score calculation.
@@ -246,4 +252,20 @@ Needed for TIN score calculation.
*additionally for single end:*
*additionally for single end:*
* -l: fragment length, user specified as `mean`
* -l: fragment length, user specified as `mean`
* -s: fragment length SD, user specified as `sd`
* -s: fragment length SD, user specified as `sd`
\ No newline at end of file
 
 
 
#### star_rpm_paired_end
 
Create stranded bedgraph coverage with STARs RPM normalisation.
 
Described [here](https://ycl6.gitbooks.io/rna-seq-data-analysis/visualization.html)
 
 
**Input:** .bam, .bam.bai index
 
**Output:** coverage bedGraphs
 
 
**Arguments not influencable by user:**
 
--outWigStrans "Stranded"
 
--outWigNorm "RPM"
 
 
 
*Same for single- and paired-end.*
 
\ No newline at end of file
Loading