Skip to content
Snippets Groups Projects
Commit 53b95906 authored by BIOPZ-Gypas Foivos's avatar BIOPZ-Gypas Foivos
Browse files

Merge branch 'feature/conda-support' into 'dev'

Conda support in ZARP

See merge request !89
parents 2ab716c0 10885dca
No related branches found
No related tags found
1 merge request!89Conda support in ZARP
Pipeline #12809 passed
Showing
with 541 additions and 21 deletions
......@@ -13,9 +13,10 @@ test:
# add unit tests here
# add script tests here
- bash tests/test_scripts_prepare_inputs_table/test.sh
#- bash tests/test_scripts_prepare_inputs_labkey/test.sh
# - bash tests/test_scripts_prepare_inputs_labkey/test.sh
#- bash tests/test_alfa/test.sh
# add integration tests here
- bash tests/test_integration_workflow_with_conda/test.local.sh
- bash tests/test_create_dag_image/test.sh
- bash tests/test_create_rule_graph/test.sh
- bash tests/test_integration_workflow/test.local.sh
......
......@@ -14,7 +14,7 @@ Below is a schematic representation of the individual steps of the workflow
> ![rule_graph][rule-graph]
For a more detailed description of each step, please refer to the [workflow
documentation][workflow-documentation].
documentation][pipeline-documentation].
## Requirements
......@@ -48,12 +48,8 @@ Other versions are not guaranteed to work as expected.
### Installing dependencies
For improved reproducibility and reusability of the workflow,
each individual step of the workflow runs in its own [Singularity][singularity]
container. As a consequence, running this workflow has very few
individual dependencies. However, it requires Singularity to be installed
on the system where the workflow is executed. As the functional installation of
Singularity requires root privileges, and Conda currently only provides
Singularity for Linux architectures, the installation instructions are
each individual step of the workflow runs either in its own [Singularity][singularity]
container or in its own [Conda][conda] virtual environemnt. As a consequence, running this workflow has very few individual dependencies. However, for the **container execution** it requires Singularity to be installed on the system where the workflow is executed. As the functional installation of Singularity requires root privileges, and Conda currently only provides Singularity for Linux architectures, the installation instructions are
slightly different depending on your system/setup:
#### For most users
......@@ -115,12 +111,17 @@ need to be installed.
### Test workflow on local machine
Execute the following command to run the test workflow on your local machine:
Execute the following command to run the test workflow on your local machine (with singularity):
```bash
bash tests/test_integration_workflow/test.local.sh
```
Alternatively execute the following command to run the test workflow on your local machine (with conda):
```bash
bash tests/test_integration_workflow_with_conda/test.local.sh
```
### Test workflow via Slurm
Execute the following command to run the test workflow on a
......@@ -130,6 +131,12 @@ Execute the following command to run the test workflow on a
bash tests/test_integration_workflow/test.slurm.sh
```
or
```bash
bash tests/test_integration_workflow_with_conda/test.slurm.sh
```
> **NOTE:** Depending on the configuration of your Slurm installation or if
> using a different workload manager, you may need to adapt file `cluster.json`
> and the arguments to options `--config` and `--cores` in the file
......@@ -177,7 +184,7 @@ your run.
cat << "EOF" > run.sh
#!/bin/bash
snakemake \
--snakefile="../../snakemake/Snakefile" \
--snakefile="/path/to/Snakefile" \
--configfile="config.yaml" \
--cores=4 \
--printshellcmds \
......@@ -198,7 +205,7 @@ your run.
#!/bin/bash
mkdir -p logs/cluster_log
snakemake \
--snakefile="../../snakemake/Snakefile" \
--snakefile="/path/to/Snakefile" \
--configfile="config.yaml" \
--cluster-config="cluster.json" \
--cluster="sbatch --cpus-per-task={cluster.threads} --mem={cluster.mem} --qos={cluster.queue} --time={cluster.time} --job-name={cluster.name} -o {cluster.out} -p scicore" \
......@@ -210,6 +217,8 @@ your run.
EOF
```
When running the pipeline with conda you should use the `--use-conda` flag instead of `--use-singularity` and `--singularity-args`.
5. Start your workflow run:
```bash
......
......@@ -20,7 +20,7 @@ cd $script_dir
# Run tests
snakemake \
--snakefile="../../Snakefile" \
--snakefile="../../workflow/Snakefile" \
--configfile="../input_files/config.yaml" \
--dag \
--printshellcmds \
......
......@@ -20,7 +20,7 @@ cd $script_dir
# Run tests
snakemake \
--snakefile="../../Snakefile" \
--snakefile="../../workflow/Snakefile" \
--configfile="../input_files/config.yaml" \
--rulegraph \
--printshellcmds \
......
......@@ -26,7 +26,7 @@ cd $script_dir
# Run tests
snakemake \
--snakefile="../../Snakefile" \
--snakefile="../../workflow/Snakefile" \
--configfile="../input_files/config.yaml" \
--cores=4 \
--printshellcmds \
......@@ -39,7 +39,7 @@ snakemake \
# Create a Snakemake report after the workflow execution
snakemake \
--snakefile="../../Snakefile" \
--snakefile="../../workflow/Snakefile" \
--configfile="../input_files/config.yaml" \
--report="snakemake_report.html"
......
......@@ -26,7 +26,7 @@ cd $script_dir
# Run tests
snakemake \
--snakefile="../../Snakefile" \
--snakefile="../../workflow/Snakefile" \
--configfile="../input_files/config.mutliple_lanes.yml" \
--cores=4 \
--printshellcmds \
......@@ -39,7 +39,7 @@ snakemake \
# Create a Snakemake report after the workflow execution
snakemake \
--snakefile="../../Snakefile" \
--snakefile="../../workflow/Snakefile" \
--configfile="../input_files/config.mutliple_lanes.yml" \
--report="snakemake_report.html"
......
results/kallisto_indexes/homo_sapiens/kallisto.idx
results/salmon_indexes/homo_sapiens/31/salmon.idx/versionInfo.json
results/salmon_indexes/homo_sapiens/31/salmon.idx/duplicate_clusters.tsv
results/salmon_indexes/homo_sapiens/31/salmon.idx/info.json
results/star_indexes/homo_sapiens/75/STAR_index/chrLength.txt
results/star_indexes/homo_sapiens/75/STAR_index/chrNameLength.txt
results/star_indexes/homo_sapiens/75/STAR_index/chrName.txt
results/star_indexes/homo_sapiens/75/STAR_index/chrStart.txt
results/star_indexes/homo_sapiens/75/STAR_index/exonGeTrInfo.tab
results/star_indexes/homo_sapiens/75/STAR_index/exonInfo.tab
results/star_indexes/homo_sapiens/75/STAR_index/geneInfo.tab
results/star_indexes/homo_sapiens/75/STAR_index/Genome
results/star_indexes/homo_sapiens/75/STAR_index/SA
results/star_indexes/homo_sapiens/75/STAR_index/SAindex
results/star_indexes/homo_sapiens/75/STAR_index/sjdbInfo.txt
results/star_indexes/homo_sapiens/75/STAR_index/sjdbList.fromGTF.out.tab
results/star_indexes/homo_sapiens/75/STAR_index/sjdbList.out.tab
results/star_indexes/homo_sapiens/75/STAR_index/transcriptInfo.tab
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.remove_adapters_mate1.fastq
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.remove_polya_mate1.fastq
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.remove_adapters_mate2.fastq
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.remove_polya_mate2.fastq
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/map_genome/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.SJ.out.tab
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/fastqc_data.txt
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/fastqc.fo
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/summary.txt
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/fastqc_data.txt
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/fastqc.fo
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/summary.txt
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/adapter_content.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/duplication_levels.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_base_n_content.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_base_quality.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_base_sequence_content.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_sequence_gc_content.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_sequence_quality.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_tile_quality.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/sequence_length_distribution.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/adapter_content.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/duplication_levels.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_base_n_content.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_base_quality.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_base_sequence_content.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_sequence_gc_content.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_sequence_quality.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_tile_quality.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/sequence_length_distribution.png
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/quant_kallisto/abundance.tsv
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/quant_kallisto/pseudoalignments.bam
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/quant_kallisto/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.kallisto.pseudo.sam
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.salmon.pe/lib_format_counts.json
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.salmon.pe/aux_info/ambig_info.tsv
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.salmon.pe/aux_info/expected_bias
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.salmon.pe/aux_info/observed_bias
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.salmon.pe/aux_info/observed_bias_3p
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.salmon.pe/aux_info/unmapped_names.txt
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.remove_adapters_mate1.fastq
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.remove_polya_mate1.fastq
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/map_genome/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.SJ.out.tab
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/fastqc_data.txt
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/fastqc.fo
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/summary.txt
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/adapter_content.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/duplication_levels.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_base_n_content.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_base_quality.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_base_sequence_content.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_sequence_gc_content.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_sequence_quality.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_tile_quality.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/sequence_length_distribution.png
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/quant_kallisto/abundance.tsv
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/quant_kallisto/pseudoalignments.bam
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/quant_kallisto/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.kallisto.pseudo.sam
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.salmon.se/lib_format_counts.json
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.salmon.se/aux_info/ambig_info.tsv
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.salmon.se/aux_info/expected_bias
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.salmon.se/aux_info/observed_bias
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.salmon.se/aux_info/observed_bias_3p
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.salmon.se/aux_info/unmapped_names.txt
results/transcriptome/homo_sapiens/transcriptome.fa
results/alfa_indexes/homo_sapiens/75/ALFA/sorted_genes.stranded.ALFA_index
results/alfa_indexes/homo_sapiens/75/ALFA/sorted_genes.unstranded.ALFA_index
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.UniqueMultiple.minus.bg
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.UniqueMultiple.plus.bg
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.Unique.minus.bg
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.Unique.plus.bg
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired.UniqueMultiple.minus.bg
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired.UniqueMultiple.plus.bg
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired.Unique.minus.bg
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired.Unique.plus.bg
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.ALFA_feature_counts.tsv
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.ALFA_feature_counts.tsv
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired.ALFA_feature_counts.tsv
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired.ALFA_feature_counts.tsv
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/bigWig/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1_UniqueMultiple_minus.bw
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/bigWig/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1_UniqueMultiple_plus.bw
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/bigWig/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1_Unique_minus.bw
results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/bigWig/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1_Unique_plus.bw
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/bigWig/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired_UniqueMultiple_minus.bw
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/bigWig/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired_UniqueMultiple_plus.bw
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/bigWig/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired_Unique_minus.bw
results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/bigWig/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired_Unique_plus.bw
results/multiqc_summary/multiqc_data/multiqc_fastqc.txt
results/multiqc_summary/multiqc_data/multiqc_cutadapt.txt
results/multiqc_summary/multiqc_data/multiqc_cutadapt_1.txt
results/multiqc_summary/multiqc_data/multiqc_star.txt
results/multiqc_summary/multiqc_data/multiqc_kallisto.txt
results/multiqc_summary/multiqc_data/multiqc_general_stats.txt
results/summary_kallisto/tx2geneID.tsv
results/summary_kallisto/genes_counts.tsv
results/summary_kallisto/transcripts_counts.tsv
results/summary_kallisto/transcripts_tpm.tsv
results/summary_kallisto/genes_tpm.tsv
\ No newline at end of file
cbaebdb67aee4784b64aff7fec9fda42 results/kallisto_indexes/homo_sapiens/kallisto.idx
204865f645102587c4953fccb256797c results/salmon_indexes/homo_sapiens/31/salmon.idx/versionInfo.json
51b5292e3a874119c0e1aa566e95d70c results/salmon_indexes/homo_sapiens/31/salmon.idx/duplicate_clusters.tsv
4e10114bb8f9096d594776181424a302 results/salmon_indexes/homo_sapiens/31/salmon.idx/info.json
dee7cdc194d5d0617552b7a3b5ad8dfb results/star_indexes/homo_sapiens/75/STAR_index/chrLength.txt
8e2e96e2d6b7f29940ad5de40662b7cb results/star_indexes/homo_sapiens/75/STAR_index/chrNameLength.txt
d0826904b8afa45352906ad9591f2bfb results/star_indexes/homo_sapiens/75/STAR_index/chrName.txt
8d3291e6bcdbe9902fbd7c887494173f results/star_indexes/homo_sapiens/75/STAR_index/chrStart.txt
83ea3c15ab782b5c55bfaefda8e7aad8 results/star_indexes/homo_sapiens/75/STAR_index/exonGeTrInfo.tab
bad9d837f9a988694cc7080ee6d2997a results/star_indexes/homo_sapiens/75/STAR_index/exonInfo.tab
0c0b013fb8cbb8f3cb7a7bf92f3b1544 results/star_indexes/homo_sapiens/75/STAR_index/geneInfo.tab
00dda17b3c3983873d1474e9a758d6e6 results/star_indexes/homo_sapiens/75/STAR_index/Genome
c0d91c3af633d9439bfd0160d11efe4d results/star_indexes/homo_sapiens/75/STAR_index/SA
a8dfc49713c053a8a1a2cc2527f15186 results/star_indexes/homo_sapiens/75/STAR_index/SAindex
bae93882f9148a6c55816b733c32a3a2 results/star_indexes/homo_sapiens/75/STAR_index/sjdbInfo.txt
875030141343fca11f0b5aa1a37e1b66 results/star_indexes/homo_sapiens/75/STAR_index/sjdbList.fromGTF.out.tab
ea36f062eedc7f54ceffea2b635a25a8 results/star_indexes/homo_sapiens/75/STAR_index/sjdbList.out.tab
65e794aa5096551254af18a678d02264 results/star_indexes/homo_sapiens/75/STAR_index/transcriptInfo.tab
500dd49da40b16799aba62aa5cf239ba results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.remove_adapters_mate1.fastq
500dd49da40b16799aba62aa5cf239ba results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.remove_polya_mate1.fastq
e90e31db1ce51d930645eb74ff70d21b results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.remove_adapters_mate2.fastq
1c0796d7e0bdab0e99780b2e11d80c19 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.remove_polya_mate2.fastq
9896744dd90ff3eef00c91fa1f721366 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/fastqc_data.txt
6946ba80af318b9c1052b264dc674a51 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/fastqc.fo
2603f3031242e97411a71571f6ad9e53 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/summary.txt
c39fc9108e6f6c0df45acc9391daad9c results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/fastqc_data.txt
82c37e4cb9c1e167383d589ccb5c80b4 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/fastqc.fo
2029b1ecea0c5fb3c54238813cf02a26 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/summary.txt
caf24c834f9f8aa31473c3d5826227ac results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/adapter_content.png
909c316306050c8f7dfb9ad72dfe0334 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/duplication_levels.png
3f7d7acd0b42a4e3642f3cc8f81e7b8d results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_base_n_content.png
5475f0266800b9febf00979b8dc561e6 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_base_quality.png
42462f1beeecb7820682284f7d5518cf results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_base_sequence_content.png
dc6b69c56474f492bbc9824631ac84d3 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_sequence_gc_content.png
b5a5a126e3f85478abdac1074aaf2fe1 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_sequence_quality.png
f399fa5792cdfb72fac7ae2226723122 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/per_tile_quality.png
e342ab7fae5112b9ebca5a04cc6230a2 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq1/synthetic_10_reads_paired_synthetic_10_reads_paired.fq1_fastqc/Images/sequence_length_distribution.png
caf24c834f9f8aa31473c3d5826227ac results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/adapter_content.png
909c316306050c8f7dfb9ad72dfe0334 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/duplication_levels.png
3f7d7acd0b42a4e3642f3cc8f81e7b8d results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_base_n_content.png
5475f0266800b9febf00979b8dc561e6 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_base_quality.png
1c5ddee8a651c196e1b0ecdd8b406e71 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_base_sequence_content.png
eedd2f45539f47e163c2b390ba6fbcfc results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_sequence_gc_content.png
b5a5a126e3f85478abdac1074aaf2fe1 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_sequence_quality.png
f399fa5792cdfb72fac7ae2226723122 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/per_tile_quality.png
e342ab7fae5112b9ebca5a04cc6230a2 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/fastqc/fq2/synthetic_10_reads_paired_synthetic_10_reads_paired.fq2_fastqc/Images/sequence_length_distribution.png
d41d8cd98f00b204e9800998ecf8427e results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/quant_kallisto/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.kallisto.pseudo.sam
500dd49da40b16799aba62aa5cf239ba results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.remove_adapters_mate1.fastq
500dd49da40b16799aba62aa5cf239ba results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.remove_polya_mate1.fastq
fdb8c6ddd39b606414b2785d6ec2da8a results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/fastqc_data.txt
3cb70940acdcca512207bd8613085538 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/fastqc.fo
fc276a1711cc35f7a9d5328bdbbab810 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/summary.txt
caf24c834f9f8aa31473c3d5826227ac results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/adapter_content.png
909c316306050c8f7dfb9ad72dfe0334 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/duplication_levels.png
3f7d7acd0b42a4e3642f3cc8f81e7b8d results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_base_n_content.png
5475f0266800b9febf00979b8dc561e6 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_base_quality.png
42462f1beeecb7820682284f7d5518cf results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_base_sequence_content.png
dc6b69c56474f492bbc9824631ac84d3 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_sequence_gc_content.png
b5a5a126e3f85478abdac1074aaf2fe1 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_sequence_quality.png
f399fa5792cdfb72fac7ae2226723122 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/per_tile_quality.png
e342ab7fae5112b9ebca5a04cc6230a2 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/fastqc/fq1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.fq1_fastqc/Images/sequence_length_distribution.png
d41d8cd98f00b204e9800998ecf8427e results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/quant_kallisto/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.kallisto.pseudo.sam
3ce47cb1d62482c5d62337751d7e8552 results/transcriptome/homo_sapiens/transcriptome.fa
6b44c507f0a1c9f7369db0bb1deef0fd results/alfa_indexes/homo_sapiens/75/ALFA/sorted_genes.stranded.ALFA_index
2caebc23faf78fdbbbdbb118d28bd6b5 results/alfa_indexes/homo_sapiens/75/ALFA/sorted_genes.unstranded.ALFA_index
bcccf679a8c083d01527514c9f5680a0 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.UniqueMultiple.minus.bg
ea91b4f85622561158bff2f7c9c312b3 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.UniqueMultiple.plus.bg
bcccf679a8c083d01527514c9f5680a0 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.Unique.minus.bg
ea91b4f85622561158bff2f7c9c312b3 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.Unique.plus.bg
90ae442ebf35015eab2dd4e804c2bafb results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired.UniqueMultiple.minus.bg
16652c037090f3eed1123618a2e75107 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired.UniqueMultiple.plus.bg
90ae442ebf35015eab2dd4e804c2bafb results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired.Unique.minus.bg
16652c037090f3eed1123618a2e75107 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired.Unique.plus.bg
c1254a0bae19ac3ffc39f73099ffcf2b results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.ALFA_feature_counts.tsv
c1254a0bae19ac3ffc39f73099ffcf2b results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/ALFA/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.ALFA_feature_counts.tsv
53fd53f884352d0493b2ca99cef5d76d results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired.ALFA_feature_counts.tsv
53fd53f884352d0493b2ca99cef5d76d results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/ALFA/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired.ALFA_feature_counts.tsv
ed3428feeb7257b0a69ead76a417e339 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/bigWig/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1_UniqueMultiple_minus.bw
2767ca6a648f3e37b7e3b05ce7845460 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/bigWig/UniqueMultiple/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1_UniqueMultiple_plus.bw
ed3428feeb7257b0a69ead76a417e339 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/bigWig/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1_Unique_minus.bw
2767ca6a648f3e37b7e3b05ce7845460 results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/bigWig/Unique/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1_Unique_plus.bw
69e2bf688165e9fb7c9c49a8763f5632 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/bigWig/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired_UniqueMultiple_minus.bw
ec5aab1b79e7880dfa590e5bc7db5232 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/bigWig/UniqueMultiple/synthetic_10_reads_paired_synthetic_10_reads_paired_UniqueMultiple_plus.bw
69e2bf688165e9fb7c9c49a8763f5632 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/bigWig/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired_Unique_minus.bw
ec5aab1b79e7880dfa590e5bc7db5232 results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/bigWig/Unique/synthetic_10_reads_paired_synthetic_10_reads_paired_Unique_plus.bw
ba090b1b4a2473891de97493d3244956 results/multiqc_summary/multiqc_data/multiqc_fastqc.txt
d8118d944149eecc691d182448696e7f results/multiqc_summary/multiqc_data/multiqc_cutadapt.txt
a127fabda5c3aad9d95414dc4fbc11c3 results/multiqc_summary/multiqc_data/multiqc_cutadapt_1.txt
0c6363588cf6ff74d49f27c164185918 results/multiqc_summary/multiqc_data/multiqc_star.txt
dd81441ca97912a62292d317af2c107c results/multiqc_summary/multiqc_data/multiqc_kallisto.txt
0703b4cb7ec2abfab13ccd5f58c2d536 results/multiqc_summary/multiqc_data/multiqc_general_stats.txt
9f22fcd1b38d9dd692e77cb27f2e52f2 results/summary_kallisto/tx2geneID.tsv
a9514da3fe2c94b9dca71d9e0160be69 results/summary_kallisto/genes_counts.tsv
0d288e71d017a090152384fd915dc2a1 results/summary_kallisto/transcripts_counts.tsv
58a075e4e938d690748b556141912d1c results/summary_kallisto/transcripts_tpm.tsv
8f29696ede8e5d290513f56d3a0b4bff results/summary_kallisto/genes_tpm.tsv
#!/bin/bash
# Tear down test environment
cleanup () {
rc=$?
rm -rf .cache/
rm -rf .config/
rm -rf .fontconfig/
rm -rf .java/
rm -rf .snakemake/
rm -rf logs/
rm -rf results/
rm -rf snakemake_report.html
cd $user_dir
echo "Exit status: $rc"
}
trap cleanup EXIT
# Set up test environment
set -eo pipefail # ensures that script exits at first command that exits with non-zero status
set -u # ensures that script exits when unset variables are used
set -x # facilitates debugging by printing out executed commands
user_dir=$PWD
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
cd $script_dir
# Run tests
snakemake \
--snakefile="../../workflow/Snakefile" \
--configfile="../input_files/config.yaml" \
--cores=4 \
--printshellcmds \
--rerun-incomplete \
--use-conda \
--notemp \
--no-hooks \
--verbose
# Create a Snakemake report after the workflow execution
snakemake \
--snakefile="../../workflow/Snakefile" \
--configfile="../input_files/config.yaml" \
--report="snakemake_report.html"
# Check md5 sum of some output files
find results/ -type f -name \*\.gz -exec gunzip '{}' \;
find results/ -type f -name \*\.zip -exec sh -c 'unzip -o {} -d $(dirname {})' \;
md5sum --check "expected_output.md5"
# Checksum file generated with
#find results/ \
# -type f \
# -name \*\.gz \
# -exec gunzip '{}' \;
#find results/ \
# -type f \
# -name \*\.zip \
# -exec sh -c 'unzip -o {} -d $(dirname {})' \;
#md5sum $(cat expected_output.files) > expected_output.md5
# Check whether STAR produces expected alignments
# STAR alignments need to be fully within ground truth alignments for tests to pass; not checking
# vice versa because processing might cut off parts of reads (if testing STAR directly, add '-f 1'
# as additional option)
echo "Verifying STAR output"
result=$(bedtools intersect -F 1 -v -bed \
-a ../input_files/synthetic.mate_1.bed \
-b results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/map_genome/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.Aligned.sortedByCoord.out.bam \
| wc -l)
if [ $result != "0" ]; then
echo "Alignments for mate 1 reads are not consistent with ground truth"
exit 1
fi
result=$(bedtools intersect -F 1 -v -bed \
-a <(cat ../input_files/synthetic.mate_1.bed ../input_files/synthetic.mate_2.bed) \
-b results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/map_genome/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.Aligned.sortedByCoord.out.bam \
| wc -l)
if [ $result != "0" ]; then
echo "Alignments for mate 1 reads are not consistent with ground truth"
exit 1
fi
# Check whether Salmon assigns reads to expected genes
echo "Verifying Salmon output"
diff \
<(cat results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.salmon.se/quant.genes.sf | cut -f1,5 | tail -n +2 | sort -k1,1) \
<(cat ../input_files/synthetic.mate_1.bed | cut -f7 | sort | uniq -c | sort -k2nr | awk '{printf($2"\t"$1"\n")}')
diff \
<(cat results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.salmon.pe/quant.genes.sf | cut -f1,5 | tail -n +2 | sort -k1,1) \
<(cat ../input_files/synthetic.mate_1.bed | cut -f7 | sort | uniq -c | sort -k2nr | awk '{printf($2"\t"$1"\n")}')
#!/bin/bash
# Tear down test environment
cleanup () {
rc=$?
# rm -rf .cache/
# rm -rf .config/
# rm -rf .fontconfig/
# rm -rf .java/
# rm -rf .snakemake/
# rm -rf logs/
# rm -rf results/
# rm -rf snakemake_report.html
cd $user_dir
echo "Exit status: $rc"
}
trap cleanup EXIT
# Set up test environment
set -eo pipefail # ensures that script exits at first command that exits with non-zero status
set -u # ensures that script exits when unset variables are used
set -x # facilitates debugging by printing out executed commands
user_dir=$PWD
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" >/dev/null 2>&1 && pwd)"
cd $script_dir
# Run tests
snakemake \
--snakefile="../../Snakefile" \
--configfile="../input_files/config.yaml" \
--cluster-config="../input_files/cluster.json" \
--cluster="sbatch --cpus-per-task={cluster.threads} --mem={cluster.mem} --qos={cluster.queue} --time={cluster.time} --job-name={cluster.name} -o {cluster.out} -p scicore" \
--cores=256 \
--printshellcmds \
--rerun-incomplete \
--use-conda \
--notemp \
--no-hooks \
--verbose
# Create a Snakemake report after the workflow execution
snakemake \
--snakefile="../../Snakefile" \
--configfile="../input_files/config.yaml" \
--report="snakemake_report.html"
# Check md5 sum of some output files
find results/ -type f -name \*\.gz -exec gunzip '{}' \;
find results/ -type f -name \*\.zip -exec sh -c 'unzip -o {} -d $(dirname {})' \;
md5sum --check "expected_output.md5"
# Checksum file generated with
# find results/ \
# -type f \
# -name \*\.gz \
# -exec gunzip '{}' \;
# find results/ \
# -type f \
# -name \*\.zip \
# -exec sh -c 'unzip -o {} -d $(dirname {})' \;
# md5sum $(cat expected_output.files) > expected_output.md5
# Check whether STAR produces expected alignments
# STAR alignments need to be fully within ground truth alignments for tests to pass; not checking
# vice versa because processing might cut off parts of reads (if testing STAR directly, add '-f 1'
# as additional option)
echo "Verifying STAR output"
result=$(bedtools intersect -F 1 -v -bed \
-a ../input_files/synthetic.mate_1.bed \
-b results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/map_genome/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.se.Aligned.sortedByCoord.out.bam \
| wc -l)
if [ $result != "0" ]; then
echo "Alignments for mate 1 reads are not consistent with ground truth"
exit 1
fi
result=$(bedtools intersect -F 1 -v -bed \
-a <(cat ../input_files/synthetic.mate_1.bed ../input_files/synthetic.mate_2.bed) \
-b results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/map_genome/synthetic_10_reads_paired_synthetic_10_reads_paired.pe.Aligned.sortedByCoord.out.bam \
| wc -l)
if [ $result != "0" ]; then
echo "Alignments for mate 1 reads are not consistent with ground truth"
exit 1
fi
# Check whether Salmon assigns reads to expected genes
echo "Verifying Salmon output"
diff \
<(cat results/samples/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1/synthetic_10_reads_mate_1_synthetic_10_reads_mate_1.salmon.se/quant.genes.sf | cut -f1,5 | tail -n +2 | sort -k1,1) \
<(cat ../input_files/synthetic.mate_1.bed | cut -f7 | sort | uniq -c | sort -k2nr | awk '{printf($2"\t"$1"\n")}')
diff \
<(cat results/samples/synthetic_10_reads_paired_synthetic_10_reads_paired/synthetic_10_reads_paired_synthetic_10_reads_paired.salmon.pe/quant.genes.sf | cut -f1,5 | tail -n +2 | sort -k1,1) \
<(cat ../input_files/synthetic.mate_1.bed | cut -f7 | sort | uniq -c | sort -k2nr | awk '{printf($2"\t"$1"\n")}')
......@@ -35,7 +35,7 @@ python "../../scripts/prepare_inputs.py" \
# Check if dry run completes
snakemake \
--snakefile="../../Snakefile" \
--snakefile="../../workflow/Snakefile" \
--configfile="config.yaml" \
--dryrun \
--verbose
......
......@@ -100,8 +100,8 @@ def parse_rule_config(rule_config: dict, current_rule: str, immutable: Tuple[str
localrules: start, finish, rename_star_rpm_for_alfa, prepare_multiqc_config
# Include subworkflows
include: os.path.join("workflow", "rules", "paired_end.snakefile.smk")
include: os.path.join("workflow", "rules", "single_end.snakefile.smk")
include: os.path.join("rules", "paired_end.snakefile.smk")
include: os.path.join("rules", "single_end.snakefile.smk")
rule finish:
......@@ -223,6 +223,9 @@ rule fastqc:
singularity:
"docker://quay.io/biocontainers/fastqc:0.11.9--hdfd78af_1"
conda:
os.path.join(workflow.basedir, "envs", "fastqc.yaml")
log:
stderr = os.path.join(
config["log_dir"],
......@@ -304,6 +307,9 @@ rule create_index_star:
singularity:
"docker://quay.io/biocontainers/star:2.7.8a--h9ee0642_1"
conda:
os.path.join(workflow.basedir, "envs", "STAR.yaml")
threads: 12
log:
......@@ -373,6 +379,9 @@ rule extract_transcriptome:
singularity:
"docker://quay.io/biocontainers/gffread:0.12.1--h2e03b76_1"
conda:
os.path.join(workflow.basedir, "envs", "gffread.yaml")
shell:
"(gffread \
-w {output.transcriptome} \
......@@ -463,6 +472,9 @@ rule create_index_salmon:
singularity:
"docker://quay.io/biocontainers/salmon:1.4.0--h84f40af_1"
conda:
os.path.join(workflow.basedir, "envs", "salmon.yaml")
log:
stderr = os.path.join(
......@@ -518,6 +530,9 @@ rule create_index_kallisto:
singularity:
"docker://quay.io/biocontainers/kallisto:0.46.2--h60f4f9f_2"
conda:
os.path.join(workflow.basedir, "envs", "kallisto.yaml")
log:
stderr = os.path.join(
config['log_dir'],
......@@ -563,6 +578,9 @@ rule extract_transcripts_as_bed12:
singularity:
"docker://quay.io/biocontainers/zgtf:0.1.1--pyh5e36f6f_0"
conda:
os.path.join(workflow.basedir, "envs", "zgtf.yaml")
threads: 1
log:
......@@ -611,6 +629,9 @@ rule index_genomic_alignment_samtools:
singularity:
"docker://quay.io/biocontainers/samtools:1.3.1--h1b8c3c0_8"
conda:
os.path.join(workflow.basedir, "envs", "samtools.yaml")
threads: 1
log:
......@@ -700,6 +721,9 @@ rule calculate_TIN_scores:
singularity:
"docker://quay.io/biocontainers/tin-score-calculation:0.4--pyh5e36f6f_0"
conda:
os.path.join(workflow.basedir, "envs", "tin-score-calculation.yaml")
shell:
"(calculate-tin.py \
-i {input.bam} \
......@@ -781,6 +805,9 @@ rule salmon_quantmerge_genes:
singularity:
"docker://quay.io/biocontainers/salmon:1.4.0--h84f40af_1"
conda:
os.path.join(workflow.basedir, "envs", "salmon.yaml")
shell:
"(salmon quantmerge \
--quants {params.salmon_in} \
......@@ -864,6 +891,9 @@ rule salmon_quantmerge_transcripts:
singularity:
"docker://quay.io/biocontainers/salmon:1.4.0--h84f40af_1"
conda:
os.path.join(workflow.basedir, "envs", "salmon.yaml")
shell:
"(salmon quantmerge \
--quants {params.salmon_in} \
......@@ -946,6 +976,9 @@ rule kallisto_merge_genes:
singularity:
"docker://quay.io/biocontainers/r-merge-kallisto:0.6--hdfd78af_0"
conda:
os.path.join(workflow.basedir, "envs", "r-merge-kallisto.yaml")
shell:
"(merge_kallisto.R \
--input {params.tables} \
......@@ -1027,6 +1060,9 @@ rule kallisto_merge_transcripts:
singularity:
"docker://quay.io/biocontainers/r-merge-kallisto:0.6--hdfd78af_0"
conda:
os.path.join(workflow.basedir, "envs", "r-merge-kallisto.yaml")
shell:
"(merge_kallisto.R \
--input {params.tables} \
......@@ -1074,6 +1110,9 @@ rule pca_salmon:
singularity:
"docker://quay.io/biocontainers/zpca:0.8.3.post1--pyh5e36f6f_0"
conda:
os.path.join(workflow.basedir, "envs", "zpca.yaml")
shell:
"(zpca-tpm \
--tpm {input.tpm} \
......@@ -1120,6 +1159,9 @@ rule pca_kallisto:
singularity:
"docker://quay.io/biocontainers/zpca:0.8.3.post1--pyh5e36f6f_0"
conda:
os.path.join(workflow.basedir, "envs", "zpca.yaml")
shell:
"(zpca-tpm \
--tpm {input.tpm} \
......@@ -1210,6 +1252,9 @@ rule star_rpm:
singularity:
"docker://quay.io/biocontainers/star:2.7.8a--h9ee0642_1"
conda:
os.path.join(workflow.basedir, "envs", "STAR.yaml")
log:
stderr = os.path.join(
config["log_dir"],
......@@ -1356,6 +1401,9 @@ rule generate_alfa_index:
singularity:
"docker://quay.io/biocontainers/alfa:1.1.1--pyh5e36f6f_0"
conda:
os.path.join(workflow.basedir, "envs", "alfa.yaml")
log:
os.path.join(
config["log_dir"],
......@@ -1459,6 +1507,9 @@ rule alfa_qc:
singularity:
"docker://quay.io/biocontainers/alfa:1.1.1--pyh5e36f6f_0"
conda:
os.path.join(workflow.basedir, "envs", "alfa.yaml")
log:
os.path.join(
config["log_dir"],
......@@ -1484,7 +1535,6 @@ rule prepare_multiqc_config:
input:
script = os.path.join(
workflow.basedir,
"workflow",
"scripts",
"zarp_multiqc_config.py")
......@@ -1632,6 +1682,9 @@ rule multiqc_report:
singularity:
"docker://quay.io/biocontainers/zavolan-multiqc-plugins:1.3--pyh5e36f6f_0"
conda:
os.path.join(workflow.basedir, "envs", "zavolan-multiqc-plugins.yaml")
shell:
"(multiqc \
--outdir {output.multiqc_report} \
......@@ -1676,6 +1729,9 @@ rule sort_bed_4_big:
singularity:
"docker://quay.io/biocontainers/bedtools:2.27.1--h9a82719_5"
conda:
os.path.join(workflow.basedir, "envs", "bedtools.yaml")
log:
stderr = os.path.join(
config["log_dir"],
......@@ -1736,6 +1792,9 @@ rule prepare_bigWig:
singularity:
"docker://quay.io/biocontainers/ucsc-bedgraphtobigwig:377--h0b8a92a_2"
conda:
os.path.join(workflow.basedir, "envs", "ucsc-bedgraphtobigwig.yaml")
log:
stderr = os.path.join(
config["log_dir"],
......
---
channels:
- conda-forge
- bioconda
dependencies:
- STAR=2.7.8a
...
---
channels:
- conda-forge
- bioconda
dependencies:
- alfa=1.1.1
...
---
channels:
- conda-forge
- bioconda
dependencies:
- bedtools=2.27.1
...
---
channels:
- conda-forge
- bioconda
dependencies:
- cutadapt=3.4
...
---
channels:
- conda-forge
- bioconda
dependencies:
- fastqc=0.11.9
...
---
channels:
- conda-forge
- bioconda
dependencies:
- gffread=0.12.1
...
---
channels:
- conda-forge
- bioconda
dependencies:
- kallisto=0.46.2
...
---
channels:
- conda-forge
- bioconda
dependencies:
- r-merge-kallisto=0.6
...
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment