Skip to content
Snippets Groups Projects
To learn more about this project, read the wiki.
README.md 8.25 KiB

ZARP (Zavolan-Lab Automated RNA-Seq Pipeline) is a generic RNA-Seq analysis workflow that allows users to process and analyze Illumina short-read sequencing libraries with minimum effort. The workflow relies on publicly available bioinformatics tools and currently handles single or paired-end stranded bulk RNA-seq data. The workflow is developed in Snakemake, a widely used workflow management system in the bioinformatics community.

According to the current ZARP implementation, reads are analyzed (pre-processed, aligned, quantified) with state-of-the-art tools to give meaningful initial insights into the quality and composition of an RNA-Seq library, reducing hands-on time for bioinformaticians and giving experimentalists the possibility to rapidly assess their data. Additional reports summarise the results of the individual steps and provide useful visualisations.

Note: For a more detailed description of each step, please refer to the workflow documentation.

Requirements

The workflow has been tested on:

  • CentOS 7.5
  • Debian 10
  • Ubuntu 16.04, 18.04

NOTE: Currently, we only support Linux execution.

Installation

1. Clone the repository

Go to the desired directory/folder on your file system, then clone/get the repository and move into the respective directory with:

git clone ssh://git@git.scicore.unibas.ch:2222/zavolan_group/pipelines/zarp.git
cd zarp

2. Conda installation

Workflow dependencies can be conveniently installed with the Conda package manager. We recommend that you install Miniconda for your system (Linux). Be sure to select Python 3 option. The workflow was built and tested with miniconda 4.7.12. Other versions are not guaranteed to work as expected.

3. Dependencies installation

For improved reproducibility and reusability of the workflow, each individual step of the workflow runs either in its own Singularity container or in its own Conda virtual environemnt. As a consequence, running this workflow has very few individual dependencies. The container execution requires Singularity to be installed on the system where the workflow is executed. As the functional installation of Singularity requires root privileges, and Conda currently only provides Singularity for Linux architectures, the installation instructions are slightly different depending on your system/setup:

For most users

If you do not have root privileges on the machine you want to run the workflow on or if you do not have a Linux machine, please install Singularity separately and in privileged mode, depending on your system. You may have to ask an authorized person (e.g., a systems administrator) to do that. This will almost certainly be required if you want to run the workflow on a high-performance computing (HPC) cluster.

NOTE: The workflow has been tested with the following Singularity versions:

  • v2.6.2
  • v3.5.2