Skip to content
Snippets Groups Projects
Commit 2bdcf390 authored by Alex Kanitz's avatar Alex Kanitz
Browse files

Merge branch 'patch-1' into 'dev'

docs: revised terminology for non-experts

See merge request !73
parents 2f28725d e331f10e
No related branches found
No related tags found
2 merge requests!76Bump version to v0.2.0,!73docs: revised terminology for non-experts
Pipeline #11048 passed
# ZARP pipeline # ZARP
[Snakemake][snakemake] workflow for general purpose RNA-Seq library annotation [Snakemake][snakemake] workflow that covers common steps of short read RNA-Seq
developed by the [Zavolan lab][zavolan-lab]. library analysis developed by the [Zavolan lab][zavolan-lab].
Reads are processed, aligned, quantified and analyzed with state-of-the-art Reads are analyzed (pre-processed, aligned, quantified) with state-of-the-art
tools to give meaningful initial insights into various aspects of an RNA-Seq tools to give meaningful initial insights into the quality and composition
library while cutting down on hands-on time for bioinformaticians. of an RNA-Seq library, reducing hands-on time for bioinformaticians and giving
experimentalists the possibility to rapidly assess their data.
Below is a schematic representation of the individual workflow steps ("pe" Below is a schematic representation of the individual steps of the workflow
refers to "paired-end"): ("pe" refers to "paired-end"):
> ![rule_graph][rule-graph] > ![rule_graph][rule-graph]
For a more detailed description of each step, please refer to the [pipeline For a more detailed description of each step, please refer to the [workflow
documentation][pipeline-documentation]. documentation][workflow-documentation].
## Requirements ## Requirements
...@@ -28,8 +29,8 @@ on the following distributions: ...@@ -28,8 +29,8 @@ on the following distributions:
### Cloning the repository ### Cloning the repository
Traverse to the desired path on your file system, then clone the repository and Traverse to the desired directory/folder on your file system, then clone/get the
move into it with: repository and move into the respective directory with:
```bash ```bash
git clone ssh://git@git.scicore.unibas.ch:2222/zavolan_group/pipelines/zarp.git git clone ssh://git@git.scicore.unibas.ch:2222/zavolan_group/pipelines/zarp.git
...@@ -49,8 +50,8 @@ Other versions are not guaranteed to work as expected. ...@@ -49,8 +50,8 @@ Other versions are not guaranteed to work as expected.
For improved reproducibility and reusability of the workflow, For improved reproducibility and reusability of the workflow,
each individual step of the workflow runs in its own [Singularity][singularity] each individual step of the workflow runs in its own [Singularity][singularity]
container. As a consequence, running this workflow has very few container. As a consequence, running this workflow has very few
individual dependencies. It does, however, require Singularity to be installed individual dependencies. However, it requires Singularity to be installed
on the system running the workflow. As the functional installation of on the system where the workflow is executed. As the functional installation of
Singularity requires root privileges, and Conda currently only provides Singularity requires root privileges, and Conda currently only provides
Singularity for Linux architectures, the installation instructions are Singularity for Linux architectures, the installation instructions are
slightly different depending on your system/setup: slightly different depending on your system/setup:
...@@ -105,15 +106,14 @@ conda env update -f install/environment.dev.yml ...@@ -105,15 +106,14 @@ conda env update -f install/environment.dev.yml
## Testing the installation ## Testing the installation
We have prepared several tests to check the integrity of the workflow, its We have prepared several tests to check the integrity of the workflow and its
components and non-essential processing scripts. These can be found in components. These can be found in subdirectories of the `tests/` directory.
subdirectories of the `tests/` directory. The most critical of these tests The most critical of these tests enable you execute the entire workflow on a
lets you execute the entire workflow on a small set of example input files. set of small example input files. Note that for this and other tests to complete
Note that for this and other tests to complete without issues, successfully, [additional dependencies](#installing-non-essential-dependencies)
[additional dependencies](#installing-non-essential-dependencies) need to be need to be installed.
installed.
### Run workflow on local machine ### Test workflow on local machine
Execute the following command to run the test workflow on your local machine: Execute the following command to run the test workflow on your local machine:
...@@ -121,7 +121,7 @@ Execute the following command to run the test workflow on your local machine: ...@@ -121,7 +121,7 @@ Execute the following command to run the test workflow on your local machine:
bash tests/test_integration_workflow/test.local.sh bash tests/test_integration_workflow/test.local.sh
``` ```
### Run workflow via Slurm ### Test workflow via Slurm
Execute the following command to run the test workflow on a Execute the following command to run the test workflow on a
[Slurm][slurm]-managed high-performance computing (HPC) cluster: [Slurm][slurm]-managed high-performance computing (HPC) cluster:
...@@ -131,15 +131,15 @@ bash tests/test_integration_workflow/test.slurm.sh ...@@ -131,15 +131,15 @@ bash tests/test_integration_workflow/test.slurm.sh
``` ```
> **NOTE:** Depending on the configuration of your Slurm installation or if > **NOTE:** Depending on the configuration of your Slurm installation or if
> using a different workflow manager, you may need to adapt file `cluster.json` > using a different workload manager, you may need to adapt file `cluster.json`
> and the arguments to options `--config` and `--cores` in file > and the arguments to options `--config` and `--cores` in the file
> `test.slurm.sh`, both located in directory `tests/test_integration_workflow`. > `test.slurm.sh`, both located in directory `tests/test_integration_workflow`.
> Consult the manual of your workload manager as well as the section of the > Consult the manual of your workload manager as well as the section of the
> Snakemake manual dealing with [cluster execution]. > Snakemake manual dealing with [cluster execution].
## Running the workflow on your own samples ## Running the workflow on your own samples
1. Assuming that you are currently inside the repository's root directory, 1. Assuming that your current directory is the repository's root directory,
create a directory for your workflow run and traverse inside it with: create a directory for your workflow run and traverse inside it with:
```bash ```bash
...@@ -156,7 +156,7 @@ configuration files: ...@@ -156,7 +156,7 @@ configuration files:
touch cluster.json touch cluster.json
``` ```
3. Use your editor of choice to manually populate these files with appropriate 3. Use your editor of choice to populate these files with appropriate
values. Have a look at the examples in the `tests/` directory to see what the values. Have a look at the examples in the `tests/` directory to see what the
files should look like, specifically: files should look like, specifically:
...@@ -166,7 +166,7 @@ files should look like, specifically: ...@@ -166,7 +166,7 @@ files should look like, specifically:
4. Create a runner script. Pick one of the following choices for either local 4. Create a runner script. Pick one of the following choices for either local
or cluster execution. Before execution of the respective command, you must or cluster execution. Before execution of the respective command, you must
replace the data directory placeholders in the argument to the replace the data directory placeholders in the argument of the
`--singularity-args` option with a comma-separated list of _all_ directories `--singularity-args` option with a comma-separated list of _all_ directories
containing input data files (samples and any annoation files etc) required for containing input data files (samples and any annoation files etc) required for
your run. your run.
...@@ -223,9 +223,11 @@ Our lab stores metadata for sequencing samples in a locally deployed ...@@ -223,9 +223,11 @@ Our lab stores metadata for sequencing samples in a locally deployed
programmatic access to the LabKey data table and convert it to the programmatic access to the LabKey data table and convert it to the
corresponding workflow inputs (`samples.tsv` and `config.yaml`), respectively. corresponding workflow inputs (`samples.tsv` and `config.yaml`), respectively.
As such, these scripts largely automate step 3. of the above instructions. As such, these scripts largely automate step 3. of the above instructions.
However, as these scripts were specifically for the needs of our lab, they are However, as these scripts were written specifically for the needs of our lab,
likely not portable or, at least, will require considerable modification for they are likely not directly usable or, at least, will require considerable
other setups (e.g., different LabKey table structure). modification for other setups (e.g., different LabKey table structure).
Nevertheless, they can serve as an example for interfacing between LabKey and
your workflow.
> **NOTE:** All of the below steps assume that your current working directory > **NOTE:** All of the below steps assume that your current working directory
> is the repository's root directory. > is the repository's root directory.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment