Skip to content
Snippets Groups Projects
Commit c915fcbc authored by BIOPZ-Gypas Foivos's avatar BIOPZ-Gypas Foivos
Browse files
parents 63e1a63c 61b3b6b9
No related branches found
No related tags found
No related merge requests found
README.md 0 → 100644
# Riboseq pipeline
## Requirements
* wget
* git
* singularity
* Slurm Workload Manager
## Features
Pipeline for Ribo-Seq data. It consists of two snakemake workflows:
* prepare_annotation: Prepares the annotation files
* process_data: Processes the Ribo-Seq data
## Installation
The recommended way is to create a virtual environment via conda and install the snakemake dependenies.
**In order to run the workflows you need to run it in a system where singularity is available.**
### Step 1: Download miniconda 3 installation file (if not already installed)
for Linux:
```
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
```
### Step 2: Install miniconda 3
Make sure that you run the 'bash' shell and execute:
for Linux:
```
bash Miniconda3-latest-Linux-x86_64.sh
```
### Step 3: Create a new conda environment
Create a new conda environment
```
conda create --name riboseq_pipeline --channel bioconda --channel conda-forge snakemake=4.8.1
```
Activate the virtual environment
```
conda activate riboseq_pipeline
```
You can deactivate later the virtual environment as
```
conda deactivate
```
Check if snakemake was installed properly
```
snakemake --help
```
### Step 4: Clone the repository
```
git clone ssh://git@git.scicore.unibas.ch:2222/AnnotationPipelines/riboseq_pipeline.git
```
## Configure pipeline
### Download annotation files
Go in the snakemake directory and create a new annotation directory
```
cd riboseq_pipeline/snakemake
mkdir annotation
```
Download an annotation file (e.g. gtf from ENSEMBL) and uncompress it
```
wget ftp://ftp.ensembl.org/pub/release-90/gtf/homo_sapiens/Homo_sapiens.GRCh38.90.chr.gtf.gz
gunzip Homo_sapiens.GRCh38.90.chr.gtf.gz
```
Download a chromosome sequences file (e.g. soft-masked fasta from ENSEMBL) and uncompress it
```
wget ftp://ftp.ensembl.org/pub/release-90/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
gunzip Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
```
Then download rRNAs (e.g. from RefSeq). For the members of the group you can use the following file:
```
cp /scicore/home/zavolan/gypas/projects/resources/human/rRNA/txome_rRNAs_joao.fa .
```
Finally copy or create a file with oligos. For the members of the group you can use the following file:
```
cp /scicore/home/zavolan/gypas/projects/resources/human/rRNA/oligos.txt .
```
### Configure and run workflows
As mentioned earlier two snakemake workflows are available. One that prepares the annotation files (e.g. generation of index files etc) and one that processes the Ribo-Seq data.
#### Prepare annotation workflow
First of all go to the 'snakemake/prepare_annotation' directory and fill in the 'config.yaml' file. To make sure that everything is configured properly create a dag of the workflow.
```
bash create_snakemake_flowchart.sh
```
And finally run the pipeline. This script is configured for the Slurm Workload Manager
```
nohup bash run_snakefile.sh &
```
#### Process data workflow
Once the prepare_annotation pipeline is complete you can move to the 'snakemake/process_data' directory. Copy or create a hard link of the Ribo-Seq samples you want to process in the 'samples' directory. Fill in the config.yaml file.
Create the dag
```
bash create_snakemake_flowchart.sh
```
And finally run the pipeline. This script is configured for the Slurm Workload Manager
```
nohup bash run_snakefile.sh &
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment