Merge branch 'master' of ssh://git.scicore.unibas.ch:2222/AnnotationPipelines/riboseq_pipeline

c915fcbc · BIOPZ-Gypas Foivos · 63e1a63c · 61b3b6b9 · c915fcbc
Commit c915fcbc authored 6 years ago by BIOPZ-Gypas Foivos
--- a/README.md
+++ b/README.md
+# Riboseq pipeline
+
+## Requirements
+
+* wget
+* git
+* singularity
+* Slurm Workload Manager
+
+## Features
+
+Pipeline for Ribo-Seq data. It consists of two snakemake workflows:
+* prepare_annotation: Prepares the annotation files
+* process_data: Processes the Ribo-Seq data
+
+## Installation
+
+The recommended way is to create a virtual environment via conda and install the snakemake dependenies.
+**In order to run the workflows you need to run it in a system where singularity is available.**
+
+### Step 1: Download miniconda 3 installation file (if not already installed)
+
+
+for Linux:
+```
+wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
+```
+
+### Step 2: Install miniconda 3
+
+Make sure that you run the 'bash' shell and execute:
+
+for Linux:
+```
+bash Miniconda3-latest-Linux-x86_64.sh
+```
+
+### Step 3: Create a new conda environment
+
+Create a new conda environment
+```
+conda create --name riboseq_pipeline --channel bioconda --channel conda-forge snakemake=4.8.1
+```
+
+Activate the virtual environment
+```
+conda activate riboseq_pipeline
+```
+
+You can deactivate later the virtual environment as
+```
+conda deactivate
+```
+
+Check if snakemake was installed properly
+```
+snakemake --help
+```
+
+### Step 4: Clone the repository
+```
+git clone ssh://git@git.scicore.unibas.ch:2222/AnnotationPipelines/riboseq_pipeline.git
+```
+
+## Configure pipeline
+
+### Download annotation files
+
+Go in the snakemake directory and create a new annotation directory
+```
+cd riboseq_pipeline/snakemake
+mkdir annotation
+```
+
+Download an annotation file (e.g. gtf from ENSEMBL) and uncompress it
+```
+wget ftp://ftp.ensembl.org/pub/release-90/gtf/homo_sapiens/Homo_sapiens.GRCh38.90.chr.gtf.gz
+gunzip Homo_sapiens.GRCh38.90.chr.gtf.gz
+```
+
+Download a chromosome sequences file (e.g. soft-masked fasta from ENSEMBL) and uncompress it
+```
+wget ftp://ftp.ensembl.org/pub/release-90/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
+gunzip Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz
+```
+
+Then download rRNAs (e.g. from RefSeq). For the members of the group you can use the following file:
+```
+cp /scicore/home/zavolan/gypas/projects/resources/human/rRNA/txome_rRNAs_joao.fa .
+```
+
+Finally copy or create a file with oligos. For the members of the group you can use the following file:
+```
+cp /scicore/home/zavolan/gypas/projects/resources/human/rRNA/oligos.txt .
+```
+
+### Configure and run workflows
+
+As mentioned earlier two snakemake workflows are available. One that prepares the annotation files (e.g. generation of index files etc) and one that processes the Ribo-Seq data.
+
+#### Prepare annotation workflow
+
+First of all go to the 'snakemake/prepare_annotation' directory and fill in the 'config.yaml' file. To make sure that everything is configured properly create a dag of the workflow.
+
+```
+bash create_snakemake_flowchart.sh
+```
+
+And finally run the pipeline. This script is configured for the Slurm Workload Manager
+```
+nohup bash run_snakefile.sh &
+```
+#### Process data workflow
+
+Once the prepare_annotation pipeline is complete you can move to the 'snakemake/process_data' directory. Copy or create a hard link of the Ribo-Seq samples you want to process in the 'samples' directory. Fill in the config.yaml file. 
+
+Create the dag
+```
+bash create_snakemake_flowchart.sh
+```
+
+And finally run the pipeline. This script is configured for the Slurm Workload Manager
+```
+nohup bash run_snakefile.sh &
+```
+
+