Workflow for the analysis of the synthetic data

The aim of the workflow is to automate the analysis of synthetic data. Inputs and intermediate outputs will be:

Inputs (I#)

I1. Path to genome sequence file
I3. Path to gene expression table (csv)
I14. Output directory
I16. Pattern specifying the reads file name for an individual cell

Intermediate outputs

O1. Sampled transcript structures (gtf)
O11. Files with read-to-genome alignments (bam)
O12. Path to estimates of gene expression
O13. Plot with mean vs. standard deviation for individual genes
O14. Path to table with mean vs. standard deviation for individual genes (csv)
O15. Plot with initial vs. inferred expression levels for all genes

This will be done as follows:

Mapping reads to genome (#19)
- Inputs I1,I14,I16
- Outputs O12
Quantifying gene expression (#11)
- Inputs O1, O11
- Outputs O12
MA plot (#12)
- Inputs O12
- Outputs O13, O14
Estimating accuracy of gene expression inference
- Inputs I3, O14
- Outputs O16

Edited Dec 01, 2021 by MihaelaZavolan